Models

1. CLMGraph framework

The overview of the Contrastive Learning based Multi-Task Graph Neural Network framework (CLMGraph) is shown in Figure 1. CLMGraph consists of an input module, a feature extraction module, and an ADMET predictor module.

Input Module:

Convert compound x into a molecular graph: G x = ( N x , E x ) , where N x represents the node matrix and E x represents the edge feature matrix.

Feature Extraction Module:

Update the node and edge information of the molecular graph using a Graph Neural Network (GNN): G x ' = ( N x ' , E x ' ) Optimize the GNN network by maximizing the mutual information between pairs of augmented molecular graphs using a self-supervised pre-training strategy based on contrastive learning. Use NT-Xent as the objective function to train the model:
Loss i , j = - log exp ( sim ( z i , z i ' ) / τ ) N k=1 exp ( sim ( z i , z k ) / τ ) where z i z i ' , and z k are the feature vectors extracted by GNN from positive pairs and negative samples, N is the batch size, τ is the temperature parameter, and sim ( z i , z i ' ) represents cosine similarity.

ADMET Prediction Module:

Jointly train the pre-trained model on 108 ADMET prediction tasks to leverage and share relevant information across similar tasks. Optimize the ADMET prediction module using a related task loss (related task loss):
Loss t = N k=1 Loss n Overall, the CLMGraph framework combines contrastive learning and multi-task learning to extract features from compound structures using self-supervised pre-training and joint training, leading to accurate predictions on ADMET tasks. This approach improves model performance and leverages information sharing among related tasks to accelerate learning.

2. Model data
Table 1. Data information of 107 predictive models
Properities Total (positive/Negative) Training set (positive/Negative) validation set (positive/Negative) test set (positive/Negative)
logS15141080275159
logP13076939223681316
pKa51493723936490
Acidic pKa27191965491263
Basic pKa27251954514257
Caco-2 classify962/749679/543183/130100/76
Caco-217501255316179
HIA65/43751/30112/912/45
MDCK217/205151/14343/3923/23
F50%854/1086625/768154/20775/111
F30%598/1342433/960115/24650/136
F20%445/1495323/107088/27334/152
BBB451/1976323/144474/33854/194
OATP1B1 inhibitor176/1276131/91436/2119/151
OATP1B3 inhibitor118/134884/97127/2217/156
OATP2B1 inhibitor132/3892/2434/116/3
OCT1 inhibitor84/3470/219/115/2
OCT2 inhibitor522/150367/11599/2156/14
BCRP inhibitor483/423341/30687/7155/46
BSEP inhibitor212/258160/17831/4721/33
MATE1 inhibitor557/59400/43107/750/9
Pgp inhibitor651/1118485/842103/17863/98
Pgp substrate797/669569/490149/11679/63
PPB38042716668420
VDss1166845207114
CYP1A2 inhibitor9037/80956564/57841568/1506905/805
CYP3A4 inhibitor13790/57349932/40842491/10251367/625
CYP2B6 inhibitor306/182215/13558/2833/19
CYP2C9 inhibitor11516/62778338/45072074/10991104/671
CYP2C19 inhibitor10087/79747301/57191799/1433987/822
CYP2D6 inhibitor14264/377310269/26982577/6701418/405
CYP1A2 substrate1055/457746/341198/71111/45
CYP3A4 substrate1747/13301257/966314/240176/124
CYP2B6 substrate248/229183/17840/3425/17
CYP2C9 substrate1486/5111071/361271/97144/53
CYP2C19 substrate1038/398743/284200/7695/38
CYP2D6 substrate1519/6151104/452279/98136/65
HLM3871/21262800/1525689/407382/194
RLM2270/32941613/2361419/600238/333
UGT substrate537/597390/44195/9552/61
CLp565/521409/38094/9862/43
CLr204/280152/20734/4318/30
T1/21155837208110
MRT1165844209112
Neurotoxicity4193036749
DILI901/1255672/903145/22984/123
hERG 1uM3696/7972633/599718/135345/63
hERG 10uM1826/26671285/1947360/493181/227
hERG 30uM912/3581630/2602183/67099/309
hERG 1-10uM1826/7971285/599360/135181/63
hERG 10-30uM912/2667630/1947183/49399/227
Respiratory toxicity997/1185697/875188/202112/108
Nephrotoxicity350/301262/21562/5526/31
Eye corrosion1194/855849/613218/141127/101
Eye irritation1137/3761813/2739208/680116/342
Skin corrosion1187/698854/469199/136134/93
Skin irritation1187/1256854/913199/218134/125
Skin sensitisation565/354388/250109/6668/38
Acute dermal toxicity664/784453/561138/14673/77
Ames mutagenesis3601/43722561/3197672/762368/413
Mouse carcinogenicity classify386/371267/26376/7543/33
Mouse carcinogenicity3762657932
Rat carcinogenicity classify474/500338/34992/10144/50
Rat carcinogenicity49834510548
Rodents carcinogenicity394/498279/35973/9242/47
Micronucleus364/225267/15758/4839/20
Reproductive toxicity917/783653/553175/14889/82
Mitochondrial toxicity1504/13611101/981253/249150/131
Hemolytic toxicity496/387348/27197/7451/42
Repeated dose toxicity674/1145474/822135/20365/120
Acute oral toxicity classify1343/2581936/1908242/417165/256
Acute oral toxicity13493977323321388
FDAMDD classify505/428380/31586/7239/41
FDAMDD107480117697
AR5745/5244143/3911030/87572/46
ER6220/3914484/2931113/66623/32
AR-LBD5818/5424206/3941037/96575/52
ER-LBD5625/6634047/5021021/104557/57
Aromatase4977/6553593/473887/117497/65
AhR5205/7073782/509916/138507/60
ARE4256/8143030/606791/145435/63
ATAD56162/3284457/2541094/47611/27
HSE5295/3183807/238954/51534/29
p535772/4044170/2971038/74564/33
PPARγ5949/4714316/3351046/95587/41
MMP4214/7973026/586769/135419/76
TR4578/3313272/249847/52459/30
GR4946/5493555/390893/110498/49
Honey bee toxicity462/158333/11783/2346/18
Avain toxicity (Colinus virginanus)395/84282/7171/942/4
Avain toxicity (Anas platyrhynchos)416/82310/6265/1441/6
Aquatic toxicity (P. subcapitata)385/464264/32774/9347/44
Aquatic toxicity (Crustaceans)326/369236/27953/5437/36
Aquatic toxicity (D. magna)302/287221/21645/4136/30
Aquatic toxicity (Fish)597/831436/628102/12559/78
Aquatic toxicity (Fathead minnow)437/424312/32279/5646/46
Aquatic toxicity (Bluegill sunfish)264/504195/38242/7727/45
Aquatic toxicity (Rainbow trout)217/553156/42241/8120/50
Aquatic toxicity (Sheepshead minnow)102/22884/16811/387/22
Aquatic toxicity (T. pyriformis, classify)337/965250/73551/15036/80
Aquatic toxicity (T. pyriformis)1310991202117
BCF classify488/156346/10791/3051/19
BCF4633317953
Biodegradability906/550644/401158/94104/55
Photoinduced toxicity659/627473/458118/11568/54
Phototoxicity/Photoirritation221/146150/10843/2628/12
Photoallergy660/462475/333117/8568/44
3. Model performance
Table 2. Predictive performance of regression models
Properities Test set Validation set
PCC R2 MAE RMSE PCC R2 MAE RMSE
logS0.9610.9240.4590.603 0.9560.9150.4990.69
logP0.9680.9360.3460.495 0.9670.9350.3630.507
pKa0.8030.6451.0542.014 0.8690.7550.9241.63
Acidic pKa0.8830.780.9181.626 0.9030.8150.8291.449
Basic pKa0.7810.611.0462.183 0.8930.7970.851.565
Caco-20.8570.7350.3060.402 0.8180.6690.330.419
PPB0.8140.6620.1080.154 0.8460.7160.1070.151
VDss0.7780.6050.3220.447 0.8190.6710.3090.422
T1/20.5060.2560.430.593 0.5990.3580.3790.544
MRT0.5230.2730.4490.615 0.6030.3640.3830.542
Neurotoxicity0.8170.6680.2380.312 0.7520.5660.1990.293
Mouse carcinogenicity0.680.4620.7290.922 0.720.5180.8511.085
Rat carcinogenicity0.690.4760.8641.071 0.610.3721.0951.334
Acute oral toxicity0.770.5920.3780.534 0.7650.5850.3870.54
FDAMDD0.5850.3420.7381.025 0.7550.570.5250.693
Aquatic toxicity (T. pyriformis)0.9370.8790.2830.39 0.9450.8920.2740.35
BCF0.8580.7370.5050.645 0.8530.7270.5110.684
Table 3. Predictive performance of classification models
Properities Test set Validation set
AUC Precision Recall Specificity AUC Precision Recall Specificity
Caco-2 classify0.950.8460.8980.878 0.9050.7950.8160.84
HIA0.9770.9610.9360.824 1.00.9781.00.5
MDCK0.8380.8620.6410.879 0.7860.7730.7390.783
F50%0.780.7380.7560.654 0.8080.7710.730.68
F30%0.7920.8020.8920.514 0.8680.8810.8750.68
F20%0.7630.8260.9370.354 0.8790.920.9080.647
BBB0.9270.9320.9830.69 0.9460.9170.9740.685
OATP1B1 inhibitor0.8230.9390.9430.462 0.7090.9670.980.444
OATP1B3 inhibitor0.8910.9470.9830.235 0.8890.9631.00.143
OATP2B1 inhibitor0.8510.3330.1430.909 1.00.00.01.0
OCT1 inhibitor0.8890.60.60.889 1.01.00.51.0
OCT2 inhibitor0.840.6430.290.94 0.8390.6250.3570.946
BCRP inhibitor0.8860.7940.890.761 0.8610.7240.9130.709
BSEP inhibitor0.8830.7760.8650.675 0.9380.8380.9390.714
MATE1 inhibitor0.8510.00.01.0 0.7330.00.01.0
Pgp inhibitor0.9240.8370.9380.678 0.8920.8360.9390.714
Pgp substrate0.8390.750.7450.765 0.8260.7210.7780.759
CYP1A2 inhibitor0.9120.8090.8370.826 0.9220.8240.8730.834
CYP3A4 inhibitor0.9010.7010.7350.871 0.8970.740.7150.885
CYP2B6 inhibitor0.7410.5710.480.83 0.8170.6670.7370.788
CYP2C9 inhibitor0.8880.7320.7590.852 0.8950.7810.7680.87
CYP2C19 inhibitor0.8970.7960.8360.827 0.9090.790.8450.813
CYP2D6 inhibitor0.8470.670.5240.935 0.8720.730.5480.942
CYP1A2 substrate0.9230.7580.7580.878 0.9140.680.7560.856
CYP3A4 substrate0.8730.780.7860.802 0.8670.7210.8550.767
CYP2B6 substrate0.8850.780.8520.735 0.8560.7220.7650.8
CYP2C9 substrate0.8570.580.5540.861 0.8510.6670.5660.896
CYP2C19 substrate0.9440.8160.8270.922 0.9120.7270.6320.905
CYP2D6 substrate0.8850.7260.6250.886 0.920.790.7540.904
HLM0.850.7430.6610.882 0.8220.6690.5520.861
RLM0.8130.7630.8290.624 0.8280.7720.8530.647
UGT substrate0.8060.7680.7290.757 0.7160.6760.7540.577
CLp0.8070.7130.720.722 0.7920.6750.6280.79
CLr0.6850.5920.9130.341 0.5170.6410.8330.222
DILI0.7160.7420.7150.615 0.7070.70.740.536
hERG 1uM0.8680.6540.6140.92 0.8790.5610.5870.916
hERG 10uM0.8840.8030.8940.675 0.880.7610.9120.641
hERG 30uM0.9190.9060.9630.637 0.9080.8540.9680.485
hERG 1-10uM0.9460.8060.8730.896 0.960.6790.9050.851
hERG 10-30uM0.9410.920.9580.769 0.9380.8690.960.667
Respiratory toxicity0.9130.8890.8040.866 0.9040.8270.7960.839
Nephrotoxicity0.7770.7190.6950.789 0.7440.6550.6130.615
Eye corrosion0.9850.8860.9860.92 0.980.9230.950.937
Eye irritation0.9510.9210.9810.736 0.9780.9440.9910.828
Skin corrosion0.9310.7850.8480.866 0.9370.8650.8280.91
Skin irritation0.7820.6980.770.673 0.7940.6830.760.672
Skin sensitisation0.9150.80.7270.881 0.8540.6830.7370.809
Acute dermal toxicity0.8480.8350.7220.812 0.7640.6890.6620.685
Ames mutagenesis0.880.8080.8370.745 0.8550.7730.8330.726
Mouse carcinogenicity classify0.7180.6440.6130.687 0.7040.6180.6360.698
Rat carcinogenicity classify0.7080.6150.7090.598 0.7790.750.780.705
Rodents carcinogenicity0.7310.640.6790.557 0.850.7760.8090.738
Micronucleus0.9490.7670.8680.873 0.9360.850.850.923
Reproductive toxicity0.8870.9390.6550.963 0.8850.90.6590.933
Mitochondrial toxicity0.9080.8430.8360.853 0.9140.8290.8170.853
Hemolytic toxicity0.8470.7090.8470.709 0.8450.7210.7380.765
Repeated dose toxicity0.7320.7410.8270.544 0.7360.7610.8750.492
Acute oral toxicity classify0.8920.8920.8590.777 0.870.8280.8670.721
FDAMDD classify0.8290.760.760.791 0.8940.7830.8780.744
AR0.8510.6490.3630.981 0.9340.5260.4350.969
ER0.9020.720.4620.987 0.9490.5860.5310.981
AR-LBD0.8630.5740.390.972 0.9310.5910.50.969
ER-LBD0.8670.5460.440.952 0.9340.5580.5090.959
Aromatase0.8220.5650.4160.955 0.850.4810.40.944
AhR0.8640.5570.4120.955 0.8690.6170.4830.964
ARE0.8560.6040.3840.946 0.8840.6550.6030.954
ATAD50.8350.7650.1910.996 0.9170.60.2220.993
HSE0.8520.6250.1850.994 0.8310.50.1720.991
p530.8660.4550.2050.983 0.8840.50.2420.986
PPARγ0.9220.2740.586 0.9390.2960.615
MMP0.930.7550.6690.955 0.9490.750.7890.952
TR0.8650.6360.2330.99 0.8570.5710.1330.993
GR0.9060.8120.5440.985 0.9120.8120.5310.988
Honey bee toxicity0.7990.5790.3670.907 0.9550.9230.6670.978
Avain toxicity (Colinus virginanus)0.7640.6360.3680.953 1.01.01.01.0
Avain toxicity (Anas platyrhynchos)0.8210.7270.4210.966 0.9070.80.6670.976
Aquatic toxicity (P. subcapitata)0.8950.8210.8970.729 0.8110.6670.8640.596
Aquatic toxicity (Crustaceans)0.8880.7680.8280.765 0.8720.8290.8060.838
Aquatic toxicity (D. magna)0.8950.7410.8330.788 0.8560.8150.7330.861
Aquatic toxicity (Fish)0.9110.8260.9020.721 0.910.8290.8720.763
Aquatic toxicity (Fathead minnow)0.9110.8430.8430.817 0.8910.8440.8260.848
Aquatic toxicity (Bluegill sunfish)0.920.8480.9180.709 0.9420.8780.9560.778
Aquatic toxicity (Rainbow trout)0.8930.8520.9450.647 0.9160.90.90.75
Aquatic toxicity (Sheepshead minnow)0.9380.8180.9570.643 0.890.9550.9550.857
Aquatic toxicity (T. pyriformis, classify)0.9740.9310.9640.821 0.9490.9230.90.833
BCF classify0.9580.5830.8750.851 0.9430.7390.8950.882
Biodegradability0.8390.6280.7720.753 0.8720.6430.8180.76
Photoinduced toxicity0.7120.6590.680.656 0.7430.6330.7040.676
Phototoxicity/Photoirritation0.7190.6090.4670.8 0.7470.4380.5830.679
Photoallergy0.7030.60.60.73 0.7440.6670.6360.794
4. Implementation

admetSAR3.0 was designed with HTML, CSS, and JavaScript for the frontend, utilizing jQuery for page control and event handling. It incorporates JSME as a molecular editor and employs ApacheECharts for data visualization. The core logic of the website is implemented in PHP, with mariaDB chosen as the data management system. The prediction models were built with Python, and the modeling process was supported by chemical informatics packages such as PyTorch, DGL, DGLlife, and RDKit. The server has been successfully tested on Google Chrome, Microsoft Edge, Mozilla Firefox, and Apple Safari.

Table 4. The development environment of admetSAR3.0
Software and tool Version
PHP 5.4.16
MySQL 5.5.64
Anaconda 4.12.0
NVIDIA CUDA Toolkit 11.5
Python 3.9.18
RDKit 2022.03.2
dgl 0.9.1
dgllife 0.3.2
pytorch 1.11.0
torchvision 0.12.0
jQuery 2.1.1
Bootstrap 3.3.7
JSME 2022-06-10
Apache ECharts 5.4.0
5. Browser compatibility
Table 5. The browser compatibility of admetSAR3.0
OS Version Chrome Edge Firefox Safari
Linux Ubuntu 18.04.6 LTS 79.0.3945.88 - 74.0.1 -
MacOS Catalina 10.15.7 75.0.3770.80 - 71.0 13.1
Windows 10 75.0.3770.80 80.0.361.109 71.0 -