Human Intestinal Absorption (HIA) Database

The original HIA data set was collected from lterature. This data set contained 578 compounds with FA% values. The threshold of 30% was used to divide molecules into HIA+ and HIA-. In order to make a comparison, 578 (500 HIA+ and 78 HIA-) compounds were divided into a HIA_TrainingSet1 of 480 (407 HIA+ and 73 HIA-) compounds and a HIA_TestSet1 of 98 (93 HIA+ and 5 HIA-) compounds. Furthermore, to validate the generalization ability of the model built using our method, 634 oral drugs, which were not contained in the HIA data set, were collected from the DrugBank database and composed of a HIA_ExternalSet. These drugs with oral dosage formulations were considered to be HIA+ compounds. The chemical name, SMILES and class label (1 for HIA+ and -1 for HIA-) can be download following:


The detail description for database can be found in the reference 1.


(1). Jie Shen, Feixiong Cheng, You Xu, Weihua Li* and Yun Tang*. Estimation of ADME properties with substructure pattern recognition. J Chem Inf Model 2010, 50, 1034-1041.