An advantage of such an approach is that with each successive prediction, the learning space not only becomes smaller but also more specific, reducing thereby the informational heterogeneity and simplifying the learning task. Based on this premise, in tiered learning, prediction of the ATC code at a certain level is constrained by the ATC code at the higher levels. Figure 1 thus underlines the existence of characteristic distribution for ATC codes–an observation that can be employed to constrain the search-space of possible codes once preceding levels are known. As can be seen from this figure, for each ATC class at the first level, the distribution of second level is highly specific. 1, we plot the distribution of ATC classes for the data used by us at the first and second code levels. In other words, ATC classes must have characteristic distributions, which become increasingly specific as one traverses the ATC classification levels. The proposed architecture is based on the premise that while chemical compounds may exhibit polypharmacology, that is, compounds may modulate multiple targets, this phenomenon has limits and anatomical-therapeutic biological activity of certain types must preclude activities of many other types. Towards this goal, we present a learning architecture called tiered learning that can be utilized by any prediction method (classifier) to obtain highly accurate ATC predictions. In both the above problems formulations, the ability to identify compounds that are therapeutically of interest vis-à-vis a particular pathology is critical.Īutomatically determining the ATC code of a compound constitutes an attractive approach to both these problems. Finally, repositioning an existing drug to a novel pathology is an alluring, though limited, alternative to de novo drug design. However, outside general principles such as the Lipinski rules, few rigorous criteria exist to guide selection of the initial set of molecules for primary screening. The selection of a large number of compounds for primary screening is often driven both by the need to capture chemical diversity, and also because small structural variations can cardinally influence binding against a target. Drug discovery efforts typically start by screening a large number of compounds to identify “leads” which subsequently undergo optimization and in vivo test of efficacy and pharmacokinetics to identify candidates for clinical trials. This simplifies the prediction and allows for improved accuracy.ĭiscovery of efficacious drugs against diseases is one of the key challenges of modern science. Tiered learning utilizes this observation to constrain the learning space for ATC codes at a particular level based on the ATC code at higher levels. Thus, there exists a characteristic distribution of the ATC codes, which can be leveraged to limit the search-space of possible codes that can be ascribed at a particular level once the codes at the preceding levels are known. The basis of our approach lies in the observation that anatomical-therapeutic biological activity of certain types typically precludes activities of many other types. Additionally, the experiments demonstrated the generalizability of the tiered learning architecture, in that its use was found to improve prediction rates for a majority of machine learning algorithms when compared to their stand-alone application. The prediction accuracy obtained with tiered learning was found to be either comparable or better than that of established methods. The validation experiments compared chemical descriptors, initialization methods and classification algorithms. The proposed approach was validated using a number of compounds in both cross-validation and test setting. We propose a machine learning architecture called tiered learning for prediction of ATC codes that relies on the prediction results of the higher levels of the ATC code to simplify the predictions of the lower levels. The ability to predict ATC codes of compounds can assist in creation of high-quality chemical libraries for drug screening and in applications such as drug repositioning. The Anatomical Therapeutic Chemical (ATC) Code System is a World Health Organization (WHO) proposed classification that assigns multi-level codes to compounds based on their therapeutic, pharmacological and chemical characteristics as well as the in-vivo sites(s) of activity. The low success rate and high cost of drug discovery requires the development of new paradigms to identify molecules of therapeutic value.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |