文摘
In the lead optimization process, medicinal chemists must consider various chemical properties of activecompounds, including ADME/Tox properties, and find the best compromise among these. This study presentsa novel data mining method for multiobjective optimization of chemical properties, which consists of thehierarchical classification and visualization of multidimensional data. A hierarchical classification tree modelis generated by an extension of recursive partitioning that utilizes averaged information gains for multipleobjective variables as a quality-of-split criterion. All the hierarchically structured data objects are representedusing a large-scale data visualization technique. The technique is an extension of HeiankyoView, whichdisplays data objects as colored icons and group nodes as rectangular borders. Each icon is divided intosubregions with different colors, so that it can present multidimensional data according to brightness of thecolors. The proposed method was applied to the structure-activity relationship analysis for cytochromeP450 (CYP) substrates. The substrate specificity of six CYP isoforms was successfully delineated: e.g.,CYP2C9 substrates are anionic compounds, while CYP2D6 substrates are cationic; and CYP2E1 substratesare smaller compounds, while CYP3A4 substrates are larger compounds.