Main»Home Page

Main.HomePage History

Hide minor edits - Show changes to markup

November 28, 2006, at 11:05 PM by 210.212.45.144 -
Deleted lines 2-41:

A structure activity model for identification of anticancer and non-anticancer drugs based on a set of 13 molecular moment descriptors and using machine learning techniques like artificial neural network and support vector machine.

http://i121.photobucket.com/albums/o234/kunaljaiswal/Figure.jpg Pictorial representation of the prediction methods of anticancer and non-anticancer drugs.

Summary

The structure-activity relationship (QSAR) model developed discriminate anticancer / non-anticancer drugs using machine learning techniques: artificial neural network (ANN) and support vector machine (SVM). The ANN used here is a feed-forward neural network with a standard back-propagation training algorithm. The performance was compared using 13 shape and electrostatic (Molecular Moments) descriptors. For the complete set of 13 molecular moments descriptors, ANN reveal a superior model (accuracy = 86.7%, Qpred = 76.7%, Qobs. = 95.8%, sensitivity = 0.958, specificity = 0.805, enrichment factor (EF) = 0.884) in comparison to the SVM model (accuracy = 51.7%, Qpred = 30%, Qobs. = 52.9%, sensitivity = 0.529, specificity = 0.511, enrichment factor (EF) = 0.581). The Mathew correlation coefficient was significantly better for the ANN (0.74) than that of SVM (0.03). These methods were trained and tested on a non redundant data set of 180 drugs (90 anticancer and 90 non-anticancer). The proposed model can be used for the prediction of the anti-cancer activity of novel classes of compounds enabling a virtual screening of large databases.

Key words: Artificial neural network, comparative molecular moment descriptors, support vector machine, drug design, structure activity relationship.

Introduction

The discrimination between anticancer and non-anticancer drugs is a major challenge in current cancer research. A number of natural and synthetic products have been found to exhibit anticancer activity against tumor cell lines[1-5]. Eventually, the number of anticancer drugs is increasing exponentially day by day. The worldwide pharmaceutical industry is investing in technologies for high-throughput screening (HTS) of such compounds. Therefore, development of in silico techniques for anticancer drug screening is the demand of today’s anticancer drug discovery. It concerns the gathering and systematic use of chemical information from already existing molecules, and the use of those data to predict the behavior of unknown compounds. But the structural factors which relate the anticancer activity of these compounds are still not clear. Recently, quantitative structure–activity relationships (QSARs) have been used extensively to develop models in order to estimate and predict biological or toxicological behavior of organic molecules using computational descriptors solely derived from chemical structures. [6-9] The use of computational tools for discrimination of anticancer drugs from lead molecules prior to their chemical synthesis will accelerate the drug discovery processes in the pharmaceutical industry.[10-12]

The focus of this study is to develop a QSAR model for prediction of anticancer / non-anticancer drugs from their chemical structures. The authors have used Comparative Molecular Moment Analysis 13 (CoMMA) for calculating the molecular descriptors. It utilizes information from moment expansions of molecular mass and charge, up through and inclusive of second order to perform molecular comparison. It also allows deriving shape and electrostatic descriptors. CoMMA uses the lower order moments of the molecular mass and charge distributions in addition of one higher-order multipole moment of the charge density distribution, namely, the quadrupole moment, as well as a description of the relationship between the two distributions by projections of the electrostatic moments upon the principal component inertial axes. This, together with the ability to perform similarity assignments between different molecules without the requirement of molecular superposition makes the CoMMA descriptors a powerful three-dimensional representation of molecular structure.

Neural networks have been widely used for classification and for approximation in various fields of chemistry and bioinformatics.14 Xue et al.15 have successfully used the probabilistic neural networks for the classification of 102 active compounds from diverse medicinal plants with anticancer activity using molecular descriptors. The ANNs results are found to be superior to the linear discriminant analysis 15. The potential of support vector machines (SVM) for distinguishing biologically active from non-active substances has been investigated [16-17]. In the present paper an attempt has been made to develop a structure-activity relationship model that could help to predict novel classes of compounds having anticancer activity. We have used two machine-learning techniques: a support vector machine (SVM) and an Artificial Neural Network (ANN) as binary classifiers.

Methods

Data Set

To discriminate between the anticancer and non-anticancer drugs, a data set of 180 drug molecules, consisted of 90 non redundant anticancer and the same number of non redundant non-anticancer drugs, were used for training, validation and testing. The 3D structure of all the drug molecules in MOL2 format is obtained from the DRUG BANK database 18. The complete list of the drug molecules used along with their properties is given in the supplementary material.

November 28, 2006, at 11:01 PM by 210.212.45.144 -
Changed lines 24-25 from:

The discrimination between anticancer and non-anticancer drugs is a major challenge in current cancer research. A number of natural and synthetic products have been found to exhibit anticancer activity against tumor cell lines1-5. Eventually, the number of anticancer drugs is increasing exponentially day by day. The worldwide pharmaceutical industry is investing in technologies for high-throughput screening (HTS) of such compounds. Therefore, development of in silico techniques for anticancer drug screening is the demand of today’s anticancer drug discovery. It concerns the gathering and systematic use of chemical information from already existing molecules, and the use of those data to predict the behavior of unknown compounds. But the structural factors which relate the anticancer activity of these compounds are still not clear. Recently, quantitative structure–activity relationships (QSARs) have been used extensively to develop models in order to estimate and predict biological or toxicological behavior of organic molecules using computational descriptors solely derived from chemical structures. 6-9 The use of computational tools for discrimination of anticancer drugs from lead molecules prior to their chemical synthesis will accelerate the drug discovery processes in the pharmaceutical industry.10-12

to:

The discrimination between anticancer and non-anticancer drugs is a major challenge in current cancer research. A number of natural and synthetic products have been found to exhibit anticancer activity against tumor cell lines[1-5]. Eventually, the number of anticancer drugs is increasing exponentially day by day. The worldwide pharmaceutical industry is investing in technologies for high-throughput screening (HTS) of such compounds. Therefore, development of in silico techniques for anticancer drug screening is the demand of today’s anticancer drug discovery. It concerns the gathering and systematic use of chemical information from already existing molecules, and the use of those data to predict the behavior of unknown compounds. But the structural factors which relate the anticancer activity of these compounds are still not clear. Recently, quantitative structure–activity relationships (QSARs) have been used extensively to develop models in order to estimate and predict biological or toxicological behavior of organic molecules using computational descriptors solely derived from chemical structures. [6-9] The use of computational tools for discrimination of anticancer drugs from lead molecules prior to their chemical synthesis will accelerate the drug discovery processes in the pharmaceutical industry.[10-12]

Changed line 28 from:

Neural networks have been widely used for classification and for approximation in various fields of chemistry and bioinformatics.14 Xue et al.15 have successfully used the probabilistic neural networks for the classification of 102 active compounds from diverse medicinal plants with anticancer activity using molecular descriptors. The ANNs results are found to be superior to the linear discriminant analysis 15. The potential of support vector machines (SVM) for distinguishing biologically active from non-active substances has been investigated 16-17.

to:

Neural networks have been widely used for classification and for approximation in various fields of chemistry and bioinformatics.14 Xue et al.15 have successfully used the probabilistic neural networks for the classification of 102 active compounds from diverse medicinal plants with anticancer activity using molecular descriptors. The ANNs results are found to be superior to the linear discriminant analysis 15. The potential of support vector machines (SVM) for distinguishing biologically active from non-active substances has been investigated [16-17].

November 28, 2006, at 11:00 PM by 210.212.45.144 -
Changed lines 1-2 from:

AntPred: Anti-Cancer Drug Prediction Tool.

to:

AntPred: Anti-Cancer Drug Prediction Tool.

Changed lines 8-10 from:

Summary

to:

Summary

Changed lines 17-23 from:

Key words: Artificial neural network, comparative molecular moment descriptors, support vector machine, drug design, structure activity relationship.

Introduction

to:

Key words: Artificial neural network, comparative molecular moment descriptors, support vector machine, drug design, structure activity relationship.

Introduction

Changed lines 34-39 from:

Methods

Data Set

to:

Methods

Data Set

November 28, 2006, at 10:59 PM by 210.212.45.144 -
Changed lines 6-9 from:

Pictorial representation of the prediction methods of anticancer and non-anticancer drugs.

to:

Pictorial representation of the prediction methods of anticancer and non-anticancer drugs.

Summary

The structure-activity relationship (QSAR) model developed discriminate anticancer / non-anticancer drugs using machine learning techniques: artificial neural network (ANN) and support vector machine (SVM). The ANN used here is a feed-forward neural network with a standard back-propagation training algorithm. The performance was compared using 13 shape and electrostatic (Molecular Moments) descriptors. For the complete set of 13 molecular moments descriptors, ANN reveal a superior model (accuracy = 86.7%, Qpred = 76.7%, Qobs. = 95.8%, sensitivity = 0.958, specificity = 0.805, enrichment factor (EF) = 0.884) in comparison to the SVM model (accuracy = 51.7%, Qpred = 30%, Qobs. = 52.9%, sensitivity = 0.529, specificity = 0.511, enrichment factor (EF) = 0.581). The Mathew correlation coefficient was significantly better for the ANN (0.74) than that of SVM (0.03). These methods were trained and tested on a non redundant data set of 180 drugs (90 anticancer and 90 non-anticancer). The proposed model can be used for the prediction of the anti-cancer activity of novel classes of compounds enabling a virtual screening of large databases.

Key words: Artificial neural network, comparative molecular moment descriptors, support vector machine, drug design, structure activity relationship.

Introduction

The discrimination between anticancer and non-anticancer drugs is a major challenge in current cancer research. A number of natural and synthetic products have been found to exhibit anticancer activity against tumor cell lines1-5. Eventually, the number of anticancer drugs is increasing exponentially day by day. The worldwide pharmaceutical industry is investing in technologies for high-throughput screening (HTS) of such compounds. Therefore, development of in silico techniques for anticancer drug screening is the demand of today’s anticancer drug discovery. It concerns the gathering and systematic use of chemical information from already existing molecules, and the use of those data to predict the behavior of unknown compounds. But the structural factors which relate the anticancer activity of these compounds are still not clear. Recently, quantitative structure–activity relationships (QSARs) have been used extensively to develop models in order to estimate and predict biological or toxicological behavior of organic molecules using computational descriptors solely derived from chemical structures. 6-9 The use of computational tools for discrimination of anticancer drugs from lead molecules prior to their chemical synthesis will accelerate the drug discovery processes in the pharmaceutical industry.10-12

The focus of this study is to develop a QSAR model for prediction of anticancer / non-anticancer drugs from their chemical structures. The authors have used Comparative Molecular Moment Analysis 13 (CoMMA) for calculating the molecular descriptors. It utilizes information from moment expansions of molecular mass and charge, up through and inclusive of second order to perform molecular comparison. It also allows deriving shape and electrostatic descriptors. CoMMA uses the lower order moments of the molecular mass and charge distributions in addition of one higher-order multipole moment of the charge density distribution, namely, the quadrupole moment, as well as a description of the relationship between the two distributions by projections of the electrostatic moments upon the principal component inertial axes. This, together with the ability to perform similarity assignments between different molecules without the requirement of molecular superposition makes the CoMMA descriptors a powerful three-dimensional representation of molecular structure.

Neural networks have been widely used for classification and for approximation in various fields of chemistry and bioinformatics.14 Xue et al.15 have successfully used the probabilistic neural networks for the classification of 102 active compounds from diverse medicinal plants with anticancer activity using molecular descriptors. The ANNs results are found to be superior to the linear discriminant analysis 15. The potential of support vector machines (SVM) for distinguishing biologically active from non-active substances has been investigated 16-17. In the present paper an attempt has been made to develop a structure-activity relationship model that could help to predict novel classes of compounds having anticancer activity. We have used two machine-learning techniques: a support vector machine (SVM) and an Artificial Neural Network (ANN) as binary classifiers.

Methods

Data Set

To discriminate between the anticancer and non-anticancer drugs, a data set of 180 drug molecules, consisted of 90 non redundant anticancer and the same number of non redundant non-anticancer drugs, were used for training, validation and testing. The 3D structure of all the drug molecules in MOL2 format is obtained from the DRUG BANK database 18. The complete list of the drug molecules used along with their properties is given in the supplementary material.

November 28, 2006, at 10:58 PM by 210.212.45.144 -
Changed lines 5-7 from:
to:

http://i121.photobucket.com/albums/o234/kunaljaiswal/Figure.jpg Pictorial representation of the prediction methods of anticancer and non-anticancer drugs.

November 28, 2006, at 10:52 PM by 210.212.45.144 -
Changed lines 5-6 from:
to:
November 28, 2006, at 10:49 PM by 210.212.45.144 -
Changed lines 1-6 from:

AntPred: Anti-Cancer Drug Prediction Tool.

to:

AntPred: Anti-Cancer Drug Prediction Tool.

A structure activity model for identification of anticancer and non-anticancer drugs based on a set of 13 molecular moment descriptors and using machine learning techniques like artificial neural network and support vector machine.

November 28, 2006, at 10:48 PM by 210.212.45.144 -
Added line 1:

AntPred: Anti-Cancer Drug Prediction Tool.