ReviewPredictive non-linear modeling of complex data by artificial neural networks
Introduction
The past decade has seen a host of data analysis tools based on biological phenomena develop into well-established modeling techniques, such as artificial intelligence and evolutionary computing [1]. Artificial neural networks (ANNs) are now the most popular artificial learning tool in biotechnology, with applications ranging from pattern recognition in chromatographic spectra 2., 3••. and expression profiles, to functional analyses of genomic and proteomic sequences. ANNs are currently being used as the primary modeling tool in biotechnology, appearing in the literature at a rate of approximately 15 peer-reviewed publications per month [4]. The growing importance of ANNs has also been echoed by the program of major international life science conferences, such as the 9th International Conference on Intelligent Systems for Molecular Biology, Copenhagen 2001 [5]. The technique also forms the main focus of the Society for Artificial Neural Networks in Medicine and Biology, which organized its first conference last year [6]. The maturation of ANN usage is also reflected in the fact that most reports no longer appear in specialized literature, such as Neural Computation, Artificial Intelligence in Medicine or IEEE Transactions on Biomedical Engineering, but instead span the entire range of biotechnology literature, from clinical to environmental, from bioreaction engineering to molecular modeling.
The suitability of ANN to resolve complex relationships makes it the method of choice to calibrate and deconvolute complex signals in analytical biochemistry. Furthermore, as ANN modeling is not conditioned by the need to assume a mechanistic dependency, biochemical signals can be directly used to infer complex conditions such as automated medical diagnosis/prognosis from biochemical analysis, pattern recognition from expression profiles, compound detection from spectral information and, more recently, biological function from genomic information. Recent literature documents a very rapid expansion of ANN applications in biotechnology that has not been fully accompanied by the dissimilation of underlying theory, thus causing some published ANN models to be statistically defective. Further adding to the problem, several software implementations are flawed or misleading. To help counteract the lack of a standard procedure, a synthetic overview of ANN identification methodology is included in this review. On the basis of this description, a checklist is proposed to help avoid the most common pitfalls. In this report, the review of ANNs was restricted to multilayered, feed-forward, fully connected networks of perceptions, by far the most widely used neural network topology in biotechnology applications.
Section snippets
Overview of recent literature
The success of ANNs in capturing the characteristic non-linearity of biological systems has been translated in numerous applications for deconvolution and modeling of analytical biochemistry signals 7., 8., 9••., 10.. It is interesting to note that the use of ANNs is quickly going beyond calibration, instead aiming to be a predictive model of how complex physiological or functional target properties are dependent on basic instrumental results. This trend is particularly strong in the field of
ANN development, procedure and pitfalls
The fast and wide embrace of neural computing by biotechnology is noted for uneven accuracy in its application. There is a combination of factors involved, including peer-review committees unfamiliar with ANN, further aggravated by the wide distribution of misleading or even faulty software packages. In addition, there is a general lack of well-accepted exploratory techniques to describe non-linear dependencies when no accurate mechanistic description is available. This is particularly
Advice for evaluating ANN software
There is plenty of anecdotal evidence that a significant amount of ANN software is severely flawed, for example, by not including early stop criteria. In addition, some software is notoriously misleading for inexperienced users, for example, by not optimizing topology as a default option. The most thorough way to verify the accuracy of the implementation is to try it with benchmark datasets and to compare predictions with observations for test and validation data. A quicker approach is to try
Shedding light on the ANN black box
A common criticism of ANN models is that a successful prediction does not necessarily lead to a better mechanistic understanding of the process. However, the analysis of the ANN function itself, namely by determining the sensitivity to input parameters (Eq. (2)), can in fact shed some light on the predictive non-linear association [25].
It is noteworthy that the sensitivity of an output parameter to a particular input parameter varies not only with the values of that input, but
Conclusions
The use of machine learning and, in particular, the use of ANNs is increasingly becoming the method of choice to calibrate complex biotechnology instrumentation and for modeling biological responses. A decade ago, when ANN made the first inroads into biotechnology, its use was mostly restricted to biochemical engineering applications. By contrast, ANN now occupies a central position as a predictive modeling tool in areas as diverse as computer-aided medical diagnosis and biological sequence
Acknowledgements
The author acknowledges the help of Elisabeth Pickelsimer of the Medical University of South Carolina for editing the text of this review.
References and recommended reading
Papers of particular interest, published within the annual period of review, have been highlighted as:
• of special interest
•• of outstanding interest
References (35)
- et al.
Evolutionary computation in medicine: an overview
Artif Intell Med
(2000) - et al.
Prediction of retention times for anions in linear gradient elution ion chromatography with hydroxide eluents using artificial neural networks
J Chromatogr A
(2001) - et al.
Ranitidine hydrochloride X-ray assay using a neural network
J Pharm Biomed Anal
(2000) - et al.
Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research
J Pharm Biomed Anal
(2000) - et al.
Pressurized-flow anion-exchange capillary electrochromatography using a polymeric ion-exchange stationary phase
J Chromatogr A
(2000) - et al.
Formula optimization of theophylline controlled-release tablet based on artificial neural networks
J Control Release
(2000) - et al.
The application of artificial neural networks to the identification of new spinosoids with improved biological activity toward larvae of Heliothis virescens
Pestic Biochem Physiol
(2000) - et al.
Evaluation and structure-activity relationship of synthesized cyclohexanol derivatives on percutaneous absorption of ketoprofen using artificial neural network
Int J Pharm
(2001) Large-scale predictions of secretory proteins from mammalian genomic EST sequences
Curr Opin Biotechnol
(2000)- et al.
Quantitative nuclear grade (QNG): a new image analysis-based biomarker of clinically relevant nuclear structure alterations
J Cell Biochem
(2000)
Use of artificial neural networks in biomedical diagnosis
Microarrays — the 21st century divining rod?
Nat Med
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks
Nat Med
Synthesis and structure-activity relationships of a new model of arylpiperazines. 6. Study of the 5-HT1A/α1-adrenergic receptor affinity by classical Hansch analysis, artificial neural networks, and computational simulation of ligand recognition
J Med Chem
Cited by (220)
Prediction of MSW pyrolysis products based on a deep artificial neural network
2024, Waste ManagementPrincipal component analysis–multivariate adaptive regression splines (PCA-MARS) and back propagation-artificial neural network (BP-ANN) methods for predicting the efficiency of oxidative desulfurization systems using ATR-FTIR spectroscopy
2023, Spectrochimica Acta - Part A: Molecular and Biomolecular SpectroscopyArtificial intelligence (AI) and machine learning in the treatment of various diseases
2023, Computational Approaches in Drug Discovery, Development and Systems Pharmacology