Review
Predictive non-linear modeling of complex data by artificial neural networks

https://doi.org/10.1016/S0958-1669(02)00288-4Get rights and content

Abstract

An artificial neural network (ANN) is an artificial intelligence tool that identifies arbitrary nonlinear multiparametric discriminant functions directly from experimental data. The use of ANNs has gained increasing popularity for applications where a mechanistic description of the dependency between dependent and independent variables is either unknown or very complex. This machine learning technique can be roughly described as a universal algebraic function that will distinguish signal from noise directly from experimental data. The application of ANNs to complex relationships makes them highly attractive for the study of biological systems. Recent applications include the analysis of expression profiles and genomic and proteomic sequences.

Protein transduction domains have circumvented the bioavailability restriction of the cell membrane and resulted in the delivery of peptides, full-length proteins, iron beads, liposomes, and radioactive isotopes into cells and animal models.

Introduction

The past decade has seen a host of data analysis tools based on biological phenomena develop into well-established modeling techniques, such as artificial intelligence and evolutionary computing [1]. Artificial neural networks (ANNs) are now the most popular artificial learning tool in biotechnology, with applications ranging from pattern recognition in chromatographic spectra 2., 3••. and expression profiles, to functional analyses of genomic and proteomic sequences. ANNs are currently being used as the primary modeling tool in biotechnology, appearing in the literature at a rate of approximately 15 peer-reviewed publications per month [4]. The growing importance of ANNs has also been echoed by the program of major international life science conferences, such as the 9th International Conference on Intelligent Systems for Molecular Biology, Copenhagen 2001 [5]. The technique also forms the main focus of the Society for Artificial Neural Networks in Medicine and Biology, which organized its first conference last year [6]. The maturation of ANN usage is also reflected in the fact that most reports no longer appear in specialized literature, such as Neural Computation, Artificial Intelligence in Medicine or IEEE Transactions on Biomedical Engineering, but instead span the entire range of biotechnology literature, from clinical to environmental, from bioreaction engineering to molecular modeling.

The suitability of ANN to resolve complex relationships makes it the method of choice to calibrate and deconvolute complex signals in analytical biochemistry. Furthermore, as ANN modeling is not conditioned by the need to assume a mechanistic dependency, biochemical signals can be directly used to infer complex conditions such as automated medical diagnosis/prognosis from biochemical analysis, pattern recognition from expression profiles, compound detection from spectral information and, more recently, biological function from genomic information. Recent literature documents a very rapid expansion of ANN applications in biotechnology that has not been fully accompanied by the dissimilation of underlying theory, thus causing some published ANN models to be statistically defective. Further adding to the problem, several software implementations are flawed or misleading. To help counteract the lack of a standard procedure, a synthetic overview of ANN identification methodology is included in this review. On the basis of this description, a checklist is proposed to help avoid the most common pitfalls. In this report, the review of ANNs was restricted to multilayered, feed-forward, fully connected networks of perceptions, by far the most widely used neural network topology in biotechnology applications.

Section snippets

Overview of recent literature

The success of ANNs in capturing the characteristic non-linearity of biological systems has been translated in numerous applications for deconvolution and modeling of analytical biochemistry signals 7., 8., 9••., 10.. It is interesting to note that the use of ANNs is quickly going beyond calibration, instead aiming to be a predictive model of how complex physiological or functional target properties are dependent on basic instrumental results. This trend is particularly strong in the field of

ANN development, procedure and pitfalls

The fast and wide embrace of neural computing by biotechnology is noted for uneven accuracy in its application. There is a combination of factors involved, including peer-review committees unfamiliar with ANN, further aggravated by the wide distribution of misleading or even faulty software packages. In addition, there is a general lack of well-accepted exploratory techniques to describe non-linear dependencies when no accurate mechanistic description is available. This is particularly

Advice for evaluating ANN software

There is plenty of anecdotal evidence that a significant amount of ANN software is severely flawed, for example, by not including early stop criteria. In addition, some software is notoriously misleading for inexperienced users, for example, by not optimizing topology as a default option. The most thorough way to verify the accuracy of the implementation is to try it with benchmark datasets and to compare predictions with observations for test and validation data. A quicker approach is to try

Shedding light on the ANN black box

A common criticism of ANN models is that a successful prediction does not necessarily lead to a better mechanistic understanding of the process. However, the analysis of the ANN function itself, namely by determining the sensitivity to input parameters (Eq. (2)), can in fact shed some light on the predictive non-linear association [25].Syi←xj=dyidxj·xjyi

It is noteworthy that the sensitivity of an output parameter to a particular input parameter varies not only with the values of that input, but

Conclusions

The use of machine learning and, in particular, the use of ANNs is increasingly becoming the method of choice to calibrate complex biotechnology instrumentation and for modeling biological responses. A decade ago, when ANN made the first inroads into biotechnology, its use was mostly restricted to biochemical engineering applications. By contrast, ANN now occupies a central position as a predictive modeling tool in areas as diverse as computer-aided medical diagnosis and biological sequence

Acknowledgements

The author acknowledges the help of Elisabeth Pickelsimer of the Medical University of South Carolina for editing the text of this review.

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

References (35)

  • Result obtained by searching Current Contents...
  • Proceedings of the 9th International conference on Intelligent Systems for Molecular Biology. 2001 July 21–25; Denmark...
  • The Society for Artificial Neural Networks in Medicine and Biology. URL:...
  • J. Schmitt et al.

    Use of artificial neural networks in biomedical diagnosis

  • Y.D.D. He et al.

    Microarrays — the 21st century divining rod?

    Nat Med

    (2001)
  • J. Khan et al.

    Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks

    Nat Med

    (2001)
  • M.L. Lopez-Rodriguez et al.

    Synthesis and structure-activity relationships of a new model of arylpiperazines. 6. Study of the 5-HT1A/α1-adrenergic receptor affinity by classical Hansch analysis, artificial neural networks, and computational simulation of ligand recognition

    J Med Chem

    (2001)
  • Cited by (220)

    • Artificial intelligence (AI) and machine learning in the treatment of various diseases

      2023, Computational Approaches in Drug Discovery, Development and Systems Pharmacology
    View all citing articles on Scopus
    View full text