Thepace, by which scientific knowledge is being produced and shared today, wasnever been so fast in the past. Different areas of science are getting closerto each other to give rise new disciplines. Bioinformatics is one of such newlyemerging fields, which makes use of computer, mathematics and statistics inmolecular biology to archive, retrieve, and analyse biological data.
Althoughyet at infancy, it has become one of the fastest growing fields, and quicklyestablished itself as an integral component of any biological researchactivity. It is getting popular due to its ability to analyze huge amount ofbiological data quickly and cost-effectively. Bioinformatics can assist abiologist to extract valuable information from biological data providingvarious web- and/or computer-based tools, the majority of which are freelyavailable. The present review gives a comprehensive summary of some of thesetools available to a life scientist to analyse biological data. Exclusivelythis review will focus on those areas of biological research, which can begreatly assisted by such tools like analyzing a DNA and protein sequence toidentify various features, prediction of 3D structure of protein molecules, tostudy molecular interactions, and to perform simulations to mimic a biologicalphenomenon to extract useful information from the biological data. The functioningand specificity of the tools like ENTREZ, iTasser, GENSCAN, ORF finder;Modeller is discussed in the following reviewIntroductionBioinformatics is aninterdisciplinary science, emerged by the combination of various otherdisciplines like biology, mathematics, computer science, and statistics, todevelop methods for storage, retrieval and analyses of biological data.
PaulienHogeweg, a Dutch system-biologist, was the firstperson who used the term”Bioinformatics” in 1970, referring to the use of information technology for studyingbiological systems. The launch of userfriendly interactive automated modelingalong with the creation of SWISS-MODEL server around 18 years ago resulted inmassive growth of this discipline. Since then, it has become an essential partof biological sciences to process biological data at a much faster rate withthe databases and informatics working at the backend.Computational tools are routinelyused for characterization of genes, determining structural and physiochemicalproperties of proteins, phylogenetic analyses, and performing simulations tostudy how biomolecule interact in a living cell. Although these tools cannotgenerate information as reliable as experimentation, which is expensive, timeconsuming and tedious, however, the in silico analyses can stillfacilitate to reach an informed decision for conducting a costly experiment.For example, a druggable molecule must have certain ADMET (absorption,distribution, metabolism, excretion, and toxicity) properties to pass throughclinical trials. If a compound does not have required ADMETs, it is likely tobe rejected.
To avoid such failures, different bioinformatics tools have beendeveloped to predict ADMET properties, which allow researchers to screen alarge number of compounds to select most druggable molecule before launching ofclinical trials. Earlier, a number of reviews on various specialized aspects ofbioinformatics have been written. However, none of these articles makes itsuitable for a scientist who does not belong to computational biology. Here, wetake the opportunity to introduce various tools of bioinformatics to anon-specialist reader to help extract useful information regarding his/herproject. Therefore, we have selected only those areas where these tools couldbe highly useful to obtain useful information from biological data.
These areasinclude analyses of DNA/protein sequences, phylogenetic studies, predicting 3Dstructures of protein molecules, molecular interactions and simulations as wellas drug designing. The organization of text in each section starts from asimplistic overview of each area followed by key reports from literature and atabulated summary of related tools, where necessary, towards the end of each section.`•Genbank• Expasy• ENSAMBLE• READSEQ• ENTREZ• Magpie• GenQuiz• GENSCAN• ORF finder• Modeller• iTASSERi. iTassar Iterative Threading ASSEmbly Refinementisa bioinformatics method for predicting three-dimensionalstructure model ofprotein molecules from amino acid sequences. Specificity It detectsstructure templates from the Protein Data Bank by a technique calledfold recognition or threading. The full-length structure models areconstructed by reassembling structural fragments from threading templates usingReplica Exchange Monte Carlo Simulation.
I-TASSER is one of the mostsuccessful protein structure prediction methods in the community-wide CASP experiments.I-TASSER has been extended for structure-based protein function predictions,which provides annotations on ligand binding site, gene ontology and enzyme commission by structurallymatching structural models of the target protein to the known proteins in proteinfunction databases. It has an on-line server built in the Yang Zhang Lab atthe University of Michigan, Ann Arbor, allowing users to submit sequences and obtainstructure and function predictions. A standalone package of I-TASSERis available for download at the I-TASSER website.
Functioning The I-TASSER server allows users togenerate protein structure and function predictions.· Input· Mandatory:· Aminoacid sequence with length from 10 to 1,500 residues· Optional· Contactrestraints· Distancemaps· Inclusionof special templates· Exclusionof special templates· Secondarystructures· Output· Structureprediction:· Secondarystructure prediction· Solventaccessibility prediction· Top10 threading alignment from LOMETS· Top5 full-length atomic models (ranked based on cluster density)· Top10 proteins in PDB which are structurally closest to the predicted models· Estimatedaccuracy of the predicted models B-factor estimation· Function prediction:· EnzymeClassification and the confidence score· GeneOntology terms and the confidence score· Ligand-bindingsites and the confidence score· Animage of the predicted ligand-binding sitesConclusion and Future Prospects Bioinformatics is comparatively young disciplines and it has progressedvery fast in the last few years. It has made it possible to test our hypothesesvirtually and therefore allows to take a better and an informed decision beforelaunching costly experimentations. Although, more and more tools for analyzinggenomes, proteomes, predicting structures, rational drug designing andmolecular simulations are being developed; none of them is ‘perfect’.Therefore, the hunt for finding a better package for solving the given problemswill continue.
One thing is clear that the future research will be guidedlargely by the availability of databases, which could be either generic orspecific. It can also be safely assumed, based on the developments in the fieldof bioinformatics, that the bioinformatics tools and software packages would beable to give results that are more accurate and thus more reliableinterpretations. Prospects in the field of bioinformatics include its futurecontribution to functional understanding of the human genome, leading toenhanced discovery of drug targets and individualized therapy. Thus,bioinformatics and other scientific disciplines have to move hand in hand to flourishfor the welfare of humanity.There are some other tools and the softwares1. Genbank2. Expasy3. Ensamble4.
Readseq5. Enterez6. Magpie7. GenQuiz8. Genscan9. ORF finder10.
Modeller11. DDBJ12. PIR13. AceDB14. Bankit15. Sequin16.
Spin17. Panther18. NCBI ORF finder19. ORF Prediction20. ORF Investigation21. RNA Seq etcREFERENCES Mount DW (2004) Sequence and genome analysis.
New York: Cold Spring. Hesper B, Hogeweg P (1970) Bioinformatica:eenwerkconcept. Kameleon 1:28-9. Hogeweg P (2011) The roots of bioinformatics in theoretical biology. PLoS Comput Biol 7: e1002021. Peitsch MC (1996) ProMod and Swiss-Model: Internet-based tools for automated comparative protein modelling. Biochem Soc Trans 24: 274-279. Dibyajyoti S, Bin ET, Swati P (2013) Bioinformatics: The effects on the cost of drug discovery.
Galle Med J 18:44-50. Ouzounis CA, Valencia A (2003) Early bioinformatics: the birth of a discipline–a personal view. Bioinformatics 19: 2176-2190. Molatudi M, Molotja N, Pouris A (2009) Abibliometric study of bioinformatics research in South Africa. Scientometrics 81:47-59. Ouzounis CA (2012) Rise and demise of bioinformatics? Promise and progress. PLoS Comput Biol 8: e1002487.
Geer RC, Sayers EW (2003) Entrez: making use of its power. Brief Bioinform 4: 179-184. Parmigiani G, Garrett ES, Irizarry RA, Zeger SL (2003) The analysis of gene expression data: an overview of methods and software, Springer, New York