Development of an integrated suite of bioinformatic tools for genomic and proteoimic research

  1. SEVILLA CAMPO, JOSE LUIS FRANCISCO
Dirigida por:
  1. Angel Rubio Díaz-Cordovés Director/a

Universidad de defensa: Universidad de Navarra

Fecha de defensa: 15 de marzo de 2005

Tribunal:
  1. Pedro Crespo Bofill Presidente/a
  2. Javier Santos Garcia Secretario/a
  3. José Manuel García Carrasco Vocal
  4. Pedro Larrañaga Múgica Vocal
  5. Javier de las Rivas Vocal

Tipo: Tesis

Teseo: 300491 DIALNET

Resumen

Development of an integrated suite OF bioinformatic tools for genomic AND proteomic research Resumen: Recent technological developments in the biological scienees such as the advent of mi croarray technologies have resulted in the demand for new bioinformatic tools able to deal effectively with the overwhelming wealth of data available. This doctoral thesis explores the development of a novel system for genomic and proteomic studies that we have named garban (Genomic Analysis for Rapid Biological Annotation). The new software suite integrates the management of gene and protein expression levies with a comprehensive set of tools that carry out an exhaustive analysis of data gathered through multiple microarray or proteomic assays. The thesis combines concepts of data mining, mathematical algorithms as well as related statistical techniques. A relational database system is used to store, maintain and retrieve microarray data. Appropriate algorithms have been implemented to analyse the validity and relevance of the information gathered as well as to manipulate data to draw biological conclusions of interest. A client-server architecture has been chosen to provide simultaneous access to multiple users via a local intranet or the internet. Our implementation allows institutional sharing of data while safe-guarding the security and privacy of the information. We all so provide project, analysis and sample management at the user's level. The system is accessed through a friendly user interface that runs on a standard web browser-no special client installation is required. GARBAN al so provides automatic annotation of gene products and direct links to major public databases. The work comprises the design of a suitable database schema that supports the input of data from experimental samples, its annotation and further manipulation with the integrated set of analysis tools. we have all so developed a comprehensive process for filtering and normalisation of expression levies in order to help the researcher identify gene products that are most affected by the experimental conditions and, therefore, liable to lead to biological inferences. We include graphical access to Boehringer-Mannheim and KEGG (Kyoto Encyclopedia of Genes and Genomes) metabolic pathways that provide a powerful insight into functional information of genes and proteins. our implementation displays graphically the location of enzymes derived from genes and proteins within the metabolic pathways in which they are involved; it becomes possible to see, at glance, those functional processes that have been influenced or somehow altered in the experimental samples. GARBAN integrates the annotation developed by the Gene Ontology Consortium (GO) to classify and extract functional information of gene products. This approach has proven to be extremely useful to elucídate the significance of over- and under-expressed gene products under the GO hierarchical categories of molecular function, biological process and cellular component. Finally, we have developed GECCO (Gene Product Coexpression) as web-based system to select relevant genes using available information on gene coexpression in humans. Coexpression information may be combined with standard statistical techniques to increase the power of analysis and reduce the number of potential errors. GECCO also provides a graphical display of the network of coexpressed genes. Although it may be used as a stand-alone appli catión, its functionality has also been integrated into the GARBAN suite of tools