Aproximación Bio-Bibliométrica a la Detección de Relaciones Biológicas entre Genes

Galvez, Carmen and Félix, Moya-Anegón Aproximación Bio-Bibliométrica a la Detección de Relaciones Biológicas entre Genes., 2007 . In 2nd Iberian Conference on Information Systems and Technologies - CISTI 2007, Porto (Portugal), 21-23 June 2007. [Conference paper]

[img]
Preview
PDF
Bio-Bibliometrics.pdf

Download (322kB) | Preview

English abstract

The bioinformatics research has generated a large quantity of biomedical literature stored in databases, such as MEDLINE. The information extraction of the literature published can apply to detect biological relations among genes. The premise of the Bio-Bibliometric Analysis is the following one: if two symbols of gene appear in the same document is likely that be related, by the principle of co-occurrence. These data can be utilized to calculate the 'biobibliometric distance' among genes of a complete genome. In this work, we carry out a straightforward experiment based on this approach with the objective to extract and to visualize information of the biomedical literature related to the lymphoma disease. The main limitations of this method are the unification of the different gene-naming variants (to avoid the incorrect co-occurrences) and the identification of the type of genomic interactions

Spanish abstract

La investigación bioinformática ha generado una gran cantidad de literatura biomédica almacenada en bases de datos tales como MEDLINE. La extracción de información de la literatura publicada se puede aplicar para detectar relaciones biológicas entre genes. La premisa del análisis Bio-Bibliométrico es la siguiente: si dos símbolos de gen aparecen en el mismo documento es probable que estén relacionados (por el principio de co-ocurrencia). Estos datos se pueden utilizar para calcular la ‘distancia biobibliométrica’ entre pares de genes de un genoma completo. En este trabajo, realizamos un sencillo experimento basado en este planteamiento con el objetivo de extraer y visualizar información de la literatura biomédica relacionada con la enfermedad del linfoma. Las principales limitaciones de este método son la unificación de las diferentes variantes de nombres de gen, para que no se produzcan co-ocurrencias incorrectas, y la identificación del tipo de interacción genómica.

Item type: Conference paper
Keywords: Análisis bio-bilbiométrico; minería de textos; redes de genes; BioBibliometrics; Genes; Gene Networks; Text Mining
Subjects: I. Information treatment for information services > IC. Index languages, processes and schemes.
B. Information use and sociology of information > BB. Bibliometric methods
Depositing user: Carmen Galvez
Date deposited: 08 Aug 2007
Last modified: 02 Oct 2014 12:09
URI: http://hdl.handle.net/10760/10185

References

Blaschke, C. & Valencia, A. (2001). Can Bibliographic Pointers for Known Biological Data Be Found Automatically? Protein Interactions as a Case Study. Comparative and Functional Genomics, 2, 196-206.

Blasoklonny, M. V. & Pardee, A. B. (2002). Conceptual Biology: Unearthing the Gems. Nature, 416:373.

Borgatti, S., Everett, M. & Freeman, L. (2002). Ucinet 6.0 for Windows. Harvard: Analytic Technologies.

Chaussabel, D. & Sher, A. (2002). Mining Microarray Expression Data by Literature Profiling. Genome Biology, 3(10), Research0055.

Galvez, C. & Moya-Anegón, F. (2006a). Extracción y Normalización de Entidades Genómicas en Textos Biomédicos: Una Propuesta Basada en Transductores Gráficos. In Proceedings of the 1st Iberian Conference on Information Systems and Technologies - CISTI 2006 (Esposende, Portugal, Escola Superior de Tecnologia), 697-709.

Galvez, C. & Moya-Anegón, F. (2006b). Identificación de Nombres de Genes en la Literatura Biomédica. In Proceedings of the I International Conference on Multidisciplinary Information Sciences and Technologies, InSciT2006 (Mérida, Spain, Open Institute of Knowledge, INSTAC), 344-348.

Hamers, L., Hemeryck, Y., Herweyers, G., Janssen, M., Keters, H., Rousseau, R., & Vanhoutte, A. (1989). Similarity Measures in Scientometric Research: The Jaccard Index versus Salton’s Cosine Formula. Information Processing & Management, 25(3), 315-318.

Jenssen, T.-K., Laegreid, A., Komorowski, J. & Hovig, E. (2001). A Literature Network of Human Genes for High-Throughput Analysis of Gene Expression. Nature Genetics, 28(1), 21-28.

Stapley, B. J. & Benoit, G. (2000). Biobibliometrics: Information Retrieval and Visualization from Co-Occurrence of Gene Names in Medline Abstracts. In Proceedings of Pacific Symposium on Biocomputing, 529-540.

Tanabe, L. (2005). The Genomic Data Mine. In Chen, H., Fuller, S. S., Friedman, C. & Hersh, W. (Eds.), Medical Informatics: Knowledge Management and Data Mining in Biomedicine. New York: Springer.

Tanabe, L., Scherf, U., Smith, L., Lee, J., Hunter, L. & Weinstein, J. (1999). MedMiner: An Internet Tex-Mining Tool for Biomedical Information, with Application to Gene Expression Profiling. BioTechniques, 27(6), 1210-1217.

Venter, J. C. et al. (2001). The Sequence of the Human Genome. Science, 291(5507), 1304-1351.

Wren, J. D., Bekeredjian, R., Stewart, J. A., Shohet, R. V. & Hamer, H. R. (2004). Knowledge Discovery by Automated Identification and Ranking of Implicit Relationships. Bioinformatics, 20(3), 389-398.


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item