Sistemas de recuperación de información adaptados al dominio biomédico

Marrero, Monica and Sanchez-Cuadrado, Sonia and Urbano, Julian and Morato, Jorge and Moreiro, Jose-Antonio Sistemas de recuperación de información adaptados al dominio biomédico. El profesional de la información, 2009, vol. 19, n. 3, pp. 246-254. [Journal article (Paginated)]

[img]
Preview
PDF
10.3145-epi.2010.may.04.pdf

Download (1MB) | Preview

English abstract

The terminology used in biomedicine has lexical characteristics that have required the elaboration of terminological resources and information retrieval systems with specific functionalities. The main characteristics are the high rates of synonymy and homonymy, due to phenomena such as the proliferation of polysemic acronyms and their interaction with common language. Information retrieval systems in the biomedical domain use techniques oriented to the treatment of these lexical peculiarities. In this paper we review some of these techniques, such as the application of Natural Language Processing (BioNLP), the incorporation of lexical-semantic resources, and the application of Named Entity Recognition (BioNER). Finally, we present the evaluation methods adopted to assess the suitability of these techniques for retrieving biomedical resources.

Spanish abstract

La terminología usada en biomedicina tiene rasgos léxicos que han requerido la elaboración de recursos terminológicos y sistemas de recuperación de información con funciones específicas. Las principales características son las elevadas tasas de sinonimia y homonimia, debidas a fenómenos como la proliferación de siglas polisémicas y su interacción con el lenguaje común. Los sistemas de recuperación de información en el dominio biomédico utilizan técnicas orientadas al tratamiento de estas peculiaridades léxicas. Se revisan algunas de estas técnicas, como la aplicación de Procesamiento del Lenguaje Natural (BioNLP), la incorporación de recursos léxico-semánticos, y la aplicación de Reconocimiento de Entidades (BioNER). Se presentan los métodos de evaluación adoptados para comprobar la adecuación de estas técnicas en la recuperación de recursos biomédicos.

Item type: Journal article (Paginated)
Keywords: Biomedicina, BioNER, BioNLP, Text-mining, Recuperación de información, Proceso del lenguaje natural, NLP Biomedicine, BioNER, BioNLP, Text-mining, Information retrieval, Natural Language Processing, NLP
Subjects: L. Information technology and library technology > LL. Automated language processing.
L. Information technology and library technology > LM. Automatic text retrieval.
L. Information technology and library technology > LS. Search engines.
Depositing user: Sonia Sanchez-Cuadrado
Date deposited: 19 Jun 2012
Last modified: 02 Oct 2014 12:22
URI: http://hdl.handle.net/10760/17153

References

Ananiadou, Sophia (ed.); McNaught, John (ed.). Text mining for biology

and biomedicine. Artech House, 2006, ISBN 978-1-58053-984-5.

Baeza-Yates, Ricardo. “Tendencias en minería de datos de la Web”. El

profesional de la información, 2009, v. 18, n. 1, pp. 5-10.

Bodenreider, Olivier. “Lexical, terminological and ontological resources

for biological text mining”. En: Ananiadou, Sophia (ed.); McNaught, John

(ed.). Text mining for biology and biomedicine. Artech House, 2006, pp. 43-

66, ISBN 978-1-58053-984-5.

http://www.lhncbc.nlm.nih.gov/lhc/docs/published/2006/pub2006007.pdf

Clegg, Andrew B.; Shepherd, Adrian J. “Evaluating and integrating treebank parsers on a biomedical corpus”. En: Workshop on software (43rd Annual meeting of the ACL), 2005.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.136.8412&rep=

rep1&type=pdf

Cohen Aaron; Hersch, William. “A survey of current work in biomedical

text mining”. Briefings in bioinformatics, 2005, v. 6, n. 1, pp. 57–71.

http://bib.oxfordjournals.org/cgi/content/short/6/1/57

Collier, Nigel; Kawazoe, Ai; Jin, Lihua; Shigematsu, Mika; Dien, Dinh;

Barrero, Roberto A.; Takeuchi, Koichi; Kawtrakul, Asanee. “A multilingual ontology for infectious disease surveillance: rationale, design and

challenges”. Language resources and evaluation, 2007, v. 40 n. 3-4, pp.

405-413.

http://naist.cpe.ku.ac.th/downloads/publications/2007_n/Journal_Lecture_

Notes/Multi_Onot_Disease.pdf

Cunningham, Hamish. “Information extraction, automatic”. En: Brown,

Keith (ed.). Encyclopedia of language and linguistics, v. 1-14, 2nd Edition,

Elsevier Science Publishers, 2005, pp. 665-677. ISBN 0-08-044299-4.

http://gate.ac.uk/sale/ell2/ie/main.pdf

Dingare, Shipra; Finkel, Jenny; Nissim, Malvina; Manning, Christopher; Grover, Claire. “A system for identifying named entities in biomedical text: how

results from two evaluations reflect on both

the system and the evaluations”. En: BioLink

meeting at ISMB, 2004.

Gaizauskas, Robert; Demetriou, George;

Artymiuk, Pete J.; Willett, Peter. “Protein

structures and information extraction from

biological texts: the Pasta system”. Bioinformatics, 2003, v. 19, n. 1, pp. 135–143.

http://bioinformatics.oxfordjournals.org/

cgi/content/abstract/19/1/135

Hersh, William. TREC genomics track protocol. Oregon Health & Science University,

2004.

http://ir.ohsu.edu/genomics/2004protocol.

html

Jacquemin, Christian. Spotting and discovering terms through natural language processing. Cambridge, MA: MIT Press, 2001,

ISBN 0-262-10085-1.

Kawazoe, Ai; Jin, Lihua; Shigematsu,

Mika; Bekki, Daisuke; Barrero, Roberto;

Taniguchi, Kiyosu; Collier, Nigel. “The development of a schema for semantic annotation: gain brought by a formal ontological method”. Applied

ontology, 2009, v. 4, n. 1, pp. 5-20.

Leser, Ulf; Hakenberg Jörg. “What makes a gene name? Named entity

recognition in the biomedical literature”. Briefings in bioinformatics, 2005,

v. 6, n. 4, pp. 357-369.

Liu, Hongfang; Johnson, Stephen; Friedman, Carol. “Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS”. Journal of the American Medical Informatics Association,

2002, v. 9, n. 6, pp. 621–636.

McCray, Alexa T.; Browne, Allen C.; Bodenreider, Olivier. “The lexical

properties of the Gene ontology (GO)”. En: Proceedings of the AMIA symposium, 2002, pp. 504-508.

http://www.lhncbc.nlm.nih.gov/lhc/docs/published/2002/pub2002030.pdf

Morgan, Alexander; Hirschman, Lynette; Colosimo, Marc; Yeh, Alexander; Colombe, Jeff. “Gene name identification and normalization using a

model organism database”. Journal of biomedical informatics, 2004, v. 37,

n. 6, pp. 396–410.

Poibeau Thierry; Kosseim, Leila. “Proper name extraction from non-journalistic texts”. Language and computers, 2001, v. 37, pp. 144-157.

Rector, Alan; Stevens, Robert; Rogers, Jeremy. Simple bio upper ontolgy,

2006.

http://www.cs.man.ac.uk/~rector/ontologies/simple-top-bio/

Rong, Xu; Morgan, Alex; Das, Amar K.; Garber, Alan. “Investigation

of unsupervised pattern learning techniques for bootstrap construction of a

medical treatment lexicon”. En: BioNLP workshop, 2009, pp. 63-70.

http://aclweb.org/anthology/W/W09/W09-1308.pdf

Rosse, Cornelius; Kumar, Anand; Mejino Jose L. V.; Cook, Daniel L.;

Detwiler, Landon T.; Smith, Barry. “A strategy for improving and integrating biomedical ontologies”. En: Annual symposium of the AMIA, 2005,

pp. 639–643.

http://ontology.buffalo.edu/bio/OBR.pdf

Samwald, Matthias; Adlassnig, Klaus-Peter. “The bio-zen plus ontology”. Applied ontology, 2008, v. 3, n. 4, pp. 213-217.

Schulze-Kremer, Steffen. “Adding semantics to genome databases: towards

an ontology for molecular biology”. En: 5

th

Int. conf. on intelligent systems

for molecular biology, 1997, pp. 272-275.

Soldatova, Larisa N.; King, Ross D. “Are the current ontologies in biology

good ontologies?”. Nature biotechnology, 2005, v. 23, n. 9, pp. 1095–1098.

Spasic, Irena; Ananiadou, Sophia. “A flexible measure of contextual similarity for biomedical terms”. En: Pacific symposium on biocomputing, 2005,

pp. 197-208.

http://helix-web.stanford.edu/psb05/spasic.pdf

Spasic, Irena; Ananiadou, Sophia; McNaught, John; Kumar, Anand.

“Text mining and ontologies in biomedicine: making sense of raw text”.

Briefings in bioinformatics, 2005, v. 6, n. 3, pp. 239–251.

http://bib.oxfordjournals.org/cgi/content/short/6/3/239

Stenzhorn, Holger; Schulz, Stefan; Beißwanger, Elena; Hahn, Udo; Van

Den Hoek, László; Van Mulligen, Erik. “BioTop and ChemTop – TopDomain ontologies for biology and chemistry”. En: International Semantic

Web Conference (Posters & Demos), 2008, pp. 1-2.

http://www.imbi.uni-freiburg.de/ontology/biotop/publications/iswc08.pdf

Tsuruoka, Yoshimasa; Tsujii, Jun’ichi. “Improving the performance of

dictionary-based approaches in protein name recognition”. Journal of biomedical informatics, 2004, v. 37, n. 6, pp. 461–470.

Weeber, Marc; Klein, Henny; Aronson, Alan R.; Mork, James G.; De

Jong-Van den Berg, Lolkje; Vos, Rein. ”Text-based discovery in biomedicine: the architecture of the DAD-system”. En: AMIA symposium, 2000,

pp. 903–907.

http://www.lhncbc.nlm.nih.gov/lhc/docs/published/2000/pub2000061.pdf

Zhou, GuoDong; Zhang, Jie; Su, Jian; Shen, Dan; Tan, ChewLim. “Recognizing names in biomedical texts: a machine learning approach”. Bioinformatics, 2004, v. 20, n. 7, pp. 1178–1190.

http://bioinformatics.oxfordjournals.org/cgi/content/short/20/7/1178


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item