Recuperación de información utilizando el modelo vectorial. Participación en el taller CLEF−2001

Zazo, Ángel F. and G.-Figuerola, Carlos and Alonso-Berrocal, José-Luis and Gómez-Díaz, Raquel Recuperación de información utilizando el modelo vectorial. Participación en el taller CLEF−2001., 2002 (Unpublished) [Report]

[img]
Preview
PDF
zazo2002recuperacion.pdf

Download (754kB) | Preview

English abstract

This document describes the construction process of an information retrieval system (IRS) for our participation in CLEF−2001. We use the vector model. First, the document provides a brief description of the information retrieval problem, the vector model and the evaluation of the IRS. Next, the lexical analysis applied to documents and queries is described. Finally, we show the results and the conclusions.

Spanish abstract

En este documento se describe el proceso seguido para la construcción del sistema de recuperación de información que hemos utilizado en nuestra participación en CLEF−2001. El sistema utiliza el conocido modelo vectorial. Se analiza primeramente el problema de la recuperación de información, el modelo vectorial y la evaluación sobre una colección de pruebas. Seguidamente se describe el procesado léxico realizado sobre el contenido de documentos y consultas. Se finaliza con los resultados obtenidos en los experimentos y las conclusiones

Item type: Report
Keywords: Information retrieval. Space vectorial model
Subjects: L. Information technology and library technology > LL. Automated language processing.
Depositing user: R. Gómez-Díaz
Date deposited: 07 Dec 2009
Last modified: 02 Oct 2014 12:16
URI: http://hdl.handle.net/10760/13963

References

Baeza−Yates, Ricardo, y Ribeiro−Neto, Berthier. Modern information retrieval. Harlow [England], etc: Addison−Wesley, 1999.

Belkin, N.J.; Croft, W.B. Retrieval techniques. Annual Review of Information Science and Technology, 22, p. 109−145. (1987)

Buckley, C.; Allan, J.; Salton, G. Automatic routing and ad−hoc retrieval using SMART: TREC 2. // Donna Harman, editor, Proceedings of the Second Text Retrieval Conference TREC−2. NIST Special Publication , 500−215. (1994).

Figuerola, C.G.; Gómez Díaz, R.; López de San Román, E. Stemming and n−grams in Spanish: an evaluation of their impact on information retrieval. En: Journal of Information Science, 26 (6) 2000, pp. 461−467.

Figuerola, C.G.; Alonso Berrocal, J.L.; Zazo Rodríguez, A.F.; Gómez Díaz, R. A simple approach to the Spanish−English Bilingual retrieval task. En: Peters, C. (Ed.). Cross Language Information Retrival and Evaluation, pp. 224−229. Springer−Verlag: Berlin, N.Y., [etc.], 2001.

Figuerola, C.G.; Alonso Berrocal, J.L.; Zazo Rodríguez, A.F. Diseño de un motor de recuperación de información para uso experimental y educativo. En: BID. Textos universitaris de bibliotecomia i documentació, V. 11, pp. 201−209, 2001.

Christopher Fox. Lexical Analysis and Stoplists. En W.B. Frakes y R. Baeza−Yates. Information retrieval, data structures and algorithms. New Jersey, London, etc.: Prentice−Hall, 1992.

Gómez Díaz, Raquel. Estudio de la incidencia del conocimiento lingüístico en los sistemas de recuperación de información para el español. Tesis doctoral. Salamanca: Ediciones Universidad de Salamanca, 2001 (colección Vitor).

Harman, D. Ranking Algorithms. //Frakes, W.B.; Baeza−Yates, R. Information retrieval: Data Structures and Algorithms. Prentice−Hall, Englewood Cliffs (NJ), pp. 363−392. (1992).

Harter, S.P.; Hert, C.A. Evaluation of information retrieval systems. // Annual Review of Information Science and Technology, 32, p. 3−94. (1977).

Hooper, R. S. Indexer consistency tests−origin, measurements, results and utilization. Bethesda, MD. (1965).

NIST (National Institute of Standards and Technology), Information Technology Laboratory’s (ITL) Retrieval Group of the Information Access Division (IAD) and the Information Technology Office of the Defense Advanced Research Projects Agency (DARPA) and the Advanced Research and Development Agency (ARDA): Text REtrieval Conference (TREC). [http://trec.nist.gov, consulta: abril 2002].

Peters, Carol. CLEF−2001: Workshop of the Cross−Language Evaluation Forum. En: ERCIM News No.47, October 2001, http://www.ercim.org/publication/Ercim_News/enw47/clef2001.html

Rijsbergen, Keith V. Information Retrieval. 2nd print. London: Butther−worths, 1979. [También en línea: http://www.dcs.gla.ac.uk/Keith/ (consultado en abril de 2002].

Salton, G. Automatic Information Organization and Retrieval. McGraw−Hill, N.Y. (1968).

Salton, G.; Buckley, C. Term−Witghting Approaches in Automatic Text Retrieval. En: Information Processing and Management, 24(5), 513−523. (1988)

Salton, G.; Buckley, C. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41 (4), 288−297. (1990).

Salton, G.; McGill, M.J. Introduction to Modern Information Retrieval. McGraw−Hill, New York. (1983)

Stubbs, E. A.; Mangiaterra, N.E; Martinez, A. M. Internal quality audit of indexing: a new application of interindexer consytency. En: Cataloguing & Classification Quaterly, 28(4), 53−70. (2000).

Ellen M. Voorhees, Donna Harman. Overview of the Six Text REtrieval Conference (TREC−6). En: E.M. Voorhees and D. K. Harman (editors), Proceedings of the Six Text REtrieval Conference (TREC−6), pages 1−24, November 1997. NIST Special Publication 500−240.

Wall, Larry. Larry Wall’s Very Own Perl Page. [En línea: http://www.wall.org/~larry/perl.html (consultado: abril 2002)].


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item