REINA at WebCLEF 2006: Mixing Fields to Improve Retrieval

Zazo, Ángel F., G.-Figuerola, Carlos and Alonso-Berrocal, José-Luis . REINA at WebCLEF 2006: Mixing Fields to Improve Retrieval., 2006 In: ABSTRACTS CLEF 2006 Workshop, 20-22 September, Alicante, Spain. Results of the CLEF 2006 Cross-Language System Evaluation Campaign. UNSPECIFIED. [Book chapter]

[thumbnail of zazo2006reina.pdf]
Preview
PDF
zazo2006reina.pdf

Download (122kB) | Preview

English abstract

his paper describes our work at CLEF 2006 Robust task. This task is an ad-hoc task that explores methods for stable retrieval by focusing on poorly performing topics. We have realized experiments for all subtask: monolingual (EN, ES, FR and IT), bilingual (IT→ES) and multilingual (ES→[EN ES FR IT]) retrieval. For monolingual retrieval we have focused our work on local query expansion, i.e. using only the information from retrieved documents. External corpora, such as the Web, were not used. Our document retrieval system is simple; it is based on vector space model. Some local expansion techniques were applied for training topics. The best improvement was achieved using association thesauri, which were constructed employing co-occurrence relations in term windows, not in complete document. This technique is effective and can be easily implemented without tuning some parameters. Our mandatory runs (title+description topic fields) have obtained good positions in all monolingual subtasks we participate.For bilingual retrieval two machine translation programs were used to translate the topics from Italian into Spanish. Both translations were joined before searching. The same expansion technique was also applied. Our mandatory run has got the top rank in the bilingual subtask. For multilingual research we used the same procedure to obtain the retrieval list for each target language, and we combined them with the MAX-MIN data fusion method. In this subtask, our mandatory run has been in the lower part of the ranking of runs.

Item type: Book chapter
Keywords: Robust Retrieval, Query Expansion, Term Windows, Association Thesauri, CLIR, Machine Translation
Subjects: L. Information technology and library technology > LM. Automatic text retrieval.
Depositing user: R. Gómez-Díaz
Date deposited: 15 Dec 2009
Last modified: 02 Oct 2014 12:16
URI: http://hdl.handle.net/10760/14004

References

Salton, G. and McGill, M. J. (1983). Introduction to Modern Information Retrieval. McGraw-Hill, New-York.

Singhal, A., Buckley, C., and Mitra, M. (1996). Pivoted document length normalization. In Frei, H.-P., Harman, D. K., Schäuble, P., and Wilkinson, R., editors, Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’96. August 18–22, 1996, Zurich, Switzerland, pages 21–29. ACM. van Rijsbergen, C. J. (1979).

Voorhees, E. M. (2003). Overview of the TREC 2003 robust retrieval track. In The Twelfth Text REtrieval Conference (TREC 2003), pages 69–77. NIST Special Publication 500-255.

Voorhees, E. M. (2004). Overview of the TREC 2004 robust retrieval track. In The Thirteen Text REtrieval Conference (TREC 2004), Gaithersburg, Maryland, November 16-19. NIST Special Publication 500-261.

Voorhees, E. M. (2005). Overview of the TREC 2005 robust retrieval track. In The Fourteenth Text REtrieval Conference (TREC 2005), Gaithersburg, Maryland, November 15-18. NIST.

Zazo, A. F. (2003). Técnicas de Expansión en los Sistemas de Recuperación de Información. PhD thesis, Departamento de Informática y Automática. Universidad de Salamanca.

Zazo, A. F., Figuerola, C. G., Alonso Berrocal, J. L., and Rodríguez, E. (2005). Reformulation of queries using similarity thesauri. Information Processing & Management, 41(5):1163–1173.

Zazo, A. F., Figuerola, C. G., Alonso Berrocal, J. L., and Rodríguez Vázquez de Aldana, E. (2002). Tesauros de asociación y similitud para la expansión automática de consultas. Algunos resultados experimentales. Technical Report DPTOIA-IT-2002-007, Departamento de Informática y Automática - Universidad de Salamanca.

Zazo, A. F., Figuerola, C. G., Berrocal, J. L. A., Rodríguez, E., and Gómez, R. (2003). Experiments in term expansion using thesauri in Spanish. In Advances in Cross-Language Information Retrieval. Third Workshop of the Cross-Language Evaluation Forum, CLEF 2002, Rome, Italy. September, 2002 Revised Papers, volume 2785 of Lecture Notes in Computer Science, pages 301–310. Springer.


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item