Where do good query terms come from?

Muresan, Gheorghe and Roussinov, Dmitri Where do good query terms come from?, 2006 . In 69th Annual Meeting of the American Society for Information Science and Technology (ASIST), Austin (US), 3-8 November 2006. [Conference paper]

[thumbnail of Muresan_Where.pdf]
Preview
PDF
Muresan_Where.pdf

Download (328kB) | Preview

English abstract

This paper describes a framework for investigating the quality of different query expansion approaches, and applies it in the HARD TREC experimental setting. The intuition behind our approach is that each topic has an optimal term-based representation, i.e. a set of terms that best describe it, and that the effectiveness of any other representation is correlated with the overlap that it has with the optimal representation. Indeed, we find that, for a wide number of candidate topic representations, obtained through various query-expansion approaches, there is a high correlation between standard effectiveness measures (R-P, P@10, MAP) and term overlap with what is estimated to be the optimal representation. An important conclusion of comparing different query expansion approaches is that machines are better than humans at doing statistical calculations and at estimating which query terms are more likely to discriminate documents relevant for a given topic. This explains why, in the HARD track of TREC 2005, the overall conclusion was that interaction with the searcher and elicitation of additional information could not over-perform automatic procedures for query improvement. However, the best results are obtained from hybrid approaches, in which human relevance judgments are used by algorithms for deriving terms representations. This result suggest that the best approach in improving retrieval performance is probably to focus on implicit relevance feedback and novel interaction models based on ostention or mediation, which have shown great potential.

Item type: Conference paper
Keywords: query expansion ; HARD TREC ; automated information retrieval
Subjects: L. Information technology and library technology > LL. Automated language processing.
I. Information treatment for information services > IB. Content analysis (A and I, class.)
Depositing user: Norm Medeiros
Date deposited: 21 Dec 2006
Last modified: 02 Oct 2014 12:05
URI: http://hdl.handle.net/10760/8687

References

Allen, J. (2004) HARD Track Overview in TREC 2004 - High Accuracy Retrieval from Documents Proceedings of TREC 2004, Gaithersburg, November 2004

Allen, J. (2005) HARD Track Overview in TREC 2004 - High Accuracy Retrieval from Documents Proceedings of TREC 2005, Gaithersburg, November 2005

Belkin, N.J. (1980) Anomalous states of knowledge as a basis for information retrieval Canadian Journal of Information Science 5: 133-143

Belkin, N.J., Cool, C, Head, Jeng, J., Kelly, D., Lin, S. et al. (2000) Relevance Feedback versus Local Context Analysis as Term Suggestion Devices: Rutgers' TREC-8 Interactive Track Experience In: E.M. Voorhees & D.K. Harman (Eds.) The Eighth Text Retrieval Conference (TREC 8) (pp. 565-576). Washington, D.C.

Belkin, N. J., Cool, C., Kelly, D., Kim, G., Lee, H.-J., Muresan, G., et al. (2002) Rutgers interactive track at TREC 2002 Paper presented at the Eleventh Text Retrieval Conference (TREC 2002), Washington, D.C.

Belkin, N. J., Kelly, D., Lee, H.-J., Li, Y.-L., Muresan, G., Tang, M.-C., et al. (2003) Rutgers' HARD and web interactive track experiences at TREC 2003 Proceedings of the Twelfth Text Retrieval Conference (TREC 2003)

Belkin, N.J., Cole, M., Li, Y.-L., Liu, L., Liu, Y.-H., Muresan, G. et al. (2004) Rutgers' HARD Track Experiments at TREC 2004 Proceedings of TREC 2004, Gaithersburg, November 2004

Belkin, N.J., Cole, M., Gwizdka, J., Li, Y.-L., Liu, J.-J., Muresan, G. et al. (2005) Rutgers Information Interaction Lab at TREC 2005: Trying HARD Proceedings of TREC 2005, Gaithersburg, November 2005

Belkin, N. J., Oddy, R. N., & Brooks, H. (1982) Ask for information retrieval part ii. Results of a design study Journal of Documentation 38(3), 145-164

Buckley, C. & Voorhees, E. M. (2005) Retrieval System Evaluation In Voorhees, E. M. and Harman, D. K. (eds) TREC - Experiment and Evaluation in Information Retrieval The MIT Press, Cambridge, MA

Campbell, I. (1996) The Ostensive Model of Developing Information Needs Proceedings of COLIS-96, 2nd International Conference on Conceptions of Library Science

Cronen-Townsend, S., Zhou, Y. & Croft, W.B. (2004) A Framework for Selective Query Expansion Proceedings of CIKM 2004, Washington, DC, November 8-13, 2004

Harper, D. J. (1980) Relevance Feedback in Document Systems: An Evaluation of Probabilistic Strategies PhD thesis, Jesus College, Cambridge, UK, February 1980

Ingwerwsen, P. & Järvelin, K. (2004) Information retrieval in contexts In Ingwersen, P., Van Rijsbergen, C. J., Belkin, and Nick, Larsen, B. (eds.). Information in Context: IRiX:ACM-SIGIR Workshop 2004 Proceedings. Sheffield: Sheffield University, 2004 pp. 6-9 http://ir.dcs.gla.ac.uk/context/

Kelly, D., Dollu, V. D., & Fu, X. (2004) University of North Carolina's HARD track experiments at TREC 2004 Thirteenth Text Retrieval Conference (TREC 2004)

Kelly, D., Dollu, V. D., & Fu, X. (2005, August 15-19) The loquacious user: A document-independent source of terms for query expansion Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '05), Salvador, Brazil

Koenemann, J., & Belkin, N. J. (1996) A case for interaction: A study of interactive information retrieval behavior and effectiveness Proceedings of the Human Factors in Computing Systems Conference (CHI'96) ACM Press, New York, 1996

Krovetz, R. (1993) Viewing morphology as an inference process Proc. 16th ACM SIGIR Conference, Pittsburgh, June 27-July 1, 1993 pp. 191-202

Manning, C. D. & Schutze, H. (1999) Foundations of Statistical Natural Language Processing The MIT Press, Cambridge, MA Muresan, G. (2002) Using Document Clustering and Language Modelling in Mediated Information Retrieval PhD thesis, Robert Gordon University, Aberdeen, UK, January 2002

Muresan, G. & Harper, D.J. (2004) Topic Modelling for Mediated Access to Very Large Document Collections JASIST 55 (10): 892 - 910

Robertson, S. E., Walker, S., Hancock-Beaulieu, M .M. & Gatford, M. (1994) Okapi at TREC-3, Proceedings of the Third Text Retrieval Conference, November 1994

Rocchio, J. J. (1971) Relevance Feedback in Information Retrieval In Salton, G., (ed.) The SMART retrieval system Prentice Hall, 1971

Roussinov, D., Zhao, L., & Fan, W. (2005) Mining Context Specific Similarity Relationships Using The World Wide Web Proceedings of 2005 Conference on Human Language Technologies

Salton, G. & Buckley, C. (1988) Term-weighting approaches in automatic text retrieval Information Processing & Management 24(5): 513-523

Wilkinson, R. (1997) Using combination of evidence for term expansion Information Retrieval Research - Proceedings of the 19th Annual BCS-IRSG Colloquium on IR Research, Aberdeen, Scotland, April 1997


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item