Where do good query terms come from?

Muresan, Gheorghe and Roussinov, Dmitri Where do good query terms come from?, 2006 . In 69th Annual Meeting of the American Society for Information Science and Technology (ASIST), Austin (US), 3-8 November 2006. [Conference paper]

Preview

PDF
Muresan_Where.pdf
Download (328kB) | Preview

English abstract

This paper describes a framework for investigating the quality of different query expansion approaches, and applies it in the HARD TREC experimental setting. The intuition behind our approach is that each topic has an optimal term-based representation, i.e. a set of terms that best describe it, and that the effectiveness of any other representation is correlated with the overlap that it has with the optimal representation. Indeed, we find that, for a wide number of candidate topic representations, obtained through various query-expansion approaches, there is a high correlation between standard effectiveness measures (R-P, P@10, MAP) and term overlap with what is estimated to be the optimal representation. An important conclusion of comparing different query expansion approaches is that machines are better than humans at doing statistical calculations and at estimating which query terms are more likely to discriminate documents relevant for a given topic. This explains why, in the HARD track of TREC 2005, the overall conclusion was that interaction with the searcher and elicitation of additional information could not over-perform automatic procedures for query improvement. However, the best results are obtained from hybrid approaches, in which human relevance judgments are used by algorithms for deriving terms representations. This result suggest that the best approach in improving retrieval performance is probably to focus on implicit relevance feedback and novel interaction models based on ostention or mediation, which have shown great potential.

Item type:	Conference paper
Keywords:	query expansion ; HARD TREC ; automated information retrieval
Subjects:	L. Information technology and library technology > LL. Automated language processing. I. Information treatment for information services > IB. Content analysis (A and I, class.)
Depositing user:	Norm Medeiros
Date deposited:	21 Dec 2006
Last modified:	02 Oct 2014 12:05
URI:	http://hdl.handle.net/10760/8687

Check full metadata for this record

References

Downloads

Downloads per month over past year

Actions (login required)

View Item

Facebook

Twitter

RSS