Niels, Jensen and Thomas, Mandl Different Indexing Strategies for Multilingual Web Retrieval: Experiments with the EuroGOV Corpus., 2006 . In 17th ACM Conference on Hypertext and Hypermedia (HT '06), Odense, Denmark, 2006 August 22nd –25th. (Unpublished) [Presentation]
Preview |
PDF
ht06PosterPresentedJensenMandl.PDF Download (165kB) | Preview |
English abstract
Experiments with a multi-lingual web collection are presented. The EuroGOV corpus is the first multi-lingual web corpus for retrieval evaluation. We show how indexes based on words and n-grams are developed for different document parts. Different indexes were based on the full document content, partial content and the title. The best results were achieved for a title only index based on words.
Item type: | Presentation |
---|---|
Keywords: | web information retrieval, multilingual information systems |
Subjects: | L. Information technology and library technology > LM. Automatic text retrieval. L. Information technology and library technology > LS. Search engines. L. Information technology and library technology > LC. Internet, including WWW. |
Depositing user: | Thomas Mandl |
Date deposited: | 28 Aug 2006 |
Last modified: | 02 Oct 2014 12:04 |
URI: | http://hdl.handle.net/10760/8033 |
References
Downloads
Downloads per month over past year
Actions (login required)
View Item |