Enabling Multilingual Search through Controlled Vocabularies: the AGRIS Approach

Celli, Fabrizio and Keizer, Johannes Enabling Multilingual Search through Controlled Vocabularies: the AGRIS Approach., 2016 [Preprint]

[img] Text
Celli_EnablingMultilingualSearch.pdf

Download (778kB)

English abstract

AGRIS is a bibliographic database of scientific publications in the food and agricultural domain. The AGRIS web portal is highly visited, reaching peaks of 350,000 visits/month from more than 200 countries and territories. Considering the variety of AGRIS users, the possibility to support cross-language information retrieval is crucial to improve the usefulness of the web-site. This paper describes a lightweight approach adopted to enable the afore-mentioned feature in the AGRIS system. The proposed approach relies on the adoption of a controlled vocabulary. Furthermore, we discuss how expanding user queries with synonyms increases the sensitivity of a search engine and how we can use a controlled vocabulary to achieve this result.

Item type: Preprint
Keywords: Cross-language Information Retrieval, Controlled Vocabulary, Query Expansion, Search Engine, Digital Repository, Agriculture, FAO_AIMS, FAO, AIMS
Subjects: I. Information treatment for information services > IC. Index languages, processes and schemes.
I. Information treatment for information services > IE. Data and metadata structures.
I. Information treatment for information services > IL. Semantic web
Depositing user: Thembani Malapela
Date deposited: 07 Nov 2016 07:57
Last modified: 07 Nov 2016 07:57
URI: http://hdl.handle.net/10760/30220

References

Basili, R., Stellato, A., Daniele, P., Salvatore, P., & Wurzer, J. (2012, September). Innova-tion-related enterprise semantic search: the INSEARCH experience. In Semantic Compu-ting (ICSC), 2012 IEEE Sixth International Conference on (pp. 194-201). IEEE.

Bibliographic Services Task Force of the University of California Libraries (2005). Re-thinking How We Provide Bibliographic Services for the University of California: Final Report.

Carpineto, C., & Romano, G. (2012). A survey of automatic query expansion in infor-mation retrieval. ACM Computing Surveys (CSUR), 44(1), 1.

Celli, F., Keizer, J., Jaques, Y., Konstantopoulos, S., & Vudragović, D. (2015). Discover-ing, Indexing and Interlinking Information Resources. F1000Research.

Deanna B. Marcum (2005). The Future of Cataloging: Address to the Ebsco Leadership Seminar. Boston, Massachusetts.

FAOSTAT Food and Agriculture commodities production, http://faostat3.fao.org/browse/rankings/commodities_by_regions/E

Gardner, S. A. (2008). The changing landscape of contemporary cataloging. Cataloging & Classification Quarterly, 45(4), 81-99.

Ghorab, M. R., Leveling, J., Lawless, S., O’Connor, A., Zhou, D., Jones, G. J., & Wade, V. (2011). Multilingual adaptive search for digital libraries. In Research and Advanced Technology for Digital Libraries (pp. 244-251). Springer Berlin Heidelberg.

Gale, W. A., Church, K. W., & Yarowsky, D. (1992, February). One sense per discourse. In Proceedings of the workshop on Speech and Natural Language (pp. 233-237). Associa-tion for Computational Linguistics.

Gross, T., & Taylor, A. G. (2005). What have we got to lose? The effect of controlled vo-cabulary on keyword searching results. College & Research Libraries, 66(3), 212-230.

Gross, T., Taylor, A. G., & Joudrey, D. N. (2015). Still a lot to lose: the role of controlled vocabulary in keyword searching. Cataloging & Classification Quarterly, 53(1), 1-39.

Kaplan, A., Sándor, Á., Severiens, T., & Vorndran, A. (2014). Finding Quality: A Multi-lingual Search Engine for Educational Research. In Assessing Quality in European Educa-tional Research (pp. 22-30). Springer Fachmedien Wiesbaden.

Lu, C., Park, J. R., & Hu, X. (2010). User tags versus expert-assigned subject terms: A comparison of LibraryThing tags and Library of Congress Subject Headings. Journal of in-formation science, 36(6), 763-779.

McCutcheon, S. (2009). Keyword vs controlled vocabulary searching: the one with the most tools wins. The Indexer, 27(2), 62-65.

Peters, C., Braschler, M., & Clough, P. (2012). Cross-Language Information Retrieval. In Multilingual Information Retrieval (pp. 57-84). Springer Berlin Heidelberg.

Rowley, J. (1994). The controlled versus natural indexing languages debate revisited: a perspective on information retrieval practice and research. Journal of information science, 20(2), 108-118.

Scicluna, R. (2015). Should libraries discontinue using and maintaining controlled subject vocabularies?.

Spink, A., Wolfram, D., Jansen, M. B., & Saracevic, T. (2001). Searching the web: The public and their queries. Journal of the American society for information science and tech-nology, 52(3), 226-234.

Stellato, A., Rajbhandari, S., Turbati, A., Fiorelli, M., Caracciolo, C., Lorenzetti, T., ... & Pazienza, M. T. (2015). VocBench: a web application for collaborative development of multilingual thesauri. In The Semantic Web. Latest Advances and New Domains (pp. 38-53). Springer International Publishing.

Voorbij, H. J. (1998). Title keywords and subject descriptors: A comparison of subject search entries of books in the humanities and social sciences. Journal of documentation, 54(4), 466-476.

Zavalina, O.L. (2010). Collection-Level Subject Access in Aggregations of Digital Collec-tions: Metadata Application and Use. PhD dissertation, University of Illinois at Urbana-Champaign.


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item