Mayr, Philipp, Mutschke, Peter and Petras, Vivien Reducing semantic complexity in distributed Digital Libraries: treatment of term vagueness and document re-ranking. Library Review, 2008, vol. 57, n. 3. (Unpublished) [Journal article (Unpaginated)]
Preview |
PDF
LR-10-07.pdf Download (151kB) | Preview |
English abstract
Purpose - The general science portal vascoda merges structured, high-quality information collections from more than 40 providers on the basis of search engine technology (FAST) and a concept which treats semantic heterogeneity between different controlled vocabularies. First experiences with the portal show some weaknesses of this approach which come out in most metadata-driven Digital Libraries (DL) or subject specific portals. The purpose of the paper is to propose models to reduce the semantic complexity in heterogeneous DLs. The aim is to introduce value-added services (treatment of term vagueness and document re-ranking) that gain a certain quality in DLs if they are combined with heterogeneity components established in the project “Competence Center Modeling and Treatment of Semantic Heterogeneity”. Design/methodology/approach - First, semantic heterogeneity components translate automatically between different indexing languages. This approach will have an impact on search in a scenario when the searcher uses controlled vocabularies which are cross-linked with cross-concordances. However, users usually formulate query terms freely without any vocabulary support. Empirical observations show that freely formulated user terms and terms from controlled vocabularies are often not the same or match just by coincidence. Therefore, a value-added service will be developed which rephrases the natural language searcher terms into suggestions from the controlled vocabulary, the Search Term Recommender (STR). Second, the result sets of transformed or expanded queries in distributed collections are often very large and tests show that the conventional web-based ranking methods are not appropriate for presenting heterogeneous metadata records as suitable result sets to the user. Therefore, two methods, which are derived from scientometrics and network analysis, will be implemented with the objective to re-rank result sets by the following structural properties: the ranking of the results by core journals (so-called Bradfordizing) and ranking by centrality of authors in co-authorship networks. Findings - The methods, which will be implemented, focus on the query and on the result side of a search and are designed to positively influence each other. Conceptually they will improve the search quality and guarantee that the most relevant documents in result sets will be ranked higher. Originality/value - The central impact of the paper focuses on the integration of three structural value-adding methods which aim at reducing the semantic complexity represented in distributed DLs at several stages in the information retrieval process: query construction, search and ranking, and re-ranking. Paper type - Research paper
Item type: | Journal article (Unpaginated) |
---|---|
Keywords: | Digital Library, Semantic Heterogeneity, Search Term Recommender, Re-Ranking, Bradfordizing, Co-Author Networks, Network Analysis |
Subjects: | H. Information sources, supports, channels. > HL. Databases and database Networking. I. Information treatment for information services > IC. Index languages, processes and schemes. B. Information use and sociology of information > BB. Bibliometric methods H. Information sources, supports, channels. > HR. Portals. L. Information technology and library technology > LS. Search engines. |
Depositing user: | Philipp Mayr |
Date deposited: | 16 Dec 2007 |
Last modified: | 02 Oct 2014 12:10 |
URI: | http://hdl.handle.net/10760/10893 |
References
Downloads
Downloads per month over past year
Actions (login required)
View Item |