Term conflation methods in information retrieval: non-linguistic and linguistic approaches

Galvez, Carmen, Felix de Moya-Anegon, Felix and Herrero-Solana, Victor Term conflation methods in information retrieval: non-linguistic and linguistic approaches. Journal of Documentation, 2005, vol. 61, n. 4, pp. 520-547. [Journal article (Paginated)]

Preview

PDF
Galvez-JD.pdf
Download (396kB) | Preview

English abstract

Purpose – To propose a categorization of the different conflation procedures at the two basic approaches, non-linguistic and linguistic techniques, and to justify the application of normalization methods within the framework of linguistic techniques. Design/methodology/approach – Presents a range of term conflation methods, that can be used in information retrieval. The uniterm and multiterm variants can be considered equivalent units for the purposes of automatic indexing. Stemming algorithms, segmentation rules, association measures and clustering techniques are well evaluated non-linguistic methods, and experiments with these techniques show a wide variety of results. Alternatively, the lemmatisation and the use of syntactic pattern-matching, through equivalence relations represented in finite-state transducers (FST), are emerging methods for the recognition and standardization of terms. Findings – The survey attempts to point out the positive and negative effects of the linguistic approach and its potential as a term conflation method. Originality/value – Outlines the importance of FSTs for the normalization of term variants.

Item type:	Journal article (Paginated)
Keywords:	Finite-State Transducers
Subjects:	I. Information treatment for information services > IC. Index languages, processes and schemes.
Depositing user:	Carmen Galvez
Date deposited:	06 Aug 2007
Last modified:	02 Oct 2014 12:06
URI:	http://hdl.handle.net/10760/8818

Check full metadata for this record

References

Downloads

Downloads per month over past year

Actions (login required)

View Item

Facebook

Twitter

RSS