An evaluation of conflation accuracy using finite-state transducers

Galvez, Carmen and De-Moya-Anegón, Félix An evaluation of conflation accuracy using finite-state transducers. Journal of Documentation, 2006, vol. 62, n. 3. [Journal article (Unpaginated)]

Preview

PDF
Galvez-An_Evaluation.pdf
Download (204kB) | Preview

English abstract

Purpose – To evaluate the accuracy of conflation methods based on Finite-State Transducers (FSTs). Design/methodology/approach – Incorrectly lemmatized and stemmed forms may lead to the retrieval of inappropriate documents. Experimental studies to date have focused on retrieval performance, but very few on conflation performance. The process of normalization we used involved a linguistic toolbox that allowed us to construct, through graphic interfaces, electronic dictionaries represented internally by FSTs. The lexical resources developed were applied to a Spanish test corpus for merging term variants in canonical lemmatized forms. Conflation performance was evaluated in terms of an adaptation of recall and precision measures, based on accuracy and coverage, not actual retrieval. The results were compared with those obtained using a Spanish version of the Porter algorithm. Findings – We come to the conclusion that the main strength of lemmatisation using finite-state technology is its accuracy, whereas its main limitation is the underanalysis of variant forms. Originality/value –The report outlines the potential of transducers in their application to normalization processes.

Item type:	Journal article (Unpaginated)
Keywords:	Natural Languge Processing; Finite-State Transducers; Information Retrieval; Conflation; Linguistics, Semantics, Programming and algorithm theory, Accuracy
Subjects:	L. Information technology and library technology > LM. Automatic text retrieval.
Depositing user:	Carmen Galvez
Date deposited:	08 Aug 2007
Last modified:	02 Oct 2014 12:09
URI:	http://hdl.handle.net/10760/10184

Check full metadata for this record

References

Downloads

Downloads per month over past year

Actions (login required)

View Item

Facebook

Twitter

RSS