Automatic Keyword Extraction from Documents Using Conditional Random Fields

Zhang, Chengzhi Automatic Keyword Extraction from Documents Using Conditional Random Fields. Journal of Computational Information Systems, 2008, vol. 4, n. 3, pp. 1169-1180. [Journal article (Paginated)]

Preview

PDF
Automatic_Keyword_Extraction_from_Documents_Using_Conditional_Random_Fields.pdf
Download (117kB) | Preview

English abstract

Keywords are subset of words or phrases from a document that can describe the meaning of the document. Many text mining applications can take advantage from it. Unfortunately, a large portion of documents still do not have keywords assigned. On the other hand, manual assignment of high quality keywords is expensive, time-consuming, and error prone. Therefore, most algorithms and systems aimed to help people perform automatic keywords extraction have been proposed. Conditional Random Fields (CRF) model is a state-of-the-art sequence labeling method, which can use the features of documents more sufficiently and effectively. At the same time, keywords extraction can be considered as the string labeling. In this paper, keywords extraction based on CRF is proposed and implemented. As far as we know, using CRF model in keyword extraction has not been investigated previously. Experimental results show that the CRF model outperforms other machine learning methods such as support vector machine, multiple linear regression model etc. in the task of keywords extraction.

Item type:	Journal article (Paginated)
Keywords:	Keywords Extraction; Conditional Random Fields; Automatic Indexing; Machine Learning
Subjects:	L. Information technology and library technology > LL. Automated language processing.
Depositing user:	Chengzhi Zhang
Date deposited:	23 Sep 2008
Last modified:	02 Oct 2014 12:12
URI:	http://hdl.handle.net/10760/12305

Check full metadata for this record

References

Downloads

Downloads per month over past year

Actions (login required)

View Item

Facebook

Twitter

RSS