Detecting multiword phrases in mathematical text corpora

Gödert, Winfried Detecting multiword phrases in mathematical text corpora., 2012 (Unpublished) [Report]

[thumbnail of Detecting_multiword_phrases.pdf]
Preview
PDF
Detecting_multiword_phrases.pdf

Download (249kB) | Preview

English abstract

We present an approach for detecting multiword phrases in mathematical text corpora. The method used is based on characteristic features of mathematical terminology. It makes use of a software tool named Lingo which allows to identify words by means of previously defined dictionaries for specific word classes as adjectives, personal names or nouns. The detection of multiword groups is done algorithmically. Possible advantages of the method for indexing and information retrieval and conclusions for applying dictionary-based methods of automatic indexing instead of stemming procedures are discussed.

Item type: Report
Keywords: Automatic indexing, Multiword phrases, Mathematical text
Subjects: I. Information treatment for information services > IC. Index languages, processes and schemes.
L. Information technology and library technology > LL. Automated language processing.
Depositing user: Winfried Gödert
Date deposited: 05 Oct 2012
Last modified: 02 Oct 2014 12:23
URI: http://hdl.handle.net/10760/17742

References

Encyclopedia of Mathematics (EoM). http://www.encyclopediaofmath.org/index.php/Main_Page.

Gödert, W. (1980). Subject headings for mathematical literature. Journal of documentation. 36, no.1, 11-23.

Gödert, W., Lepsky, K. & Nagelschmidt, M. (2012). Informationserschließung und Automatisches Indexieren: ein Lehr- und Arbeitsbuch. Berlin: Springer.

Lepsky, K. & Vorhauer, J. (2006). Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente. In ABI-Technik. 26, H.1, 18-28.

Mathematics Subject Classification (MSC2010). http://www.ams.org/mathscinet/msc/msc2010.html.

Schwartzman, S. (1994). The words of mathematics : an etymological dictionary of mathematical terms used in English ; a reference book describing the origins of over 1500 mathematical terms used in English, including a glossary that explains the historical and linguistic terms used in the book. Washington, DC : Math. Association of America.

Sperber, W. & Wegner, B. (2011). Content Analysis in der Mathematik: Erschließung und Retrieval mathematischer Publikationen. In Information und Wissen: global, sozial und frei? In J. Griesbaum (Ed.), Proceedings des 12. Internationalen Symposiums für Informationswissenschaft (ISI 2011) ; Hildesheim, 9. - 11. März 2011 (pp. 393-403). Boizenburg: VWH, Verl. W. Hülsbusch.

Sperber, W. & Ion, P. D. F. (2011). Content analysis and classification in mathematics. In A. Slavic (Ed.), Classification and ontology: formal approaches and access to knowledge: proceedings of the International UDC Seminar, 19-20 September 2011, The Hague, The Netherlands (pp. 129-144). Würzburg: Ergon Verlag.

Tietze, H. (1991). Mathematik und Sprache: oder der Wert von Zweideutigkeiten.In: Mitteilungen der Deutschen Mathematiker-Vereinigung. no.1, 19-28.


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item