SKRIPTOR - program za raščlanjivanje bibliografskih informacija

Pajić, Dejan, Šipka, Pero and Kosanović, Biljana SKRIPTOR - program za raščlanjivanje bibliografskih informacija. Infoteka, 2002, vol. 3, n. 1-2, pp. 13-22. [Journal article (Paginated)]

This is the latest version of this item.

[thumbnail of img020.pdf]
Preview
PDF
img020.pdf

Download (769kB) | Preview

English abstract

Skriptor - a program developed for the use in maintaining SocioFact and designed for parsing journals' contents and references is described. By making use of auxiliary databases (e.g. lists containing authors and publishers' names) and simple algorithms for processing Serbian as natural language, the program recognizes the elements of the journals' contents and articles' references (e.g. author name, book title, journal title, page numbers) and assigns a standardized label to each of those elements providing automatic transfer of information into the respective database field. Apart from basic parsing module, the program provides subroutines for conversions of various character sets, word (de)capitalization according to orthographic rules, inversion of author's name and surname position, filling up the missing data, as well as interactive control and correction of the parsed information. Skriptor comes with an installation program and detailed help file which contains specific instructions for the operators explaining ways to effectively use program itself, and defining bibliographic standards used in the process SocioFact maintenance. Skriptor is written in Visual Basic for Applications as an Microsoft Word template.

Serbian abstract

Opisan je Skriptor, program za raščlanjivanje sadržaja i referenci iz periodičnih publikacija, razvijen za potrebe održavanja baze podataka SocioFakta. S osloncem na pomoćne baze (liste autorskih imena, izdavača i sl.) i jednostavne algoritme za obradu srpskog kao prirodnog jezika, program automatski prepoznaje elemente iz sadržaja časopisa i referenci članaka (ime autora, naslov, izvor, kolaciju itd.) i dodeljuje im standardne labele, čime se obezbeđuje automatski transfer podataka u odgovarajuća polja baze. Pored osnovnog modula za raščlanjivanje, program sadrži potprograme za konverziju kodnih rasporeda, pretvaranje velikih slova u mala u skladu sa pravopisom, promenu redosleda autorskog imena i prezimena, dopunjavanje nedostajućih informacija, kao i interaktivnu kontrolu i korekciju raščlanjenog materijala. Skriptor sadrži program za instalaciju i detaljan sistem pomoći, koji operatora upućuje u upotrebu programa i upoznaje ga sa bibliografskim standardima koji se koriste pri izradi SocioFakta. Napisan je u jeziku Visual Basic for Application kao dodatak programu Microsoft Word.

Item type: Journal article (Paginated)
Keywords: bibliografske informacije, raščlanjivanje, bibliografske baze podataka, citatne informacije, softver, bibliographic information, parsing, bibliographic databases, citation information, software
Subjects: I. Information treatment for information services
Depositing user: Biljana Kosanovic
Date deposited: 02 Jul 2008
Last modified: 02 Oct 2014 12:11
URI: http://hdl.handle.net/10760/11821

Available Versions of this Item

References

Aiello, M., Monz, C., Todoran, L. (2000) Combining linguistic and spatial information for document analysis. In: Mariani, J., Harama, D. (ed.) Proceedings of RIAO '2000 Content-Based Multimedia Information Access, CID, 266-275

Bergmark, D. (2000) Automatic extraction of reference linking information from online documents, Technical Report TR 2000-1821, Cornell University - Computer Science Departement, November, URL: http://www.cs.cornell.edu/cdlrg/Reference Linking/extraction.pdf, preuzeto 25.02.2002.

Chowdhury, G.G. (1999) Template mining for information extraction from digital documents, Library Trends, 48 (1), 182-208

Connan, J., Omlin, C.W. (2000) Bibliography extraction with hidden Markov models, URL: http://citeseer.nj.nec.com/294556.html, preuzeto 11.06.2001.

Ding, Y., Chowdhury, G.G., Foo, S. (1999) Template mining for the extraction of citation from digital documents. In: Second Asian Digital Libraries Conference, National Taiwan University, November 8-9

Giles, L.C., Bollacker, D., Lawrense, S. (1998) CiteSeer: An automatic citation indexing system, u: Witten I., Akscyn R., Shipman F.III (ed.) Digital Libraries - Third ACM Conference on Digital Libraries, ACM Press, New York, 89-98

Kosanović, B., Šipka, P. (1996) SocioFakt - Jugoslovenska baza za društvene činjeničke nauke. U: Kostić P. (ur.) Merenje u psihologiji, IKSI i Centar za primenjenu psihologiju, Beograd, 2, 85-95

Lawrense, S., Giles, L.C., Bollacker, D. (1999): Digital libraries and autonomous citation indexing, IEEE Computer, 32 (6), 67-71

Seymore, K., Mccallum, A., Rosenfeld, R. (1999) Learning hidden Markov model structure for information extraction. In: AAAI '99 - Workshop on Machine Learning for Information Extraction


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item