Keywords given by authors of scientific articles in database descriptors

Gil-Leiva, Isidoro and Alonso-Arroyo, Adolfo Keywords given by authors of scientific articles in database descriptors. Journal of the American Society for Information Science and Technology, 2007, vol. 58, n. 8, pp. 1175-1187. [Journal article (Paginated)]

[img]
Preview
PDF
PresenceKeywordsGILLEIVAIsidoro.pdf

Download (264kB) | Preview

English abstract

This paper analyses the keywords given by authors of scientific articles and the descriptors assigned to the articles in order to ascertain the presence of the keywords in the descriptors. 640 INSPEC, CAB abstracts, ISTA and LISA database records were consulted. After detailed comparisons it was found that keywords provided by authors have an important presence in the database descriptors studied, since nearly 25% of all the keywords appeared in exactly the same form as descriptors, with another 21% while normalized, are still detected in the descriptors. This means that almost 46% of keywords appear in the descriptors, either as such or after normalization. Elsewhere, three distinct indexing policies appear, one represented by INSPEC and LISA (indexers seem to have freedom to assign the descriptors they deem necessary); another is represented by CAB (no record has fewer than four descriptors and, in general, a large number of descriptors is employed; in contrast, in ISTA, a certain institutional code towards economy in indexing, since 84% of records contain only four descriptors.

Spanish abstract

Se analizan las palabras clave proporcionadas por los autores de los artículos científicos y los descriptores asignados a esos artículos para averiguar la presencia de las mismas en los descriptores. Para ello, se manejaron 640 registros de las Bases de datos INSPEC, CAB Abstracts, ISTA y LISA. Una vez realizadas minuciosamente las comparaciones se concluyó que las palabras clave de los autores tienen una presencia importante en los descriptores de las bases de datos estudiadas, ya que el 25% de todas las palabras clave aparecen exactamente igual como descriptores; y otro 21% aunque ha sufrido un proceso de normalización se sigue detectando en los descriptores, lo que propicia que el 45% de las palabras clave aparezcan exactas o normalizadas como descriptores. Por otra parte, se observa lo que parecen tres políticas de indización distintas, una representada por INSPEC y LISA (los indizadores parecen tener libertad para asignar los descriptores que estimen necesarios); otra por CAB (ningún registro posee menos de cuatro descriptores y en general, se emplea un cuantioso número de descriptores); y por el contrario, en ISTA se percibe cierta consigna institucional hacia el ahorro en la indización, ya que el 84% de los registros tiene sólo cuatro descriptores.

Item type: Journal article (Paginated)
Keywords: Descriptors ; Indexing ; Automatic indexing ; Journal papers ; Databases ; Comparative study ; INSPEC ; CAB ; ISTA ; LISA Descriptores ; Indización ; Indización automática ; Artículos de revista ; Bases de datos ; Estudio comparativo ; INSPEC ; CAB ; ISTA ; LISA.
Subjects: I. Information treatment for information services > IB. Content analysis (A and I, class.)
Depositing user: Isidoro Gil Leiva
Date deposited: 10 Jun 2008
Last modified: 02 Oct 2014 12:11
URI: http://hdl.handle.net/10760/11726

References

ALIMOHAMMADI, D. (2003). Meta-tag: a means to control the process of Web indexing. Online Information Review, 27 (4), 238-242.

ANSARI, M. (2001). Descriptors and title keywords: matching in medical PhD dissertations. Quarterly Journal of the National Library of the Islamic Republic of Iran, 12 (2), 23-33.

BOGER, Z. & KUFLIK, T. & SHOVAL, P. & SHAPIRA, B. (2001). Automatic keyword identification by artificial neural networks compared to manual identification by users of filtering systems. Information Processing and Management, 37 (2), 187-198.

CRAVEN, T. (2004). Variations in use of meta tag keywords by web pages in different languages. Journal of Information Science, 30 (3), 268-279.

CRAVEN, T. (2005). Web authoring tools and meta tagging of page descriptions and keywords. Online Information Review, 29 (2), 129-138.

GBUR, E.E. & TRUMBO, B.E. (1995). Key words and phrases—The key to scholarly visibility and efficiency in an information explosion. The American Statistician, 49, 29-33.

GIL-LEIVA, I. & RODRÍGUEZ MUÑOZ, J.V. (1997). Análisis de los descriptores de diferentes áreas del conocimiento indizadas en bases de datos del CSIC. Aplicación a la indización automática. Revista Española de Documentación Científica, 20 (2), 150-161.

GIL-LEIVA, I. (1999). La automatización de la indización de documentos. Gijón, Trea.

GIL-LEIVA, I. (2003). Sistema para la Indización Semi-Automática (SISA) de Artículos de Revista de Biblioteconomía y Documentación. II Jornadas de Tratamiento y Recuperación de Información, septiembre 2003, Leganés (Madrid), p. 228-232.

GIL-LEIVA, I. & ALONSO-ARROYO, A. (2005). La relación entre las palabras clave aportadas por los autores de artículos de revista y su indización en las Bases de datos ISOC, IME e ICYT. Revista Española de Documentación Científica, 28 (1), 62-79.

GROSS, T. & TAYLOR, A.G. (2005). What have we got to lose? The effect of controlled vocabulary on keyword searching results. College & Research Libraries, 66 (3), 212-230.

HARTLEY, J. & KOSTOFF, R.N. (2003). How useful are ‘key words’ in scientific journals?. Journal of Information Science, 29 (5), 433-438.

HERSH, W.R. & HICKAM, D.H. (1992). A comparison of retrieval effectivenes for three methods of indexing medical literature. The American Journal of the Medical Sciences, 303, 293-300.

HMEIDI, I., KANAAN, G. & EVENS, M. (1997). Design and implementation of automatic indexing for information retrieval with Arabia documents. Journal of the American Society for Information Science, 48 (10), 867-881.

ISO 5963:1985 Documentation. Methods for examining documents, determining their subjects, and selecting indexing terms.

JONES, S. & PAYNTER, G.W. (2002). Automatic extraction of document keyphrases for use in digital libraries: evaluation and applications. Journal of the American Society for Information Science and Technology, 53 (8), 653-677.

KISHIDA, K. (2001). Statistical methods for automatically assigning classification numbers and descriptors based on title words of journal articles. Journal of Japan Society of Library and Information Science, 47 (2), 49-66.

KO, Y., PARK, J. & SEO, J. (2004). Improving text categorization using the importante of sentences. Information Processing & Management, 40 (1), 65-79.

LANCHENG, W. (2005). Theme information extraction of XMARC based on extended maximum matching algorithm. Journal of the China Society for Scientific and technical Information, 24 (1), 82-86.

MONTEJO RÁEZ, A. (2002). Towards conceptual indexing using automatic assignment of descriptors. Workshop in Personalization Techniques in Electronic Publishing on the Web: Trends and Perspectives. Málaga, Spain, May.

RIPPLINGER, B. & SCHMIDT, P. (2001). AUTINDEX: An Automatic Multilingual Indexing System. SIGIR, p. 452-452.

SILVESTER, J.P., GENUARDI, M.T. & KLINGBIEL, P.H. (1994). Machine-aided indexing at NASA. Information Processing & Management, 30 (5), 631-645.

TAGHVA, K., BORSACK, J., NARTKER, T. & CONDIT, A. (2004). The role of manually-assigned keywords in query expansion. Information Processing & Management, 40, 441-458.

TILLOTSON, J. (1995). Is keyword searching the answer? College and Research Libraries, 56 (3), 199-206.

TURNEY, P.D. (2000). Learning algorithms for keyphrase extraction. Information Retrieval, 2 (4), 303-336.

VOORBIJ, H.J. (1998). Title keywords and subject descriptors: a comparison of subject search entries of books in the humanities and social sciences. Journal of Documentation, 54 (4), 466-476.


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item