The Role of Vocabularies in the Age of Data: The Question of Research Data

Marcondes, H The Role of Vocabularies in the Age of Data: The Question of Research Data. Knowledge Organization, 2022, vol. 49, n. 7, pp. 467-482. [Journal article (Paginated)]

[thumbnail of KO-2022-0003- v3R2_Proof_hi.pdf]
Preview
Text
KO-2022-0003- v3R2_Proof_hi.pdf

Download (1MB) | Preview

English abstract

Objective: This paper discusses the role of vocabularies in addressing the issues associated with Big Data. Methodology: The materials used are definitions of Big Data found in literature, standards, and technologies used in the Semantic Web and Linked Open Data, as well as the use case of a research dataset; we use the conceptual bases of semiotics and ontology to analyze the role of vocabularies in knowledge organization (KO) in assigning subjects to documents as a special, limited, use case that may be expanded within such context. Results: We develop and expand the conception of data as an artificial, intentional construction that represents a property of an entity within a specific domain and serving as the essential component of the Big Data. We present a comprehensive conceptualization of semantic expressivity and use it to classify the different vocabularies. We suggest and specify features to vocabularies that may be used within the context of the Semantic Web and the Linked Open Data to assign machine-processable semantics to Big Data. We identify computational ontologies as a type of knowledge organization system with a higher degree of semantic expressivity. It is suggested that such themes should be incorporated into professional qualifications in KO.

Item type: Journal article (Paginated)
Keywords: Big Data, research data, vocabulary, Semantic Web
Subjects: H. Information sources, supports, channels. > HZ. None of these, but in this section.
I. Information treatment for information services > IE. Data and metadata structures.
Depositing user: Carlos Marcondes
Date deposited: 10 May 2023 06:20
Last modified: 10 May 2023 06:20
URI: http://hdl.handle.net/10760/44301

References

Ameida, Mauricio; Souza, Renato and Fonseca, Fred. 2011. “Semantics in the Semantic

Web: A Critical Evaluation”. Knowledge Organization, 38(3):187-203.

https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1041.7976&rep=rep1&type=pdf,

accessed 25 Mar 2021.

Aristóteles. Categorias. Porto: Porto Editora Ltda, 1995.

Barbosa, Nilson. T. and ; CAMPOS, Maria. L. de Almeida. 2017. “A questão da

interoperabilidade em repositórios institucionais e sistemas de informação de pesquisas

correntes (cris): uma abordagem preliminary”. In Encontro Nacional de Pesquisa em Ciência

da Informação, n. XVIII ENANCIB, 2017. http://hdl.handle.net/20.500.11959/brapci/104600,

accessed 25 Dez. 2021.

Bergman, Mike. 2011. “Ontology-Driven Apps Using Generic Applications”. AI3 blog.

https://www.mkbergman.com/948/ontology-driven-apps-using-generic-applications/.

Berners-Lee, Tim. 1998. “Cool URIs don’t change”.

https://www.w3.org/Provider/Style/URI

.

Cabré, María Teresa. 2005. A Terminologia, uma disciplina em evolução: passado, presente

e alguns elementos de futuro. Debate Terminológico. ISSN: 1813-1867, v1.

https://www.seer.ufrgs.br/riterm/article/download/21286/15349, accessed 21 Set. 2020.

Campos, Maria Luiza de Almeida. 2010. “O papel das definições na pesquisa em

ontologia”. Perspectivas em Ciência da Informação, 15: 220-238

https://www.scielo.br/j/pci/a/tJr4GnX9Xp7pj5pf44gK4yD/?lang=pt&format=html.

Chierchia, Gennaro. 2003. Semântica. São Paulo: Ed. UNICAMP.

CIDOC Conceptual Reference Model Version 5.1.12. 2014. ICOM/CIDOC.

http://www.cidoc-crm.org/Version/version-5.1.2, accessed May 3, 2015.

Codd, Eugene. F. 1970. “A relational model of data for large shared databanks”.

Communications of The ACM, 13(6): 377-387.

https://dl.acm.org/doi/pdf/10.1145/362384.362685?casa_token=uOdxFTaktMAAAAAA:i_e

wo3eO7rDNRE7VYvlBGeHn452O1VQGi69Jn13MciziUeGNMPy827WA6guuZzLkgq4D

Gl79ocfO4A.

Dahlberg, Ingetraut. 1978. “A referent-oriented, analytical concept theory for

INTERCONCEPT”. KO KNOWLEDGE ORGANIZATION, 5( 3): 142-151

https://www.ergon-verlag.de/isko_ko/downloads/ic_5_1978_3.pdf#page=20.

Dhar, Vasant. 2013. “Data science and prediction”. Communications of the ACM, 56(12):.

64-73. https://dl.acm.org/doi/pdf/10.1145/2500499.

Dextre Clarke, Stella G. 2019. “The Information Retrieval Thesaurus”. KNOWLEDGE

ORGANIZATION, 46(6): 439-459. https://www.ergon�verlag.de/isko_ko/downloads/ko_46_2019_6_c.pdf.

Dextre Clarke, Stella G. and Zeng, Marcia Lei. 2012. “From ISO 2788 to ISO 25964: The

evolution of thesaurus standards towards interoperability and data modelling”. Information

Standards Quarterly (ISQ), 24(1).

http://eprints.rclis.org/16818/1/SP_clarke_zeng_isqv24no1.pdf.

Dierickx, Harold and Hopkinson, Alan. 1986. Reference manual for machine-readable

bibliographic descriptions.

http://biblio.cerist.dz/hrbdonf5214/ouvrages/00000000000000594806000000_2.pdf.

FAIR Compliant Biomedical Metadata Templates. 2019. CEDAR, Center for Expanded

Annotation and Retrieval, University of Stanford, Department of Medicine.

https://medicine.stanford.edu/2019-report/cedar-to-the-rescue.html.

Floridi, Luciano. 2019. “Semantic Conceptions of Information”. In The Stanford

Encyclopedia of Philosophy (Winter 2019 Edition), Edward N. Zalta (ed.).

https://plato.stanford.edu/archives/win2019/entries/information-semantic/.

Freitas, C.; Carvalho, P.; Oliveira, H. G.; Mota, C. and Santos, D. 2010. "Second HAREM:

advancing the state of the art of named entity recognition in Portuguese". In Nicoletta

Calzolari et al. (eds.), Proceedings of the International Conference on Language Resources

and Evaluation (LREC 2010). European Language Resources Association, pp. 3630-3637.

Valletta, 2010.

Giunchiglia, Fausto; Dutta, Biswanath and Maltese, Vincenzo. 2014. “From knowledge

organization to knowledge representation”. KNOWLEDGE ORGANIZATION, 41(1): 44-56,

2014. http://eprints.biblio.unitn.it/4186/1/techRep027.pdf.

Guarino, Nicola. 1997. “Semantic matching: Formal ontological distinctions for information

organization, extraction, and integration”. In International Summer School on Information

Extraction. Springer, Berlin, Heidelberg, 1997. 139-170.

https://kask.eti.pg.gda.pl/redmine/projects/sova/repository/revisions/5378040326bc499e118

636a1d25ad667285e005c/entry/Praca_dyplomowa/materialy/10.1.1.53.939.pdf.

Guarino, Nicola; Carrara, Massimiliano an Giaretta, Pierdaniele. 1994. “Formalizing

ontological commitment”. In AAAI. 1994. p. 560-567.

https://www.aaai.org/Papers/AAAI/1994/AAAI94-085.pdf.

Hey, Tony; Trefethen, Anne. 2003. “The data deluge: An e-science perspective”. In Grid

computing: Making the global infrastructure a reality, p. 809-824.

https://eprints.soton.ac.uk/257648/1/The_Data_Deluge.pdf.

Hjørland, Birger. (2018). “Data (with big data and database semantics)”. Knowledge

Organization, 45(8): 685-708.

Hjørland, Birger. (2002). “Domain analysis in information science: eleven approaches–

traditional as well as innovative”. Journal of Documentation, 58(4), 422-462.

Hjørland, Birger, and Albrechtsen, Hanne. (1995). “Toward a new horizon in information

science: Domain‐analysis”. Journal of the American society for information science, 46(6),

400-425.

Hjørland, Birger and Hartel, Jenna. 2003. “Introduction to a special issue of Knowledge

Organization”. Knowledge Organization, 30(3/4), 125-7.

International Council on Archives. Experts Group on Archival Description. 2019. Records in

Context: A Conceptual Model for Archival Description (Consultation Draft v0.1). ICA.

https://www.ica.org/sites/default/files/ric-cm-0.2_preview.pdf, accessed December 12, 2018.

International Federation of Library Associations and Institutions (IFLA). 1998. Study Group

on Functional Requirements for Bibliographic Records: Final Report. UBCIM Publications

New Series. München: K. G. Saur.

ISO/IEC 20546:2019(en). Information technology — Big data — Overview and vocabulary.

ISO, 2019.

ISO 25964-2 - Information and documentation — Thesauri and interoperability with other vocabularies

— Part 2: Interoperability with other vocabularies. ISO, 2013.

Lambe, Patrick. 2014. Organising knowledge: taxonomies, knowledge and organisational

effectiveness. Elsevier.

Marcondes, Carlos H. and Costa, Leonardo C. da. 2016. “A Model to Represent and

Process Scientific Knowledge in Biomedical Articles with Semantic Web Technologies”.

Knowledge Organization, 43(2): 122-137. https://www.ergon�verlag.de/isko_ko/downloads/ko_43_2016_2_b.pdf, accessed Apr. 12, 2017.

Marcondes, Carlos H. and Dias, Celia. 2020. “Representing facet classification in SKOS”.

In International ISKO Conference, Aalborg, Denmark, 16th

, Proceedings…

1. Edition. Würzburg: Ergon Verlag. ISBN print: 978-3-95650-775-5, ISBN online: 978-3-95650-776-2, Series: Advances in knowledge organization 9. Würzburg: Ergon

Verlag, 254–263. https://doi.org/10.5771/9783956507762, acessed Fev. 15, 2021.

Marcondes, Carlo. H.; Martins, Sergio. C. and Ramos Junior, Mauricio. C. 2021. The role of

vocabularies for the access and reuse of Big Data. Informação & Informação, 26(4): 146-

174. https://www.uel.br/revistas/uel/index.php/informacao/article/view/44653/pdf. Access 5

Jan. 2022.

De Mauro, Andrea; Greco, Marco and GrimaldiI, Michele. 2015. “What is big data? A

consensual definition and a review of key research topics”. In AIP conference proceedings.

American Institute of Physics, 2015. p. 97-104. http://big-data-fr.com/wp�content/uploads/2015/02/aip-scitation-what-is-bigdata.pdf.

Mylopoulos, John. 1992. “Conceptual modelling and Telos”. In Conceptual modelling,

databases, and CASE: An integrated view of information system development, p. 49-68.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.3647&rep=rep1&type=pdf,

accessed Dec. 13, 2020.

ONTOLOGY WEB LANGUAGE OVERVIEW. 2004. W3C. https://www.w3.org/TR/owl�features/, accessed 7 Jan. 2022.

Orilia, Francesco and Paoletti, Michele Paolini. 2020. "Properties", The Stanford

Encyclopedia of Philosophy (Winter 2020 Edition), Edward N. Zalta (ed.).

https://plato.stanford.edu/archives/win2020/entries/properties/, accessed 20 Sept. 2021.

Otlet, Paul. (2018). Tratado de Documentação: o livro sobre o livro, teoria e prática.

Brasília: Briquet de Lemos Livros.

Prieto-Díaz, Ruben. 1990. “Domain analysis: An introduction”. ACM SIGSOFT Software

Engineering Notes, 15(2): 47-54.

RDF semantics. W3C, 2004. http://www.w3.org/TR/rdf-mt/, acessed Mar, 10, 2010.

Ranganathan, S. R. and Gopinath, M. A. Prolegomena to Library Classification. 3 ed.

Bombay: Asia Publishing House, 1967.

RDF 1.1. PRIMER. 2014. W3C. https://www.w3.org/TR/rdf11-primer/, accessed 12 Dez.

2019.

Resource Description Framework (RDF) Model and Syntax Specification. W3C, 1998.

https://www.w3.org/1998/10/WD-rdf-syntax-19981008/. Acessed May 5, 2011.

Saracevic, Tefko. 2007. “Relevance: A review of the literature and a framework for thinking

on the notion in information science. Part II: Nature and manifestations of

relevance”. Journal of the american society for information science and technology, 58(13):

1915-1933.

Shet, Amith. 2020. “Knowledge Graphs and their central role in big data processing: Past,

Present, and Future”. In 7th ACM India Joint Conference on Data Science & management of

Data (COD-COMAD), Indian School of Business, Hyderabad Campus, 5-7 January 2020.

https://www.slideshare.net/apsheth/knowledge-graphs-and-their-central-role-in-big-data�processing-past-present-and-future, accessed Jun. 5, 2021.

Shet, Amith; Ramakrishnan, Cartic and Thomas, Christopher. 2005. “Semantics for the

semantic web: The implicit, the formal and the powerful”. International Journal on

Semantic Web and Information Systems (IJSWIS), 1(1): 1-18.http://www.ebusinessforum.gr/old/content/downloads/JSWIS.pdf#page=19, accessed Jul 14,

2010.

SKOS – Simple Knowledge Organization System Namespace Document. W3C, 2012.

https://www.w3.org/2009/08/skos-reference/skos.html#, accessed Aug 10, 1013.

SPARQL 1.1 QUERY LANGUAGE, 2013. W3C. https://www.w3.org/TR/sparql11-query/,

accessed 12 Fev. 2010.

Veiga, Viviane Santo de Oliveira; Campos, Maria Luiza; Silva, Carlos Roberto Lyra;

Henning, Patricia and Moreira, João. 2021. “Vodan br: a gestão de dados no enfrentamento

da pandemia coronavirus”. Páginas A&B, Arquivos e Bibliotecas (Portugal), n. Especial:

51-58. http://hdl.handle.net/20.500.11959/brapci/157353, accessed Out 7, 2021.

Wilson, Thomas. D. 1972. “The work of the British Classification Research Group”. In

Wellish, H. (ed). Subject retrieval in the seventies. Westport: Greenword Publishing Co.,

62-71.

Zeng, Marcia Lei. 2019. “Interoperability”. In Hjørland, Birger and Gnoli, Claudio eds. ISKO

Encyclopedia of Knowledge Organization. ISKO. http://www.isko.org/cyclo/interoperability,

accessed Jun 4, 2020.


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item