The Development of an audio-visual aggregator: the Open Audio-Visual Archives (OAVA) project in Greece

Malliari, Afrodite, Nitsos, Ilias, Zapounidou, Sofia and Doropoulos, Stavros The Development of an audio-visual aggregator: the Open Audio-Visual Archives (OAVA) project in Greece., 2022 UNSPECIFIED. [Other]

[thumbnail of tilt_newsletterJune2022_OAVA.pdf]
Preview
Text
tilt_newsletterJune2022_OAVA.pdf - Published version
Available under License Creative Commons Attribution.

Download (1MB) | Preview

English abstract

The available online audiovisual material maintained by public and private organizations and institutions is numerous and ranges from videos and narrations regarding historical and everyday life events to scientific, academic, and cultural events. Despite the apparent availability of audiovisual resources, finding useful resources is not as easy as one might think. Poor indexing by search engines and lack of metadata schemes are the main reasons for this inconvenience. The problem of dispersion of audiovisual resources and diversity of resource providers is largely solved by aggregation services. In Greece there is no reference point for the search and access to audiovisual material. A Greek National Registry of audiovisual providers has not yet been implemented. The Open Audio-Visual Archives (OAVA) project aims to gather audiovisual material that is of Greek interest or contains Greek speech. The OAVA project provides a unified search mechanism not only to the aggregate metadata of audio-visual material but also to its searchable content. Through the application of deep learning models, algorithms are developed that perform Automatic Speech Recognition in Greek and in English. During the project, 500 Greek-language audiovisual content providers were reviewed. 233 of them were found eligible according to specific criteria. The metadata used by each provider were mainly application schemes without metadata schemes. Those were recorded and mapped with Vufind and with EBUCore schema. The open-source software VuFind was configured to operate as the basis of the unified search platform. This article briefly presents the objectives of the project, selection criteria, results for content and audiovisual providers in Greece. It overviews licensing issues, the OAVA metadata scheme and the basic functions of the search mechanism.

Item type: Other
Keywords: Audiovisual material; speech to text technologies; cultural heritage; open access; content aggregators;
Subjects: I. Information treatment for information services > IE. Data and metadata structures.
I. Information treatment for information services > IG. Information presentation: hypertext, hypermedia.
I. Information treatment for information services > IK. Design, development, implementation and maintenance
I. Information treatment for information services > IM. Open data
Depositing user: Sofia Zapounidou
Date deposited: 08 Jul 2022 06:47
Last modified: 08 Jul 2022 06:47
URI: http://hdl.handle.net/10760/43379

References

Digital Public Library of America. (c2022). Collection Development Guidelines. Available from: https://pro.dp.la/hubs/collectiondevelopment-guidelines

EBU. (2020). EBU Core Metadata Set (EBUCore): Specification v.1.10. TECH 3293. Available from: https://tech.ebu.ch/docs/tech/tech3293.pdf

Europeana. (c2022). Europeana. Available from:https://www.europeana.eu/

Europeana Foundation. (2019). Europeana Publishing Framework 2.0. Available from: https://pro.europeana.eu/files/Europeana_Professional/Publications/Publishing_Framework/Europeana%20Publishing%20Framework%20V2.0%20English.pdf

Graves, A. (2012). Sequence transduction with recurrent neural networks. arXiv preprint arXiv:1211.3711.

Graves, A., Fernández, S., Gomez, F., & Schmidhuber, J. (2006, June). Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd international conference on Machine learning (pp. 369-376). doi: 10.1145/1143844.1143891

Gulati, A., Qin, J., Chiu, C. C., Parmar, N., Zhang, Y., Yu, J., ... & Pang, R. (2020). Conformer: Convolution-augmented transformer for speech recognition. arXiv preprint arXiv:2005.08100.

Han, W., Zhang, Z., Zhang, Y., Yu, J., Chiu, C. C., Qin, J., ... & Wu, Y. (2020). Contextnet: Improving convolutional neural networks for automatic speech recognition with global context. arXiv preprint arXiv:2005.03191.

Klijn, E., & de Lusenet, Y. (2008). Tracking the reel world: A survey of audiovisual collections in Europe. Amsterdam: European Commission on Preservation and Access. Available from: https://www.ica.org/sites/default/files/WG_2008_PAAGtracking_the_reel_world_EN.pdf

Kriman, S., Beliaev, S., Ginsburg, B., Huang, J., Kuchaiev, O., Lavrukhin, V., ... & Zhang, Y. (2020, May). Quartznet: Deep automatic speech recognition with 1d time-channel separable convolutions. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6124-6128). IEEE. doi: 10.1109/ICASSP40776.2020.9053889

Li, J., Lavrukhin, V., Ginsburg, B., Leary, R., Kuchaiev, O., Cohen, J. M., ... & Gadde, R. T. (2019). Jasper: An end-to-end convolutional neural acoustic model. arXiv preprint arXiv:1904.03288.

Majumdar, S., Balam, J., Hrinchuk, O., Lavrukhin, V., Noroozi, V., & Ginsburg, B. (2021). Citrinet: Closing the Gap between NonAutoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition. arXiv preprint arXiv:2104.01721.

Malliari, A., Nitsos, I., Zapounidou, S., & Doropoulos, S. (2022). Mapping audiovisual content providers and resources in Greece. International Journal on Digital Libraries, 1-11. doi: 10.1007/s00799-022-00321-6

Memoriav. (c2022). Memobase: Access to Switzerland’s audiovisual cultural heritage. Available from: https://memoriav.ch/en/memobase

National Library of New Zealand. (c2022). About DigitalNZ. Available from: https://digitalnz.org/about

Oesterlen, E.-M. (2017). Aggregation Handbook, 3rd edition. The Hague: Europeana, EUscreen. Available from: http://blog.euscreen.eu/wpcontent/uploads/2017/03/Aggregation-Handbook_Revisededition3_2017.pdf

Pitschmann, L.A. (2001). Building Sustainable Collections of Free Third-Party Web Resources. Washington, D.C.: Digital Library Federation, Council on Library and Information Resources. Available from: https://clir.wordpress.clir.org/wpcontent/uploads/sites/6/pub98_57d70f70b208f.pdf

Scholz, Henning. (2019). Europeana Publishing Guide v1.8: A guide to the metadata and content requirements for data partners publishing material in Europeana Collections. The Hague: Europeana Foundation. Available from: https://pro.europeana.eu/files/Europeana_Professional/Publications/Europeana%20Publishing%20Guide%20v1.8.pdf

Trove. (c2020). Technical specifications. Available from: https://trove.nla.gov.au/technical-specifications

Trove. (c2022). What is Trove. Available from: https://trove.nla.gov.au/about/what-trove

Turnok, R., Kaye, L., & Carrasqueiro, L. (2010). EUscreen content selection policy. In EUscreen Workshop, 23-24 June, Mykonos, Greece. Available from: https://www.slideshare.net/EUscreen/turnok-kayecarrasqueiroe-uscreen-content-selection-policy-euscreen-mykonos


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item