How to automate the extraction and analysis of information for educational purposes

Calvera-Isabal, Miriam, Santos, Patricia, Hoppe, H.-Ulrich and Schulten, Cleo How to automate the extraction and analysis of information for educational purposes. Comunicar, 2023, vol. 31, n. 74, pp. 23-35. [Journal article (Paginated)]

[thumbnail of Research article (English)]
Text (Research article (English))
10.3916_C74-2023-02-english.pdf - Published version
Available under License Creative Commons Attribution.

Download (15MB) | Preview
[thumbnail of Research article (Spanish)]
Text (Research article (Spanish))
10.3916_C74-2023-02.pdf - Published version
Available under License Creative Commons Attribution.

Download (15MB) | Preview

English abstract

There is an increasing interest and growing practice in Citizen Science (CS) that goes along with the usage of websites for communication as well as for capturing and processing data and materials. From an educational perspective, it is expected that by integrating information about CS in a formal educational setting, it will inspire teachers to create learning activities. This is an interesting case for using bots to automate the process of data extraction from online CS platforms to better understand its use in educational contexts. Although this information is publicly available, it has to follow GDPR rules. This paper aims to explain (1) how CS communicates and is promoted on websites, (2) how web scraping methods and anonymization techniques have been designed, developed and applied to collect information from online sources and (3) how these data could be used for educational purposes. After the analysis of 72 websites, some of the results obtained show that only 24.8% includes detailed information about the CS project and 48.61% includes information about educational purposes or materials.

Spanish abstract

El interés y la práctica de la ciencia ciudadana (CC) ha aumentado en los últimos años. Esto ha derivado en el uso de páginas web como herramienta de comunicación, recolección o análisis datos o repositorio materiales y recursos. Desde una perspectiva educativa, se espera que al integrar información sobre proyectos de CC en un entorno educativo formal, se inspire a los maestros a crear actividades de aprendizaje. Este, es un caso interesante para usar bots que automaticen el proceso de extracción de datos de webs de CC que ayuden a comprender mejor su uso en contextos educativos. Aunque esta información está disponible públicamente, se deben seguir las reglas de la ley de protección de datos o GDPR. Este artículo tiene como objetivo explicar: 1) cómo la CC se comunica y promueve en los sitios web; 2) cómo se diseñan, desarrollan y aplican los métodos de web scraping y las técnicas de anonimización para recopilar información en línea; y 3) cómo se podrían usar estos datos con fines educativos. Tras el análisis de 72 webs algunos de los resultados son que solo el 24,8% incluye información detallada sobre el proyecto, y el 48,61% incluye información sobre propósitos o materiales educativos.

Item type: Journal article (Paginated)
Keywords: Citizen science; informal learning; algorithms; automatization; education; privacy protection; Ciencia ciudadana; aprendizaje informal; algoritmos; automatización; educación; protección de la privacidad
Subjects: B. Information use and sociology of information > BJ. Communication
G. Industry, profession and education.
G. Industry, profession and education. > GH. Education.
Depositing user: Alex Ruiz
Date deposited: 09 Jan 2023 06:11
Last modified: 09 Jan 2023 06:11


Asensio-Pérez, J.I., Dimitriadis, Y., Prieto, L.P., Hernández-Leo, D., & Mor, Y. (2014). From idea to VLE in half a day: METIS approach and tools for learning co-design. In Proceedings of the Second International Conference on Technological Ecosystems for Enhancing Multiculturality (pp. 741-745). ACM.

Bickford, D., Posa, M.R.C., Qie, L., Campos-Arceiz, A., & Kudavidanage, E.P. (2012). Science communication for biodiversity conservation. Biological Conservation, 151(1), 74-76.

Bonney, R., Cooper, C.B., Dickinson, J., Kelling, S., Phillips, T., Rosenberg, K.V., & Shirk, J. (2009). Citizen science: A developing tool for expanding science knowledge and scientific literacy. BioScience, 59(11), 977-984.

Bonney, R., Phillips, T. B., Ballard, H.L., & Enck, J.W. (2016). Can citizen science enhance public understanding of science? Public Understanding of Science, 25(1), 2-16.

Bowser, A., Brenton, P., Stevenson, R., Newman, G., Schade, S., Bastin, L., Parker, A., & Oliver, J. (2017). Citizen Science Association Data & Metadata Working Group: Report from CSA 2017 and Future Outlook. European Commision.

Calvera-Isabal, M., Varas, N., & Santos, P. (2021). Computational techniques for data science applied to broaden the knowledge between citizen science and education. In D.G. Sampson, D. Ifenthaler, & P. Isaías (Eds.), Proceedings of the 18th International Conference on Cognition and Exploratory Learning in the Digital Age (CELDA 2021) (pp. 219-226). IADIS Press.

Chan, K.K.H., & Yung, B.H.W. (2018). Developing pedagogical content knowledge for teaching a new topic: More than teaching experience and subject matter knowledge. Research in Science Education, 48(2), 233-265.

Clarivate analytics (Ed.) (2022). Web of Science Core Collection Help.

Derave, T., Sales, T.P., Gailly, F., & Poels, G. (2020). Towards a reference ontology for digital platforms. In G. Dobbie, U. Frank, G. Kappel, S.W. Liddle, H.C. May (Eds.), Conceptual modeling. ER 2020. Lecture notes in computer science (pp. 289-302). Springer.

Diouf, R., Sarr, E.N., Sall, O., Birregah, B., Bousso M., & Mbaye, S.N. (2019). Web Scraping: State-of-the-Art and Areas of Application. In 2019 IEEE International Conference on Big Data (Big Data) (pp. 6040-6042).

Djonko-Moore, C.M., Leonard, J., Holifield, Q., Bailey, E.B., & Almughyirah, S.M. (2018). Using culturally relevant experiential education to enhance urban children’s knowledge and engagement in science. Journal of Experiential Education, 41(2), 137-153.

European Union (Ed.) (2010). Charter of fundamental rights of the European Union. Official Journal of the European Union C83, 53, 380.

Fraisl, D., Campbell, J., See, L., Wehn, U., Wardlaw, J., Gold, M., Moorthy, I., Arias, R., Piera, J., Oliver, J.L., Masó, J., Penker, M., & Fritz, S. (2020). Mapping citizen science contributions to the UN sustainable development goals.?Sustainability Science,?15(6), 1735-1751.

Gruschka, N., Mavroeidis, V., Vishi, K., & Jensen, M. (2018). Privacy issues and data protection in big data: A case study analysis under GDPR. In 2018 IEEE International Conference on Big Data (Big Data) (pp. 5027-5033). IEEE.

Hiller, S.E., & Kitsantas, A. (2014). The effect of a horseshoe crab citizen science program on middle school student science performance and STEM career motivation. School Science and Mathematics, 114(6), 302-311.

Kasgar, A.K, Agrawal, J., & Sahu, S. (2012). New modified 256-bi MD5 algorithm with SHA. Compression Function. International Journal of Computer Applications, 42(12), 47-51.

Kobori, H., Dickinson, J.L., Washitani, I., Sakurai, R., Amano, T., Komatsu, N., Kitamura, W., Takagawa, S., Koyama, K., Ogawara, T., & Miller-Rushing, A.J. (2016). Citizen science: a new approach to advance ecology, education, and conservation. Ecological Research, 31(1), 1-19.

Kolay, S., D'Alberto, P., Dasdan, A., & Bhattacharjee, A. (2008). A larger scale study of robots. txt. In Proceedings of the 17th international conference on World Wide Web (pp. 1171-1172).

Li, Y., & Manoharan, S. (2013). A performance comparison of SQL and NoSQL databases. In 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM) (pp. 15-19). IEEE.

Lin-Hunter, D.E., Newman, G.J., & Balgopal, M.M. (2020). Citizen scientist or citizen technician: A case study of communication on one citizen science platform. Citizen Science: Theory and Practice, 5(1).

Massa, N., Dischino, M., Donnelly, J.F., & Hanes, F.D. (2011). Creating real-world problem-based learning challenges in sustainable technologies to increase the STEM Pipeline. In 2011 ASEE Annual Conference & Exposition (pp. 22.397.1 - 22.397.19).

Parvez, M.S., Tasneem, K.S.A., Rajendra, S.S., & Bodke, K.R. (2018). Analysis of different web data extraction techniques. In 2018 International Conference on Smart City and Emerging Technology (ICSCET) (pp. 1-7). IEEE.

Ponti, M., Hillman, T., Kullenberg, C., & Kasperowski, D. (2018). Getting it right or being top rank: Games in citizen science. Citizen Science: Theory and Practice, 3(1).

Roche, J., Bell, L., Galvão, C., Golumbic, Y. N., Kloetzer, L., Knoben, N. Laakso, M., Lorke, J., Mannion, G., Massetti, L., Mauchline, A., Pata, K., Ruck, A., Taraba, P., & Winter, S. (2020). Citizen science, education, and learning: challenges and opportunities. Frontiers in Sociology, 5, 613814.

Roldán-Álvarez, D., Martínez-Martínez, F., & Martín, E. (2021). Citizen science and open learning: A Twitter perspective. In 2021 International Conference on Advanced Learning Technologies (ICALT) (pp. 6-8). IEEE.

Saddiqa, M., Larsen, B., Magnussen, R., Rasmussen, L.L., & Pedersen, J.M. (2019). Open data visualization in Danish schools: A case study. Complex Systems Informatics and Modeling Quarterly, 21, 1-20.

Sanz, F., Gold, M., & Mazzonetto, M. (2019). D2.3: Platform functionality requirements & specification report. Zenodo.

Stocklmayer, S.M., Rennie, L.J., & Gilbert, J.K. (2010). The roles of the formal and informal sectors in the provision of effective science education. Studies in Science Education, 46(1), 1-44.

Storksdieck, M., Shirk, J.L., Cappadonna, J.L., Domroese, M., Göbel, C., Haklay, M., Miller-Rushing, A.J., Roetman, P., Sbrocchi, C., & Vohland, K. (2016). Associations for citizen science: Regional knowledge, global collaboration. Citizen Science: Theory and Practice, 1(2).

Veeckman, C.M., Talboom, S., Gijsel, L., Devoghel, H., & Duerinckx, A. (2019). Communicatie bij burgerwetenschap: Een praktische handleiding voor communicatie en betrokkenheid bij citizen science. SCIVIL.

Vohland, K., Land-Zandstra, A., Ceccaroni, L., Lemmens, R., Perelló, J., Ponti, M., Samson, R., & Wagenknecht, K. (Eds.) (2021). The science of citizen science. Springer Nature.

Warin, C., & Delaney, N. (2020). Citizen science and citizen engagement. Achievements in Horizon 2020 and recommendations on the way forward. European Commision.


Downloads per month over past year

Actions (login required)

View Item View Item