How to automate the extraction and analysis of information for educational purposes

Calvera-Isabal, Miriam, Santos, Patricia, Hoppe, H.-Ulrich and Schulten, Cleo How to automate the extraction and analysis of information for educational purposes. Comunicar, 2023, vol. 31, n. 74, pp. 23-35. [Journal article (Paginated)]

[thumbnail of Research article (English)]
Preview
Text (Research article (English))
10.3916_C74-2023-02-english.pdf - Published version
Available under License Creative Commons Attribution.

Download (15MB) | Preview
[thumbnail of Research article (Spanish)]
Preview
Text (Research article (Spanish))
10.3916_C74-2023-02.pdf - Published version
Available under License Creative Commons Attribution.

Download (15MB) | Preview

English abstract

There is an increasing interest and growing practice in Citizen Science (CS) that goes along with the usage of websites for communication as well as for capturing and processing data and materials. From an educational perspective, it is expected that by integrating information about CS in a formal educational setting, it will inspire teachers to create learning activities. This is an interesting case for using bots to automate the process of data extraction from online CS platforms to better understand its use in educational contexts. Although this information is publicly available, it has to follow GDPR rules. This paper aims to explain (1) how CS communicates and is promoted on websites, (2) how web scraping methods and anonymization techniques have been designed, developed and applied to collect information from online sources and (3) how these data could be used for educational purposes. After the analysis of 72 websites, some of the results obtained show that only 24.8% includes detailed information about the CS project and 48.61% includes information about educational purposes or materials.

Spanish abstract

El interés y la práctica de la ciencia ciudadana (CC) ha aumentado en los últimos años. Esto ha derivado en el uso de páginas web como herramienta de comunicación, recolección o análisis datos o repositorio materiales y recursos. Desde una perspectiva educativa, se espera que al integrar información sobre proyectos de CC en un entorno educativo formal, se inspire a los maestros a crear actividades de aprendizaje. Este, es un caso interesante para usar bots que automaticen el proceso de extracción de datos de webs de CC que ayuden a comprender mejor su uso en contextos educativos. Aunque esta información está disponible públicamente, se deben seguir las reglas de la ley de protección de datos o GDPR. Este artículo tiene como objetivo explicar: 1) cómo la CC se comunica y promueve en los sitios web; 2) cómo se diseñan, desarrollan y aplican los métodos de web scraping y las técnicas de anonimización para recopilar información en línea; y 3) cómo se podrían usar estos datos con fines educativos. Tras el análisis de 72 webs algunos de los resultados son que solo el 24,8% incluye información detallada sobre el proyecto, y el 48,61% incluye información sobre propósitos o materiales educativos.

Item type: Journal article (Paginated)
Keywords: Citizen science; informal learning; algorithms; automatization; education; privacy protection; Ciencia ciudadana; aprendizaje informal; algoritmos; automatización; educación; protección de la privacidad
Subjects: B. Information use and sociology of information > BJ. Communication
G. Industry, profession and education.
G. Industry, profession and education. > GH. Education.
Depositing user: Alex Ruiz
Date deposited: 09 Jan 2023 06:11
Last modified: 09 Jan 2023 06:11
URI: http://hdl.handle.net/10760/43877

References

Asensio-Pérez, J.I., Dimitriadis, Y., Prieto, L.P., Hernández-Leo, D., & Mor, Y. (2014). From idea to VLE in half a day: METIS approach and tools for learning co-design. In Proceedings of the Second International Conference on Technological Ecosystems for Enhancing Multiculturality (pp. 741-745). ACM. https://doi.org/10.1145/2669711.2669983

Bickford, D., Posa, M.R.C., Qie, L., Campos-Arceiz, A., & Kudavidanage, E.P. (2012). Science communication for biodiversity conservation. Biological Conservation, 151(1), 74-76. https://doi.org/10.1016/j.biocon.2011.12.016

Bonney, R., Cooper, C.B., Dickinson, J., Kelling, S., Phillips, T., Rosenberg, K.V., & Shirk, J. (2009). Citizen science: A developing tool for expanding science knowledge and scientific literacy. BioScience, 59(11), 977-984. https://doi.org/10.1525/bio.2009.59.11.9

Bonney, R., Phillips, T. B., Ballard, H.L., & Enck, J.W. (2016). Can citizen science enhance public understanding of science? Public Understanding of Science, 25(1), 2-16. https://doi.org/10.1177/0963662515607406

Bowser, A., Brenton, P., Stevenson, R., Newman, G., Schade, S., Bastin, L., Parker, A., & Oliver, J. (2017). Citizen Science Association Data & Metadata Working Group: Report from CSA 2017 and Future Outlook. European Commision. https://bit.ly/3IS85Sl

Calvera-Isabal, M., Varas, N., & Santos, P. (2021). Computational techniques for data science applied to broaden the knowledge between citizen science and education. In D.G. Sampson, D. Ifenthaler, & P. Isaías (Eds.), Proceedings of the 18th International Conference on Cognition and Exploratory Learning in the Digital Age (CELDA 2021) (pp. 219-226). IADIS Press. https://doi.org/10.1007/978-3-030-65657-7

Chan, K.K.H., & Yung, B.H.W. (2018). Developing pedagogical content knowledge for teaching a new topic: More than teaching experience and subject matter knowledge. Research in Science Education, 48(2), 233-265. https://doi.org/10.1007/s11165-016-9567-1

Clarivate analytics (Ed.) (2022). Web of Science Core Collection Help. https://bit.ly/3ts2lZl

Derave, T., Sales, T.P., Gailly, F., & Poels, G. (2020). Towards a reference ontology for digital platforms. In G. Dobbie, U. Frank, G. Kappel, S.W. Liddle, H.C. May (Eds.), Conceptual modeling. ER 2020. Lecture notes in computer science (pp. 289-302). Springer. https://doi.org/10.1007/978-3-030-62522-1_21

Diouf, R., Sarr, E.N., Sall, O., Birregah, B., Bousso M., & Mbaye, S.N. (2019). Web Scraping: State-of-the-Art and Areas of Application. In 2019 IEEE International Conference on Big Data (Big Data) (pp. 6040-6042). https://doi.org/10.1109/BigData47090.2019.9005594

Djonko-Moore, C.M., Leonard, J., Holifield, Q., Bailey, E.B., & Almughyirah, S.M. (2018). Using culturally relevant experiential education to enhance urban children’s knowledge and engagement in science. Journal of Experiential Education, 41(2), 137-153. https://doi.org/10.1177/1053825917742164

European Union (Ed.) (2010). Charter of fundamental rights of the European Union. Official Journal of the European Union C83, 53, 380. https://bit.ly/3PFo605

Fraisl, D., Campbell, J., See, L., Wehn, U., Wardlaw, J., Gold, M., Moorthy, I., Arias, R., Piera, J., Oliver, J.L., Masó, J., Penker, M., & Fritz, S. (2020). Mapping citizen science contributions to the UN sustainable development goals.?Sustainability Science,?15(6), 1735-1751. https://doi.org/10.1007/s11625-020-00833-7

Gruschka, N., Mavroeidis, V., Vishi, K., & Jensen, M. (2018). Privacy issues and data protection in big data: A case study analysis under GDPR. In 2018 IEEE International Conference on Big Data (Big Data) (pp. 5027-5033). IEEE. https://doi.org/10.1109/BigData.2018.8622621

Hiller, S.E., & Kitsantas, A. (2014). The effect of a horseshoe crab citizen science program on middle school student science performance and STEM career motivation. School Science and Mathematics, 114(6), 302-311. https://doi.org/10.1111/ssm.12081

Kasgar, A.K, Agrawal, J., & Sahu, S. (2012). New modified 256-bi MD5 algorithm with SHA. Compression Function. International Journal of Computer Applications, 42(12), 47-51. https://doi.org/10.5120/5748-7956

Kobori, H., Dickinson, J.L., Washitani, I., Sakurai, R., Amano, T., Komatsu, N., Kitamura, W., Takagawa, S., Koyama, K., Ogawara, T., & Miller-Rushing, A.J. (2016). Citizen science: a new approach to advance ecology, education, and conservation. Ecological Research, 31(1), 1-19. https://doi.org/10.1007/s11284-015-1314-y

Kolay, S., D'Alberto, P., Dasdan, A., & Bhattacharjee, A. (2008). A larger scale study of robots. txt. In Proceedings of the 17th international conference on World Wide Web (pp. 1171-1172). https://doi.org/10.1145/1367497.1367711

Li, Y., & Manoharan, S. (2013). A performance comparison of SQL and NoSQL databases. In 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM) (pp. 15-19). IEEE. https://doi.org/10.1109/PACRIM.2013.6625441

Lin-Hunter, D.E., Newman, G.J., & Balgopal, M.M. (2020). Citizen scientist or citizen technician: A case study of communication on one citizen science platform. Citizen Science: Theory and Practice, 5(1). https://doi.org/10.5334/cstp.261

Massa, N., Dischino, M., Donnelly, J.F., & Hanes, F.D. (2011). Creating real-world problem-based learning challenges in sustainable technologies to increase the STEM Pipeline. In 2011 ASEE Annual Conference & Exposition (pp. 22.397.1 - 22.397.19). https://doi.org/10.18260/1-2--17678

Parvez, M.S., Tasneem, K.S.A., Rajendra, S.S., & Bodke, K.R. (2018). Analysis of different web data extraction techniques. In 2018 International Conference on Smart City and Emerging Technology (ICSCET) (pp. 1-7). IEEE. https://doi.org/10.1109/ICSCET.2018.8537333

Ponti, M., Hillman, T., Kullenberg, C., & Kasperowski, D. (2018). Getting it right or being top rank: Games in citizen science. Citizen Science: Theory and Practice, 3(1). https://doi.org/10.5334/cstp.101

Roche, J., Bell, L., Galvão, C., Golumbic, Y. N., Kloetzer, L., Knoben, N. Laakso, M., Lorke, J., Mannion, G., Massetti, L., Mauchline, A., Pata, K., Ruck, A., Taraba, P., & Winter, S. (2020). Citizen science, education, and learning: challenges and opportunities. Frontiers in Sociology, 5, 613814. https://doi.org/10.3389/fsoc.2020.613814

Roldán-Álvarez, D., Martínez-Martínez, F., & Martín, E. (2021). Citizen science and open learning: A Twitter perspective. In 2021 International Conference on Advanced Learning Technologies (ICALT) (pp. 6-8). IEEE. https://doi.org/10.1109/ICALT52272.2021.00009

Saddiqa, M., Larsen, B., Magnussen, R., Rasmussen, L.L., & Pedersen, J.M. (2019). Open data visualization in Danish schools: A case study. Complex Systems Informatics and Modeling Quarterly, 21, 1-20. https://doi.org/10.24132/CSRN.2019.2902.2.3

Sanz, F., Gold, M., & Mazzonetto, M. (2019). D2.3: Platform functionality requirements & specification report. Zenodo. https://doi.org/10.5281/zenodo.3612808

Stocklmayer, S.M., Rennie, L.J., & Gilbert, J.K. (2010). The roles of the formal and informal sectors in the provision of effective science education. Studies in Science Education, 46(1), 1-44. https://doi.org/10.1080/03057260903562284

Storksdieck, M., Shirk, J.L., Cappadonna, J.L., Domroese, M., Göbel, C., Haklay, M., Miller-Rushing, A.J., Roetman, P., Sbrocchi, C., & Vohland, K. (2016). Associations for citizen science: Regional knowledge, global collaboration. Citizen Science: Theory and Practice, 1(2). https://doi.org/10.5334/cstp.55

Veeckman, C.M., Talboom, S., Gijsel, L., Devoghel, H., & Duerinckx, A. (2019). Communicatie bij burgerwetenschap: Een praktische handleiding voor communicatie en betrokkenheid bij citizen science. SCIVIL. https://bit.ly/3PKQz50

Vohland, K., Land-Zandstra, A., Ceccaroni, L., Lemmens, R., Perelló, J., Ponti, M., Samson, R., & Wagenknecht, K. (Eds.) (2021). The science of citizen science. Springer Nature. https://doi.org/10.1007/978-3-030-58278-4

Warin, C., & Delaney, N. (2020). Citizen science and citizen engagement. Achievements in Horizon 2020 and recommendations on the way forward. European Commision. https://doi.org/10.2777/05286


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item