News analysis for the detection of cyber security issues in digital healthcare: A text mining approach to uncover actors, attack methods and technologies for cyber defense

Bertl, Markus News analysis for the detection of cyber security issues in digital healthcare: A text mining approach to uncover actors, attack methods and technologies for cyber defense. Young Information Scientist, 2019, vol. 4, pp. 1-15. [Journal article (Paginated)]

[img]
Preview
Text
YIS_4_2019_1_Bertl.pdf - Published version

Download (603kB) | Preview

English abstract

Objectives: This research reviews the possibilities of text mining in the area of cybercrime in digital healthcare showing how advanced information retrieval and natural language processing can be used to get insights. The aim is to mine news data to find out what is reported about digital healthcare, what security-related critical events happened, and what actors, attack methods, and technologies play a role there. Methods: Different projects already apply text mining successfully in the cyber domain. However, none of these are specifically tailored to threats in the digital healthcare sector or uses an as big data foundation for analysis. To achieve that goal, different text mining methodologies like fact extraction, semantic fields as well as statistical methods like frequency, correlation and trend calculations were used. The news data for the analysis was provided by the DocCenter from the National Defense Academy (DocCenter/NDA) of the Austrian Armed Forces. About 300,000 news articles were processed and analyzed. Additionally, the open source GDELT dataset was investigated. Results & Conclusion: The data points out that cyber threats are present in digital health technologies and cyberattacks are more and more threatening to organizations, governments, and every person them self. Not only hacker groups, firms, and governments are involved in these attacks, also terroristic organizations use cyberwarfare. That, together with the amount of technology in digital healthcare like pacemakers, IoT, wearables but also the importance of healthcare as critical infrastructure and the dependence on electronic health records makes our society vulnerable.

German abstract

Nachrichtenanalyse zur Erkennung von Cybersicherheitsproblemen im digitalen Gesundheitswesen: ein Text-Mining-Ansatz zur Aufdeckung von Akteuren, Angriffsmethoden und Technologien für die Cyber-Abwehr [Titel in deutscher Übersetzung]. Zielsetzung — Diese Publikation untersucht die Möglichkeiten, welche Text Mining im Bereich Cybercrime in Digital Healthcare bietet. Die Zielsetzung dieser Arbeit ist, herauszufinden was über Digital Healthcare berichtet wird, welche Akteure in dieser Domäne agieren und welche Angriffsmethoden und Technologien eine Rolle spielen. Forschungsmethoden — Verschiedene Projekte verwenden Text Mining erfolgreich im Cyber-Bereich, allerdings nicht spezifisch adaptiert auf die Anforderungen des Gesundheitswesens. Dazu wurden verschiedene Text-Mining-Methoden, wie Fact Extraction oder semantische Felder, sowie statistische Methoden wie Korrelationen oder Trendanalysen angewendet. Die Datengrundlage kam aus der Zentraldokumentation der Landesverteidigungsakademie (ZentDok/LVAk) des österreichischen Bundesheeres. Insgesamt wurden zirka 300.000 Artikel ausgewertet. Zusätzlich wurden die Metadaten des GDELT Datasets untersucht. Ergebnisse & Schlussfolgerung — Text Mining ist ein zentrales Werkzeug für Cybersecurity und Trendanalysen. Die Daten zeigen, dass Technologien im Bereich Digital Healthcare stetig zunehmen und Gefahren bergen. Diese werden auch gezielt von Organisationen, Staaten und Einzelpersonen ausgenutzt. Auch Terroristengruppen bedienen sich immer mehr Methoden der digitalen Kriegsführung, als Ergänzung zu klassischen Terrorangriffen. Das zeigt gemeinsam mit der Durchdringung des Gesundheitswesens von digitalen Technologien wie Herzschrittmacher, IoT, Wearables aber auch Krankenhausinformationssysteme und elektronische Patientenakten die Gefahr, die auf uns zukommt.

Item type: Journal article (Paginated)
Keywords: digital healthcare, cybercrime, text mining, media mining, new technologies, Watson Explorer, GDELT, OSInfo, OSINT, association rule mining, Digital Healthcare, Cybercrime, Text Mining, Media Mining, neue Technologien, Watson Explorer, GDELT, OSInfo, OSINT, Assoziationsanalyse
Subjects: L. Information technology and library technology > LH. Computer and network security.
Depositing user: Austrian E-LIS editors
Date deposited: 12 Nov 2019 07:25
Last modified: 23 Feb 2021 12:25
URI: http://hdl.handle.net/10760/39224

References

Aitchison, J. (2005). Language change. In The routledge companion to semiotics and linguistics. Ed. by Cobley, P. London: Routledge, pp. 111–120.

Aktypi, A.; Nurse, J. R. C.; Goldsmith, M. (2017). Unwinding Ariadne’s identity thread: Privacy risks with fitness trackers and online social networks. In Proceedings of the 2017 on multimedia privacy and security – MPS ’17 (Dallas, TX, 30th October 2017). New York, NY, USA: ACM, pp. 1–11. DOI: 10.1145/3137616.3137617.

Brinton, L. J. (2000). The structure of modern english: A linguistic introduction. John Benjamins.

Camara, C.; Peris-Lopez, P.; Tapiador, J. E. (2015). Security and privacy issues in implantable medical devices: A comprehensive survey. In Journal of Biomedical Informatics 55, pp. 272–289. DOI: 10.1016/j.jbi.2015.04.007.

Fuchslueger, J. (2016). Semantische Analyse unstrukturierter Daten. In Austrian Law Journal 3, p. 10.

Gerner, D. J.; Schrodt, P.; Abu-Jabr, R.; Yilmaz, Ö. (2002). Conflict and mediation event observations (CAMEO): A new event data framework for the analysis of foreign policy interactions. Paper prepared for delivery at the Annual Meeting of the International Studies Association.

Indulska, M.; Hovorka, D. S.; Recker, J. (2012). Quantitative approaches to content analysis: Identifying conceptual drift across publication outlets. In European Journal of Information Systems 21 (1), pp. 49–69. DOI: 10.1057/ejis.2011.37.

Kitchenham, B.; Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.117.471 visited on 30th September 2019.

Leetaru, K.; Schrodt, P. A. (2013). GDELT: Global data on events, location, and tone. Paper presented at the ISA annual convention.

Lu, H.; Li, Y.; Chen, M. et al. (2018). Brain intelligence: Go beyond artificial intelligence. In Mobile Networks and Applications 23 (2), pp. 368–375. DOI: 10.1007/s11036-017-0932-8.

Mak, K.; Klerx, J.; Pilles, H. C.; Göllner, J. (2015). Wissensentwicklung mit "Crowd OSInfo". Schriftenreihe der Landesverteidigungsakademie 80. Wien: BM für Landesverteidigung und Sport.

Mak, K.; Pilles, H. C.; Bertl, M.; Klerx, J. (2018). Wissensentwicklung mit IBM Watson in der Zentraldokumentation (ZentDok) der Landesverteidigungsakademie. Schriftenreihe der Landesverteidigungsakademie. Wien: BM für Landesverteidigung und Sport.

Marsh & McLennan Companies (2017). MMC cyber handbook 2018.

Meier, P. (2012). Ushahidi as a liberation technology. In Liberation technology: Social media and the struggle for democracy. Ed. by Diamond, L.; Plattner, M. F. Baltimore: The Johns Hopkins University Press, pp. 95–109.

Mertz, L. (2018). Cyber-attacks to devices threaten data and patients: Cybersecurity risks come with the territory. Three experts explain what you need to know. In IEEE Pulse. DOI: 10.1109/MPUL.2018.2814258.

Michel, J.-B.; Shen, Y. K.; Aiden, A. P. et al. (2011). Quantitative analysis of culture using millions of digitized books. In Science 331 (6014), pp. 176–182. DOI: 10.1126/science.1199644.

Müller, O.; Junglas, I.; Brocke, J. vom (2016). Text mining for information systems researchers: An annotated topic modeling tutorial. In Communications of the Association for Information Systems 39, pp. 110–135. DOI: 10.17705/1CAIS.03907.

Ponemon Institute (2017). 2017 cost of data breach study: Global overview. https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?htmlfid=SEL03130WWEN visited on 5th January 2019.

Raatikainen, M. J. P.; Arnar, D. O.; Zeppenfeld, K. et al. (2015). Statistics on the use of cardiac electronic devices and electrophysiological procedures in the European Society of Cardiology countries: 2014 report from the European Heart Rhythm Association. In Europace: European Pacing, Arrhythmias, and Cardiac Electrophysiology: Journal of the Working Groups on Cardiac Pacing, Arrhythmias, and Cardiac Cellular Electrophysiology of the European Society of Cardiology 17 Suppl 1, pp. i1–75. DOI: 10.1093/europace/euu300.

The GDELT Project (2019). https://www.gdeltproject.org/ visited on 12th March 2019.

Zhu, W.-D.; Foyle, B.; Gagné, D. et al. (2014). IBM Watson content analytics: Discovering actionable insight from your content. 3rd ed. IBM Redbook. IBM International technical support organization. http://www.redbooks.ibm.com/redbooks/pdfs/sg247877.pdf visited on 30th September 2019.

Zubiaga, A.; Procter, R.; Maple, C. (2018). A longitudinal analysis of the public perception of the opportunities and challenges of the internet of things. In PLoS ONE 13 (12).DOI: 10.1371/journal.pone.0209472.


Downloads

Downloads per month over past year

Actions (login required)

View Item View Item