Persian Causality Corpus (PerCause) and the Causality Detection Benchmark

Rahimi, Zeinab and ShamsFard, Mehrnoush Persian Causality Corpus (PerCause) and the Causality Detection Benchmark., 2022 [Article]

[thumbnail of JIPM_Volume 38_Issue 2_Pages 273-303.pdf]
Preview
Text
JIPM_Volume 38_Issue 2_Pages 273-303.pdf

Download (1MB) | Preview

English abstract

Recognizing causal elements and causal relations in the text is among the challenging issues in natural language processing (NLP), specifically in low-resource languages such as Persian. In this research, we prepare a causality human-annotated corpus for the Persian language. This corpus consists of 4446 sentences and 5128 causal relations. Three labels of Cause, Effect, and Causal mark are specified to each relation, if possible. We used this corpus to train a system for detecting causal elements’ boundaries.Also, we present a causality detection benchmark for three machine-learning methods and two deep learning systems based on this corpus. Performance evaluations indicate that our best total result is obtained through the CRF classifier, which provides an F-measure of 0.76. In addition, the best accuracy (91.4) is obtained through the BiLSTM-CRF deep learning method

Item type: Article
Keywords: PerCause,Causality annotated corpus,causality detection,Deep Learning,CRF
Depositing user: elahe naseri
Date deposited: 14 Jan 2024 09:34
Last modified: 14 Jan 2024 09:34
URI: http://hdl.handle.net/10760/45232

Downloads

Downloads per month over past year

Actions (login required)

View Item View Item