[examples/turing] - rename it to turing

This commit is contained in:
Luccas Mateus de Medeiros Gomes
2023-05-11 16:13:09 -03:00
parent 82773b5e8a
commit 7822440f0d
43 changed files with 0 additions and 0 deletions

View File

@@ -0,0 +1,14 @@
---
title: AbuseEval v1.0
link-to-publication: http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.760.pdf
link-to-data: https://github.com/tommasoc80/AbuseEval
task-description: Explicitness annotation of offensive and abusive content
details-of-task: "Enriched versions of the OffensEval/OLID dataset with the distinction of explicit/implicit offensive messages and the new dimension for abusive messages. Labels for offensive language: EXPLICIT, IMPLICT, NOT; Labels for abusive language: EXPLICIT, IMPLICT, NOTABU"
size-of-dataset: 14100
percentage-abusive: 20.75
language: English
level-of-annotation: ["Tweets"]
platform: ["Twitter"]
medium: ["Text"]
reference: "Caselli, T., Basile, V., Jelena, M., Inga, K., and Michael, G. 2020. \"I feel offended, dont be abusive! implicit/explicit messages in offensive and abusive language\". The 12th Language Resources and Evaluation Conference (pp. 6193-6202). European Language Resources Association."
---

View File

@@ -0,0 +1,16 @@
---
title: "Abusive Language Detection on Arabic Social Media (Al Jazeera)"
link-to-publication: https://www.aclweb.org/anthology/W17-3008
link-to-data: http://alt.qcri.org/~hmubarak/offensive/AJCommentsClassification-CF.xlsx
task-description: Ternary (Obscene, Offensive but not obscene, Clean)
details-of-task: Incivility
size-of-dataset: 32000
percentage-abusive: 0.81
language: Arabic
level-of-annotation: ["Posts"]
platform: ["AlJazeera"]
medium: ["Text"]
reference: "Mubarak, H., Darwish, K. and Magdy, W., 2017. Abusive Language Detection on Arabic Social Media. In: Proceedings of the First Workshop on Abusive Language Online. Vancouver, Canada: Association for Computational Linguistics, pp.52-56."
---
SOMETHING TEST

View File

@@ -0,0 +1,14 @@
---
title: "CoRAL: a Context-aware Croatian Abusive Language Dataset"
link-to-publication: https://aclanthology.org/2022.findings-aacl.21/
link-to-data: https://github.com/shekharRavi/CoRAL-dataset-Findings-of-the-ACL-AACL-IJCNLP-2022
task-description: Multi-class based on context dependency categories (CDC)
details-of-task: Detectioning CDC from abusive comments
size-of-dataset: 2240
percentage-abusive: 100
language: "Croatian"
level-of-annotation: ["Posts"]
platform: ["Posts"]
medium: ["Newspaper Comments"]
reference: "Ravi Shekhar, Mladen Karan and Matthew Purver (2022). CoRAL: a Context-aware Croatian Abusive Language Dataset. Findings of the ACL: AACL-IJCNLP."
---

View File

@@ -0,0 +1,14 @@
---
title: Detecting Abusive Albanian
link-to-publication: https://arxiv.org/abs/2107.13592
link-to-data: https://doi.org/10.6084/m9.figshare.19333298.v1
task-description: Hierarchical (offensive/not; untargeted/targeted; person/group/other)
details-of-task: Detect and categorise abusive language in social media data
size-of-dataset: 11874
percentage-abusive: 13.2
language: Albanian
level-of-annotation: ["Posts"]
platform: ["Instagram", "Youtube"]
medium: ["Text"]
reference: "Nurce, E., Keci, J., Derczynski, L., 2021. Detecting Abusive Albanian. arXiv:2107.13592"
---

View File

@@ -0,0 +1,15 @@
---
title: "Hate Speech Detection in the Bengali language: A Dataset and its Baseline Evaluation"
link-to-publication: https://arxiv.org/pdf/2012.09686.pdf
link-to-data: https://www.kaggle.com/naurosromim/bengali-hate-speech-dataset
task-description: Binary (hateful, not)
details-of-task: "Several categories: sports, entertainment, crime, religion, politics, celebrity and meme"
size-of-dataset: 30000
percentage-abusive: 0.33
language: Bengali
level-of-annotation: ["Posts"]
platform: ["Youtube", "Facebook"]
medium: ["Text"]
reference: "Romim, N., Ahmed, M., Talukder, H., & Islam, M. S. (2021). Hate speech detection in the bengali language: A dataset and its baseline evaluation. In Proceedings of International Joint Conference on Advances in Computational Intelligence (pp. 457-468). Springer, Singapore."
---

View File

@@ -0,0 +1,14 @@
---
title: Large-Scale Hate Speech Detection with Cross-Domain Transfer
link-to-publication: https://aclanthology.org/2022.lrec-1.238/
link-to-data: https://github.com/avaapm/hatespeech
task-description: Three-class (Hate speech, Offensive language, None)
details-of-task: Hate speech detection on social media (Twitter) including 5 target groups (gender, race, religion, politics, sports)
size-of-dataset: "100k English (27593 hate, 30747 offensive, 41660 none)"
percentage-abusive: 58.3
language: English
level-of-annotation: ["Posts"]
platform: ["Twitter"]
medium: ["Text", "Image"]
reference: "Cagri Toraman, Furkan Şahinuç, Eyup Yilmaz. 2022. Large-Scale Hate Speech Detection with Cross-Domain Transfer. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 22152225, Marseille, France. European Language Resources Association."
---

View File

@@ -0,0 +1,14 @@
---
title: "Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language"
link-to-publication: https://arxiv.org/abs/2103.10195
link-to-data: https://drive.google.com/file/d/1mM2vnjsy7QfUmdVUpKqHRJjZyQobhTrW/view
task-description: Binary (misogyny/none) and Multi-class (none, discredit, derailing, dominance, stereotyping & objectification, threat of violence, sexual harassment, damning)
details-of-task: Introducing an Arabic Levantine Twitter dataset for Misogynistic language
size-of-dataset: 6603
percentage-abusive: 48.76
language: Arabic
level-of-annotation: ["Posts"]
platform: ["Twitter"]
medium: ["Text", "Images"]
reference: "Hala Mulki and Bilal Ghanem. 2021. Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, pages 154163, Kyiv, Ukraine (Virtual). Association for Computational Linguistics"
---

View File

@@ -0,0 +1,14 @@
---
title: Measuring Hate Speech
link-to-publication: https://arxiv.org/abs/2009.10277
link-to-data: https://huggingface.co/datasets/ucberkeley-dlab/measuring-hate-speech
task-description: 10 ordinal labels (sentiment, (dis)respect, insult, humiliation, inferior status, violence, dehumanization, genocide, attack/defense, hate speech), which are debiased and aggregated into a continuous hate speech severity score (hate_speech_score) that includes a region for counterspeech & supportive speeech. Includes 8 target identity groups (race/ethnicity, religion, national origin/citizenship, gender, sexual orientation, age, disability, political ideology) and 42 identity subgroups.
details-of-task: Hate speech measurement on social media in English
size-of-dataset: "39,565 comments annotated by 7,912 annotators on 10 ordinal labels, for 1,355,560 total labels."
percentage-abusive: 25
language: English
level-of-annotation: ["Social media comment"]
platform: ["Twitter", "Reddit", "Youtube"]
medium: ["Text"]
reference: "Kennedy, C. J., Bacon, G., Sahn, A., & von Vacano, C. (2020). Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application. arXiv preprint arXiv:2009.10277."
---

View File

@@ -0,0 +1,14 @@
---
title: Offensive Language and Hate Speech Detection for Danish
link-to-publication: http://www.derczynski.com/papers/danish_hsd.pdf
link-to-data: https://figshare.com/articles/Danish_Hate_Speech_Abusive_Language_data/12220805
task-description: "Branching structure of tasks: Binary (Offensive, Not), Within Offensive (Target, Not), Within Target (Individual, Group, Other)"
details-of-task: Group-directed + Person-directed
size-of-dataset: 3600
percentage-abusive: 0.12
language: Danish
level-of-annotation: ["Posts"]
platform: ["Twitter", "Reddit", "Newspaper comments"]
medium: ["Text"]
reference: "Sigurbergsson, G. and Derczynski, L., 2019. Offensive Language and Hate Speech Detection for Danish. ArXiv."
---