datahub/examples/turing/content/datasets/measuring-hate-speech.md
2023-05-11 16:13:09 -03:00

1.2 KiB

title, link-to-publication, link-to-data, task-description, details-of-task, size-of-dataset, percentage-abusive, language, level-of-annotation, platform, medium, reference
title link-to-publication link-to-data task-description details-of-task size-of-dataset percentage-abusive language level-of-annotation platform medium reference
Measuring Hate Speech https://arxiv.org/abs/2009.10277 https://huggingface.co/datasets/ucberkeley-dlab/measuring-hate-speech 10 ordinal labels (sentiment, (dis)respect, insult, humiliation, inferior status, violence, dehumanization, genocide, attack/defense, hate speech), which are debiased and aggregated into a continuous hate speech severity score (hate_speech_score) that includes a region for counterspeech & supportive speeech. Includes 8 target identity groups (race/ethnicity, religion, national origin/citizenship, gender, sexual orientation, age, disability, political ideology) and 42 identity subgroups. Hate speech measurement on social media in English 39,565 comments annotated by 7,912 annotators on 10 ordinal labels, for 1,355,560 total labels. 25 English
Social media comment
Twitter
Reddit
Youtube
Text
Kennedy, C. J., Bacon, G., Sahn, A., & von Vacano, C. (2020). Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application. arXiv preprint arXiv:2009.10277.