datahub/examples/turing/content/datasets/measuring-hate-speech.md

---
title: Measuring Hate Speech
link-to-publication: https://arxiv.org/abs/2009.10277
link-to-data: https://huggingface.co/datasets/ucberkeley-dlab/measuring-hate-speech
task-description: 10 ordinal labels (sentiment, (dis)respect, insult, humiliation, inferior status, violence, dehumanization, genocide, attack/defense, hate speech), which are debiased and aggregated into a continuous hate speech severity score (hate_speech_score) that includes a region for counterspeech & supportive speeech. Includes 8 target identity groups (race/ethnicity, religion, national origin/citizenship, gender, sexual orientation, age, disability, political ideology) and 42 identity subgroups.
details-of-task: Hate speech measurement on social media in English
size-of-dataset: "39,565 comments annotated by 7,912 annotators on 10 ordinal labels, for 1,355,560 total labels."
percentage-abusive: 25
language: English
level-of-annotation: ["Social media comment"]
platform: ["Twitter", "Reddit", "Youtube"]
medium: ["Text"]
reference: "Kennedy, C. J., Bacon, G., Sahn, A., & von Vacano, C. (2020). Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application. arXiv preprint arXiv:2009.10277."
---