[alan-turing][m] - individual pages

This commit is contained in:
Luccas Mateus de Medeiros Gomes
2023-05-01 21:06:58 -03:00
parent cc43597130
commit 7c845fe0e3
19 changed files with 6022 additions and 404 deletions

View File

@@ -0,0 +1,22 @@
---
title: Contributing
---
We accept entries to our catalogue based on pull requests to the content folder. The dataset must be avaliable for download to be included in the list. If you want to add an entry, follow these steps!
Please send just one dataset addition/edit at a time - edit it in, then save. This will make everyones life easier (including yours!)
- Go to the repo url file and click the "Add file" dropdown and then click on "Create new file".
![](https://i.imgur.com/2PR0ZgL.png)
- In the following page type `content/datasets/<name-of-the-file>.md`. if you want to add an entry to the datasets catalog or `content/keywords/<name-of-the-file>.md` if you want to add an entry to the lists of abusive keywords.
![](https://i.imgur.com/rr3uSYu.png)
- Copy the contents of `templates/dataset.md` or `templates/keywords.md` respectively to the camp below, filling out the fields with the correct data format
![](https://i.imgur.com/x6JIjhz.png)
- Click on "Commit changes", on the popup make sure you give some brief detail on the proposed change. and then click on Propose changes
![](https://i.imgur.com/BxuxKEJ.png)
- Submit the pull request on the next page when prompted.

View File

@@ -12,3 +12,5 @@ platform: ["AlJazeera"]
medium: ["Text"]
reference: "Mubarak, H., Darwish, K. and Magdy, W., 2017. Abusive Language Detection on Arabic Social Media. In: Proceedings of the First Workshop on Abusive Language Online. Vancouver, Canada: Association for Computational Linguistics, pp.52-56."
---
SOMETHING TEST

View File

@@ -12,3 +12,4 @@ platform: ["Youtube", "Facebook"]
medium: ["Text"]
reference: "Romim, N., Ahmed, M., Talukder, H., & Islam, M. S. (2021). Hate speech detection in the bengali language: A dataset and its baseline evaluation. In Proceedings of International Joint Conference on Advances in Computational Intelligence (pp. 457-468). Springer, Singapore."
---

View File

@@ -1,3 +1,7 @@
---
title: Hate Speech Dataset Catalogue
---
This page catalogues datasets annotated for hate speech, online abuse, and offensive language. They may be useful for e.g. training a natural language processing system to detect this language.
The list is maintained by Leon Derczynski, Bertie Vidgen, Hannah Rose Kirk, Pica Johansson, Yi-Ling Chung, Mads Guldborg Kjeldgaard Kongsbak, Laila Sprejer, and Philine Zeinert.

View File

@@ -0,0 +1,10 @@
---
title: Hurtlex
description: HurtLex is a lexicon of offensive, aggressive, and hateful words in over 50 languages. The words are divided into 17 categories, plus a macro-category indicating whether there is stereotype involved.
data-link: https://github.com/valeriobasile/hurtlex
reference: http://ceur-ws.org/Vol-2253/paper49.pdf, Proc. CLiC-it 2018
---
## Markdown TEST
Some text

View File

@@ -0,0 +1,5 @@
---
title: SexHateLex is a Chinese lexicon of hateful and sexist words.
data-link: https://doi.org/10.5281/zenodo.4773875
reference: http://ceur-ws.org/Vol-2253/paper49.pdf, Journal of OSNEM, Vol.27, 2022, 100182, ISSN 2468-6964.
---