---
title: Hate Speech
category: [project]
tag: [hatespeech,nlp,nlm]
---

# Hate Speech Scientific Project

Goal:

- To be able to recognize parts of text that contains hate or vulgarisms.

Possible applications:

- Management of discussion forums / detection of spam or abuse.
- "Postprocessing" for biased generative language models - preventing to generate inapropriate responses.

Plan:

- Perform a review of the state-of-the-art
- Pick established (english) corpora
- Formalize the problem - classification of sentiment, recognition of topic, keyword selection, 
- Propose a preliminary system, repeat existing approach.
- Create small evaluation set in Slovak
- Try multilingual/crosslingual approach. Possibility of machine translation.
- Annotate a bigger Slovak Corpus
- Recognize  and publish scientific contribution

People:

- Ján Staš
- Daniel Hládek
- Zuzana Sokolová
- [Manohar Gowdru Shridharu](/students/2021/manohar_gowdru_shridharu)

Links:

- https://hatespeechdata.com/
- https://oznacuj-dezinfo.kinit.sk/