du707zr/dmytro_ushatenko

forked from KEMT/zpwiki

dano 4213f05c92 Update 'pages/topics/hatespeech/README.md'

2022-01-28 12:00:19 +00:00

985 B

Raw Blame History

title

category

tag

Hate Speech

project

hatespeech

nlp

nlm

Hate Speech Scientific Project

Goal:

To be able to recognize parts of text that contains hate or vulgarisms.

Possible applications:

Management of discussion forums / detection of spam or abuse.
"Postprocessing" for biased generative language models - preventing to generate inapropriate responses.

Plan:

Perform a review of the state-of-the-art
Pick established (english) corpora
Formalize the problem - classification of sentiment, recognition of topic, keyword selection,
Propose a preliminary system, repeat existing approach.
Create small evaluation set in Slovak
Try multilingual/crosslingual approach. Possibility of machine translation.
Annotate a bigger Slovak Corpus
Recognize and publish scientific contribution

People:

Ján Staš
Daniel Hládek
Zuzana Sokolová
Manohar Gowdru Shridharu

Sources:

https://hatespeechdata.com/