From 3843b1868ebcee0e4a51169c0ce9f81b798de140 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Daniel=20Hl=C3=A1dek?= Date: Fri, 18 Aug 2023 09:07:14 +0000 Subject: [PATCH] Update 'pages/topics/hatespeech/README.md' --- pages/topics/hatespeech/README.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/pages/topics/hatespeech/README.md b/pages/topics/hatespeech/README.md index db7bbd71..aa4ebd8e 100644 --- a/pages/topics/hatespeech/README.md +++ b/pages/topics/hatespeech/README.md @@ -26,6 +26,18 @@ Plan: - Annotate a bigger Slovak Corpus - Recognize and publish scientific contribution +Tasks: + +- Evaluate existing multilingual model. E.G. https://huggingface.co/Andrazp/multilingual-hate-speech-robacofi +- Translate existing English dataset into Slovak. Use OPUS English Slovak Marian NMT model. Train Slovak munolingual model. + +Future tasks: + +- Annotate a Twitter Dataset. Possible guidelines are: https://developers.perspectiveapi.com/s/about-the-api-training-data?language=en_US +- Annotate a Facebook Dataset. Use some other guidelines. e.g. sentence-level annotation, for context sensitive hate. +- Prepare existing Slovak Twitter dataaset, trainm evaluate a model. + + People: