Compare commits

..

2 Commits

Author SHA1 Message Date
97c6100dd8 zz 2024-05-03 15:00:20 +02:00
ed0603679a zz 2024-05-03 14:56:22 +02:00
3 changed files with 14 additions and 5 deletions

View File

@ -27,6 +27,7 @@ Súvisiaca téma:
- [Python](/topics/python) - [Python](/topics/python)
- [Hate Speech](/topics/hatespeech) - [Hate Speech](/topics/hatespeech)
- [Tetiana Mahorian](/students/2022/tetiana_mohorian)
Stretnutie 5.4. Stretnutie 5.4.

View File

@ -13,6 +13,10 @@ rok začiatku štúdia: 2022
## Bakalárska práca 2025 ## Bakalárska práca 2025
- Spolupráca [P. Pokrivčák](/students/2019/patrik_pokrivcak)
- [Python](/topics/python)
- [Hate Speech](/topics/hatespeech)
Návrh na tému: Návrh na tému:

View File

@ -24,16 +24,17 @@ Plan:
- Create small evaluation set in Slovak - Create small evaluation set in Slovak
- Try multilingual/crosslingual approach. Possibility of machine translation. - Try multilingual/crosslingual approach. Possibility of machine translation.
- Annotate a bigger Slovak Corpus - Annotate a bigger Slovak Corpus
- Recognize and publish scientific contribution - Recognize and publish scientific contribution
Futire Tasks: Future Tasks:
- Evaluate existing multilingual model. E.G. https://huggingface.co/Andrazp/multilingual-hate-speech-robacofi - Evaluate existing multilingual model. E.G. https://huggingface.co/Andrazp/multilingual-hate-speech-robacofi
- Translate existing English dataset into Slovak. Use OPUS English Slovak Marian NMT model. Train Slovak munolingual model. - Translate existing English dataset into Slovak. Use OPUS English Slovak Marian NMT model. Train Slovak munolingual model.
- Annotate a Twitter Dataset. Possible guidelines are: https://developers.perspectiveapi.com/s/about-the-api-training-data?language=en_US - Train or finetune or prompt a large langauge model.
In progress tasks: In progress tasks:
- Annotate a Twitter Dataset. Possible guidelines are: https://developers.perspectiveapi.com/s/about-the-api-training-data?language=en_US
- Annotate a Facebook Dataset. Use some other guidelines. e.g. sentence-level annotation, for context sensitive hate. - Annotate a Facebook Dataset. Use some other guidelines. e.g. sentence-level annotation, for context sensitive hate.
- Prepare existing Slovak Twitter dataaset, train evaluate a model. - Prepare existing Slovak Twitter dataaset, train evaluate a model.
@ -50,10 +51,13 @@ People:
- Daniel Hládek - Daniel Hládek
- Zuzana Sokolová - Zuzana Sokolová
- [Vladimír Ferko](/students/2021/vladimir_ferko) - [Vladimír Ferko](/students/2021/vladimir_ferko)
- [Sevval Bulburu](/interns/sevval_bulburu) - [Tetiana Mohorian](/students/2022/tetiana_mohorian)
- [Patrik Pokrivčák](/students/2019/patrik_pokrivcak)
Former participants: Former participants:
- [Sevval Bulburu](/interns/sevval_bulburu)
- [Manohar Gowdru Shridharu](/students/2021/manohar_gowdru_shridharu) - [Manohar Gowdru Shridharu](/students/2021/manohar_gowdru_shridharu)
@ -62,4 +66,4 @@ Links:
- https://europeanonlinehatelab.com/ - https://europeanonlinehatelab.com/
- https://hatespeechdata.com/ - https://hatespeechdata.com/
- https://oznacuj-dezinfo.kinit.sk/ - https://oznacuj-dezinfo.kinit.sk/