dmytro_ushatenko/pages/interns/sevval_bulburu/README.md

---
title: Sevval Bulburu
published: true
taxonomy:
    category: [iaeste2023]
    tag: [hatespeech,nlp]
    author: Daniel Hladek
---

Sevval Bulburu


IAESTE Intern Summer 2023, two months

Goal: Help with the [Hate Speech Project](/topics/hatespeech)

Meeting 22.8.2023

State:

- Familiar with Python, Anaconda, Tensorflow, AI projects

Tasks:

- Get familiar with the task of Hate speech detection. Find out how can we use Transformer neural networks to detect and categorize hate speech in internet comments created by random people.
- Get familiar with the basic tools: Huggingface Transformers, Learn how to use https://huggingface.co/Andrazp/multilingual-hate-speech-robacofi in Python script. Learn something about Transformer neural networks.
- get familiar with  Prodi.gy annotation tool.
- Set up web-based annotation environment for students (open, cooperation with [Vladimir Ferko](/students/2021/vladimir_ferko) ).

Ideas fo annotation tools:

- https://github.com/UniversalDataTool/universal-data-tool
- https://www.johnsnowlabs.com/top-6-text-annotation-tools/
- https://app.labelbox.com/
- https://github.com/recogito/recogito-js
- https://github.com/topics/text-annotation?l=javascript

Future tasks (to be decided):

- Evaluate existing multilingual model. E.G. https://huggingface.co/Andrazp/multilingual-hate-speech-robacofi with slovak data
- Translate existing English dataset into Slovak. Use OPUS English Slovak Marian NMT model. Train Slovak munolingual model.
- Prepare existing Slovak Twitter dataaset, trainm evaluate a model.
Add 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:06:15 +00:00			`---`
			`title: Sevval Bulburu`
			`published: true`
			`taxonomy:`
			`category: [iaeste2023]`
			`tag: [hatespeech,nlp]`
			`author: Daniel Hladek`
			`---`

Update 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:19:19 +00:00			`Sevval Bulburu`


			`IAESTE Intern Summer 2023, two months`

			`Goal: Help with the [Hate Speech Project](/topics/hatespeech)`

			`Meeting 22.8.2023`

			`State:`

			`- Familiar with Python, Anaconda, Tensorflow, AI projects`

			`Tasks:`

			`- Get familiar with the task of Hate speech detection. Find out how can we use Transformer neural networks to detect and categorize hate speech in internet comments created by random people.`
Update 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:39:05 +00:00			`- Get familiar with the basic tools: Huggingface Transformers, Learn how to use https://huggingface.co/Andrazp/multilingual-hate-speech-robacofi in Python script. Learn something about Transformer neural networks.`
Update 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:32:37 +00:00			`- get familiar with Prodi.gy annotation tool.`
Update 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:39:05 +00:00			`- Set up web-based annotation environment for students (open, cooperation with [Vladimir Ferko](/students/2021/vladimir_ferko) ).`
Update 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:19:19 +00:00
Update 'pages/interns/sevval_bulburu/README.md' 2023-08-22 12:29:45 +00:00			`Ideas fo annotation tools:`

			`- https://github.com/UniversalDataTool/universal-data-tool`
			`- https://www.johnsnowlabs.com/top-6-text-annotation-tools/`
			`- https://app.labelbox.com/`
			`- https://github.com/recogito/recogito-js`
			`- https://github.com/topics/text-annotation?l=javascript`
Update 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:19:19 +00:00
			`Future tasks (to be decided):`

Update 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:32:37 +00:00			`- Evaluate existing multilingual model. E.G. https://huggingface.co/Andrazp/multilingual-hate-speech-robacofi with slovak data`
Update 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:19:19 +00:00			`- Translate existing English dataset into Slovak. Use OPUS English Slovak Marian NMT model. Train Slovak munolingual model.`
			`- Prepare existing Slovak Twitter dataaset, trainm evaluate a model.`


Add 'pages/interns/sevval_bulburu/README.md' 2023-08-22 08:06:15 +00:00