- Proposed and tried extra layers above BERT model to make a classifier in seriees of experiments. There is a single sigmoid neuron on the output.
- Manually adjusted the slovak HS dataset. Slovak dataset is not balanced. Tried some methods for "balancing" the dataset. By google translate - augmentation. Other samples are "generated" by translation into random langauge and translating back. This creates "paraphrases" of the original samples. It helps.
- Please send me your work report. Please upload your scripts and notebooks on git and send me a link. git is git.kemt.fei.tuke.sk or github. Prepare some short comment about scripts.You can also upload the slovak datasetthere is some work done on it.
- Use WinSCP to Copy Files. Use anaconda virtual env to create and activate a new python virtual environment. Use Visual Studio Code Remote to delvelop on your computer and run on remote computer (idoc.fei.tuke.sk). Use the same credentials for idoc server.
- [ ] Get familiar with the task of Hate speech detection. Find out how can we use Transformer neural networks to detect and categorize hate speech in internet comments created by random people.
- [ ] Get familiar with the basic tools: Huggingface Transformers, Learn how to use https://huggingface.co/Andrazp/multilingual-hate-speech-robacofi in Python script. Learn something about Transformer neural networks.