zpwiki/pages/interns/oliver_pejic/README.md

---
title: Oliver Pejic
published: true
taxonomy:
    category: [iaeste]
    tag: [hatespeech,nlp]
    author: Daniel Hladek
---

Oliver Pejic

IAESTE Intern Summer 2024, six weeks in August and September

Goal:

- Help with the [Hate Speech Project](/topics/hatespeech)
- Help with evaluation of sentence transformer models using toolkit [MTEB](https://github.com/embeddings-benchmark/mteb)

Final Tasks:

- Prepare an MTEB evaluation task for [Slovak HATE speech](https://huggingface.co/datasets/TUKE-KEMT/hate_speech_slovak).
- Prepare an MTEB evaluation task for [Slovak question answering](https://huggingface.co/datasets/TUKE-KEMT/retrieval-skquad).
- [Machine translate](https://huggingface.co/google/madlad400-3b-mt) an SBERT evaluation set for multiple slavic languages.
- Write a short scientific paper with results.

Preparation:

- Get familiar with [SentenceTransformer](https://sbert.net/) framework, study fundamental papers and write down notes.
- Get familiar with [MTEB](https://github.com/embeddings-benchmark/mteb) evaluation framework.
- Prepare a working  environment on Google Colab or on school server or Anaconda.
- Get familiar with [existing finetuning scripts](https://git.kemt.fei.tuke.sk/dano/slovakretrieval).