This commit is contained in:
Dnaiel Hladek 2024-08-06 01:05:54 +02:00
parent b06eb9c21c
commit 946bc7f9f1
3 changed files with 46 additions and 1 deletions

View File

@ -1,5 +1,16 @@
---
title: Cesar Abascal Gutierrez
published: true
taxonomy:
category: [iaeste]
tag: [ner,nlp]
author: Daniel Hladek
---
## Named entity annotations ## Named entity annotations
Intern, probably summer 2019
Cesar Abascal Gutierrez <cesarbielva1994@gmail.com> Cesar Abascal Gutierrez <cesarbielva1994@gmail.com>
## Goals ## Goals

View File

@ -0,0 +1,34 @@
---
title: Oliver Pejic
published: true
taxonomy:
category: [iaeste]
tag: [hatespeech,nlp]
author: Daniel Hladek
---
Oliver Pejic
IAESTE Intern Summer 2024, six weeks in August and September
Goal:
- Help with the [Hate Speech Project](/topics/hatespeech)
- Help with evaluation of sentence transformer models using toolkit [MTEB](https://github.com/embeddings-benchmark/mteb)
Final Tasks:
- Prepare an MTEB evaluation task for [Slovak HATE speech](https://huggingface.co/datasets/TUKE-KEMT/hate_speech_slovak).
- Prepare an MTEB evaluation task for [Slovak question answering](https://huggingface.co/datasets/TUKE-KEMT/retrieval-skquad).
- [Machine translate](https://huggingface.co/google/madlad400-3b-mt) an SBERT evaluation set for multiple slavic languages.
- Write a short scientific paper with results.
Preparation:
- Get familiar with [SentenceTransformer](https://sbert.net/) framework, study fundamental papers and write down notes.
- Get familiar with [MTEB](https://github.com/embeddings-benchmark/mteb) evaluation framework.
- Prepare a working environment on Google Colab or on school server or Anaconda.
- Get familiar with [existing finetuning scripts](https://git.kemt.fei.tuke.sk/dano/slovakretrieval).

View File

@ -2,7 +2,7 @@
title: Sevval Bulburu title: Sevval Bulburu
published: true published: true
taxonomy: taxonomy:
category: [iaeste2023] category: [iaeste]
tag: [hatespeech,nlp] tag: [hatespeech,nlp]
author: Daniel Hladek author: Daniel Hladek
--- ---