zpwiki/pages/interns/yussef_ressaissi
2025-06-26 09:34:22 +02:00
..
README.md zz 2025-06-26 09:34:22 +02:00

title published taxonomy
Youssef Ressaissi true
category tag author
iaeste
summarization
nlp
Daniel Hladek

IAESTE Intern Summer 2025, 1.7. - 31.8.2025

Goal: Evaluate and improve language models for summarization in Slovak medical or legal domain.

Tasks:

  • Get familiar with basic tools and prepare working environment: HF transformers, datasets, lm-evaluation-harness, HF trl
  • Read several recent papers about summarization using LLM and write a report.
  • Get familiar how to perform and evaluate document summarization using language models in Slovak.
  • Pick summarization datasets and models. Evaluate several models for evaluation using ROUGE and BLEU metrics.
  • Describe the experiments. Summarize results in a table. Describe the results.
  • Improve performance of a languge model. Use more data. Prepare a domain-oriented dataset and finetune a model. Maybe generate artificial data to imporve summarization.
  • Run new expriments and write down the results.
  • Publish the fine-tuned models in HF HUB. Publish the paper from the project.