--- title: Youssef Ressaissi published: true taxonomy: category: [iaeste] tag: [summarization,nlp] author: Daniel Hladek --- IAESTE Intern Summer 2025, 1.7. - 31.8.2025 Goal: Evaluate and improve language models for summarization in Slovak medical or legal domain. Tasks: - Get familiar with basic tools and prepare working environment: HF transformers, datasets, lm-evaluation-harness, HF trl - Read several recent papers about summarization using LLM and write a report. - Get familiar how to perform and evaluate document summarization using language models in Slovak. - Pick summarization datasets and models. Evaluate several models for evaluation using ROUGE and BLEU metrics. - Describe the experiments. Summarize results in a table. Describe the results. - Improve performance of a languge model. Use more data. Prepare a domain-oriented dataset and finetune a model. Maybe generate artificial data to imporve summarization. - Run new expriments and write down the results. - Publish the fine-tuned models in HF HUB. Publish the paper from the project.