diff --git a/pages/interns/yussef_ressaissi/README.md b/pages/interns/yussef_ressaissi/README.md index c805ed90..410cc750 100644 --- a/pages/interns/yussef_ressaissi/README.md +++ b/pages/interns/yussef_ressaissi/README.md @@ -15,13 +15,18 @@ Goal: Evaluate and improve language models for summarization in Slovak medical o Tasks: -- Get familiar with basic tools and prepare working environment: HF transformers, datasets, lm-evaluation-harness, HF trl -- Read several recent papers about summarization using LLM and write a report. -- Get familiar how to perform and evaluate document summarization using language models in Slovak. -- Pick summarization datasets and models. Evaluate several models for evaluation using ROUGE and BLEU metrics. -- Describe the experiments. Summarize results in a table. Describe the results. -- Improve performance of a languge model. Use more data. Prepare a domain-oriented dataset and finetune a model. Maybe generate artificial data to imporve summarization. -- Run new expriments and write down the results. -- Publish the fine-tuned models in HF HUB. Publish the paper from the project. +1. Get familiar with basic tools + - and prepare working environment: HF transformers, datasets, lm-evaluation-harness, HF trl + - Read several recent papers about summarization using LLM and write a report. + - Get familiar how to perform and evaluate document summarization using language models in Slovak. +2. Make a comparison experiment + - Pick summarization datasets and models. Evaluate several models for evaluation using ROUGE and BLEU metrics. + - Describe the experiments. Summarize results in a table. Describe the results. +3. Improve performance of a languge model. + - Use more data. Prepare a domain-oriented dataset and finetune a model. Maybe generate artificial data to imporve summarization. + - Run new expriments and write down the results. +4. Report and disseminate + - Prepare a final report with analysis, experiments and conclusions. + - Publish the fine-tuned models in HF HUB. Publish the paper from the project.