zz

2025-08-04 14:13:40 +02:00 · 2025-08-04 14:13:40 +02:00 · a099634802
commit a099634802
parent cdcc660b16
1 changed files with 23 additions and 2 deletions
--- a/pages/interns/yussef_ressaissi/README.md
+++ b/pages/interns/yussef_ressaissi/README.md
@ -30,6 +30,27 @@ Tasks:
  - Prepare a final report with analysis, experiments and conclusions.
  - Publish the fine-tuned models in HF HUB. Publish the paper from the project.

+Meeting 4.8.
+
+State:
+
+- Tested LMs with ROUGE metrics, most models got 4-5 ROGUE, facebook/mbart-large-50  got 17 (trained for translation).
+- In my opinion, large-50 is not good for finetuning, because it is already fine tuned for translation.
+- no finetuning done yet. 
+
+Tasks:
+
+- Try evaluate google/flan-t5-large, kiviki/mbart-slovaksum-large-sum and similar models. These should be already working.
+- continue working on finetuning t5 or Mbart models, but ask when you are stuck. Use hf examples script on summarization
+
+Future tasks:
+
+- use LLMS (open or closed) and evaluate (ROUGE) summarization without fine-tuning on slovak legal data set
+- install lm-eval-harness, learn it, prepare and run task for slovak summarization
+
+
+
+
 Meeting 24.7.

 State:
@ -50,8 +71,8 @@ State:

 - Studying of the task, metrics (ROUGE,BLEU)
 - Loaded a model. preprocessed a dataset, evaluated a model
- loaded more models, used SlovakSum, generated summarization with four model and comapre them with ROUGE and BLEU (TUKE-KEMT/slovak-t5-base, google/mt5-small,  google/mt5-base,  facebook/mbart-large-50)
- the comparisin is without fine tuning (zero shot), for far, the best is MBART-large
+- loaded more models, used SlovakSum, generated summarization with four model and compare them with ROUGE and BLEU (TUKE-KEMT/slovak-t5-base, google/mt5-small,  google/mt5-base,  facebook/mbart-large-50)
+- the comparison is without fine tuning (zero shot), for far, the best is MBART-large
 - working on legal dataset "dennlinger/eur-lex-sum", 
 - notebooks are on the kemt git