Update 'pages/topics/bert/README.md'

2021-10-21 11:10:46 +00:00 · 2021-10-21 11:10:46 +00:00 · 13faa91944
commit 13faa91944
parent 218e472e17
1 changed files with 12 additions and 9 deletions
--- a/pages/topics/bert/README.md
+++ b/pages/topics/bert/README.md
@ -17,17 +17,9 @@ author: Daniel Hládek
 - [SlovakBERT](https://github.com/gerulata/slovakbert) od Kinit, a [článok](https://arxiv.org/abs/2109.15254)
 - [SK Quad](/topics/question) - Slovak Question Answering Dataset 
 - bakalárska práca [Ondrej Megela](/students/2018/ondrej_megela)
+- https://git.kemt.fei.tuke.sk/dano/bert-train

-## Hardvérové požiadavky

-[https://medium.com/nvidia-ai/how-to-scale-the-bert-training-with-nvidia-gpus-c1575e8eaf71](zz):
-
-    When the mini-batch size n is multiplied by k, we should multiply the starting learning rate η by the square root of k as some theories may suggest. However, with experiments from multiple researchers, linear scaling shows better results, i.e. multiply the starting learning rate by k instead.
-
-| BERT Large | 330M |
-| BERT Base | 110M |
-
-Väčšia veľkosť vstupného vektora => menšia veľkosť dávky => menší parameter učenia => pomalšie učenie


 ## Hotové úlohy
@ -76,3 +68,14 @@ Väčšia veľkosť vstupného vektora => menšia veľkosť dávky => menší pa
 - Natrénovať BART model.
 - Natrénovať model založený na znakoch.
 - Adaptovať SlovakBERT na SQUAD. To znamená dorobiť úlohu SQUAD do fairseq.
+
+## Hardvérové požiadavky
+
+[https://medium.com/nvidia-ai/how-to-scale-the-bert-training-with-nvidia-gpus-c1575e8eaf71](zz):
+
+    When the mini-batch size n is multiplied by k, we should multiply the starting learning rate η by the square root of k as some theories may suggest. However, with experiments from multiple researchers, linear scaling shows better results, i.e. multiply the starting learning rate by k instead.
+
+| BERT Large | 330M |
+| BERT Base | 110M |
+
+Väčšia veľkosť vstupného vektora => menšia veľkosť dávky => menší parameter učenia => pomalšie učenie