forked from KEMT/zpwiki
		
	Update 'pages/topics/bert/README.md'
This commit is contained in:
		
							parent
							
								
									218e472e17
								
							
						
					
					
						commit
						13faa91944
					
				| @ -17,17 +17,9 @@ author: Daniel Hládek | ||||
| - [SlovakBERT](https://github.com/gerulata/slovakbert) od Kinit, a [článok](https://arxiv.org/abs/2109.15254) | ||||
| - [SK Quad](/topics/question) - Slovak Question Answering Dataset  | ||||
| - bakalárska práca [Ondrej Megela](/students/2018/ondrej_megela) | ||||
| - https://git.kemt.fei.tuke.sk/dano/bert-train | ||||
| 
 | ||||
| ## Hardvérové požiadavky | ||||
| 
 | ||||
| [https://medium.com/nvidia-ai/how-to-scale-the-bert-training-with-nvidia-gpus-c1575e8eaf71](zz): | ||||
| 
 | ||||
|     When the mini-batch size n is multiplied by k, we should multiply the starting learning rate η by the square root of k as some theories may suggest. However, with experiments from multiple researchers, linear scaling shows better results, i.e. multiply the starting learning rate by k instead. | ||||
| 
 | ||||
| | BERT Large | 330M | | ||||
| | BERT Base | 110M | | ||||
| 
 | ||||
| Väčšia veľkosť vstupného vektora => menšia veľkosť dávky => menší parameter učenia => pomalšie učenie | ||||
| 
 | ||||
| 
 | ||||
| ## Hotové úlohy | ||||
| @ -76,3 +68,14 @@ Väčšia veľkosť vstupného vektora => menšia veľkosť dávky => menší pa | ||||
| - Natrénovať BART model. | ||||
| - Natrénovať model založený na znakoch. | ||||
| - Adaptovať SlovakBERT na SQUAD. To znamená dorobiť úlohu SQUAD do fairseq. | ||||
| 
 | ||||
| ## Hardvérové požiadavky | ||||
| 
 | ||||
| [https://medium.com/nvidia-ai/how-to-scale-the-bert-training-with-nvidia-gpus-c1575e8eaf71](zz): | ||||
| 
 | ||||
|     When the mini-batch size n is multiplied by k, we should multiply the starting learning rate η by the square root of k as some theories may suggest. However, with experiments from multiple researchers, linear scaling shows better results, i.e. multiply the starting learning rate by k instead. | ||||
| 
 | ||||
| | BERT Large | 330M | | ||||
| | BERT Base | 110M | | ||||
| 
 | ||||
| Väčšia veľkosť vstupného vektora => menšia veľkosť dávky => menší parameter učenia => pomalšie učenie | ||||
|  | ||||
		Loading…
	
		Reference in New Issue
	
	Block a user