Update 'pages/topics/resources/README.md'

This commit is contained in:
dano 2020-05-04 15:20:36 +00:00
parent ab5ebfccf6
commit d5280e1a82

View File

@ -43,6 +43,7 @@ Europarlament
- [Aranea](http://ucts.uniba.sk/aranea_about/)
- [SkTenTen](https://www.sketchengine.eu/sktenten-slovak-corpus/) automaticky POS anotovaný, prístup cez web rozhranie
- [CommonCrawl](https://commoncrawl.org/2020/03/february-2020-crawl-archive-now-available/) Obsahuje aj slovenské dáta?
- [Oscar](https://traces1.inria.fr/oscar/) klasifikácia a deduplikácia dát z COmmonCrawl, aj pre slovenčinu (4.5 GB dedub, 665M slov dedup.)
### Wikipedia