From d3be1b2bac571b6c92379e2b44948f60f97eef3b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Maro=C5=A1=20Harahus?= Date: Sat, 21 Dec 2019 19:18:39 +0000 Subject: [PATCH] =?UTF-8?q?Aktualizovat=20=E2=80=9Epages/students/2016/mar?= =?UTF-8?q?os=5Fharahus/timovy=5Fprojekt/README.md=E2=80=9C?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../maros_harahus/timovy_projekt/README.md | 82 +++++++++---------- 1 file changed, 41 insertions(+), 41 deletions(-) diff --git a/pages/students/2016/maros_harahus/timovy_projekt/README.md b/pages/students/2016/maros_harahus/timovy_projekt/README.md index aefa1c19..92dc8ffa 100644 --- a/pages/students/2016/maros_harahus/timovy_projekt/README.md +++ b/pages/students/2016/maros_harahus/timovy_projekt/README.md @@ -56,44 +56,44 @@ Tokenizácia sa voľne označuje ako segmentácia textového dokumentu na slová | X | other | iné | -| Tag | Description |Slovensky vyznam | Example | -|------|-------------------------------------------|---|----------------------------| -| CC | conjunction, coordinating | | and, or, but | -| CD | cardinal number | | five, three, 13% | -| DT | determiner | | the, a, these | -| EX | existential there | | there were six boys | -| FW | foreign word | | mais | -| IN | conjunction, subordinating or preposition | | of, on, before, unless | -| JJ | adjective | | nice, easy | -| JJR | adjective, comparative | | nicer, easier | -| JJS | adjective, superlative | | nicest, easiest | -| LS | list item marker | | | -| MD | verb, modal auxillary | | may, should | -| NN | noun, singular or mass | | tiger, chair, laughter | -| NNS | noun, plural | | tigers, chairs, insects | -| NNP | noun, proper singular | | Germany, God, Alice | -| NNPS | noun, proper plural | | we met two Christmases ago | -| PDT | predeterminer | | both his children | -| POS | possessive ending | | 's | -| PRP | pronoun, personal | | me, you, it | -| PRP$ | pronoun, possessive | | my, your, our | -| RB | adverb | | extremely, loudly, hard | -| RBR | adverb, comparative | | better | -| RBS | adverb, superlative | | best | -| RP | adverb, particle | | about, off, up | -| SYM | symbol | | % | -| TO | infinitival to | | what to do | -| UH | interjection | | oh, oops, gosh | -| VB | verb, base form | | think | -| VBZ | verb, 3rd person singular present | | she thinks | -| VBP | verb, non-3rd person singular present | | I think | -| VBD | verb, past tense | | they thought | -| VBN | verb, past participle | | a sunken ship | -| VBG | verb, gerund or present participle | | thinking is fun | -| WDT | wh-determiner | | which, whatever, whichever | -| WP | wh-pronoun, personal | | what, who, whom | -| WP$ | wh-pronoun, possessive | | whose, whosever | -| WRB | wh-adverb | | where, when | +| Tag | Description | Slovensky vyznam | Example | +|:----:|:-----------------------------------------:|:----------------:|:--------------------------:| +| CC | conjunction, coordinating | | and, or, but | +| CD | cardinal number | | five, three, 13% | +| DT | determiner | | the, a, these | +| EX | existential there | | there were six boys | +| FW | foreign word | | mais | +| IN | conjunction, subordinating or preposition | | of, on, before, unless | +| JJ | adjective | | nice, easy | +| JJR | adjective, comparative | | nicer, easier | +| JJS | adjective, superlative | | nicest, easiest | +| LS | list item marker | | | +| MD | verb, modal auxillary | | may, should | +| NN | noun, singular or mass | | tiger, chair, laughter | +| NNS | noun, plural | | tigers, chairs, insects | +| NNP | noun, proper singular | | Germany, God, Alice | +| NNPS | noun, proper plural | | we met two Christmases ago | +| PDT | predeterminer | | both his children | +| POS | possessive ending | | 's | +| PRP | pronoun, personal | | me, you, it | +| PRP$ | pronoun, possessive | | my, your, our | +| RB | adverb | | extremely, loudly, hard | +| RBR | adverb, comparative | | better | +| RBS | adverb, superlative | | best | +| RP | adverb, particle | | about, off, up | +| SYM | symbol | | % | +| TO | infinitival to | | what to do | +| UH | interjection | | oh, oops, gosh | +| VB | verb, base form | | think | +| VBZ | verb, 3rd person singular present | | she thinks | +| VBP | verb, non-3rd person singular present | | I think | +| VBD | verb, past tense | | they thought | +| VBN | verb, past participle | | a sunken ship | +| VBG | verb, gerund or present participle | | thinking is fun | +| WDT | wh-determiner | | which, whatever, whichever | +| WP | wh-pronoun, personal | | what, who, whom | +| WP$ | wh-pronoun, possessive | | whose, whosever | +| WRB | wh-adverb | | where, when | @@ -101,7 +101,7 @@ Tokenizácia sa voľne označuje ako segmentácia textového dokumentu na slová # Inštalácia spacy -```bash +```python pip install -U spaCy //instalacia spacy python -m spacy download en //instalacia anglickeho jazyka ``` @@ -185,6 +185,6 @@ for ent in doc.ents: Na tomto príklade môžme pozorovať, že spacy dokáže rozlíšiť slova. Dokáže zistiť či dané slovo je napr. mesto(GPE-Geopolitical entity) alebo nejaká organizácia (ORG- Companies). Ďalej vie určiť či sa jedná o dátum, sumu peňazí, osobu atď. . - - +# Záver +V tomto semestri som sa venoval programovaciemu jazyku python. Študoval som tento programovací jazyk jeho syntax a ďalšie veci okolo tohto programovacieho jazyka. V ďalšej časti som sa venoval frameworku Spacy ktorý vytvára model pre Part Of Speech tagging. Zisťoval som čo tento framework dokáže robiť a na akom princípe funguje. Ďalších semestroch sa budem venovať podpory slovenského jazyka pre framework Spacy. Konkrétne vytvorené značkovania v slovenskom jazyku. Ako prvé si musím naštudovať sadu slovenského národného korpusu a na akom princípe funguje. V letnom semestri by som už vytvoriť mapovanie morfologických značiek slovenského národného korpusu. Ako hlavne body mojej diplomovej prace budú vypracovanie prehľadu spôsobu morfologickej anotácie slovenského jazyka. Taktiež si budem musieť pripraviť nejaké dáta s ktorými by som mohol pracovať vo svojej diplomovej práci. Poslednou cestou mojej prace bude vyhodnotiť presnosť značkovania mnou vytvoreného nastroja a navrhnúť nejaké zlepšenia ktoré by sa mohli do budúcna implementovať.