forked from KEMT/zpwiki
Aktualizovat „pages/students/2016/maros_harahus/timovy_projekt/README.md“
This commit is contained in:
parent
c518bffc3b
commit
d3be1b2bac
@ -56,44 +56,44 @@ Tokenizácia sa voľne označuje ako segmentácia textového dokumentu na slová
|
|||||||
| X | other | iné |
|
| X | other | iné |
|
||||||
|
|
||||||
|
|
||||||
| Tag | Description |Slovensky vyznam | Example |
|
| Tag | Description | Slovensky vyznam | Example |
|
||||||
|------|-------------------------------------------|---|----------------------------|
|
|:----:|:-----------------------------------------:|:----------------:|:--------------------------:|
|
||||||
| CC | conjunction, coordinating | | and, or, but |
|
| CC | conjunction, coordinating | | and, or, but |
|
||||||
| CD | cardinal number | | five, three, 13% |
|
| CD | cardinal number | | five, three, 13% |
|
||||||
| DT | determiner | | the, a, these |
|
| DT | determiner | | the, a, these |
|
||||||
| EX | existential there | | there were six boys |
|
| EX | existential there | | there were six boys |
|
||||||
| FW | foreign word | | mais |
|
| FW | foreign word | | mais |
|
||||||
| IN | conjunction, subordinating or preposition | | of, on, before, unless |
|
| IN | conjunction, subordinating or preposition | | of, on, before, unless |
|
||||||
| JJ | adjective | | nice, easy |
|
| JJ | adjective | | nice, easy |
|
||||||
| JJR | adjective, comparative | | nicer, easier |
|
| JJR | adjective, comparative | | nicer, easier |
|
||||||
| JJS | adjective, superlative | | nicest, easiest |
|
| JJS | adjective, superlative | | nicest, easiest |
|
||||||
| LS | list item marker | | |
|
| LS | list item marker | | |
|
||||||
| MD | verb, modal auxillary | | may, should |
|
| MD | verb, modal auxillary | | may, should |
|
||||||
| NN | noun, singular or mass | | tiger, chair, laughter |
|
| NN | noun, singular or mass | | tiger, chair, laughter |
|
||||||
| NNS | noun, plural | | tigers, chairs, insects |
|
| NNS | noun, plural | | tigers, chairs, insects |
|
||||||
| NNP | noun, proper singular | | Germany, God, Alice |
|
| NNP | noun, proper singular | | Germany, God, Alice |
|
||||||
| NNPS | noun, proper plural | | we met two Christmases ago |
|
| NNPS | noun, proper plural | | we met two Christmases ago |
|
||||||
| PDT | predeterminer | | both his children |
|
| PDT | predeterminer | | both his children |
|
||||||
| POS | possessive ending | | 's |
|
| POS | possessive ending | | 's |
|
||||||
| PRP | pronoun, personal | | me, you, it |
|
| PRP | pronoun, personal | | me, you, it |
|
||||||
| PRP$ | pronoun, possessive | | my, your, our |
|
| PRP$ | pronoun, possessive | | my, your, our |
|
||||||
| RB | adverb | | extremely, loudly, hard |
|
| RB | adverb | | extremely, loudly, hard |
|
||||||
| RBR | adverb, comparative | | better |
|
| RBR | adverb, comparative | | better |
|
||||||
| RBS | adverb, superlative | | best |
|
| RBS | adverb, superlative | | best |
|
||||||
| RP | adverb, particle | | about, off, up |
|
| RP | adverb, particle | | about, off, up |
|
||||||
| SYM | symbol | | % |
|
| SYM | symbol | | % |
|
||||||
| TO | infinitival to | | what to do |
|
| TO | infinitival to | | what to do |
|
||||||
| UH | interjection | | oh, oops, gosh |
|
| UH | interjection | | oh, oops, gosh |
|
||||||
| VB | verb, base form | | think |
|
| VB | verb, base form | | think |
|
||||||
| VBZ | verb, 3rd person singular present | | she thinks |
|
| VBZ | verb, 3rd person singular present | | she thinks |
|
||||||
| VBP | verb, non-3rd person singular present | | I think |
|
| VBP | verb, non-3rd person singular present | | I think |
|
||||||
| VBD | verb, past tense | | they thought |
|
| VBD | verb, past tense | | they thought |
|
||||||
| VBN | verb, past participle | | a sunken ship |
|
| VBN | verb, past participle | | a sunken ship |
|
||||||
| VBG | verb, gerund or present participle | | thinking is fun |
|
| VBG | verb, gerund or present participle | | thinking is fun |
|
||||||
| WDT | wh-determiner | | which, whatever, whichever |
|
| WDT | wh-determiner | | which, whatever, whichever |
|
||||||
| WP | wh-pronoun, personal | | what, who, whom |
|
| WP | wh-pronoun, personal | | what, who, whom |
|
||||||
| WP$ | wh-pronoun, possessive | | whose, whosever |
|
| WP$ | wh-pronoun, possessive | | whose, whosever |
|
||||||
| WRB | wh-adverb | | where, when |
|
| WRB | wh-adverb | | where, when |
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -101,7 +101,7 @@ Tokenizácia sa voľne označuje ako segmentácia textového dokumentu na slová
|
|||||||
|
|
||||||
# Inštalácia spacy
|
# Inštalácia spacy
|
||||||
|
|
||||||
```bash
|
```python
|
||||||
pip install -U spaCy //instalacia spacy
|
pip install -U spaCy //instalacia spacy
|
||||||
python -m spacy download en //instalacia anglickeho jazyka
|
python -m spacy download en //instalacia anglickeho jazyka
|
||||||
```
|
```
|
||||||
@ -185,6 +185,6 @@ for ent in doc.ents:
|
|||||||
Na tomto príklade môžme pozorovať, že spacy dokáže rozlíšiť slova. Dokáže zistiť či dané slovo je napr. mesto(GPE-Geopolitical entity) alebo nejaká organizácia (ORG- Companies). Ďalej vie určiť či sa jedná o dátum, sumu peňazí, osobu atď. .
|
Na tomto príklade môžme pozorovať, že spacy dokáže rozlíšiť slova. Dokáže zistiť či dané slovo je napr. mesto(GPE-Geopolitical entity) alebo nejaká organizácia (ORG- Companies). Ďalej vie určiť či sa jedná o dátum, sumu peňazí, osobu atď. .
|
||||||
|
|
||||||
|
|
||||||
|
# Záver
|
||||||
|
V tomto semestri som sa venoval programovaciemu jazyku python. Študoval som tento programovací jazyk jeho syntax a ďalšie veci okolo tohto programovacieho jazyka. V ďalšej časti som sa venoval frameworku Spacy ktorý vytvára model pre Part Of Speech tagging. Zisťoval som čo tento framework dokáže robiť a na akom princípe funguje. Ďalších semestroch sa budem venovať podpory slovenského jazyka pre framework Spacy. Konkrétne vytvorené značkovania v slovenskom jazyku. Ako prvé si musím naštudovať sadu slovenského národného korpusu a na akom princípe funguje. V letnom semestri by som už vytvoriť mapovanie morfologických značiek slovenského národného korpusu. Ako hlavne body mojej diplomovej prace budú vypracovanie prehľadu spôsobu morfologickej anotácie slovenského jazyka. Taktiež si budem musieť pripraviť nejaké dáta s ktorými by som mohol pracovať vo svojej diplomovej práci. Poslednou cestou mojej prace bude vyhodnotiť presnosť značkovania mnou vytvoreného nastroja a navrhnúť nejaké zlepšenia ktoré by sa mohli do budúcna implementovať.
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user