Update 'pages/topics/question/README.md'

This commit is contained in:
dano 2021-09-03 06:52:26 +00:00
parent 89ef614a6b
commit 0868ef694e

View File

@ -11,6 +11,7 @@ taxonomy:
- [Project repository](https://git.kemt.fei.tuke.sk/dano/annotation) (private)
- [Annotation Manual for question annotation](navod)
- [Annotation Manual for validations](validacie)
- [Annotation Manual for unanswerable questions](nezodpovedatelne)
- [Summary database application](https://app.question.tukekemt,xyz)
@ -63,6 +64,19 @@ Notes:
- [167 good articles](https://sk.wikipedia.org/wiki/Wikip%C3%A9dia:Zoznam_dobr%C3%BDch_%C4%8Dl%C3%A1nkov)
- [Wiki Facts](https://sk.wikipedia.org/wiki/Wikip%C3%A9dia:Zauj%C3%ADmavosti)
## Finished Tasks
### Annotation Manual
Output: Recommendations for annotators
Done:
- Web Page for annotators (Daniel Hládek)
- Modivation video (Daniel Hládek)
- Video with instructions (Daniel Hládek)
bn application?
### Question Annotation
An annotation recipe for Prodigy
@ -79,15 +93,6 @@ Done:
- answer annotation together with question (Daniel Hládek)
- prepare final input paragraphs (dataset)
In progress:
- More annotations (volunteers and workers).
To be done:
- Prepare development set
### Annotation Web Application
Annotation work summary, web applicatiobn
@ -104,11 +109,6 @@ Done:
- application deployment (Daniel Hládek)
- extract annotations from question annotation in squad format (Daniel Hladek)
To be done:
- review of validations
### Annotation Validation
Input: annnotated questions and paragraph
@ -120,60 +120,53 @@ Done:
- Recipe for validations (binary annotation for paragraphs, question and answers, text fields for correction of question and answer). (Daniel Hládek)
- Deployment
To be done:
- Prepare for production
## Tasks in progress
### Annotation Manual
### Unanswerable question annotation
Output: Recommendations for annotators
Input: validated questions and answers
Output: Unanswerable questions and answers
Done:
- Web Page for annotators (Daniel Hládek)
- Modivation video (Daniel Hládek)
- Video with instructions (Daniel Hládek)
- Annotation manual
- Annotation interface
- Database schema modifications
- Modification of the database application
- Export of validations
In progress:
- Should be instructions a part of the annotation webn application?
- Annotaion process optimization
### Question Answering Model
### Final Data Export
Training the model with annotated data
Input: Validations and unanswerable questions
Input: An annotated QA database
Output: Final database in SQUAD format
Output: An evaluated model for QA
Done:
- Preliminary export script
To be done:
- Selecting existing modelling approach
- Evaluation set selection
- Model evaluation
- Supporting the annotation with the model (pre-selecting answers)
- Final export script
- Database web visualization
- Prepare development set
In progress:
## Resources
- Preliminary model (Ján Staš and Matej Čarňanský)
## Existing implementations
- https://github.com/facebookresearch/DrQA
- https://github.com/brmson/yodaqa
- https://github.com/5hirish/adam_qas
- https://github.com/WDAqua/Qanary - metodológia a implementácia QA
## Bibligraphy
### Bibligraphy
- Reading Wikipedia to Answer Open-Domain Questions, Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes
Facebook Research
- SQuAD: 100,000+ Questions for Machine Comprehension of Text https://arxiv.org/abs/1606.05250
- [WDaqua](https://wdaqua.eu/our-work/) publications
## Existing Datasets
### Existing Datasets
- [Squad](https://rajpurkar.github.io/SQuAD-explorer/) The Stanford Question Answering Dataset(SQuAD) (Rajpurkar et al., 2016)
- [WebQuestions](https://github.com/brmson/dataset-factoid-webquestions)
@ -210,3 +203,24 @@ Output:
- a trained model
- evaluation of the model (if possible)
### Question Answering Model
Training the model with annotated data
Input: An annotated QA database
Output: An evaluated model for QA
To be done:
- Selecting existing modelling approach
- Evaluation set selection
- Model evaluation
- Supporting the annotation with the model (pre-selecting answers)
In progress:
- Preliminary model (Ján Staš and Matej Čarňanský)