forked from KEMT/zpwiki
		
	| .. | ||
| README.md | ||
Question Answering
Task definition:
- Create a clone of SQuaD 2.0 in Slovak language
 - Setup annotation infrastructure
 - Perform and evaluate annotations
 - Consider using machine translation
 - Train and evaluate Question Answering model
 
Tasks
Raw Data Preparation
Input: Wikipedia
Output: a set of paragraphs
- Obtaining and parsing of wikipedia dump
 - Selecting feasible paragraphs
 
Notes:
- PageRank Causes bias to geography, random selection might be the best
 - 75 best articles
 - 167 good articles
 - Wiki Facts
 
Question Annotation
Input: A set of paragraphs
Output: A question for each paragraph
Answer Annotation
Input: A set of paragraphs and questions
Output: An answer for each paragraph and question
Annotation Summary
Annotation work summary
Input: Database of annotations
Output: Summary of work performed by each annotator
Annotation Manual
Output: Recommendations for annotators
Question Answering Model
Input: An annotated QA database
Otput: An evaluated model for QA
Traing the model with annotated data:
- Selecting existing modelling approach
 - Evaluation set selection
 - Model evaluation
 - Supporting the annotation with the model (pre-selecting answers)
 
Supporting activities
Output: More annotations
Organizing voluntary student challenges to support the annotation process
Existing implementations
- https://github.com/facebookresearch/DrQA
 - https://github.com/brmson/yodaqa
 - https://github.com/5hirish/adam_qas
 - https://github.com/WDAqua/Qanary - metodológia a implementácia QA
 
Bibligraphy
- Reading Wikipedia to Answer Open-Domain Questions, Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes Facebook Research
 - SQuAD: 100,000+ Questions for Machine Comprehension of Text https://arxiv.org/abs/1606.05250
 
Existing Datasets
- Squad TheStanfordQuestionAnsweringDataset(SQuAD) (Rajpurkar et al., 2016)
 - WebQuestions
 - https://en.wikipedia.org/wiki/Freebase