This commit is contained in:
Daniel Hládek 2020-06-11 14:43:08 +02:00
parent 01077d3731
commit d88f27564f

View File

@ -1,11 +1,9 @@
# Question Answering
[Project repository](https://git.kemt.fei.tuke.sk/dano/annotation)
[Project repository](https://git.kemt.fei.tuke.sk/dano/annotation) (private)
## Project Description
Task definition:
- Create a clone of [SQuaD 2.0](https://rajpurkar.github.io/SQuAD-explorer/) in the Slovak language
- Setup annotation infrastructure with [Prodigy](https://prodi.gy/)
- Perform and evaluate annotations of [Wikipedia data](https://dumps.wikimedia.org/backup-index.html).
@ -129,12 +127,13 @@ TBD
- Reading Wikipedia to Answer Open-Domain Questions, Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes
Facebook Research
- SQuAD: 100,000+ Questions for Machine Comprehension of Text https://arxiv.org/abs/1606.05250
- [WDaqua](https://wdaqua.eu/our-work/) publications
## Existing Datasets
- Squad TheStanfordQuestionAnsweringDataset(SQuAD) (Rajpurkar et al., 2016)
- WebQuestions
- https://en.wikipedia.org/wiki/Freebase
- [Squad](https://rajpurkar.github.io/SQuAD-explorer/) The Stanford Question Answering Dataset(SQuAD) (Rajpurkar et al., 2016)
- [WebQuestions](https://github.com/brmson/dataset-factoid-webquestions)
- [Freebase](https://en.wikipedia.org/wiki/Freebase)
## Intern tasks
@ -142,6 +141,7 @@ Week 1: Intro
- Get acquainted with the project and Squad Database
- Download the database and study the bibliography
- Study [Prodigy annnotation](https://Prodi.gy) tool
Week 2 and 3: Web Application
@ -160,7 +160,7 @@ Select and train a working question answering system
Output:
- a deployment script with comments for a selected question answering system
- a working training recipe (can use English ata), a script with comments or Jupyter Notebook
- a working training recipe (can use English data), a script with comments or Jupyter Notebook
- a trained model
- evaluation of the model (if possible)