diff --git a/pages/topics/question/README.md b/pages/topics/question/README.md index 93a1122a..651a2952 100644 --- a/pages/topics/question/README.md +++ b/pages/topics/question/README.md @@ -1,11 +1,9 @@ # Question Answering -[Project repository](https://git.kemt.fei.tuke.sk/dano/annotation) +[Project repository](https://git.kemt.fei.tuke.sk/dano/annotation) (private) ## Project Description -Task definition: - - Create a clone of [SQuaD 2.0](https://rajpurkar.github.io/SQuAD-explorer/) in the Slovak language - Setup annotation infrastructure with [Prodigy](https://prodi.gy/) - Perform and evaluate annotations of [Wikipedia data](https://dumps.wikimedia.org/backup-index.html). @@ -129,12 +127,13 @@ TBD - Reading Wikipedia to Answer Open-Domain Questions, Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes Facebook Research - SQuAD: 100,000+ Questions for Machine Comprehension of Text https://arxiv.org/abs/1606.05250 +- [WDaqua](https://wdaqua.eu/our-work/) publications ## Existing Datasets -- Squad TheStanfordQuestionAnsweringDataset(SQuAD) (Rajpurkar et al., 2016) -- WebQuestions -- https://en.wikipedia.org/wiki/Freebase +- [Squad](https://rajpurkar.github.io/SQuAD-explorer/) The Stanford Question Answering Dataset(SQuAD) (Rajpurkar et al., 2016) +- [WebQuestions](https://github.com/brmson/dataset-factoid-webquestions) +- [Freebase](https://en.wikipedia.org/wiki/Freebase) ## Intern tasks @@ -142,6 +141,7 @@ Week 1: Intro - Get acquainted with the project and Squad Database - Download the database and study the bibliography +- Study [Prodigy annnotation](https://Prodi.gy) tool Week 2 and 3: Web Application @@ -160,7 +160,7 @@ Select and train a working question answering system Output: - a deployment script with comments for a selected question answering system -- a working training recipe (can use English ata), a script with comments or Jupyter Notebook +- a working training recipe (can use English data), a script with comments or Jupyter Notebook - a trained model - evaluation of the model (if possible)