forked from KEMT/zpwiki
4.5 KiB
4.5 KiB
title | published | taxonomy | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Manohar Gowdru Shridharu | true |
|
Manohar Gowdru Shridharu
Beginning of the study: 2021
Disertation Thesis
in 2023/24
Hate Speech Detection
Goals:
- Write a dissertaion thesis
- Publish 2 A-class journal papers
Minimal Thesis
(preliminary dissertaion and exam in 2022/23)
Goals:
- Provide state-of-the-art overview.
- Formulate dissertation theses (describe scientific contribution of the thesis).
- Prepare to reach the scientific contribution.
- Publish 4 conference papers.
First year of PhD study
Goals:
- Provide state-of-the-art overview.
- Read and make notes from at least 100 scientific papers or books.
- Publish at least 2 conference papers.
- Prepare for minimal thesis.
Resources:
- Hate Speech Project Page
- https://hatespeechdata.com/
- Hate speech detection: Challenges and solutions
- HateBase
- Resources and benchmark corpora for hate speech detection: a systematic review
Meeting 10.1.22
- Set up a git account https://github.com/ManoGS with script to prepare "twitter" dataset and "english" dataset for HS detection.
- confgured laptop with (Anaconda) / PyCharm, pytorch, cuda gone throug some basic python tutorials.
- Read some blogs how to use kaggle (dataset database).
- tutorials on huggingface transformers - understanding sentiment analysis.
Open tasks:
- Continue to work on the review - with datasets and methods (specified below).
- Read and make notes about transformers, neural language models and finentuning.
- Pick feasible dataset and method to start with.
- You can use the school CUDA infrastructre (idoc.fei.tuke.sk).
- Set up a repository for experiments, use the school git server git.kemt.fei.tuke.sk.
- Get ready to pos a paper on the school PhD conference SCYR, deadline is in the middle of February.
Meeting 16.12.21
- A report was provided (through Teams).
- Installed Anaconda and started s Transformers tutorial
- Started Dive into python book
Task:
- Report: Create a detailed list of available datasets for HS.
- Report: Create a detailed description of the state of the art approaches for HS detection.
- Practical: Continue with open tasks below. (pick datasetm, perform classification,evaluate the experiment.)
Meeting 10.12.21
No report (just draft) was provided so far.
- Read papers from below and make notes what you have learned fro the papers. For each note make a bibliographic citation. Write down authors of the paper, name paper of the paper, year, publisher and other important information. When you find out something, make a reference with a number to that paper. You can use a bibliografic manager software. Mendeley, Endnote, Jabref.
- From the papers find out answers to the questions below.
- Pick a hatespeech dataset.
- Pick an approach and Python library for HS classification.
- Create a GIT repository and share your experiment files. Do not commit data files, just links how to download the files.
- Perform and evaluate experiments.
Meeting 10.11.21
First tasks
Prepare a report where you will explain:
- what is hate speech detection,
- where and why you can use hate-speech detection,
- what are state-of-the-art methods for hate speech detection,
- how can you evaluate a hate-speech detection system,
- what datasets for hate-speech detection are available,
The report should properly cite scientific bibliographical sources. Use a bibliography manager software, such as Mendeley.
Create a VPN connection to the university network to have access to the scientific databses. Use scientific indexes to discover literature:
Your review can start with:
- Hate speech detection: Challenges and solutions
- HateBase
- Resources and benchmark corpora for hate speech detection: a systematic review
Get to know the Python programming language
- Read Dive into Python
- Install Anaconda
- Try HuggingFace Transformers library