dmytro_ushatenko/pages/students/2021/manohar_gowdru_shridharu/README.md

4.5 KiB

title published taxonomy
Manohar Gowdru Shridharu true
category tag author
phd2024
lm
nlp
hatespeech
Daniel Hladek

Manohar Gowdru Shridharu

Beginning of the study: 2021

Disertation Thesis

in 2023/24

Hate Speech Detection

Goals:

  • Write a dissertaion thesis
  • Publish 2 A-class journal papers

Minimal Thesis

(preliminary dissertaion and exam in 2022/23)

Goals:

  • Provide state-of-the-art overview.
  • Formulate dissertation theses (describe scientific contribution of the thesis).
  • Prepare to reach the scientific contribution.
  • Publish 4 conference papers.

First year of PhD study

Goals:

  • Provide state-of-the-art overview.
  • Read and make notes from at least 100 scientific papers or books.
  • Publish at least 2 conference papers.
  • Prepare for minimal thesis.

Resources:

Meeting 10.1.22

  • Set up a git account https://github.com/ManoGS with script to prepare "twitter" dataset and "english" dataset for HS detection.
  • confgured laptop with (Anaconda) / PyCharm, pytorch, cuda gone throug some basic python tutorials.
  • Read some blogs how to use kaggle (dataset database).
  • tutorials on huggingface transformers - understanding sentiment analysis.

Open tasks:

  • Continue to work on the review - with datasets and methods (specified below).
  • Read and make notes about transformers, neural language models and finentuning.
  • Pick feasible dataset and method to start with.
  • You can use the school CUDA infrastructre (idoc.fei.tuke.sk).
  • Set up a repository for experiments, use the school git server git.kemt.fei.tuke.sk.
  • Get ready to pos a paper on the school PhD conference SCYR, deadline is in the middle of February.

Meeting 16.12.21

  • A report was provided (through Teams).
  • Installed Anaconda and started s Transformers tutorial
  • Started Dive into python book

Task:

  • Report: Create a detailed list of available datasets for HS.
  • Report: Create a detailed description of the state of the art approaches for HS detection.
  • Practical: Continue with open tasks below. (pick datasetm, perform classification,evaluate the experiment.)

Meeting 10.12.21

No report (just draft) was provided so far.

  1. Read papers from below and make notes what you have learned fro the papers. For each note make a bibliographic citation. Write down authors of the paper, name paper of the paper, year, publisher and other important information. When you find out something, make a reference with a number to that paper. You can use a bibliografic manager software. Mendeley, Endnote, Jabref.
  2. From the papers find out answers to the questions below.
  3. Pick a hatespeech dataset.
  4. Pick an approach and Python library for HS classification.
  5. Create a GIT repository and share your experiment files. Do not commit data files, just links how to download the files.
  6. Perform and evaluate experiments.

Meeting 10.11.21

First tasks

Prepare a report where you will explain:

  • what is hate speech detection,
  • where and why you can use hate-speech detection,
  • what are state-of-the-art methods for hate speech detection,
  • how can you evaluate a hate-speech detection system,
  • what datasets for hate-speech detection are available,

The report should properly cite scientific bibliographical sources. Use a bibliography manager software, such as Mendeley.

Create a VPN connection to the university network to have access to the scientific databses. Use scientific indexes to discover literature:

Your review can start with:

Get to know the Python programming language