Implementasi Metode Unsupervised Data Augmentation untuk Deteksi Teks Hate Speech dalam Bahasa Indonesia

⭐ Christianto, Christianto (2021) Implementasi Metode Unsupervised Data Augmentation untuk Deteksi Teks Hate Speech dalam Bahasa Indonesia. Bachelor Thesis, Universitas Multimedia Nusantara.

Text
HALAMAN_AWAL.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (504kB)

Text
DAFTAR_PUSTAKA.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (188kB)

Preview

Text
BAB_I.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (184kB) | Preview

Preview

Text
BAB_II.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (338kB) | Preview

Text
BAB_III.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (518kB)

Text
BAB_IV.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (757kB)

Text
BAB_V.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (113kB)

Text
LAMPIRAN.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (454kB)

Abstract

Media sosial yang kian berkembang secara tidak langsung membantu para penggunanya dalam menebarkan ujaran kebencian. Penyebaran ujaran kebencian yang begitu cepat dan mudah tidak boleh dianggap ringan, mengingat bahwa ujaran kebencian menjadi salah satu penyebab tingginya angka prevalensi depresi di Indonesia. Oleh sebab itu, dibutuhkan penyortiran terhadap tweet yang dikirimkan dalam media sosial. Akan tetapi, banyaknya tweet yang disebarkan setiap harinya membuat pendeteksian ujaran kebencian secara manual terlihat tidak mungkin. Text classification menjadi solusi terhadap masalah tersebut dengan membentuk model yang dapat menentukan label dari suatu teks secara otomatis. Namun, metode yang sering digunakan untuk membentuk model tersebut adalah supervised learning, yang membutuhkan banyak usaha karena proses pelatihannya membutuhkan banyak data berlabel. Oleh sebab itu, penelitian ini mengimplementasikan metode Unsupervised Data Augmentation yang meminimalisir penggunaan data berlabel dalam proses pelatihannya. Berdasarkan hasil pengujian, dengan menggunakan 10% data berlabel atau sebanyak 1.316 dari 13.169 data, diperoleh model dengan performa nilai accuracy 78,8%, precision 79,1%, recall 77,9%, dan F1 score 77%.

Item Type:	Thesis (Bachelor Thesis)
Creators:	Christianto, Christianto (00000019431)
Contributors:	Rusli, Andre (0319069201) Christian Young, Julio
Keywords:	BERT, Binary Text Classification, Hate Speech, Text Augmentation, Unsupervised Data Augmentation
Subjects:	000 Computer Science, Information and General Works > 000 Computer Science, Knowledge and Systems > 005 Computer Programming > 005.2 Programming for Specific Computers, Algorithm, HTML, PHP, java, C++ 000 Computer Science, Information and General Works > 000 Computer Science, Knowledge and Systems > 006 Special Computer Methods > 006.8 Augmented Reality, Virtual Reality
Sustainable Development Goals:	Goal 04. Ensure inclusive and equitable quality education and promote lifelong learning Goal 09. Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation
Divisions:	Faculty of Engineering & Informatics > Informatics
Date Deposited:	24 Aug 2021 15:31
URI:	https://kc.umn.ac.id/id/eprint/16558

Actions (login required)

View Item

This repository is indexed on

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.