DOIONLINE

DOIONLINE NO - IJAECS-IRAJ-DOIONLINE-18440

Publish In
International Journal of Advances in Electronics and Computer Science-IJAECS
Journal Home
Volume Issue
Issue
Volume-9,Issue-2  ( Feb, 2022 )
Paper Title
Spanish Text Classification with Bert
Author Name
Hu Hang, Tad Gonsalves
Affilition
Department of Information & Communication Sciences, Faculty of Science and Technology, Sophia University, Tokyo, Japan
Pages
26-30
Abstract
Abstract - This paper presents a training process of Spanish news content. The research is conducted as part of the Spanish e-learning support project, which aims to promote the study of Spanish linguistics and help the development of Spanish teaching. The BERT model is the most popular pre-trained model used in natural language processing. We start with the training of the BERT classifier and analysis of the performance using different datasets. We first apply the EDA augmentation method to the Spanish text and improve classification accuracy.We have achieved over 90% accuracy on the news topic classification tasks with two datasets. Not only that, we conducted further research on the performance differences with varied training setups and discovered the intra-domain migration problem of the model during training.The result reflects the impact of dataset difference on model presentation, which shows the issue of model migration within the domain. Keywords - Natural Language Processing, Data Augmentation, Visualization, Intra-Domain Migration.
  View Paper