DOIONLINE

DOIONLINE NO - IJASEAT-IRAJ-DOIONLNE-17047

Publish In
International Journal of Advances in Science, Engineering and Technology(IJASEAT)-IJASEAT
Journal Home
Volume Issue
Issue
Volume-8,Issue-1  ( Jan, 2020 )
Paper Title
Dimensionality Reduction for Classification of Filipino Text Documents Based on Improved Bayesian Vectorization Technique
Author Name
Hajah T. Sueno, Bobby D. Gerardo, Ruji P. Medina
Affilition
Technological Institute of the Philippines, Quezon City, Philippines, West Visayas State University, Iloilo City, Philippines
Pages
56-60
Abstract
Dimensionality reduction of feature vector size plays a vital role in enhancing the text processing capabilities to reduce the size of the feature vector used in the mining tasks to achieve a higher classification accuracy. While dimensionality reduction for text classification is becoming a great area of research in most languages, Filipino documents have received little or no attention from researchers. Thus, this paper addresses the issue of dimensionality reduction in representing relevant data from Filipino texts using an improved Bayesian vectorization technique. To validate the effectiveness of improved Bayesian vectorization, the model was compared to the Term Frequency and Inverse Document Frequency (TF-IDF) method. The outcomes are presented using standard measures such as precision, recall, f-score and accuracy. The results revealed that the improved Bayesian vectorization has significantly better results having 98% classification accuracy compared to 76% classification accuracy of the TF-IDF vectorization technique. Keywords - Dimensionality Reduction, Bayesian Vectorization, Filipino Text Document, OPM Songs, Lyrics, Text Classification
  View Paper