Publish In
International Journal of Advances in Electronics and Computer Science-IJAECS
Journal Home
Volume Issue
Volume-4,Issue-3  ( Mar, 2017 )
Paper Title
Cost-Efficient Clustering Techniques on Large Data Using R Environment
Author Name
Kanakamedalavineela, Naralasettysandhya Rani, D.S Bhupalnaik, Surabhichaitanya
Department of computer science & Engineering, Vignan’s Foundation Science, Technology & Research, Vadlamudi, Guntur, Andhra Pradesh
Data mining is the statistical computation of exploring patterns in large dataset. The huge amount of data can be stored in information systems. Depending upon the category of pattern we are choosing in large data a data mining tasks can be classified into predictive and descriptive analytics. Descriptive analytics analyzes the past occurrences on data and gives us a perception how to approach in future. Descriptive analytics can be sub classified into Association, Summarization, and clustering. Clustering do an exploratory data analysis .For doing any statistical computations in data mining we have various tools. In this Weka is a open-source toolkit that executes data mining algorithms. We have some disadvantages in weka tool, it cannot handle large data, it cannot import various data formats and it implements only in Java programming language. In this context we are using R programming environment used for data analytics. The main advantage of R is it can handle big data. In this paper we are differentiating various clustering algorithms by using R and to identify which algorithm will be more feasible for handling large data. Keywords - Data mining, Descriptive analytics, Clustering,R programming
  View Paper