DOIONLINE

DOIONLINE NO - IJMAS-IRAJ-DOIONLNE-10743

Publish In
International Journal of Management and Applied Science (IJMAS)-IJMAS
Journal Home
Volume Issue
Issue
Volume-4,Issue-1  ( Jan, 2018 )
Paper Title
Author Disambiguation Based on Hierarchical Agglomerative Clustering in Heterogeneous Scholarly Data
Author Name
Jae-Wook Seol, Seok-Hyoung Lee, Hye-Jin Lee, Seo-Young Jeong, Jungsun Yoon, Kwang-Young Kim
Affilition
Korea Institute of Science and Technology Information
Pages
16-21
Abstract
In many fields, different types of scholarly data are utilized to provide information for users. However, it is time consuming and cumbersome to extract information from data presented in different formats or to differentiate between data provided by different authors having the same name. To solve this issue, we identify author entities in different academic data (e.g., papers, patents, and reports), and offer users refined data by connecting author entities that exist in different types of data. Entity identification aims to match authors having the same name with actual people; it reduces the time and effort required to search for academic information, and provides accurate information. The matching involves merging existing information from different formats into a single format. In this paper, to identify author entities, we extract bibliographic information related to authors of academic papers and journals as well as similarities between authors using the hierarchical agglomerative clustering method. To validate the proposed method, we identify entities using the authors data obtained from papers, patents, and reports published in Korea between 1948 and 2016. Based on the results obtained using this data, our system exhibited a precision of 91.29%. Index Terms - Author Disambiguation, Co-Author Network, Hierarchical Agglomerative Clustering
  View Paper