Machine Learning Based Disambiguation of Author’s Names in ORCID Citations

sleeman, Jumah

DSpace Home
→
Graduation Projects, Theses, and Student Papers
→
Master of Informatics
→
View Item

dc.contributor.advisor	Tamimi, Hashem
dc.contributor.author	sleeman, Jumah
dc.date.accessioned	2018-10-22T08:18:27Z
dc.date.accessioned	2022-05-11T05:32:50Z
dc.date.available	2018-10-22T08:18:27Z
dc.date.available	2022-05-11T05:32:50Z
dc.date.issued	6/1/2018
dc.identifier.uri	http://test.ppu.edu/handle/123456789/887
dc.description	CD, no of pages 79, informatics 2/2018, 30943
dc.description.abstract	Author’s Names Disambiguation (AD) is a type of record linkage which is applied to scholarly documents. The ambiguity often occurs due to different factors such as authors who have more than one name version, or group of authors who share the same name. Therein, it is difficult to distinguish between scholarly document authors or to group scholarly documents by authors. Machine learning techniques provide a solution to deal with this challenge by training the machine to classify all the documents belonging to a certain author and distinguish them from works of other authors sharing the same name. However, AD is still a great challenge due to the ever-increasing size of digital libraries and the lack of training examples that represent the whole domain. This study aims at providing a solution by using ORCID citations as a large and reliable source of training data. A comparison study has been made among a group of machine learning approaches including j48, DNN, Naive Bayesian and Random forest. The results from the experiment have proven that Random forest classifier is the best among them with almost 95% accuracy. In addition, coauthors feature was the most important instance compared to the other instances which has an impact of 12.9% in eliminating ambiguity in author’s names.	en_US
dc.language.iso	en	en_US
dc.publisher	جامعة بوليتكنك فلسطين - معلوماتية	en_US
dc.subject	Learning, ORCID	en_US
dc.title	Machine Learning Based Disambiguation of Author’s Names in ORCID Citations	en_US
dc.type	Other	en_US