Wednesday, July 15, 2020


Author :  Diptarka Saha

Affiliation :  WalmartLabs, Bengaluru, Karnataka

Country :  India

Category :  Computer Science & Information Technology

Volume, Issue, Month, Year :  8, 13, September, 2018

Abstract :

Cluster analysis and Anomaly Detection are the primary methods for database mining. However, most of the data in today's world, generated from multifarious sources, don’t adhere to the assumption of single or even known distribution - hence the problem of finding clusters in the data becomes arduous as clusters are of widely differing sizes, densities and shapes, along with the presence of noise and outliers. Thus, we propose a relative-KNN-kernel density-based clustering algorithm. The un-clustered (noise) points are further classified as anomaly or nonanomaly using a weighted rank-based anomaly detection method. This method works particularly well when the clusters are of varying variability and shape, in these cases our algorithm not only finds the “dense” clusters that other clustering algorithms find, it also finds low-density clusters that these approaches fail to identify. This more accurate clustering in turn helps reduce the noise points and makes the anomaly detection more accurate.

Keyword :  Clustering, Relative KNN – kernel density, Varying density clusters, Anomaly Detection, DBSCAN

For More Details

No comments:

Post a Comment