Informasiya Texnologiyaları Institutu

Menyu ITI əməkdaslarının elmi isləri Elektron kitabxana Konfranslar İnformasiya Sistemi Qəzetlər UOT 004

ITI əməkdaşlarının elmi işləri - tezis

Biblioqrafik təsvir
Alguliyev , R.M. Batch Clustering Algorithm for Big Data Sets / R.M. Alguliyev , R.M. Aliguliyev // 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT). - Bakı, 2016. - P. 79-82.
Annotasiya
Vast spread of computing technologies has led to abundance of large data sets. Today tech companies like, Google, Facebook, Twitter and Amazon handle big data sets and log terabytes, if not petabytes, of data per day. Thus, there is a need to find similarities and define groupings among the elements of these big data sets. One of the ways to find these similarities is data clustering. Currently, there exist several data clustering algorithms which differ by their application area and efficiency. Increase in computational power and algorithmic improvements have reduced the time for clustering of big data sets. But it usually happens that big data sets can’t be processed whole due to hardware and computational restrictions. In this paper, the classic k-means clustering algorithm is compared to the proposed batch clustering (BC) algorithm for the required computation time and objective function. The BC algorithm is designed to cluster large data sets in batches but maintain the efficiency and quality. Several experiments confirm that batch clustering algorithm for big data sets is more efficient in using computational power, data storage and results in better clustering compared to k-means algorithm. The experiments are conducted with the data set of 2 (two) million two-dimensional data points.

Elektron variant
Elektron variant