| Menyu
ITI
əməkdaslarının elmi isləri
Elektron kitabxana
Konfranslar İnformasiya Sistemi
Qəzetlər
UOT 004
|
| ITI əməkdaşlarının elmi işləri - məqalə |
| Biblioqrafik təsvir | | Alguliyev , R.M. Ibk -means: An iterative batch k -means algorithm for big data clustering / R.M. Alguliyev , R.M. Aliguliyev , A.M. Bagirov // Kibernetika. - 2025. - N: 4, vol 61.- P. 492-508. | | Annotasiya | | Information technologies such as social media, mobile computing, and the realization of the
industrial Internet of Things (IoT) produce huge amounts of data every day. The development
of powerful tools for knowledge-discovery is imperative to deal with such a volume of data.
Clustering methods are among the most important knowledge-discovery techniques. The growth
in computational power and algorithmic developments allow us to efficiently and accurately
solve clustering problems in large datasets. However, these developments are insufficient to deal
with clustering problems in big datasets. This is because these datasets cannot be processed as a
whole due to hardware and computational restrictions. In this paper an iterative batch k-means
(ibk-means) algorithm is proposed that yields good clustering results with low computation
costs on big datasets. It is designed to cluster datasets using batch data. The efficiency and
accuracy of the proposed algorithm are investigated depending on the size of batches, the
number of attributes and clusters. The algorithm is compared with the classic k-means and
mini batch k-means algorithms using computational results on several real-world datasets, all
of which are available from the UCI Machine Learning Repository. The smallest dataset has
500000 data points and 2 attributes and the largest one contains 43930257 data points and
16 attributes. Results demonstrated that the ibk-means algorithm outperforms both the k-
means and mini batch k-means algorithms in the sense of both efficiency and accuracy and
it is applicable for the clustering of big datasets. The proposed algorithm provides real time
clustering and may have direct applications in expert and intelligent systems. Furthermore,
results from this paper will have a clear impact in the sense of designing more accurate and
efficient clustering algorithms for big datasets taking into account available computer resources. | | Elektron variant | | Elektron variant |
|
________
© ict.az http://ict.az/az
|
|