聚类模型参数自动选择的图库索引
详细信息 本馆镜像全文    |  推荐本文 | | 获取馆网全文
摘要
提出一种基于模式聚类和混合模型参数自动选择的图库索引方法。因为传统的EM(Expectation Maximization)算法为混合模型聚类问题中的参数估计提供了一个很好的解决方法,但需要事先指定聚类数,影响了高维数据索引的精度和效率。综合利用改进的CEM2(Component-wise EM of Mixture)混合模型自动选择算法、矢量量化和概率近似的索引机制,在保证准确率同时有效提高了检索效率。
A graph database indexing method,which is based on pattern clustering and automatic model selection,is proposed.The traditional Expectation Maximization(EM) algorithm provides an effective method for parameter estimation in mixture model clustering,but the number of model components need to be fixed before the processing progress,which will certainly reduce the accuracy of the high dimensional indexing.The proposed indexing method is based on the automatic mixture model selection algorithm,which uses the improved component-wise EM algorithm,the vector quantization method and probabilistic approximation mechanism.The experimental results show that the retrieval efficiency is increased while the true positive rate is kept in high level.
引文
[1]Fayyad U,Piatetsky-Shpiro G,Smyth P,et al.Advances in knowl-edge discovery and data mining[M].[S.l.]:MIT Press,1996.
    [2]Lim Y W,Lee S U.On the color image segmentation algorithm based on the thresholding and the fuzzy C-means techniques[J].Pattern Recognition,1990,23(9):935-952.
    [3]Uchiyama T,Arib M A.Color image segmentation using competitive learning[J].IEEE Trans Pattern Analysis and Machine Intelligence,1994,16(12).
    [4]Banfield J,Raftery A.Model-based gaussian and non-gaussian clus-tering[C]//Biometrics,1993,49:803-821.
    [5]MacQueen J B.Some methods for classification and analysis of mul-tivariate observations[C]//Proc Fifth Berkeley Symp Math Statistics and Probability.Berkeley,Calif:Univ of California Press,1967:281-297.
    [6]Dempster A P,Laird N M,Rubin D B.Maximum likelihood from incomplete data via the EM algorithm[J].J Roya Statistical Soc:Series B,1977,39(1):1-38.
    [7]Dua R O,Hart P E.Pattern classification and scene analysis[J].Wi-ley,1973.
    [8]Benoit Huet,Hancock E R.Relational object recognition from large structural libraries[J].Pattern Recognition,2002,35:1895-1915.
    [9]Figueiredo M A T,Jain A K.Unsupervised learning of finite mix-ture models[J].IEEE Transactions on Patern Ananlysis and Machine Intelligence,2002,24(3).
    [10]Oliver J,Baxter R,Wallace C.Unsupervised learning using MML[C]//Proc13th Int’l Conf Machine Learning,1996:364-372.
    [11]Titterington D,Smith A,Makov U.Statistical analysis of finitemixture distributions[M].Chichester,U.K.:John Wiley&Sons,1985.
    [12]Bennett K P,Fayyad U,Geigery D.Density-based indexing for approximate nearest-neighbor queries,MSR-TR-98-58[R].Mi-crosoft Research,1999.
    [13]叶航军.面向大规模图像库的索引和检索机制研究[J].清华大学计算机应用技术,2005:21-67.
    [14]付燕,赵荣椿.消除叠后地震记录相干噪声的时Ο空变倾角KL变换[J].物探与化探,2002,26(2):1.
    [15]Loyd S P.Least squares quantization in pem[J].IEEE Transactions on Information Theory,1982,28:127-135.

版权所有:© 2023 中国地质图书馆 中国地质调查局地学文献中心