大规模数据集聚类方法研究及应用

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

大规模数据集聚类方法研究及应用

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Study on Clustering for Large Data Sets and Its Applications
作者：钱鹏江
论文级别：博士
学科专业名称：轻工信息技术与工程
中文关键词：聚类 ; 大规模数据集 ; 时间复杂度 ; 谱聚类 ; 相似度聚类 ; 均值漂移 ; 核密度估计 ; 最小包含球 ; Parzen窗 ; 压缩集密度估计
英文关键词：Clustering ; Large data set ; Time complexity ; Spectral clustering ; Similarity-based clustering ; Mean shift ; Kernel density estimation ; Minimal enclosing ball ; Parzen Window ; Reduced set density estimation
学位年度：2011
导师：王士同
学科代码：081104
学位授予单位：江南大学
论文提交日期：2011-03-01
答辩委员会主席：李凡长

摘要

聚类问题一直是模式识别领域的热点课题,很多聚类方法纷纷涌现。这些方法大多在适合自身特点的小规模数据集上表现出优良的性能,但在大规模数据集上往往收效甚微,甚至无法运行。针对大规模数据环境下聚类问题的这种困境,本课题进行了相关研究,并先后提出了四种适用于大规模数据集的聚类方法和一个基础理论,分述如下:
     第二章给基于图论的松弛聚类算法GRC的目标表达式引入约束条件和一次优化项后首先提出约束型图论松弛聚类算法CGRC,又CGRC可视作一个中心约束型最小包含球问题,于是使用基于核心集的最小包含球快速估计技术进而提出了快速图论松弛聚类算法FGRC,渐进时间复杂度与样本容量呈线性关系是FGRC的最大优点。概率密度估计是模式识别领域的基础研究之一,很多后续工作都基于它而展开。本文第三章提出快速自适应相似度聚类方法FASCM和第四章提出快速均值漂移谱聚类算法FMSSC都是如此,它们均以快速压缩集密度估计器FRSDE为基础而展开。
     第三章首先证明相似度聚类方法SCM的相似度度量函数相当于一个基于高斯密度核的概率密度估计函数,于是利用FRSDE可以快速地得到具有稀疏权系数形式的相似度函数,从而大大降低了SCM中SCA过程的计算开销。接着使用图论松弛聚类技术代替层次聚类过程,使算法具有了自适应能力,摆脱了人工经验的依赖增强了实用性。这就是FASCM的主要思想。
     第四章指出原均值漂移谱聚类算法MSSC繁重计算开销的根源是使用了Parzen窗密度估计式。为此该章重新设计了MSSC的架构,以FRSDE取代其PW,以本文第二章提出的CGRC算法代替其简单模式合并方法,从而提出了快速均值漂移谱聚类FMSSC算法。FMSSC较MSSC显著提高了实用性,其总体时间复杂度与样本容量近似呈线性关系。
     第五章推导了图论松弛聚类算法GRC的目标表达式可表示成“PW加权和+平方熵”的形式,因此GRC也可看作一个KDE问题。于是利用KDE近似定理提出了基于KDE近似的大规模数据集图论松弛聚类SUGRC-KDEA新方法。SUGRC-KDEA的关键抽样容量要适量,为此该章同步提出了基于超球分割的随机抽样算法HSBRS。HSBRS既保证抽样子集容量合适又保证能较好地反映原数据集的数据分布规律。
     第六章提出了一个基础性理论:快速核密度估计定理。该章利用柯西-许瓦茨不等式证明了基于抽样子集的KDE和基于完整数据集的KDE的误差上限仅与抽样容量和核参数相关,与其它因素无关。即只要抽样容量和核窗宽合适,可以用抽样子集代替原数据集进行核密度估计。该定理的得出为所有基于数据抽样的模式识别方法或技术提供了新的理论支撑。本课题的所有研究均属于此范畴。
Clustering analysis is always a topical subject in the field of Pattern Recognition and many clustering methods have mergred now. Most of these methods have shown their superior performance on some small datasets fitting for them; however, they are often inefficient, impractical and even invalid on those large ones. Motivated by the challenges of clustering for large datasets, several issues are addressed in this study, which mainly involves the following five parts.
     In Chapter 2, the constratined Graph-based Relaxed Clustering (CGRC) algorithm is firstly presented by introducing a constrained condition and a linear term to the objective expression of the original Graph-based Relaxed Clustering (GRC) algorithm. Furthermore, owing to CGRC can be regarded as a Center-constrained Minimal Enclosing Ball problem, the Fast Graph-based Relaxed Clustering (FGRC) algorithm is further proposed by using the Core-set based MEB approximation trick accordingly. It is the biggest merit that the asymptotic time complexity of FGRC is linear with the data size. Probability Density Estimation is a theoretical foundation in Pattern Recognition and many follow-up works are continued based on it. So do the Fast Adaptable Similarity-Based Clustering Method (FASCM) proposed in Chapter 3 and the Fast Mean Shift Spectral Clustering (FMSSC) algorithm presented in Chapter 4, they are both based on the Fast Reduced Set Density Estimation (FRSDE).
     In Chapter 3, in first, it is deduced that the similarity measure function of the Similarity-Based Clustering Method (SCM) equals to a Gaussian kernel density function, so the similarity function with sparse weight coefficients, which reduces sharply the computational cost of the SCA phase embedded in SCM, can be obtained quickly by utilizing FRSDE. Then, by replacing the Agglomerative Hierarchical Clustering (AHC) algorithm with GRC, it makes true that the new prposed method is independent on artificial experience and is self-adaptable. This is FASCM.
     In Chapter 4, it is pointed out that the Parzen Window density estimator (PW) leads to the expensive time cost of the original Mean Shift Spectral Clustering (MSSC) algorithm. To deal with this problem, the FMSSC algorithm with new framework is presented, where FRSDE substitutes for PW and CGRC proposed in Chapter 2 replaces the simple mode merging method, which makes FMSSC is more practical than MSSC and its time complexity is roughly linear with the whole data size.
     In Chapter 5, because the objective function of GRC can be represented as two parts:“weight sum of PW”and“Quadratic Entropy”, GRC can be considered as a Kernel Density Estimation (KDE) problem. Thus, the Scaling up GRC by KDE Approximation (SUGRC-KDEA) method is proposed according to the KDE Approximation theorem. The key point of SUGRC-KDEA is the sampled size. To ensure the sampled size is suitable and the sampling subset is able to reflect the potential distribution rule of the whole dataset, the Hyperesphere Segmentation Based Random Sampling (HSBRS) algorithm is presented synchronously.
     In Chapter 6, the Fast Kernel Density Estimation (FKDE) theorem, which is a fundamental theory in this study, is proposed via the Cauchy– Schwartz inequality. It is proved that the Integrated Squared Error (ISE) between the KDE generated by all data points and the KDE generated by a sampled subset is just dependent on the sampled size and the kernel width. That is, a subset with appropriate sampled size can be employed for KDE instead of the whole dataset. This theorem provides a new theoretical support for those data sampling based methods or technologies in Partten Recognition, including all researches in this thesis.

引文

1. Dunn J. A fuzzy relative of the isodata process and its use in detecting compact, well-separated clusters [J]. Journal of Cybernetics. 1973, 3(3): 32–57
    2. Kruse R, Hoppner F, Klawonn F and Runkler T. Fuzzy Cluster Analysis [M], New York: John Wiley and Sons, 1999
    3. Zhang T, Ramakrishnan R, Livny M. An efficient data clustering method for very large databases [C]. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. New York: ACM Press, 1996. 103-114
    4. Guha S, Rastogi R, Shim K. Cure: An eficient clustering algorithm for large database [C]. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. New York: ACM Press, 1998. 73-84
    5. Ester M, Kriegel H P, Sander J. A density-based algorithm for discovering clusters in large spatial databases with noise [C]. Proceedings of the 2nd International Conference on Knowledge Discovering in Databases and Data Mining. AAAI Press, 1996. 122-128
    6. Ankerst M, Breunig M M, Kriegel H P, Sander J. OPTICS: Ordering points to identify the clustering structure [C]. Proc. SIGMOD’99. New York: ACM Press, 1999. 49-60
    7. Wang W, Yang J, Muntz R. STING: A statistical information grid approach to spatial data mining [C]. Proceedings of the 23rd VLDB Conference. Morgan Kaufmann, 1997. 186-195
    8. Sheikholeslami G, Chatterjee S, Zhang A. WaveCluster: A multi-resolution clustering approach for very large spatial databases [C]. Proceedings of the 24th VLDB Conference. Morgan Kaufmann, 1998. 428-439
    9. Agrawal R, Gehrke J, Gunopolos D. Automatic subspace clustering of high dimensional data for data mining applications [C]. Proc of ACM SIGMOD International Conference on Management of Data. New York: ACM Press, 1998. 94-105
    10. Girolami M, He C. Probability density estimation from optimally condensed data samples [J]. Transactions on Pattern Analysis and Machine Intelligence. 2003, 25(10): 1253–1264
    11. Deng Z H, Chung F L, Wang S T. FRSDE: fast reduced set density estimator using minimal enclosing ball [J]. Pattern Recognition, 2008, 41(4): 1363-1372
    12. Higham D J, Kibble M. A unified view of spectral clustering [R]. England: Department of Mathematics, University of Strathclyde, 2004
    13. Meila M, Xu L. Multiway cuts and spectral clustering [R]. USA: Department of Statistics, University of Washington, 2004
    14. Von Luxburg U. A tutorial on spectral clustering [J]. Statistics and Computing. 2007, 17(4): 395-416
    15. Higham D J, Kibble M. A unified view of spectral clustering [R]. Department of Mathematics: University of Strathclyde, 2004
    16. Ning H, Liu M, Tang H, and Huang T. A spectral clustering approach to speaker diarization [C]. In proceedings of the International Conference on Spoken Language Processing. Curran Associates, 2006
    17. Wu Z, Leahy R. An optimal graph theoretic approach to data clustering: theory and its application toimage segmentation [J]. Transactions on Pattern Analysis and Machine Intelligence. 1993, 15(11): 1101-1113
    18. Shi J, Malik J. Normalized cuts and image segmentation [C]. Proceddings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1997. 731-737
    19. Shi J, Malik J. Normalized cuts and image segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000, 22(8): 888-905
    20. Lee C, Za?ane O, Park H, et al. Clustering high dimensional data: A graph-based relaxed optimization approach [J]. Information Sciences. 2008, 178: 4501-4511
    21. Tsang I, Kwok J and Zurada J. Generalized core vector machines [J]. IEEE Transactions on Neural Networks. 2006, 17(5): 1126-1139
    22. Bǎdoiu M and Clarkson K L. Optimal core-sets for balls [C]. Proceedings of DIMACS Workshop on Computational Geometry. 2002
    23. Welzl E. Smallest enclosing disks (balls and ellipsoids) [J]. New Results and New Trends in Computer Science. 1991, 555: 359-370
    24. Nielsen F and Nock R. Approximating smallest enclosing balls [C]. In Proceedings of International Conference on Computational Science and Its Applications. Springer-Verlag, 2004. 147-157
    25. Bǎdoiu M and Clarkson K L. Optimal core sets for balls [J]. Computational Geometry: Theory and Applications. 2008, 40 (1): 14-22
    26. Bǎdoiu M, Har-Peled S, Indyk P. Approximate clustering via core-sets [C]. Proceedings of 34th Annual ACM Symposium on Theory of Computing. ACM Press, 2002. 250–257
    27. Kumar P, Mitchell J S B and Yildirim A. Approximate minimum enclosing balls in high dimensions using core-sets [C]. ACM Journal of Experimental Algorithmics: Vol 8. 2003
    28. Tsang I, Kwok Jand Cheung P. Core vector machines: fast SVM training on very large data sets [J]. Journal of Machine Learning Research. 2005, 6: 363- 392
    29. Chung F L, Deng Z H and Wang S T. From minimum enclosing ball to fast fuzzy inference system training on large data sets [J]. IEEE Transactions on Fuzzy Systems. 2009, 17(1): 173-184
    30. Yang M S and Wu K L. A similarity-based robust clustering method [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004, 26(4): 434-448
    31. Ozertem U, Erdogmus D and Jenssen R. Mean shift spectral clustering [J]. Pattern Recognition, 2008, 41: 1924-1938
    32. Freedman D and Kisilev P. Fast data reduction via KDE approximation [C]. Proceedings of the 2009 Data Compression Conference. 2009. 445
    33. Ng A, Jordan M, Weiss Y. On spectral clustering: Analysis and an algorithm [C]. In Advances in Neural Information Processing Systems. MIT Press, 2001. 849-856
    34. Hu W M, Hu W, Xie N et al. Unsupervised active learning based on hierarchical graph-theoretic clustering [J]. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics. 2009, 39(5): 1147-1161
    35. Tao W, Jin H, Zhang Y. Color Image Segmentation Based on Mean Shift and Normalized Cuts [J]. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, 2007, 37(5): 1382-1389
    36. Yu S X and Shi J. Multiclass spectral clustering [C]. In Proceedings of IEEE International Conferenceon Computer Vision: Vol. 1. 2003. 313-319
    37. Hagen L, Kahng A B. New Spectral Methods for Ratio Cut Partitioning and Clustering [J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 1992, 11 (9): 1074-1085
    38. Sarkar S, Soundararajan P. Supervised learning of large perceptual organization: graph spectral partitioning and learning automata [J]. IEEE Transaction on Pattern Analysis and Machine Intelligence. 2000, 22(5): 504-525
    39. Heiler M, Keuchel J, Schnorr C. Semidefinite clustering for image segmentation with a-priori knowledge [C], DAGM-Symposium: Vol. 3663. Springer, 2005. 309-317
    40. Zhou S M, Gan J Q. Constructing L2-SVM-based fuzzy classifiers in high-dimensional space with automatic model selection and fuzzy rule ranking [J]. IEEE Transactions on Fuzzy Systems. 2007, 15(3): 398-409
    41. Abe S. Analysis of support vector machines [C]. Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing. IEEE, 2002. 89– 98
    42. Asharaf S, Murty M N, and Shevade S K. Multiclass core vector machine [C]. Proceedings of the 24th International Conference on Machine Learning. New York: ACM Press, 2007. 41-48
    43. Tsang I W, Kocsor A, and James T K. Simpler core vector machines with enclosing balls [C]. Proceedings of the 24th International Conference on Machine Learning. New York: ACM Press, 2007. 911-918
    44.刘向东,骆斌,陈兆乾.支持向量机最优模型选择的研究[J].计算机研究与发展. 2005, 42(4): 576-581
    45. Lanckriet G, Cristianini N, Bartlett P, et al. Learning the kernel matrix with semidefinite programming [J]. Journal of Machine Learning Research. 2004, 5: 27-72
    46. Jung S Y, Kim T S. An agglomerative hierarchical clustering using partial maximum array and incremental similarity computation method [C]. Proceedings of the 2001 IEEE International Conference on Data Mining. IEEE, 2001. 265 - 272
    47. Endo Y, Hamasuna Y, Miyamoto S. Agglomerative hierarchical clustering for data with tolerance [C]. Proceedings of the IEEE International Conference on Granular Computing. IEEE, 2007. 404
    48.郭景峰,赵玉艳,边伟峰,李晶.基于改进的凝聚性和分离性的层次聚类算法[J].计算机研究与发展. 2008, 45: 202-206
    49.曹芳,洪文,吴一戎.基于Cloude-Pottier目标分解和聚合的层次聚类算法的全极化SAR数据的非监督分类算法研究[J].电子学报. 2008, 36(2): 543-546
    50. Frigui H and Krishnapuram R. Clustering by competitive agglomeration [J]. Pattern Recognition. 1997, 30: 1223-1232
    51. Frigui H and Krishnapuram R. A robust competitive clustering algorithm with applications in computer vision [J]. Transactions on Pattern Analysis and Machine Intelligence. 1999, 21: 450-465
    52. Krishnapuram R, Frigui H, and Nasraoui O. Fuzzy and possibilistic shell clustering algorithm and their application to boundary detection and surface approximation [J]. IEEE Transactions on Fuzzy Systems. 1995, 3: 29-60
    53. Sha F, Saul L, and Lee D D. Multiplicative updates for non-negative quadratic programming insupport vector machines [R]. University of Pennsylvania, 2002
    54. Sch?lkopf B, Platt J, Shawe-Taylor J, et al. Estimating the support of a high-dimensional distribution [J]. Neural Computation, 2001, 13: 1443-1471
    55. Chen P, Fan R, and Lin C. A study on SMO-type decomposition methods for support vector machines [J]. IEEE Transactions on Neural Networks. 2006, 17: 893-908
    56.周晓剑,马义中.两种求解非正定核Laplace-SVR的SMO算法[J].控制与决策. 2009, 24(11): 1657-1672
    57.孙剑,郑南宁,张志华.一种训练支撑向量机的改进贯序最小优化算法[J].软件学报. 2002,13(10): 2007-2013
    58.李建民,张钹,林福宗.序贯最小优化的改进算法[J].软件学报. 2003, 14(5): 918-924
    59. Parzen E. On estimation of a probability density function and mode [J], The Annals of Mathematical Statistics. 1962, 33(3): 1065-1076
    60. Jeon B, Landgrebe D A. Fast Parzen density estimation using clustering-based branch and bound [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1994, 16 (9): 950-954
    61. Babich G A, Camps O. Weighted Parzen windows for pattern classification [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1996, 18 (5): 567-570
    62. Chao H, Girolami M. Novelty detection employing an L2 optimal nonparametric density estimator [J]. Pattern Recognition Letters. 2004, 25(12): 1389-1397
    63. Silverman B W. Density Estimation for Statistics and Data Analysis [M], London: Chapman & Hall, 1986
    64.李存华,孙志挥,陈耿,胡云.核密度估计及其在聚类算法构造中的应用[J].计算机研究与发展. 2004, 41(10): 1712-1719
    65. Jenssen R, Principe J C, and Eltoft T. Information cut and information forces for clustering [C]. In Proceedings of IEEE International Workshop on Neural Networks for Signal Processing. IEEE, 2003. 459-468
    66. Principe J, Xu D, and Fisher J. Information Theoretic Learning [C]. In: Haykin S, eds. In Unsupervised Adaptive Filtering: Volume I. New York: John Wiley & Sons, 2000
    67. Jenssen R, Principe J C, Erdogmus D, et al. The Cauchy-Schwarz divergence and Parzen windowing: Connections to graph theory and Mercer kernels [C]. In IEEE International Conference on Acoustics, Speech, and Signal Processing: Vol 343. IEEE, 2006. 614-629
    68. Kullback S and Leibler R A. On information and sufficiency [J]. The Annals of Mathematical Statistics. 1951, 22(1): 79-86
    69. Chernoff H. A measure of asymptotic efficiency for tests of a hypothesis based on a sum of observations [J]. The Annals of Mathematical Statistics. 1952, 23: 493-507
    70. Fan R E, Chen P H and Lin C J. Working set selection using second order information for training support vector machines [J]. Journal of Machine Learning Research. 2005, 6: 1889-1918
    71. Simonoff J. Smoothing Methods in Statistics [M]. New York: Springer-Verlag, 1996
    72. Park B and Marron J S. Comparison of data driven bandwidth selectors [J]. Journal of the American Statistical Association. 1990, 85: 66-72
    73. Sheater S J and Jones M C. A reliable data-based bandwidth selection method for kernel densityestimation [J]. Journal of the Royal Statistical Society. 1991, 53: 683-690
    74. Jones M C, Marron J S, and Sheater S J. A brief survey of bandwidth selection for density estimation [J]. Journal of the Royal Statistical Society. 1996, 87: 227-233
    75. Abramson I. On bandwidth variation in kernel estimates [J]. The Annals of Statistics. 1982, 10(4): 1217-1223
    76. Hall P, Hu T C, and Marron J S. Improved variable window kernel estimates of probability densities [J]. The Annals of Statistics. 1995, 23: 1-10
    77. Vapnik V N and Mukherjee S. Support vector method for multivariate density estimation [C]. Advances in Neural Information Processing Systems. MIT Press, 2000. 659-665
    78. Vapnik V and Mukherjee S. Support vector method for multivariate density estimation [C]. In: Solla S, Leen T, Müller K R, eds. Advances in Neural Information Processing Systems. MIT Press, 2000. 659-665
    79. Park J, Huynh H T, Won Y. Support vector machine for hematocrit density estimation based on changing patterns of transduced anodic current [C]. 2008 Third International Conference on Convergence and Hybrid Information Technology: Vol. 1. 2008. 456-460
    80. Lin P J, Huang Y C, Huang Y J, Ker M D. Implementation of the Cosine law for location awareness system [C]. Conference on Signal Processing Systems: Vol. 2. 2010. 16-19
    81. Jansen M, Bultheel A. Multiple Wavelet Threshold Estimation Bygeneralized Cross Validation for Images with Correlated Noise [J], IEEE Transactions on Image Processing. 1999, 8 (7): 947-953
    82.杨宜东,孙志挥,张净.基于核密度估计的分布数据流离群点检测[J].计算机研究与发展. 2005, 42(9): 1498-1504
    83.王天树,李岩.用于动态序列合成的基于核密度估计的隐马尔科夫模型[J].计算机学报. 2003, 26(2): 153-159
    84.文志强,蔡自兴. Mean shift算法的收敛性分析[J].软件学报. 2007, 18(2): 205-212
    85.夏桂松,何楚,孙洪.一种基于非参数密度估计和马尔可夫上下文的SAR图像分割算法[J].电子与信息学报. 2006, 28(12): 2209-2213
    86. Bishop C. Neural Networks for Pattern Recognition [M]. Oxford University Press, 1995
    87. Babich G A, Camps O I. Weighted Parzen windows for pattern classification [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1996, 18(5): 567-570
    88. Erdogmus D, Hild K E, Principe J C, Lazaro M, Santamaria I. Adaptive blind deconvolution of linear channels using Renyi's entropy with Parzen window estimation [J]. IEEE Transactions on Signal Processing. 2004, 52(6): 1489-1498
    89. Wang X X; Tino P, Fardal M A, Raychaudhury S, Babul A. Fast parzen window density estimator [C]. In Proceedings of the International Joint Conference on Neural Networks. ACM Press, 2009. 3267– 3274
    90. Pajares G, de la Cruz J. The non-parametric Parzen's window in stereo vision matching [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B. 2002, 32(2): 225-230
    91. Fukunaga K, Mantock J M. Nonparametric data reduction [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1984, 6: 115-118
    92. Donoho D, Johnstone I M, Kerkyacharian G, etc. Density estimation by wavelet tresholding [J]. The Annals of Mathematical Statistics. 1996, 24(2): 508-539
    93. Mitra P, Murthy C A, Pal S K, Density based multiscale data condensation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002, 24(6): 734-747
    94. Cheng K F and Chu C K. Semiparametric density estimation under a two-sample density ratio model. Bernoulli. 2004, 10(4): 583-604
    95. Scott D W, Sheather S J. Kernel density estimation with binned data [J]. Communications in Statistics - Theory and Methods. 1985, 14(6): 1353-1359
    96. Weston J, Gammerman A, Stitson M O, etc. Support Vector Density Estimation [C]. Advances in Kernel Methods. MIT Press, 1999
    97. Haag M and Tonn W M. Sampling, density estimation and spatial relationships [C]. In: Karcher S J, eds. Proceedings of the 19th Workshop/Conference of the Association for Biology Laboratory Education: Vol. 19. 1998. 197-216
    98. Holmstr?m L. The error and the computational complexity of a multivariate binned kernel density estimator [J]. Journal of Multivariate Analysis. 2000, 72(2): 264-309
    99. Huang D and Chow T W S. Enhancing density-based data reduction using Entropy [J]. Neural computation. 2005, 18(2): 470-495
    100. Xu D, Principe C. Energy, Entropy and Information Potential for neural computation [R]. University of Florida, Gainesville, FL, 1998
    101. Erdogmus D., Principe J. From linear adaptive filtering to nonlinear signal processing [J]. IEEE Signal Processing Magazine. 23: 14-33
    102. Principe J, Xu D, Zhao Q, Fisher J. Learning from examples with information theoretic criteria [J]. VLSI Signal Processing Systems. 26: 61-77
    103. Xu J, Bakardjian H, Cichocki A, and Principe J. A new nonlinear similarity measure for multichannel signals [C]. International Joint Conference on Neural Networks. Springer: 2007. 2046 - 2051
    104. Liu W, Pokarel P, Principe J. The kernel LMS algorithm [J]. IEEE Transactions on Signal Processing. 2008, 56(2): 543 - 554
    105. Han S, Rao S, Erdogmus D, Principe J. A minimum error entropy algorithm with self adjusting stepsize (MEE-SAS) [J]. Signal Processing. 2008, 87: 2733-2745
    106. Li R, Liu W, Principe J, A unifying criterion for blind source separation based on correntropy [J]. Signal Processing. 2007, 87(8): 1872-1881
    107. Jenssen R, Principe J C, Erdogmus D and Eltoft T. The Laplacian classifier [J]. IEEE Transactions on Signal Processing. 2006, 55(7): 3262-3271
    108. Hild II K E, Erdogmus D, Principe J. An analysis of Entropy estimators for blind source separation [J]. Signal Processing, 2006, 86(1): 182-194
    109. Hild II K E, Erdogmus D, Torkkola K, Principe J. Sequential feature extraction using information theoretic learning [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2006, 28(9): 1385-1393
    110. Santamaria I, Pokharel P, Principe J. Generalized correlation function: definition, properties and application to blind equalization [J]. IEEE Transactions on Signal Processing. 2006, 54(6): 2187- 2186
    111. Hild II K E, Pinto D, Erdogmus D and Principe J, Convolutive blind source separation by minimizing mutual information between segments of signals [J]. IEEE Transactions on Circuits and Systems - I. 2005, 52(10): 2188-2196
    112. Young P, The Nature of Information [M]. New York: Praeger, 1987
    113. Kubat L and Zeman J. Entropy and Information in Science and Philosophy [M]. Amsterdam: Elsevier, 1975
    114. Kapur J N and Kesavan H K. Entropy Optimization Principles with Applications [M]. New York: Academic Press, 1992
    115. Jumarie G. Subjectivity, Information, Systems: Introduction to a Theory of Relativistic Cybernetics, Gordon and Breach [M]. New York: Science Publishers, 1986
    116. Tribus M and Mclrvine E C. Energy and information [J]. Scientific American. 1971, 225
    117. Shannon C E. A mathematical theory of communication [J]. Bell System Technical Journal. 1948, 27: 379-423
    118. Shannon C E and Weaver W. The Mathematical Theory of Communication [M]. Urbana: University of Illinois Press, 1962
    119. Steele J M. The Cauchy–Schwarz Master Class [M]. Cambridge University Press, 2004
    120. Bernau S J and Huijsmans C B. The Schwartz inequality in Archimedean f-algebras [J]. Indagationes Mathematicae. 1996, 7(2): 137-148
    121. Buskes G and Van Rooij A. Almost f-algebras: commutativity and the Cauchy-Schwarz inequality [J]. Positivity. 2000, 4(3): 227-231

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700