基于统计学习的分类方法及在Web挖掘中的应用研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

基于统计学习的分类方法及在Web挖掘中的应用研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：The Study of Classification Methods and Its Applications in Web Mining Based on Statistical Learning
作者：陶剑文
论文级别：博士
学科专业名称：轻工信息技术与工程
中文关键词：特征降维 ; 流形学习 ; 局部线形嵌入 ; 局部保留投影 ; 支持向量机 ; 核学习 ; 模糊支持向量机 ; 总间隔支持向量机 ; 球形支持向量机 ; 迁移学习 ; 领域适应学习
英文关键词：Feature dimention reduction ; Manifold learning ; Locally Linear Embedding ; Locality preserving
英文关键词：projection ; Support vector machine ; Kernel learning ; Fuzzy support vector machine ; Margin based Support vector
英文关键词：machine ; gap-tolerant support vector machine ; Transfer learning ; Domain adaptation learning
学位年度：2012
导师：王士同
学科代码：081104
学位授予单位：江南大学
论文提交日期：2012-06-01
答辩委员会主席：李宣东

摘要

目前，基于统计学习的模式识别技术已经得到了较深入的研究，一些相关技术成果已成功高效地应用于各种不同的领域。但因为统计学习理论尚处于发展阶段，针对某些具体的应用领域（如Web数据挖掘），其还有很多问题尚待解决，例如：如何实现鲁棒的流形特征降维、如何根据数据分布结构来优化分类边界的问题、如何实现不同数据领域之间的学习迁移问题等等一系列重要工作。具体地讲，本课题主要研究内容主要包括三个部分，各部分的内容概括如下：
     第一部分主要由第二章组成，在该部分，我们针对传统的LLE对离群（或噪声）敏感的问题，提出一种鲁棒的基于L1范数最小化的LLE算法（L1-LLE），通过L1范数最小化来求取局部重构矩阵，减小了重构矩阵能量，能有效地克服离群（或噪声）干扰，利用现有的优化技术，L1-LLE算法简单且易实现。证明了L1-LLE算法的收敛性。通过与传统LLE方法进行性能比较，结果显示L1-LLE方法是稳定、有效的。
     第二部分主要由第三章、第四章、第五章和第六章构成，该部分重点探讨了在同时考虑类内数据分布结构最小化和类间间隔最大化的情况下，如何更有效地提升SVM（包括线形和球形SVM）的泛化性能。在第三章中，我们对于模式分类问题，提出一种新颖的具有磁场效应的大间隔支持向量机（MFSVM），在Mercer核诱导的特征空间中，MFSVM能同时解决一类（新奇检测）和二类模式分类问题。MFSVM本质上是一个带约束的线性支持向量机，其旨在学习一个具有磁场效应的最优超平面，通过引入一个最小化的q-磁场带，使得一类（或正常类）被包含其中，而另一类（或异常类）与该q-磁场带的间隔尽可能的大，从而实现类内内聚性的提高和类间间隔的增大，增强线性SVM的泛化性能。在第四章中，我们针对现有模式分类方法不能较好地保持数据空间的局部流形信息或差异信息等问题，提出一种基于流形学习的局部保留最大信息差v-支持向量机（v-LPMIVSVM）。对于模式分类问题，v-LPMIVSVM引入局部同类离散度和局部异类离散度概念，分别度量输入空间局部流形结构和局部差异（或判别）信息，通过最小化局部同类离散度和最大化局部异类离散度，优化分类器的投影方向；在数据点对间的相似性度量上，v-LPMIVSVM采用了适于流形数据距离度量的测地线距离度量方式，以更好地反映流形数据的本质几何结构，从而增强了所提方法的泛化性能。在第五章，为了提高球形分类器的分类性能，受支持向量机和小球体大间隔等方法的启发，提出一种大间隔最小压缩包含球（Large Margin and Minimal Reduced Enclosing Ball, LMMREB）学习机，其在Mercer核诱导的特征空间，通过优化一个最小包含球，以寻求二个同心的分别包含二类模式的压缩包含球，且使二类模式分别与压缩包含球间最小间隔最大化，从而同时实现类间间隔和类内内聚性的最大化。在第六章，为了解决传统支持向量机易出现学习“过拟合”和丢失数据统计特征等问题，通过引入模糊隶属度和总间隔思想，提出一种基于总间隔的最大间隔最小包含模糊球形学习机（TMF-SSLM），使得一类（正类）被包含于一个最小包含超球内，而另一类（负类）与该超球间隔最大化，从而同时实现类间间隔的增大和正负两类类内体积的缩小。通过使用差异成本，解决了不平衡训练样本问题；引入总间隔和模糊隶属度，克服了传统软间隔分类机的过拟合问题，显著提升球形学习机的泛化能力。
     第三部分主要由第七章和第八章构成，该部分深入探讨了领域迁移学习问题。在第七章，针对当前流形的领域迁移SVM方法中仅考虑领域间分布均值差最小化所存在的局限性问题，在某个再生核Hilbert空间，我们在充分考虑领域间分布的均值差和散度差最小化的基础上，基于结构风险最小化模型，提出一种领域适应核支持向量学习机（DAKSVM）及其最小平方范式（LSDAKSVM），取得了优化或可比较的模式分类性能。在第八章，针对领域适应学习问题，我们提出一种核分布一致局部领域适应学习机(Kernel Distribution Consistency basedLocal Domain Adaptation Classifier, KDC-LDAC)，在某个通用再生核Hilbert空间，基于结构风险最小化模型，KDC-LDAC首先学习一个核分布一致正则化支持向量机，对目标数据进行初始划分，然后基于核局部学习思想，对目标数据类别信息进行局部回归重构，最后利用学习获得的类别信息，在目标领域训练学习一个适于目标判别的分类器。所提方法具有优化或可比较的领域适应学习性能。
     最后，在第九章，我们对本课题研究内容进行了总结和展望。
Recently, pattern recognition abased on statistical learning theory is an important study field inmachine learning and deeply studied. And some relevant technology with pattern recognition hasbeen successfully applied in many fields. However, pattern recognition still confronts manychallenges under the development of statistical learning theory, and many issues need to be moredeeply explored and further study in some specific application domain such as Web data mining.Robust Feature dimension reduction based on manifold learning, data-dependent SVM learning,and domain transfers learning are three important topics of them. Motivated by the abovechallenges, several issues are addressed in this study, which mainly involves three parts as follows.
     In the first part which is composed of Chapter2, aiming at the drawback of Locally LinearEmbedding (LLE) algorithm, which is sensitive to noise or outlier, a novel L1-norm based LLE(L1-LLE) algorithm is proposed in this study, which is robust to outliers because it utilizes theL1-norm, which is less sensitive to outliers. The proposed L1-norm optimization technique isintuitive, simple, and easy to implement. It is also proven to find a globally minimal solution. Theproposed method is applied to several data sets and the performances are compared with those ofother conventional methods.
     In the second part which is composed of Chapter3, Chapter4, Chapter5, and Chapter6, how toimprove the performance of support vector machine by simultaneously considering between-classmargin and within-class cohesion is discussed. In Chapter3, a novel maximal margin support vectormachine with magnetic field effect (MFSVM) is proposed in allusion to pattern classificationproblem. In the Mercer induced feature space, MFSVM can effectively resolve one-class/binaryclassification problems. By introducing a minimum q-magnetic field tube, the basic idea ofMFSVM is to find an optimal hyper-plane with magnetic field effect such that one class (or normalpatterns) can be enclosed in the q-magnetic field tube due to the magnetic attractive effect, whileat the same time the margin between the q-magnetic field tube and the other class (or abnormalpatterns) is as large as possible due to magnetic repulsion, thus implementing both maximumbetween-class margin and minimum within-class volume so as to improve the generalizationcapability of the proposed method. In Chapter4, aiming at the drawbacks of the state-of-the-artpattern classifiers, which can not efficiently preserve the local geometrical structure or the diversity(or discriminative) information of data points embedded in high-dimensional data space, which isuseful for pattern recognition, a novel so-called Locality-Preserved Maximum Information Variancev-Support Vector Machine (v-LPMIVSVM) algorithm is presented based on manifold learning toaddress those problems mentioned above. The v-LPMIVSVM introduces within-localityhomogeneous scatter, which measures the within-locality manifold information of data points, andwithin-locality heterogeneous scatter, which measures the within-locality diversity information ofdata points, thus an optimal classifier with optimal projection weight vector by minimizing thewithin-locality homogeneous scatter and simultaneously maximizing the within-localityheterogeneous scatter. Meanwhile, v-LPMIVSVM adopts geodesic distance metric to measure thedistance between data in the manifold space, only which can reflect the true geometry of themanifold. In addition, an additional parameter is introduced to control both the super bound on thefraction of margin errors and the lower bound on the fraction of support vectors, thus improving thegeneralization capacity of the proposed method v-LPMIVSVM. In Chapter5, inspired by thesupport vector machines for classification and the small sphere and large margin method, wepresent a novel large margin minimal reduced enclosing ball learning machine (LMMREB) forpattern classification to improving the classification performance of gap-tolerant classifiers withconstructing a minimal enclosing hyper-sphere separating data with the maximum margin andminimum enclosing volume in the Mercer induced feature space. The basic idea is to find two optimal minimal reduced enclosing balls by adjusting a reduced factor parameter q such that each ofbinary classes is enclosed by them respectively and the margin between one class pattern and thereduced enclosing ball is maximized, thus implementing both maximum between-class margin andminimum within-class volume. In Chapter6, to deal with several problems occurring in classicalsupport vector machines such as over-fitting problem, which resulted from the outlier and classimbalance learning, and loss of the statistics information of training examples, we present a novelclassifier called total margin based fuzzy hyper-sphere learning machine (TMF-SSLM) byconstructing a minimum hyper-sphere in Mercer kernel-induced feature space so that one class ofbinary patterns is enclosed in the hyper-sphere while another one is separated away from thehyper-sphere with maximum margin, thus implementing both maximum between-class margin andminimum within-class volume. TMF-SSLM solves not only the over-fitting problem resulted fromoutliers with the approaches of fuzzification of the penalty and total margin algorithm, but also theimbalanced datasets by using different cost algorithm, thus obtaining a lower generalization errorbound.
     In the third part which is composed of Chapter7and Chapter8, some issues about domainadaptation learning are deeply explored. In Chapter7, in allusion to the existing drawback ofpresent domain adaptation learning methods, which may not work well when only to minimize thedistribution mean discrepancy between source domain and target domain is considered, we design anovel domain adaptation learning method based on structure risk minimization model, calledDAKSVM (Kernel support vector machine for domain adaptation) with respect to support vectormachine (SVM) and LSDAKSVM with respect to least-square SVM (LS-SVM) respectively, toeffectively minimize both the distribution mean discrepancy and the distribution scatter discrepancybetween source domain and target domain in some reproduced kernel Hilbert space, which is thenused to improve the classification performance. In Chapter8, aiming at some problems in domainadaptation learning, we propose a novel so-called Kernel Distribution Consistency based LocalDomain Adaptation Classifier (KDC-LDAC). Firstly, in some universal reproduced kernel Hilbertspace (URKHS), KDC-LDAC trained a kernel distribution consistency regularized domainadaptation Support Vector Machine (SVM) based on structure risk minimization model. Andsecondly, according to the idea of local learning, the proposed method predicted the label of eachdata point in target domain based on its neighbors and their labels in the URKHS. The last but notleast, KDC-LDAC learning a discriminant function to classify the unseen data in target domain withtraining data well predicted in local learning procedure.
     At last, we make the conclusions about our overall study works in Chapter9.

引文

[1]朱明.机器学习与数据挖掘:方法和应用[M].北京:电子工业出版社,2004.
    [2]常甜甜.支持向量机学习算法若干问题的研究[B].陕西:西安电子科技大学博士学位论文,2010.
    [3] Duda R O, Hart PE. Pattern, Stork D G. Pattern Classification [M]. NewYork: Wiley Press,2000.
    [4]邓乃杨，田英杰.数据挖掘中的新方法-支持向量机[M].北京：科学出版社，2005
    [5] C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining andKnowledge Disovery,1998,2(2):121-167.
    [6] V. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag,1995.
    [7] V. Vapnik. Statistical Learning Theory. Wiley, NewYork,1998.
    [8] V. Vapnik, A Chervonenkis. The necessary and sufficient conditions for consistency in the empirical riskminimization method. Pattern Recognition and ImageAnalysis,1991,1(3):283-305.
    [9] V. Vapnik, A Chervonenkis. The necessary and sufficient conditions for the uniform convergence of meansto their expectations. Theory of Probability and Its Applications,1981,26(3):532-553.
    [10]王晓明.基于统计学习的模式识别几个问题及其应用研究[B].江苏:江南大学博士学位论文,2010.
    [11] Tenenbaum JB, de Silva V, Langford JC. A global geometric framework for nonlinear dimensionalityreduction [J]. Science,2000,290(5500):23192323.
    [12] S. Yan, D. Xu, B. Zhang, H.-J. Zhang, Q. Yang, and S. Lin, Graph embedding and extension: A generalframework for dimensionality reduction [J]. IEEE Trans. Pattern Anal. Mach. Intell., vol.29, no.1, pp.40–51, Jan.2007.
    [13] Roweis ST, Saul LK. Nonlinear dimensionality reduction by locally linear embedding [J]. Science,2000,290(5500):23232326.
    [14] Belkin M, Niyogi P, Sindhwani V. Manifold regularization: a geometric framework for learning fromexamples [J]. Journal of Ma chine Learning Research,2006,(7):2399-2434.
    [15] X. He, P. Niyogi, Locality preserving projections [C]. In: Proceedings of the Conference on Advances inNeural Information Processing Systems,2003, pp.585–591.
    [16] J. Yang, D. Zhang, J.Y. Yang, and B. Niu. Globally Maximizing, Locally Minimizing: UnsupervisedDiscriminant Projection with Applications to Face and Palm Biometrics. IEEE Trans. Pattern Analysis andMachine Intelligence, vol.29, no.4, pp.650-664, Apr.2007.
    [17] D. Cai, X. He, K. Zhou and J. Han, et al. Locality sensitive discriminant analysis. IJCAI,708-713,2007.
    [18] B. Boser, I. Guyon, V. N. Vapnik. Atraining algorithm for optimal margin classifiers. In Proceedings of5thAnnual Workshop Computation on Learning Theory. Pittsburgh, PA:ACM,1992.
    [19] E.Osuna, R. Freund, F. Girosi. Training support vector machines: An application to face detection.Proc.IEEE. Conf. Computer Vision and Pattern Recognition, Puerto Rico,1997.
    [20] E.Osuna, R. Freund, F. Girosi. An improved training algoirthm for support vector machines. Proceedingsof the1997IEEE Workshop on Neural Networks for Signal Processing. New York: IEEE Press,1997,267-285.
    [21] J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In: B.Scholkopf, C. J. Burges, A. J. Smola (ed.). Advance in Kernel Methods-Support Vector Learning,Cambridge, MA, MIT Press,1999,185-208.
    [22] J. C. Platt. Using analytic QP and sparseness to speed training of support vector machines. In M. Kearns, S.Solla and D. Cohn, Advances in Neural Information Processing Systems11. Cambridge, MA: MIT Press,1999,557-563.
    [23] L. Mangasarian. Generalized support vector machine. Advances in Large Margin Classifiers, A. J. Smola,P. Bartlett, b. Schokopf and D. Schuurmans, editors, MIT Press,2000,135-146.
    [24] L. Mangasarian, D. R. Musicant. Data discrimination via nonlinear generalized support vector machines.In M. C. Ferris, O. L. Mangasarian and J.-S. Pang, editors, Complementarity: Applications, Algorithmsand extensions, Kluwer Academic Publishers,2001,233-251.
    [25] Y. J. Lee, O. L. Mangasarian. SSVM: A Smooth support vector machine. Computational Optimization andApplications,2001,20(1):5-22.
    [26] Y. J. Lee, W. F. Hsieh, C. F. Huang. ε-SSVR: Asmooth support vector machine for ε-insensitive regression.IEEE Transactions on Knowledge and data Engineering,2005,17(5):678-685.
    [27] L. Mangasarian, D. R. Musicant. Lagrangian support vector machines. Journal of Machine LearningResearch,2001,1:161-177.
    [28] G. Fung, O. L. Mangasarian. Proximal support vector machine classifiers. Proceedings KDD-2001, SanFrancisco. Association for Computing Machinery, NewYork,2001:77-86.
    [29] J. A. K. Suykens, E. J. Vandewal. Least squares support vector machine classifiers. Neural ProceedingLetters,1999,99(3):293-300.
    [30] J. A. K. Suykens, T. Van Gestel, J. De Brabanter, et al. Least squares support vector machines. WorldScientific, Singapore,2002.
    [31] Y. J. Lee, O. L. Mangasarian. RSVM: Reduced support vector machines. In Proceedings of the SIAMInternational Conference on Data Mining, Chicago, SIAM, Philadelphia,2001.
    [32] C. F. Lin, S. D. Wang. Fuzzy support vector machines. IEEE Transactions on Neural Networks,2002,13(3):466-471.
    [33] H. P. Huang, Y. H. Liu. Fuzzy support vector machines for pattern recognition and data mining.International Journal of Fuzzy Systems,2002,4(3):826-835.
    [34] P. Shivaswamy and T. Jebara. Maximum Relative Margin and Data-Dependent Regularization [J]. Journalof Machine Learning Research,2010,11:747-788.
    [35] Mingrui Wu and Jieping Ye. A small Sphere and Large Margin Approach for Novelty Detection UsingTraining Data with Outliers [J]. IEEE Trans. On Pattern Analysis and Machine Intelligence,2009,31(11):2088-2092.
    [36] D.M.J. Tax and R.P.W. Duin. Support Vector Data Description [J]. Machine Learning.2004, vol.54, pp.45-66,2004.
    [37] Müller, K. R., Mika, S., Ratsch, G., Tsuda, K., and Sch lkopf, B. An introduction to kernel-based learningalgorithms [J]. IEEE Trans. on Neural Networks,2001,12(2):181–201.
    [38] Drueker H，Burges C J C，Kaufman L，Smola A，Vapnik V N. Support vector regression machines [C].In:Advances in Neural Information Proeessing Systems,1997.
    [39] Sch lkopf B, Smola AJ, Wlllianson R, Bartlett P. New support vector algorithms [J]. Neural Computation,2000,12(5):1207-1245.
    [40] Smola A J, Sch lkopf B. A tutorial on support vector regression [J]. Statistics and Computing,2004,14(3):199-222.
    [41] Asa B H, Horn D, Siegelmann H T, Vapnik V. Support vector clustering [J]. Journal of Machine LearningResearch,2001,2:125-137.
    [42] Fukunaga K. Introduction to Statistical Pattern Recognition [M]. NewYork:Academic Press,1990.
    [43] Diamantaras K I, Kung S Y. Principal Component Neural Networks [M]. New York:Wiley,1996.
    [44] Scholkopf A, Smola B, Muller K R. Nonlinear component analysis as a Kernel eigenvalueproblem [J]. Neural Computing,1998,10(5):1299–1319.
    [45] Mika S, Ratsch G, Weston J. Scholkopf B, Mullers K R. Fisher discriminant analysis with kernels [C]. In:Hu Y H, Larsen J, Wilson E, Douglas S, eds. Neural Networks for Signal Processing IX, IEEE,1999:41-48.
    [46] Baudat G, Anouar F. Generalized discriminant analysis using a kernel approach [J]. Neural Computation,2000,12(10):2385-2404.
    [47] Sinno Jialin Pan, Ivor W. Tsang, and James T. Kwok, etc. Domain Adaptation via Transfer ComponentAnalysis. IEEE TRANSACTIONS ON NEURALNETWORKS,2011,22(2):199-210.
    [48] S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and DataEngineering,2010,22(10):1345–1359.
    [49] T. Joachims. Transductive inference for text classification using support vector machines. In I. Bratko andS. Dzeroski, editors, Proceedings of ICML-99,16th International Conference on Machine Learning, pages200-209. Morgan Kaufmann Publishers,1999.
    [50] Evan Wei Xiang, Bin Cao, Derek Hao Hu, and Qiang Yang. Bridging Domains Using World WideKnowledge for Transfer Learning. IEEE TRANSACTIONS ON KNOWLEDGE AND DATAENGINEERING, VOL.22, NO.6,2010.
    [51] S. Ozawa, A. Roy, and D. Roussinov. A multitask learning model for online pattern recognition. IEEETransactions on Neural Networks, vol.20, no.3, pp.430–445, Mar.2009.
    [52] S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and DataEngineering. vol.22, no.10, pp.1345–1359, Oct.2010.
    [53]文传军,詹永照等.最大间隔最小体积球形支持向量机[J].控制与决策,2010,25(1):79-83.
    [54] Chang H, Yeung DY. Robust locally linear embedding [J]. Pattern Recognition,2006,39(6):10531065.
    [55] Yin JS, Hu DW, Zhou ZT. Manifold learning using growing locally linear embedding [C]. In: Duch W, ed.Proc. of the2007IEEE Symp. on Computational Intelligence and Data Mining. Honolulu: IEEE Press,2007:7380.
    [56] Varini C, Degenhard A, Nattkemper TW. ISOLLE: LLE with geodesic distance [J]. Neuro computing,2006,69:17681771.
    [57]文贵华，陆庭辉，江丽君等.基于相对流形的局部线性嵌入[J].软件学报,2009,20(9):23762386.
    [58] N. Kwak. Principal component analysis based on L1-norm maximization [J]. IEEE Trans. Pattern Anal.Mach. Intell.,2008,30(9):1672–1680.
    [59] Xuelong Li, Yanwei Pang, Yuan Yuan. L1-Norm-Based2DPCA [J], IEEE Trans. On Systems, Man, andCybernetics, Part B,2010,40(4):1170-1175.
    [60]徐雪松,宋东明,张谞等.基于核函数的稳健线性嵌入方法[J].中国图象图形学报,2009,14(6):1141-1147.
    [61]徐雪松,宋东明,张谞等.基于流形学习的离群点检测方法[J].中国工程科学,2009,11(2):82-87.
    [62] Eimad E. Abusham and E. K. Wong. Locally Linear Discriminate Embedding for Face Recognition [J].Discrete Dynamics in Nature and Society,2009,2009(1):1-8.
    [63] Y. Goldberg, Y. Ritov. LDR-LLE: LLE with Low-Dimensional Neighborhood Representation [J].Advances in Visual Computing, Lecture Notes in Computer Science,2008, Volume5359(2008):43-54.
    [64] P. Shivaswamy and T. Jebara. Ellipsoidal Kernel Machines [C]. Artificial Intelligence and Statistics,AISTATS, March2007.
    [65] F.L. Chung, Wang Shitong, Deng Zhaohong, et al. Fuzzy kernel hyperball perceptron [J]. Applied SoftComputing,2004,5(1):67–74.
    [66] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, Choosing Multiple Parameters for Support VectorMachines [J]. Machine Learning,2002,469(1-3):131-159.
    [67] Fu-Lai Chung, Zhaohong Deng, Shitong Wang. From Minimum Enclosing Ball to Fast Fuzzy InferenceSystem Training on Large Datasets [J]. IEEE Trans. on Fuzzy System,2009,17(1):173-184.
    [68] V. Vapnik and O. Chapelle. Bounds on Error Expectation for Support Vector Machines [J]. NeuralComputation,2000,12(9):2013-2036.
    [69] Cortes, C. and Vapnik, V. N. Support vector network. Machine learning,1995,20(1):1-25.
    [70]邓乃扬,田英杰.支持向量机——理论、算法与拓展[M].科学出版社,2009:5-19,81-132,170-180.
    [71] S.Zafeiriou, A.Tefas, I.Pitas. Minimum class variance support vector machines [J]. IEEE Transactions onImage processing,2007,16(10):2551–2564.
    [72] X.F. He, S.C. Yan, Y.X. Hu, P. Niyogi, H.J. Zhang, Face recognition using Laplacianfaces [J]. IEEETransactions on Pattern Analysis and Machine Intelligence27(3)(2005)328–340.
    [73]皋军,王士同,邓赵红.基于全局和局部保持的半监督支持向量机[J].电子学报,2010,38(07):1626-1634.
    [74] Xiaoming Wang, Fu-lai Chung and Shitong Wang. On minimum class locality preserving variance supportvector machine [J]. Pattern Recognition43(2010):2753-2762.
    [75] Haixian Wang and Sibao Chen et al. Locality-preserved maximum information projection [J]. IEEETransactions on Neural Networks, vol.19, No.4, April2008.
    [76] H. Li, T. Jiang, and K. Zhang. Efficient and robust feature extraction by maximum margin criterion. IEEETrans. Neural Networks., vol.17, no.1, pp.157–165, Jan.2006.
    [77]高全学,谢德燕等.融合局部结构和差异信息的监督特征提取算法[J].自动化学报,2010,36(8):1107-1114.
    [78] F.R.K. Chung. Spectral Graph Theory. Regional Conference Series in Mathematics, number92,1997.
    [79] Scholkopf B., Herbrich R. and Smola A. J． Ageneralized representer theorem [A]. Proc. COLT’2001[C].Amsterdam: Springer Press,2001,416-426.
    [80] Hui X., Songcan C. and Qiang Y. Discriminatively regularized least-squares classification [J]. PatternRecognition,2009,42(1):93-104.
    [81] Deng Cai, Xiaofei He and Jiawei Han. Orthogonal Laplacianfaces for Face Recognition [J]. IEEETransactions on Image Processing,2006,15(11):3608-3614.
    [82] Jianwen Tao, Shitong Wang, Wenjun Hu, and Wenhao Ying. ρ-Margin Kernel Learning Machine withMagnetic Field Effect for Both Binary Classification and Novelty Detection. International Journal ofSoftware and Informatics,2010,4(3):305-324.
    [83] Wang J. G., Neskovic P., Cooper L. N. Pattern classification via single sphere. Lecture Notes in ComputerScience, Discovery Science,2005,37(35):241-252.
    [84] Hao P. Y, Chiang J. H, Lin Y. H. A new maximal margin spherical structured multi-class support vectormachine. Applied Intelligence,2009,30(2):98-111.
    [85] Liu, Y., Zheng, Y.F. Maximum Enclosing and Maximum Excluding Machine for Pattern Description andDiscrimination. In: Y. Y. Tang, et al., eds. Proceeding of ICPR2006Conference. Hongkong: IEEEComputer Society Press,2006:129-132.
    [86] W. Tsang, J.T. Kwok, P.M. Cheung. Core vector machines: fast SVM training on very large datasets.Journal of Machine Learning Research,2005(1),6:363–392.
    [87] Zhaohong Deng, Fu-Lai Chung, Shitong Wang. FRSDE: Fast reduced set density estimator using minimalenclosing ball approximation. Pattern Recognition,2008,41(4):1363-1372.
    [88]彭新俊,王翼飞.总间隔v-支持向量机及其几何问题[J].模式识别与人工智能.2009,22(1):8-16.
    [89] Yi-Hung Liu and Yen-Ting Chen, Face Recognition Using Total Margin-Based Adaptive Fuzzy SupportVector Machines. IEEE IEEE Transactions on Neural Networks,2007,18(1):178-192.
    [90] M. Yoon, Y. Yun, and H. Nakayama, A role of total margin in support vector machines [C]. In Proc. Int.Joint Conf. Neural Network,2003, vol.3, pp.2049–2053.
    [91] Yoo Min, Yun Yeboon and Nakayama Hirotaka. Total margin algorithms in support vector machines [J].IEICE Transactions on Information and Systems,2004, E87D (5):1223-1230.
    [92] Liu Yi-Hung, Chen Yen-Ting. Total margin based adaptive fuzzy support vector machines for multi viewface recognition [C]. IEEE International Conference on Systems, Man and Cybernetics,2005,V2:1704-1711.
    [93] Peng Xin-Jun, Wang Yi-Fei. Total margin ν-support vector machine and its geometric problem [J].PatternRecognition andArtificial Intelligence,2009,22(1):8-16.
    [94] Ge Lei, Wu Hui-Zhong. A kernel-based fuzzy greedy multiple hyper-spheres covering algorithm forpattern classification [J]. Neurocomputing,2008,72(1-3):313-320.
    [95] Lorenzo Bruzzone and Mattia Marconcini. Domain Adaptation Problems: A DASVM ClassificationTechnique and a Circular Validation Strategy. IEEE Transactions on Pattern Analysis and MachineInteligence,2010,32(5):770-787.
    [96] Brian Quanz, Jun Huan. Large Margin Transductive Transfer Learning. In Proceeding of the18th ACMconference on Information and knowledge management (CIKM), ACM New York, NY, USA,2009:1327-1336.
    [97] S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. InNIPS,2007.
    [98] X. Ling, W. Dai, G. Xue, Q. Yang, and Y. Yu. Spectral domain transfer learning. In Proceedings of the14thACM SIGKDD international conference on Knowledge discovery and data mining. ACM New York, NY,USA,2008.
    [99] Q. Y. Wenyuan Dai, Gui-Rong Xue and Y. Yu. Co-clustering based classification for out-of-domaindocuments. In Proceedings of the Thirteenth ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining, pages210-219, San Jose, California, USA, August2007.ACM.
    [100] J. Blitzer, R. McDonald, and F. Pereira. Domain adaptation with structural correspondence learning. InProc. Conf. Empirical Methods Natural Lang., Sydney, Australia, Jul.2006, pp.120–128.
    [101] J. Blitzer, M. Dredze, and F. Pereira, Biographies, Bollywood. Boom-Boxes and Blenders: DomainAdaptation for Sentiment Classification. Proc.45th Ann. Meeting of the Assoc. for ComputationalLinguistics (ACL’07), pp.440-447, June2007.
    [102] Bharath K. Sriperumbudur, Arthur Gretton and Kenji Fukumizu, etc. Hilbert Space Embeddings andMetrics on Probability Measures. Journal of Machine Learning Research,2010,11(3):1517-1561.
    [103] Gretton, A., Z. Harchaoui, K. Fukumizu, B. SriperumbudurA Fast, Consistent Kernel Two-Sample Test.Advances in Neural Information Processing Systems22,673-681, MITPress (2010).
    [104] Bharath K. Sriperumbadur, Kenji Fukumizu and Arthur Gretton etc. Kernel choice and classification forRKHS embeddings of probability distributions. Advances in Neural Information Processing Systems22,1750-1758, MIT Press (2010).
    [105] A. J. Smola, A. Gretton, L. Song, and B. Sch lkopf. A Hilbert space embedding for distributions. InProc.18th Int. Conf.Algorithmic Learn. Theory, Sendai, Japan, Oct.2007, pp.13–31.
    [106] Y. Wu and Y. Liu. Robust truncated hinge loss support vector machines. Journal of the AmericanStatistical Association,2007,102(479):974-983.
    [107] T. Kanamori, S. Hido, and M. Sugiyama. A least-squares approach to direct importance estimation. J.Mach. Learn. Res., vol.10, pp.1391–1445, Jul.2009.
    [108] Thomas Hofmann, Bernhard Sch lkopf, and Alexander J. Smola. Kernel methods in machine learning.Annals of Statistics,36:1171-1220.
    [109] Y. Ying and C. Campbell. Generalization bounds for learning the kernel. In Proc. of the22nd AnnualConference on Learning Theory,2009.
    [110] J. Gao, W. Fan, J. Jiang, and J. Han. Knowledge transfer via multiple model local structure mapping. InProceedings of the14thACM SIGKDD conference on Knowledge Discovery and Data Mining,2008.
    [111] X.H. Phan, M.L. Nguyen, and S. Horiguchi. Learning to Classify Short and Sparse Text&Web withHidden Topics from Large-Scale Data Collections,” Proc.17th Int’l Conf. World Wide Web (WWW’08),pp.91-100, Apr.2008.
    [112] J. Blitzer, M. Dredze, and F. Pereira, Biographies, Bollywood. Boom-Boxes and Blenders: DomainAdaptation for Sentiment Classification. Proc.45th Ann. Meeting of the Assoc. for ComputationalLinguistics (ACL’07), pp.440-447, June2007.
    [113] C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines,2001. Software available athttp://www.csie.ntu.edu.tw/~cjlin/libsvm.
    [114] S.M. Beitzel, E.C. Jensen, O. Frieder, D.D. Lewis, A. Chowdhury, and A. Kolcz. Improving AutomaticQuery Classification via Semi-Supervised Learning. Proc. Fifth IEEE Int’l Conf. Data Mining (ICDM’05),pp.42-49, Nov.2005.
    [115] Hong Zeng and Yiu-ming Cheung. Feature Selection and Kernel Learning for Local Learning-BasedClustering. IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(8):1532-1547.
    [116] M. Wu and Sch¨olkopf B. Transductive Classification via Local Learning Regularization. Proc.11thInt’l Conf. Artificial Intelligence and Statistics, pp.628-635,2007.
    [117] M. Wu and Sch¨olkopf B. A Local Learning Approach for Clustering. In B. Sch¨olkopf, J. Platt, and T.Hoffman, editors, Advances in Neural Information Processing Systems19. MIT Press, Cambridge, MA, pp.1529-1536,2007.
    [118] S. X. Yu and J. Shi. Multiclass Spectral Clustering. In L. D. Raedt and S. Wrobel, editors, InternationalConference on Computer Vision. ACM,2003.
    [119] Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., and Wortman, J. Learning Bounds for DomainAdaptation. In Proceedings of NIPS. MIT Press,2007.
    [120] J. Huang, A. Smola, A. Gretton, K. M. Borgwardt, and B. Sch olkopf. Correcting sample selection biasby unlabeled data. In Proceedings of Twentieth Annual Conference on Neural Information ProcessingSystems,2006.
    [121] W. Jiang, E. Zavesky, S.-F. Chang, and A. Loui. Cross-Domain Learning Methods for High-Level VisualConcept Classification. Proc. IEEE Int’l Conf. Image Processing, pp.161–164,2008.
    [122] D. Xu and S.-F. Chang.Video Event Recognition Using Kernel Methods with Multilevel TemporalAlignment,” IEEE Trans. PatternAnalysis and Machine Intelligence, vol.30, no.11, pp.1985–1997,2008.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700