用户名: 密码: 验证码:
基于半监督学习的渭河水质定量遥感研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
渭河流域陕西段土地肥沃物产丰富,是我国重要的种植业和畜牧业产地,是陕西的政治、经济、文化发展中心。然而,随着人口增加,城镇工业发展,水资源短缺、水质污染等水质恶化问题日趋严重,直接危害了流域的生态环境和人民的生产生活,制约了经济的发展。因此渭河流域陕西段进行水质监测有着重要的现实意义。传统的水质监测方法受到人力、物力、气候等多种客观条件的限制,难以实现连续、快速地跟踪调查与分析。利用遥感数据和实地水质监测数据,建立水质遥感反演模型,实现水质遥感监测,弥补了传统方法的不足,可以全面、快速、动态的对水质环境进行监测,但受到客观条件的影响,实地的水质数据难以大量获取,因此利用半监督学习理论,以大量易获取的遥感影像数据为未标记样本数据,建立半监督的遥感水质反演模型,可以有效解决水质监测问题。
     本文以渭河流域陕西段为研究对象,总结概述了水质的遥感监测原理和研究区域概况,分析了半监督学习理论,特别是半监督回归问题的研究方法,介绍了统计学习理论和支持向量机等相关理论知识。本文主要的研究工作包括:
     (1)介绍了两种常见的用于优选支持向量机参数的智能优化算法,粒子群算法和遗传算法。介绍了粒子群优化算法(PSO)的基本原理,利用半监督自训练方法的思想,对标准的基于PSO的支持向量机回归模型(PSO-SVM)加以改进,建立了基于粒子群算法支持向量机的半监督回归模型(PSSRM),将其应用于渭河水质的定量遥感反演中,并将反演结果与基于粒子群算法的支持向量机回归模型(PSO-SVM)相比较。该回归模型在一定程度上提高了回归精度,具有收敛速度快、调节参数少、易实现等优点。
     (2)介绍了协同训练算法和遗传算法的基本理论和原理,将协同训练算法与基于GA优选参数的SVM回归模型(GA-SVM)相结合,建立了GA优选参数的半监督协同训练回归模型(GSSRCM),将其应用于渭河水质定量遥感反演中,并将回归结果与GA-SVM相比较。该模型克服了PSO模型不稳定、精度较低、易发散等缺点,引入了未标记样本,与传统的GA-SVM模型相比较,有效提高了回归精度和模型的推广性,可有效地对各类水质变量进行反演预测。实验结果表明,基于半监督学习的回归模型可以有效实现渭河陕西段的遥感水质定量反演预测。本文将GSSRCM回归模型应用于渭河流域陕西段的整体河流的水质变量反演中,预测结果与实际状况相符合,进一步证明了该模型的有效性。
Weihe River in Shaanxi Province with fertile land and abundant resources, is an important crop and livestock origin. It is the polity、economy and civilization centre of Shaanxi Province. With the increase of population and the development of industry, the shortage of water resources and water quality pollution are more and more serious, which have destroyed the ecological environment and people's lives and restricted the economic development. Therefore the water quality monitoring of Weihe River in Shaanxi Province has important practical significance. The traditional method of water quality monitoring limited by the human, material, climate and other objective conditions, is difficult to achieve continuous, fast track the investigation and analysis.Utilization of water quality remote sensing image and field water quality data and building water quality monitoring model can monitor the water quality environment completely、fast and dynamically. However a number of field water quality data is hard to get because of objective conditions. So building a semi-supervised remote sensing data's retrieving model of water quality monitoring is an effective method for water quality monitoring by semi-supervised learning theory.
     This paper is mainly studied the Weihe River in Shaanxi section, summaryizes the principles of remote monitoring of water quality and research regional profiles, analysis of the semi-supervised learning theory, particularly the problem of semi-supervised regression methods, introduced statistical learning theory and support vector machines and other related theoretical knowledge.The main research work of this paper include:
     (1) Describes two common parameters for the optimization of support vector machines for intelligent optimization algorithm, particle swarm optimization and genetic algorithm. Introduced particle swarm optimization (PSO) of the basic principles, improved the standard PSO-based support vector machine regression model (PSO-SVM) using semi-supervised self-training method. Established the semi-supervised support vector machine regression model (PSSRM) which based on the particle swarm algorithm, and apply it to the Weihe River water quality in quantitative remote sensing inversion then comparing the result to the particle swarm optimization based on support vector machine regression model(PSO-SVM). The regression model improve the regression accuracy, convergence speed, adjustable parameters are few and easy to implement to some extent.
     (2) Introduced the basic theory and principle about collaborative training algorithm and genetic algorithm, combined the coordinate training algorithm and the SVM regression model based on GA optimization parameter (GA-SVM), Established collaborative training semi-supervised regression model based on GA optimization parameters(GSSRCM),and applied it into the Weihe River water quality in quantitative remote sensing inversion,then compared the regression results and GA-SVM. The model overcomes the PSO model instability, less precise, easy to divergence and other shortcomings, improved the precision and the regression model generalization effectively, which can be make predicted inversion for various types of water quality variables. The results show that Semi-supervised learning based on the regression model is able to achieve the water quality of Weihe River quantitative remote sensing inversion forecast. The GSSRCM regression model was applied to the Weihe River in Shaanxi section of the overall river water quality variable inversion in this paper, the predicted results and the actual situation is consistent,which further evidence of the validity of the model.
引文
[1]梅安新,秦齐明,刘慧平等.遥感导论[M].北京:高等教育出版社,2001.
    [2]易星.半监督学习若干问题的研究[D].北京:清华大学,2004.
    [3]刘燕,胡安焱,邓亚芝.陕西省渭河流域水质时空演化特性[J].水资源保护,2007,23(03):11-39.
    [4]杨一鹏,王桥,王文杰,高士平.水质遥感监测技术研究进展[J].地理与地理信息科学,Vol.20 No.6, November 2004.
    [5]HOOGENBOOM H J, DEKKER A G, ALTHUIS J A. Simulation of AVIRIS sensitivity for detecting chlorophyll over coastal and inlandwaters [J]. Remote Sens.environ.1998,65:333-340.
    [6]MOORE G K. Satellite remote sensing of water turbidity[J]. Hydrol. Sci.,1980, 25:415-421.
    [7]DEKKERAG, VOS R J, PETERS SWM. Comparison of remote sensing data,model results and in situ data for total suspended matter(TSM) in the southern Frisian lakes[J]. The Science of the Total Environment,2001,268:197-214.
    [8]李素菊,王学军.内陆水体水质指标光谱特征与定量遥感[J].地理学与国土研究,2002,18(2):26-30.
    [9]李京.水域悬浮固体含量的遥感定量研究[J].环境科学学报,1986,6(2):166-173.
    [10]黎夏.悬浮泥沙遥感定量的统一模式及其在珠江口中的应用[J].环境遥感,1992,7(2):106-113.
    [11]Mahtab A L, Runquist D C, Han L H, etal. Estimation of suspended sediment concentration in water using intergrated surface flectance [J]. Geocarto International,1998,13(2):11-15.
    [12]QUIBELL G.The effect of suspended sediment on reflectance from fresh water algae[J]. International Journal of Remote Sensing,1991,12(1):177-182.
    [13]FRASER R N. Hyperspectral remote sensing of turbidity and chlorophyll-a among Nebraska Sand Hills lakes[J]. International Journal of Remote Sensing,1998,19(8): 1579-1589.
    [14]ALLEE R J, JOHNSONJ E. Use of satellite imageryto estimate surface chlorophyll-a and secchidisc depth of Bull reservoir, Arkanass, USA [J]. International Journal of Remote Sensing,1999,20(6):1057-1072.
    [15]Thienmann S, Kaufmann H. Determination of chlorophyll content and trophic state of lakes using field spectrometer and IRS-I satellite data in the Mecklenburg Lake Distract, Germany [J]. Remote Sens Environ,2000,73:227-235.
    [16]詹海刚,施平,陈楚群.利用神经网络反演叶绿素浓度[J].科学通报,2000,45(17):1879-1884.
    [17]赵冬至,曲元,张丰收等.用TM图像估算海表面叶绿素浓度的神经网络模型[J].海洋环境科学,2001,20(1):16-21.
    [18]张亭禄,贺明霞.基于人工神经网络的一类水域叶绿素a浓度反演方法[J].遥感学报,2002,6(1):40-44.
    [19]Tony Jebara. Discriminative, Generative and Imitative Learning[C]. PhD thesis, Massachusetts Inst.of Technology Media laboratory, Dec 2001.
    [20]Tommi Jaakkola, Maria Meila, and Tony Jebara. Maximum entropy discrimination[C]. Technical Report AITR-1668, Massachusetts Inst.of Technology AI lab,1999.
    [21]Roderick J. Little and Donald B.Rubin. Statistical analysis with missing data[J]. Wiley, NewYork,1987.
    [22]J.L.Schafer. Analysis of Incomplete Multivariate Data[J]. Chapman&Hall,1997.
    [23]A.McCallum, R.Rosenfeld, T.Mitchell, and A.Ng. Improving text classification by shrinkage in a hierarchy of classes[C]. In Proc.l5th intl.Conf. on Machine Learning(ICML) [ICML98],1998.359~367.
    [24]Fabien Letouzey, Francois Denis, and R'emi Gilleron. Learning from positive and unlabeled examples[C]. In 11-th Intl.Conf.on Algorithmic Learning Theory(ALT), Sydney, Australia, December 2000.71~85.
    [25]Liu.B., Lee.W.S., Yu.P.S.and Li.X. Partially Supervised Classification of Text Documents[C]. Proc.19th Intl.Conf.on Machine Learning, Sydney, Australia, July 2002.387~394.
    [26]Wee Sun Lee, Bing Liu, Learning with Positive and Unlabeled Examples Using Weighted Logistic Regression[C]. In Proc.of ICML,2003:448~455.
    [27]K.P.Bennett and A.Demiriz. Semi-supervised support vector machines[J]. Advances in Neural Information Processing Systems, Cambridge, MA,1998,10:368~374.
    [28]Tobias Scheffer and Stefan Wrobel. Active learning of partially hidden markov models[C]. In Proceedings of the ECML/PKDD Workshop on Instance Selection, 2001.
    [29]Kiri Wagstaff, Claire Cardie, Seth Rogers, and Stefan Schroedl. Constrained K-means Clustering with Background Knowledge[C]. ICML-2001:577~584.
    [30]Sugato Basu, Arindam Banerjee, Raymond J. Mooney:Semi-supervised Clustering by Seeding[C]. ICML2002:19~26.
    [31]Charles C.Kemp, Thomas L.Griffiths, Sean Stromsten, Joshua B.Tenenbaum. Semi-Supervised Learning with Trees[C]. In Proc.of NIPS2003.
    [32]Aharon Bar-Hillel, Tomer Hertz, Noam Shental, and Daphna Weinshall. Learning distance functions using equivalence relations[C]. In Proc.of 20th International Conference on Machine Learning,2003:11~18.
    [33]Kamal Nigam, Andrew Mccallum, Sebastian Thrun, Tom Mitchell. Text Classification from Labeled and Unlabeled Documents using EM[J]. Machine Learning,2000,39:103~134.
    [34]J.Larsen, A.Szymkowiak and L.K.Hansen. Probabilistic Hierarchical Clustering with labeled and Unlabeled Data[C]. invited submission for Int.Journal of Knowledge Based Intelligent Engineering Systems,2001.
    [35]Martin Szummer and Tommi Jaakkola. Kernel expansions with unlabeled examples[C]. In Advances in Neural Information Processing Systems(NIPS)[NIP01],2001.626~632.
    [36]Martin Szummer and Tommi Jaakkola. Partially labeled classification with markov random walks[C]. In Advances in Neural Information Processing Systems(NIPS)[NIP02],2002.945~952.
    [37]Fabio Cozman, Ira Cohen, Marcelo Cirelo. Semi-Supervised Learning of Mixture Models[C].In Proc.of ICML,2003.
    [38]Chapelle,O., J.Weston and B.Schlkopf. Cluster Kernels for Semi-Supervised Learning[C]. Advances in Neural Information Processing Systems,2005.
    [39]Xiaojin Zhu, Zoubin Ghahramani,John Lafferty. Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions[C]. In Proc.of ICML,2003.
    [40]Blum,A.and Chawla,S. Learning from Labeled and Unlabeled Data using Graph Mincuts[C]. In Proceedings of ICML,2001.
    [41]祁亨年,杨建刚,方陆明.基于多类支持向量机的遥感图像分类及其半监督式改进策略[J].复旦学报,2004,05(43):781~784.
    [42]邱磊,李国辉,代科学.遥感图像的半监督的改进FCM算法[J].计算机应用研究,2006,07(43):252~253.
    [43]陆家驹.长江南京江段水质遥感分析[J].国土资源遥感,2002,(3):33-36.
    [44]童小华,谢欢,仇雁翎,赵建夫.黄浦江上游水域的多光谱遥感水质监测与反演模型[J].武汉大学学报,信息科学版,2006,31(10):851-854.
    [45]张华,曾光明,李忠武,黄国和,谢更新.内陆水环境污染监测的多时相遥感信息模型[J].中国环境监测,2005,21(5):63-68.
    [46]佘丰宁,李旭文,蔡启铭等.水体叶绿素含量的遥感定量模型[J].湖泊科学,1996,8(3):201-207.
    [47]王学军,马廷.应用遥感技术监测和评价太湖水质状况[J].环境科学,2000,21:65-68.
    [48]刘英.千岛湖水体水质参数遥感及其估测模型研究[D].浙江:浙江大学硕士学位论文,2003.
    [49]吕恒,江南,李新.内陆湖泊的水质遥感监测研究[J].地球科学进展,2005,20(2):185-192.
    [50]谢欢,童小华.水质监测与评价中的遥感应用[J].遥感信息,2006,84:67-75.
    [51]Lathrop R G, Lillesand T M, and Yandell B S. Testing the utility of simple multi-date Thematic Mapper calibration algorithms for monitoring turbid inland waters[J]. Int.J.Remote Sensing,1991,10:2045-2063.
    [52]Lathrop R G, Lillesand T M.Monitoring water quality and river plume transport in green bay, lake Michigan with SPOT-1 imagery[J]. Photogremm Eng, Remote Sens.1989,55:349-354.
    [53]Koponen S, Pulliainen J. Lake water quality classification with airborne hyperspectral spectrometer and simulated MERIS data[J]. Remote Sensing of Environment,2002,79:51-59.
    [54]疏小舟,尹球,匡定波.内陆水体藻类叶绿素浓度与反射光谱特征豹关系[J].遥感学报,2000,4(1):41-45.
    [55]Kallio K, KusterT, Koponen S,et al. Retrieval of water quality from airborne imaging spectrometry of various lake types in different Seasons[J]. The Science of the Total Environment,2001,268:56-77.
    [56]张蓉贞,张幸.渭河流域陕西段近50年生态环境演变[J].干旱区资源与环境,2008,28(8):31-41.
    [57]杨操静.水安全评价及其在渭河中的应用研究[D].陕西:长安大学学位硕士论 文,2007.
    [58]周兆勇.基于支持向量机的渭河水质定量遥感研究[D].陕西:陕西师范大学学位硕士论文,2008.
    [59]蒋塞.基于高分辨率遥感影像的渭河水质遥感监测研究[D].陕西:陕西师范大学学位硕士论文,2009.
    [60]赵玉芹.渭河水质遥感反演的人工神经网络模型研究[J].遥感技术与应用,20009(01):12-20.
    [61]蒋塞.渭河定量遥感水质反演中的大气校正作用研究[J].遥感技术与应用,20009(02):13-20.
    [62]周志华.半监督学习中的协同训练风范[M].机器学习及其应用,北京:清华大学出版社,2007:259-275.
    [63]B. Shahshahani, D. Landgrebe. The effect of unlabeled samples in reducing the small sample size problem and mitigating the hughes phenomenon[C]. IEEE Transactions on Geoscience and Remote Sensing,1994,32(5):1087-1095.
    [64]D. J. Miller, H. S. Uyar. A mixture of experts classifier with learning based on both labelled and unlabelled data[J]. Advances in Neural Information Processing Systems 9, Cambridge, MA:MIT Press,1997,571-577.
    [65]K. Nigam, A. K. McCallum, S. Thrun, T.Mitchell. Text classification from labeled and unlabeled documents using EM[J]. Machine Learning,2000,39(2-3):103-134.
    [66]A. Blum, S. Chawla. Learning from labeled and unlabeled data using graph mincuts[C]. In:Proceedings of the 18th International Conference on Machine Learning (ICML'01), San Francisco, CA,2001,19-26.
    [67]M. Belkin, P. Niyogi. Semi-supervised learning on Riemannian manifolds[J]. Machine Learning,2004,56(1-3):209-239.
    [68]X. Zhu, Z. Ghahramani, J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions[C]. In:Proceedings of the 20th International Conference on Machine Learning (ICML'03), Washington, DC,2003,912-919.
    [69]A. Blum, T. Mitchell. Combining labeled and unlabeled data with co-training[C]. In: Proceedings of the 11th Annual Conference on Computational Learning Theory (COLT'98), Wisconsin, MI,1998,92-100.
    [70]S. Goldman, Y. Zhou. Enhancing supervised learning with unlabeled data[C]. In: Proceedings of the 17th International Conference on Machine Learning (ICML'00), San Francisco, CA,2000,327-334.
    [71]Z.-H. Zhou, M. Li.Tri-training. Exploiting unlabeled data using three classifiers[C], IEEE Transactions on Knowledge and Data Engineering,2005,17(11):1529-1541.
    [72]杨剑,王珏,钟宁.流形上的Laplacian半监督回归[J].计算机研究与发展,2007,44(7):1121-1127.
    [73]Zhou, Z.-H., Li, M. Semi-supervised regression withco-training[C]. International Joint Conference on Artificial Intelligence (IJCAI),2005.
    [74]U. Brefeld, T. Gartner, T. Scheffer,S. Wrobel. Efficient co-regularised least squares regression[C]. In:Proceedings of the 23rd International Conference on Machine Learning (ICML'06), Pittsburgh, PA,2006,137-144.
    [75]张学工译.统计学习理论的本质[M].北京:清华大学出版社,2000.
    [76]张学工.关于统计学习理论与支持向量机[J].自动化学报,2000,26(1):32-42.
    [77]Vapnik Lerner. Pattern recognition using generalized portrait method[J]. Automation and Remote Control,1963,24:774-780.
    [78]Steve R. Gunn. Support Vector Machines for Classification and Regression[J]. University of Southampton,1997.
    [79]Eberhart R. C., Kennedy J. A new optimizer using particles swarm theory[C]. Proc. Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, IEEE Service Center, Piscataway, NJ,1995,39-43.
    [80]任洪娥,霍满冬.基于PSO优化的SVM预测应用研究[J].计算机应用研究,2009,26(3):867-869.
    [81]Shi Y. H., Eberhart R. C. Parameter selection in particle swarm optimization[C]. Annual Conference on Evolutionary Programming, San Diego, March 1998.
    [82]李明.遗传算法的改进及其在优化问题中的应用研究[D].吉林:吉林大学学位硕士论文,2004.
    [83]Keerthi. S. S, Lin. C. J. Asymptotic behaviors of support vector machines with gaussian kernel[J].Neural computation,2003,15:1667-1689.
    [84]Zhou Zhi-hua, Wang Yu. Machine Learning and Application[M]. Tsinghua University Press,2007,259-275.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700