用户名: 密码: 验证码:
基于模式识别方法的湖泊水质污染特征聚类研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Clustering of Lake Variables Based on Pattern Recognition Method
  • 作者:任婷玉 ; 梁中耀 ; 陈会丽 ; 刘永
  • 英文作者:REN Tingyu;LIANG Zhongyao;CHEN Huili;LIU Yong;College of Environmental Science and Engineering, Key Laboratory of Water and Sediment Sciences Ministry of Education, Peking University;
  • 关键词:模式识别 ; 水质污染 ; 自组织映射神经网络 ; 随机森林
  • 英文关键词:pattern recognition;;water pollution;;self-organizing feature map;;random forest
  • 中文刊名:北京大学学报(自然科学版)
  • 英文刊名:Acta Scientiarum Naturalium Universitatis Pekinensis
  • 机构:北京大学环境科学与工程学院水沙科学教育部重点实验室;
  • 出版日期:2019-02-28 07:00
  • 出版单位:北京大学学报(自然科学版)
  • 年:2019
  • 期:02
  • 基金:国家自然科学基金(51779002)资助
  • 语种:中文;
  • 页:142-148
  • 页数:7
  • CN:11-2442/N
  • ISSN:0479-8023
  • 分类号:X524
摘要
构建耦合自组织映射神经网络(SOFM)和随机森林(RF)的方法,对中国63个湖泊11年的9种水质指标(5110条数据)进行模式识别。首先采用SOFM对湖泊进行聚类,以识别污染状况,然后采用RF分析水质指标对湖泊类别的决定效果,以确定代表性指标。SOFM的结果显示,湖泊可以按污染程度分为3类。RF的结果发现,在分类准确率为80%时,根据高锰酸盐指数和叶绿素a浓度即可判定湖泊污染程度。该方法可从庞杂的数据中识别出反映水体污染特征的水质指标,为快速认知水体污染状况及选取监测指标提供参考。
        The self-organizing feature map(SOFM) and random forest(RF) method were integrated to recognize water quality patterns of nine water quality indicators for 63 lakes in China for 11 years(5110 data). The SOFM was built firstly to cluster lakes to identify the pollution conditions. Then, the RF was used to explore the good-offitness of water quality variables on the clustering result and to determine the important water quality indicators. The result of SOFM shows that the lakes can be clustered into three types. And the result of RF shows that permanganate index and chlorophyll a can determine the pollution condition when the classification accuracy is 80%. The integrated method can identify the water quality indicators reflecting the pollution conditions from complex data. In practice, the method can be used to determine the pollution conditions and direct the monitoring indicators.
引文
[1]Barnett T P,Pierce D W,Hidalgo H G,et al.Humaninduced changes in the hydrology of the western United States.Science,2008,319:1080-1083
    [2]Harper D,Zalewski M,Pacini N.Ecohydrology:processes,models and case studies:an approach to the sustainable management of water resources.Trowbridge:Cromwell Press,2008
    [3]Kozaki D,Rahim M H B A,Ishak W M F B W,et al.Assessment of the river water pollution levels in Kuantan,Malaysia,using ion-exclusion chromatographic data,water quality indices,and land usage patterns.Air Soil&Water Research,2016,9:1-11
    [4]Wetzel R G.Limnology:lake and river ecosystems.Eos Transactions American Geophysical Union,2001,21(2):1-9
    [5]Lavine B K,Rayens W S.Comprehensive Chemometrics.Amsterdam:Elsevier,2009
    [6]Bücker A,Crespo P,Frede H G,et al.Identifying controls on water chemistry of tropical cloud forest catchments:combining descriptive approaches and multivariate analysis.Aquatic Geochemistry,2010,16(1):127-149
    [7]Juahir H,Zain S M,Aris A Z,et al.Spatial assessment of Langat River water quality using chemometrics.J Environ Monit,2010,12(1):287-295
    [8]Shrestha S,Kazama F.Assessment of surface water quality using multivariate statistical techniques:a case study of the Fuji river basin,Japan.Environmental Modelling&Software,2007,22(4):464-475
    [9]Sotomayor G,Hampel H,Vázquez R F.Water quality assessment with emphasis in parameter optimisation using pattern recognition methods and genetic algorithm.Water Research,2018,130:353-362
    [10]刘勇健,沈军.自组织神经网络法综合评价水质.勘察科学技术,2003(4):22-25
    [11]Tan P N,Steinbach M,Kumar V.数据挖掘导论(完整版).范明,范宏建,译.北京:人民邮电出版社,2011
    [12]郑晓君,罗妮娜,裴洪平.利用SOFM网络评价杭州西湖水质的时空变化.生物数学学报,2007,22(2):317-322
    [13]Zhang Xianqi,Feng Wenhong.Self-organizing neural networks evaluation model and its application//International Conference on Artificial Intelligence and Education.Hangzhou,2010:52-55
    [14]刘娅,朱文博,李双成.基于SOFM神经网络的京津冀地区水源涵养功能分区.环境科学研究,2015,28(3):369-376
    [15]方匡南,吴见彬,朱建平,等,随机森林方法研究综述.统计与信息论坛,2011,26(3):32-38
    [16]明均仁,肖凯.基于R语言的面向需水预测的随机森林方法.统计与决策,2012(9):81-83
    [17]康有,陈元芳,顾圣华,等.基于随机森林的区域水资源可持续利用评价.水电能源科学,2014,32(3):34-38
    [18]张颖,高倩倩.基于随机森林分类算法的巢湖水质评价.环境工程学报,2016,10(2):992-998
    [19]Shapiro S S,Wilk M B.An analysis of variance test for normality.Biometrika,1965,52(3):591-599
    [20]Carpenter M.The new statistical analysis of data.Journal of the American Statistical Association,1996,42(2):205-206
    [21]Helsel D R,Hirsch R M.Statistical methods in water resources.Technometrics,2002,174(1):466-467
    [22]Todeschini R,Ballabio D,Consonni V.Distances and other dissimilarity measures in chemometrics.Hoboken:John Wiley&Sons,2015
    [23]Frank I E,Todeschini R.The Data Analysis Handbook.Technometrics,1994,38(2):193
    [24]叶敏婷,王仰麟,彭建,等.基于SOFM网络的云南省土地利用程度类型划分研究.地理科学进展,2007,26(2):97-105
    [25]Astel A,Tsakovski S,Barbieri P,et al.Comparison of self-organizing maps classification approach with cluster and principal components analysis for large environmental data sets.Water Research,2007,41(19):4566-4578
    [26]李欣海.随机森林模型在分类与回归分析中的应用.应用昆虫学报,2013,50(4):1190-1197
    [27]于洋,张民,钱善勤,等.云贵高原湖泊水质现状及演变.湖泊科学,2010,22(6):820-828
    [28]孟庆义.国内湖泊水质污染及富营养化治理.北京水务,2001(5):45-47
    [29]蒋火华,吴贞丽.世界典型湖泊水质探研.世界环境,2000(4):35-37
    [30]梁中耀,刘永,盛虎,等.滇池水质时间序列变化趋势识别及特征分析.环境科学学报,2014,34(3):754-762

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700