用户名: 密码: 验证码:
集成R语言的环境大数据分析系统
详细信息    查看官网全文
摘要
近年来,人们对生态环境日益重视,对环境管理决策提出了更高要求。环保行业多年的信息化建设和环保基础数据的持续积累,为大数据分析应用奠定了良好基础。但环境数据庞大复杂,数理统计分析工具专业性强,严重制约了环境工作者对数据的深入分析,影响了数据的有效利用。本文尝试在信息系统中集成数理统计分析工具R,对环境数据开展深入挖掘分析,为决策分析提供参考。以叶绿素浓度的多模型拟合分析和预测为例进行了分析,并展示了多种模型的拟合预测结果。该应用系统为实现各类统计分析奠定了基础,为环境数据开展深入挖掘和拟合预测提供了便利,为环境相关人员进行专业的统计分析提供了可能。
In recent years,people start recognizing the importance of nature environment,which lead them to have a higher requirement on the management strategy of environment.Environment protection information systems' construction and accumulation of basic data established a strong foundation for big data analysis.However,the data sets are so large and complex,and mathematical statistical tools are very professional,which are strongly limited the data analysis and usage.This paper attempts to link a mathematical statistics tool R into the web applications,which makes it easier to study environmental data and provide advices to decision makers.The paper use multiple models fit analysis and prediction of chlorophyll concentration as an example.The model validation results show that the GAM model is most suitable for fitting the chlorophyll.The system's development proves all kinds of mathematical statistical analysis can be done in the web application,which provides a convenient way for the environmental data mining,fitting and prediction,which also provides a professional analysis platform for the environmental managers.
引文
[1]刘乃嘉.R语言在统计学教学中的应用探讨[J].才智,2015(33).
    [2]叶文春.浅谈R语言在统计学中的应用[J].中共贵州省委党校学报,2008(4):123-125.
    [3]邓春亮,胡南辉.广义线性模型极大似然估计弱相合性的数值模拟[J].嘉应学院学报,2011,29(8):8-11.
    [4]Moe J,Ptacnik R,Penning E,et al.Statistical and modelling methods for assessing the relationships between ecological and chemical status in lakes.REBECCA Deliverable 12[J].Vestnik Dermatologii I Venerologii,1969,43(4):21-6.
    [5]曹经福,江志红,任福民,等.广义线性统计降尺度方法模拟日降水量的应用研究[J].气象学报,2013,71(1):167-175.
    [6]朱源,康慕谊.排序和广义线性模型与广义可加模型在植物种与环境关系研究中的应用[J].生态学杂志,2005,24(7):807-811.
    [7]Richards R,Tomlinson R B,Chaloupka M Using Generalized Additive Models to Assess,Explore and Unify Environmental Monitoring Datasets[C]//Modelling for Environment's Sake:Proceedings of the 5th Biennial Conference of the International Environmental Modelling and Software Society,iEMSs 2010.International Environmental Modelling and Software Society(iEMSs),2010:1412-1420.
    [8]朱蓉君.应用GAMs回归统计分析建构GIS空间分布[D].屏东科技大学.
    [9]Kooperberg C.Multivariate Adaptive Regression Splines[J].The Annals of Statistics,1991,19(1):1-67.
    [10]沈刘平,杨吉斌,曹铁勇,等.基于MARS的语音清晰度客观评价[J].数据采集与处理,2008,23(1):100-103.
    [11]Hastie T J,Tibshirani R J.Generalized additive models.Monographs on Statistics and Applied Probability.Chapman&Hall,New York,first Edition,1990.
    [12]Buja A,Hastie T,Tibshirani R.Linear Smoothers and Additive Models[J].Annals of Statistics,1989,17(2):453-510.
    [13]Hirsch R M,Slack J R,Smith R A.Techniques of trend analysis for monthly water quality data[J].Water Resources Research,1982,18(18):107-121.
    [14]Emmerson L.A practical guide to ecological modelling:using R as a simulation platform[J].Austral Ecology,2011,36(4):492.
    [15]Chapman C,Feit M D.An overview of the R language[M]//R for marketing research and analytics.Springer International Publishing,2015.
    [16]加利福尼亚湾的水质监测数据.http://sfbay.wr.usgs.gov/access/wqdata/.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700