用户名: 密码: 验证码:
水利工程灌浆大数据平台设计与实现
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Big Data Platform for Grouting of Water Conservancy Project:Design and Implementation
  • 作者:饶小康
  • 英文作者:RAO Xiao-kang;Key Laboratory of Geotechnical Mechanics and Engineering of Ministry of Water Resources,Yangtze River Scientific Research Institute;
  • 关键词:大数据平台 ; 水利工程 ; 灌浆 ; Hadoop ; Spark ; 随机森林 ; K-Means
  • 英文关键词:big data platform;;water conservancy project;;grouting;;Hadoop;;Spark;;random forest;;K-Means
  • 中文刊名:长江科学院院报
  • 英文刊名:Journal of Yangtze River Scientific Research Institute
  • 机构:长江科学院水利部岩土力学与工程重点实验室;
  • 出版日期:2019-06-14
  • 出版单位:长江科学院院报
  • 年:2019
  • 期:06
  • 基金:国家重点研发计划项目(2017YFC1502600)
  • 语种:中文;
  • 页:143-149+174
  • 页数:8
  • CN:42-1171/TV
  • ISSN:1001-5485
  • 分类号:TP311.13;TV543
摘要
随着云计算、大数据、物联网的发展,水利工程各类采集数据与日俱增,面对如此大规模的数据集,传统存储、计算相关的理论和方法已不能满足海量、多源、异构数据的存取与处理。针对水利工程灌浆大数据,设计平台总体架构,搭建Hadoop分布式集群,设计并行化数据挖掘算法,实现水利工程灌浆大数据平台,并基于B/S服务模式进行平台展现、应用和管理。平台功能模块主要包括数据资源下载、数据集上传与运行、自定义算法、运行状态及结果和大数据可视化等,并结合白鹤滩水利工程建立基于随机森林的灌浆工程单位注入量预测模型和基于K-Means聚类的灌浆成果异常检测模型进行应用示范。平台的设计与实现融合水利工程结构化与非结构化数据,将大数据集群并行计算和数据挖掘技术应用到水利工程中,改变传统随机抽样和单一挖掘分析模型,采用多粒度、多层次、多渠道的分析模型对数据全量进行挖掘分析,从海量数据中挖掘分析出于管理、决策和生产有用的信息,实现了数据资源的集成共享、业务的高效处理、数据信息的知识发现,提高了数据存储和处理效率和精度,为当前水利工程大数据的存储与计算提供一种新的解决思路。
        The ever-rising quantity of collected data of water conservancy project together with the development of cloud computing, big data and Internet of Things poses higher demands for the storage and processing of massive, multi-source and heterogeneous data that traditional theories and methods could not meet. In this research, a big data platform for grouting data of water conservancy project is designed based on B/S service mode for display, operation, and management. The functional modules of the platform mainly include data resource downloading, data set uploading and running, customized algorithms, as well as visualization of running status and results and big data.Moreover, the platform was applied for demonstration with Baihetan water conservancy project as a case study. A model for predicting the grouting injection amount per unit based on random forest together with a model of anomaly detection of grouting result based on K-Means clustering was built.By integrating structural and unstructured data and by adopting Hadoop distributed cluster and parallelized data mining algorithm, the platform could achieve integrated sharing of data resource, effective processing, knowledge discovery of data information, and improves the efficiency and accuracy of data storage and processing. This research offers a new thinking for the big data storage and computing of water conservancy project.
引文
[1] 郭晓刚.基于LIBSVM的高面板坝趾板基础灌浆智能预测[J] .人民长江,2011,42(1):33-36.
    [2] The Apache Software Foundation.Welcome to Apache Hadoop[EB/OL].(2017-12-01)[2017-12-22].http://hadoop.apache.org/.
    [3] The Apache Software Foundation.Spark Overview[EB/OL].(2017-12-01)[2017-12-22].http://spark.apache.org/docs/latest/.
    [4] 程志华,倪时龙,黄文思,等.企业级非结构化数据管理平台研究及实践[J].电力信息化,2012,10(3):12-20.
    [5] 杨东华,李宁宁,王宏志,等.基于任务合并的并行大数据清洗过程优化[J] .计算机学报,2016,39(1):97-107.
    [6] 韦泽鲲,夏靖波,张晓燕,等.基于随机森林的流量多特征提取与分类研究[J].传感器与微系统,2016,35(12):55-59.
    [7] 刘琪琛,雷景生,郝珈玮,等.基于Spark平台和并行随机森林回归算法的短期电力负荷预测[J] .电力建设,2017,38(10):84-92.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700