用户名: 密码: 验证码:
小波特征提取和随机森林模型解析色谱重叠峰
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Resolution of Overlapped Chromatographic Peaks by Wavelet Feature Extraction and Random Forest Model
  • 作者:张鹏程 ; 王爱民
  • 英文作者:ZHANG Peng-cheng;WANG Ai-min;School of Instrument Science and Engineering, Southeast University;
  • 关键词:重叠峰解析 ; 小波变换 ; 随机森林模型 ; 交叉验证
  • 英文关键词:overlapped chromatographic peak resolution;;wavelet transform;;random forest madel;;cross-validation
  • 中文刊名:IKJS
  • 英文刊名:Measurement & Control Technology
  • 机构:东南大学仪器科学与工程学院;
  • 出版日期:2019-05-18
  • 出版单位:测控技术
  • 年:2019
  • 期:v.38;No.327
  • 语种:中文;
  • 页:IKJS201905007
  • 页数:5
  • CN:05
  • ISSN:11-1764/TB
  • 分类号:36-37+41-43
摘要
针对神经网络算法在当前色谱重叠峰解析领域存在易过拟合、网络结构复杂、学习效率低等问题,引入了随机森林模型。利用gausl小波模拟原始信号导数,选取合适的尺度并提取信号的特征拐点;以特征点作为模型输入、子峰面积比作为输出,使用随机森林模型拟合两者之间的映射关系;采用交叉验证的方式确定随机森林模型的参数,并使用CART算法进行模型的构建和训练;一系列实验与现有方法的对比,证明了本文方法不仅能准确对特征拐点和子峰面积之间进行拟合,在模型训练时间上还具有很高的效率。
        The random forest model was introduced to overcome the shortcomings of neural network in overlapped chromatographic peak resolution, such as over-fitting, complex network structure and low learning efficiency. Gausl wavelet was used to simulate the derivative and calculate the inflection points of the chromatogram by selecting the right scale. The feature points and sub-peak area ratios were served as input and output of the random forest model, which used cross-validation to determine the initialization parameters and the CART algorithm to construct and train the model. The comparison between a series of experiments and the existing methods proves that the proposed method can not only accurately fit the inflection point and the sub-peak area, but also improve the efficiency in model training.
引文
[1]叶国阳,徐科军.基于色谱重叠峰相似性原理的双重叠峰分峰新方法[J].仪器仪表学报,2015,36(2):439-445.
    [2]朱强,张荣,俞建成.基于函数平分迭代的色谱重叠峰分峰方法[J].真空科学与技术学报,2017,37(5):544-547.
    [3]李宝强,李翠萍,黄启斌.基于小波变换的便携式质谱重叠峰解析方法研究[J].质谱学报,2015,36(3):199-205.
    [4]林兆培,李钰,吴慧文.基于二次微分和小波变换的色谱重叠峰分析[J].华东理工大学学报自然科学版,2014,40(I):91-95.
    [5]姚登举,杨静,詹晓娟.基于随机森林的特征选择算法[J].吉林大学学报(工学版),2014,44(1):137-141.
    [6]杜建卫.基于小波变换的高斯函数极值点及拐点的判别[J].数学的实践与认识,2004,34(7):122-125.
    [7] Khalilia M, Chakraborty S, Popescu M. Predicting disease risks from highly imbalanced data using random forest[J].BMC Medical Informatics and Decision Making, 2001,11(7):51-58.
    [8]杜续,冯景瑜,吕少卿,等.基于随机森林回归分析的PM2.5浓度预测模型[J].电信科学,2017,33(7):66-75.
    [9] Zhang C X,Zhang J S,Zhang G Y. Using boosting to prune double-bagging ensembles[J]. Computational Statistics and Data Analysis,2009,53(4):1218-1231.
    [10] Ho T K. The random subspace method for constructing decision forests[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998,20(8):832-844.
    [11]熊智新,路文初,胡上序.小波变换和RBF网络用于模式法分解重叠色谱峰[J].浙江大学学报(工学版),2005,39(7):516-521.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700