用户名: 密码: 验证码:
基于大数据方法的玄武岩大地构造环境智能挖掘判别与分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Intelligent determination and data mining for tectonic settings of basalts based on big data methods
  • 作者:韩帅 ; 李明超 ; 任秋兵 ; 刘承照
  • 英文作者:HAN Shuai;LI MingChao;REN QiuBing;Liu ChengZhao;State Key Laboratory of Hydraulic Engineering Simulation and Safety,Tianjin University;
  • 关键词:玄武岩 ; 大地构造环境 ; 大数据 ; 判别图 ; 地球化学 ; 智能挖掘算法
  • 英文关键词:Basalt;;Tectonic setting;;Big data;;Discrimination diagram;;Geochemistry;;Intelligent algorithm
  • 中文刊名:YSXB
  • 英文刊名:Acta Petrologica Sinica
  • 机构:天津大学水利工程仿真与安全国家重点实验室;
  • 出版日期:2018-11-15
  • 出版单位:岩石学报
  • 年:2018
  • 期:v.34
  • 基金:国家自然科学优秀青年基金项目(51622904);; 天津市杰出青年科学基金项目(17JCJQJC44000)联合资助
  • 语种:中文;
  • 页:YSXB201811006
  • 页数:10
  • CN:11
  • ISSN:11-1922/P
  • 分类号:43-52
摘要
通过玄武岩判别图推断其所形成的大地构造环境的方法由来已久,自1971年Pearce提出了构造-岩浆判别图解法之后,已涌现出了几十种不同的判别图。然而,判别图的制作过程中使用的元素的信息量少,数据样本量少,缺乏代表性,以至于其适用范围有限,且准确率不够。为提高构造环境判别过程的效率和准确性,本文提出以大数据智能挖掘方法建立判别模型,通过玄武岩的化学成分,迅速准确地对其大地构造环境进行判别。所用到的玄武岩包括三类:洋中脊玄武岩(MORB)、洋岛玄武岩(OIB)和岛弧玄武岩(IAB),样品总量为755个。首先,本文分别利用主量元素判别图和微量元素判别图对三类数据的大地构造环境进行判别,包括Ti O_2-MnO-P_2O_5、Fe O~T-MgO-Al_2O_3、Ti-Zr-Y、Zr/Y-Zr和Ti-Zr判别图。由于判别图法针对的是特定的元素或化合物,而有些样品的成份记录不完善或没有测量到有指定物质,导致无法对该样品在判别图中绘制,因此在绘制不同的判别图之前,需要筛选掉一部分数据。判别结果表明,在不考虑无效数据的情况下,Zr/Y-Zr判别图的准确率最高,可达90%以上。但如果考虑到已筛选掉的数据,上述五种图对三种岩石的判别准确率均低于75%。在利用数据挖掘算法进行判别的过程中,本文分别试验了朴素贝叶斯(NB)、K邻近(KNN)、支持向量机(SVM)和随机森林(RF)四种算法。为达到较好的识别效果,本文将所有的化合物和微量元素组成51维的参数组用于训练模型,并且不会进行任何的数据筛选,即全部被视作有效数据。训练结果表明,NB的分类结果最差,但也超过了75%,而RF训练准确率高达100%。在算法的进阶分析中,测得RF算法验证准确率可达88.46%;为提升智能算法的实用性,本文利用贝叶斯定理对算法的判别结果求逆概率,以实现"由果及因"的合理推断;同时,本文通过人为模拟数据缺失,进一步验证不同的算法的鲁棒性,并认为RF和NB是应该被优先考虑的两种算法;最后,通过提取RF中的决策树,本文对样本中元素的重要性进行了分析,并找到了对判别效果影响最大的几个主量元素和微量元素。综上所述,利用数据挖掘算法判别大地构造环境要比判别图法更为准确、迅速且功能多样,可在该领域做进一步推广应用。
        Basalt discrimination diagrams have been widely used for determining tectonic settings.Since the first basalt discrimination diagram was proposed by Pearce in 1971,dozens of discrimination diagrams have emerged.However,the information in a discrimination diagram is usually 2~3 elements,and the amount of samples for designing a discrimination diagram was usually small,leading to a limitation of their applications.To improve the effectiveness and accuracy of determination,in this study,a set of methods based on intelligent algorithms and chemical composition of basalts is presented.The samples used in this research comprise 3 kinds of basalts:mid-ocean ridge basalts(MORB),ocean island basalts(OIB)and island arc basalts(IAB).The amount of the samples analyzed is 755.At first,three trace elements discrimination diagrams and two major elements discrimination diagrams,including Ti O_2-MnO-P_2O_5diagram,Fe O~T-MgO-Al_2O_3diagram,Ti-Zr-Y diagram,Zr/Y-Zr diagram and Ti-Zr diagram,are adopted for plotting the samples.Considering the limitations of the diagrams,the samples should be filtered before being plotted.The results show that the Zr/Y-Zr diagram can reach a high accuracy of 90%with the filtered samples.However,its accuracy is less than 75%when using the whole samples.In this paper,the methods of Naive Bayes(NB),K-Nearest Neighbors(KNN),Support Vector Machine(SVM)and Random Forest(RF)are adopted for determination.In training,every sample is represented by a 51-dimension vector that comprises11 major elements,35 trace elements and 5 isotopes,and they are not filtered.It shows that the worst result is made by NB,yet still has more than 75%of accuracy.The best result is made by RF,and its training accuracy is 100%.In the advanced analysis,the results show that the RF can reach a high validation accuracy of 88.46%.To improve the practicability of intelligent algorithms,the Bayes theorem is used to calculate the inverse probabilities.After that,by simulating data missing,the robust of the algorithms are verified,and it shows that RF and NB are the best.Finally,by extracting the decision trees of RF algorithm,the importance of the 51features of samples are calculated,and then the major elements and trace elements that affect the determination most are found out.In conclusion,it is more effective,accurate and functional to determine tectonic settings by intelligent algorithms,and this set of method is worthy of promotion.
引文
Altman NS.1992.An introduction to kernel and nearest-neighbor nonparametric regression.The American Statistician,46(3):175-185
    Bishop CM.2006.Pattern Recognition and Machine Learning(Information Science and Statistics).Heidelberg:Springer-Verlag
    Breiman L.2001.Random forests.Machine Learning,45(1):5-32
    Chen WF,Wang JR,Zhang Q,Liu YX,Ma L and Jiao ST.2017.Data mining of ocean island basalt and ocean plateau basalt:Geochemical characteristics and comparison with MORB.Acta Geologica Sinica,91(11):2443-2455(in Chinese with English abstract)
    Cortes C and Vapnik V.1995.Support-vector networks.Machine Learning,20(3):273-297
    Cover T and Hart P.1967.Nearest neighbor pattern classification.IEEETransactions on Information Theory,13(1):21-27
    Di PF,Wang JR,Zhang Q,Yang J,Chen WF,Pan ZJ,Du XL and Jiao ST.2017.The evaluation of basalt tectonic discrimination diagrams:Constraints on the Research of global basalt data.Bulletin of Mineralogy,Petrology and Geochemistry,36(6):891-896(in Chinese with English abstract)
    Domingos P and Pazzani M.1997.On the optimality of the simple bayesian classifier under zero-one loss.Machine Learning,29(2-3):103-130
    Floyd PA and Winchester JA.1975.Magma type and tectonic setting discrimination using immobile elements.Earth and Planetary Science Letters,27(2):211-218
    Karpatne A,Ebert-Uphoff I,Ravela S,Babaie HA and Kumar V.2018.Machine learning for the geosciences:Challenges and opportunities.IEEE Transactions on Knowledge&Data Engineering,1-12
    Kononenko I.1993.Inductive and bayesian learning in medical diagnosis.Applied Artificial Intelligence,7(4):317-337
    Li C,Arndt NT,Tang Q and Ripley EM.2015.Trace element indiscrimination diagrams.Lithos,232:76-83
    Li H.2012.Statistical Learning Method.Beijing:Tsinghua University Press(in Chinese)
    Li MC,Miao L and Shi J.2014.Analyzing heating equipment’s operations based on measured data.Energy and Buildings,82:47-56
    Luo JM,Wang XW,Song BT,Yang ZM,Zhang Q,Zhao YQ and Liu SY.2018.Discussion on the method for quantitative classification of magmatic rocks:Taking it’s application in West Qinling of Gansu Province for example.Acta Petrologica Sinica,34(2):326-332(in Chinese with English abstract)
    Mullen ED.1983.MnO/Ti O2/P2O5:A minor element discriminant for basaltic rocks of oceanic environments and its implications for petrogenesis.Earth and Planetary Science Letters,62(1):53-62
    Pearce JA and Cann JR.1971.Ophiolite origin investigated by discriminant analysis using Ti,Zr and Y.Earth and Planetary Science Letters,12(3):339-349
    Pearce JA and Cann JR.1973.Tectonic setting of basic volcanic rocks determined using trace element analyses.Earth and Planetary Science Letters,19(2):290-300
    Pearce JA and Norry MJ.1979.Petrogenetic implications of Ti,Zr,Y,and Nb variations in volcanic rocks.Contributions to Mineralogy and Petrology,69(1):33-47
    Pearce JA.1982.Trace element characteristics of lavas from destructive plate boundaries.In:Thorpe RS(ed.).Andesites:Orogenic Andesites and Related Rocks.Chichester,England:John Wiley and Sons,528-548
    Pearce JA,Lippard SJ and Roberts S.1984.Characteristics and tectonic significance of supra subduction zone ophiolites.In:Gass IG,Lippard SJ and Shelton AW(eds.).Ophiolites and Oceanic Lithosphere.Geological Society,London,Special Publication,16:77-94
    Pearce TH,Gorman BE and Birkett TC.1977.The relationship between major element chemistry and tectonic environment of basic and intermediate volcanic rocks.Earth and Planetary Science Letters,36(1):121-132
    Petrelli M and Perugini D.2016.Solving petrological problems through machine learning:the study case of tectonic discrimination using geochemical and isotopic data.Contributions to Mineralogy and Petrology,171(10):81
    Sheng Z.2001.Probability and Statistics.3rdEdition.Beijing:Higher Education Press(in Chinese)
    Wang JR,Pan ZJ,Zhang Q,Chen WF,Yang J,Jiao ST and Wang SH.2016.Intra-continental basalt data mining:The diversity of their constituents and the performance in basalt discrimination diagrams.Acta Petrologica Sinica,32(7):1919-1933(in Chinese with English abstract)
    Wang JR,Chen WF,Zhang Q,Jiao ST,Yang J,Pan ZJ and Wang SH.2017.Preliminary research on data mining of N-MORB and E-MORB:Discussion on method of the basalt discrimination diagrams and the character of MORB’s mantle source.Acta Petrologica Sinica,33(3):993-1005(in Chinese with English abstract)
    Wang YL,Zhang CJ and Xiu SZ.2001.Th/Hf-Ta/Hf identification of tectonic setting of basalts.Acta Petrologica Sinica,17(3):413-421(in Chinese with English abstract)
    Whalen JB,Currie KL and Chappell BW.1987.A-type granites:Geochemical characteristics,discrimination and petrogenesis.Contributions to Mineralogy and Petrology,95(4):407-419
    Wilkinson L.2006.Revising the Pareto chart.The American Statistician,60(4):332-334
    Yang J,Wang JR,Zhang Q,Chen WF,Pan ZJ,Du XL,Jiao ST and Wang SH.2016a.Global IAB data excavation:The performance in basalt discrimination diagrams and preliminary interpretation.Geological Bulletin of China,35(12):1937-1949(in Chinese with English abstract)
    Yang J,Wang JR,Zhang Q,Chen WF,Pan ZJ,Jiao ST and Wang SH.2016b.Back-arc basin basalt(BABB)data mining:comparison with MORB and IAB.Advances in Earth Science,31(1):66-77(in Chinese with English abstract)
    Zhang Q.1990.The correct use of the basalt discrimination diagram.Acta Petrologica Sinica,6(2):87-94(in Chinese with English abstract)
    Zhou YZ,Chen S,Zhang Q,Xiao F,Wang SG,Liu YP and Jiao SJ.2018.Advances and prospects of big data and mathematical geoscience.Acta Petrologica Sinica,34(2):255-263(in Chinese with English abstract)
    Zhou ZH.2016.Machine Learning.Beijing:Tsinghua University Press(in Chinese)
    陈万峰,王金荣,张旗,刘懿馨,马骊,焦守涛.2017.洋岛和洋底高原玄武岩数据挖掘:地球化学特征及其与MORB的对比.地质学报,91(11):2443-2455
    第鹏飞,王金荣,张旗,杨婧,陈万峰,潘振杰,杜学亮,焦守涛.2017.玄武岩构造环境判别图评估---全体数据研究的启示.矿物岩石地球化学通报,36(6):891-896
    李航.2012.统计学习方法.北京:清华大学出版社
    罗建民,王晓伟,宋秉田,杨忠明,张琪,赵彦庆,刘升有.2018.岩浆岩定量分类方法探讨---以甘肃省西秦岭地区为例.岩石学报,34(2):326-332
    盛骤.2001.概率论与数理统计.第3版.北京:高等教育出版社
    王金荣,潘振杰,张旗,陈万峰,杨婧,焦守涛,王淑华.2016.大陆板内玄武岩数据挖掘:成分多样性及在判别图中的表现.岩石学报,32(7):1919-1933
    王金荣,陈万峰,张旗,焦守涛,杨婧,潘振杰,王淑华.2017.N-MORB和E-MORB数据挖掘---玄武岩判别图及洋中脊源区地幔性质的讨论.岩石学报,33(3):993-1005
    汪云亮,张成江,修淑芝.2001.玄武岩类形成的大地构造环境的Th/Hf-Ta/Hf图解判别.岩石学报,17(3):413-421
    杨婧,王金荣,张旗,陈万峰,潘振杰,杜雪亮,焦守涛,王淑华.2016a.全球岛弧玄武岩数据挖掘---在玄武岩判别图上的表现及初步解释.地质通报,35(12):1937-1949
    杨婧,王金荣,张旗,陈万峰,潘振杰,焦守涛,王淑华.2016b.弧后盆地玄武岩(BABB)数据挖掘:与MORB及IAB的对比.地球科学进展,31(1):66-77
    张旗.1990.如何正确使用玄武岩判别图.岩石学报,6(2):87-94
    周永章,陈烁,张旗,肖凡,王树功,刘艳鹏,焦守涛.2018.大数据与数学地球科学研究进展---大数据与数学地球科学专题代序.岩石学报,34(2):255-263
    周志华.2016.机器学习.北京:清华大学出版社

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700