TAS-PF:基于大数据概率场的TAS扩展图解

    TAS-PF: Extended TAS diagram powered with probability field of big data

    • 摘要: 大数据时代背景下,地学数据规模持续增长. 以TAS为代表的传统图解面临困境:一方面,有限图幅内投点过多导致可读性降低,无法呈现有效、直观的可视化效果;另一方面,原始数据陈旧的传统图解若引入新数据可能导致分类边界发生扰动,从而降低判别分类结果的稳定性,并难以兼容已有文献投图. 针对上述问题,本文首先继承前期研究为TAS图解所做的扩展,为经典图解中的各岩性标签构建基于空间位置的类别分区. 根据待分类数据投图位置与各类别分区的空间关系进行判别,并以数据表形式呈现分类结果,从而弥补数据规模增大带来的投图可读性降低. 另外,从GEOROC数据库中提取24万余条火成岩的主量元素数据,将其在TAS图解上进行可视化,并按岩性分类进行核密度分析. 基于分析结果在投图坐标范围内构建对应类别概率场,基于待分类数据在各概率场中所处位置信息计算概率,并对比不同岩性标签的概率结果. 基于概率场利用已知岩性标签数据判别待分类数据,补充传统分类边界模式,并提供更具有定量意义的判别结果.

       

      Abstract: Under the background of big data, the continuous growth of geological data poses challenges to traditional discrimination diagrams represented by TAS:On one hand, the excessive data points within limited diagram space reduce readability and hinder effective visualization; On the other hand, the input of new data into traditional diagrams with outdated original data may lead to perturbation in classification boundaries, compromising the stability of discrimination results and compatibility with existing literature plots. To solve the above problems, this study first extends the previous research on TAS diagrams by constructing category partitions based on spatial positions for various lithology labels in classic diagrams. The discrimination is made then on the basis of spatial relationship between location of the data to be classified and category partitions, with results presented in data table to mitigate readability degradation caused by the increase of data volume. Besides, over 240 000 entries of major element data of igneous rocks are extracted from GEOROC database for TAS visualization and for kernel density analysis in terms of lithologic classification. The corresponding category probability field is constructed across the plotting coordinates based on the analysis results. The probability is calculated by the position of the data to be classified in each probability field, and the probability results of different lithologic labels are compared. Based on probability field, the known lithology label data are used to distinguish the data to be classified, supplement the traditional classification boundary model and form more quantitative discrimination results.

       

    /

    返回文章
    返回