Abstract:
The oil content is an important evaluation index for oil shale resources. Traditionally, calculation of the oil content from well logs of oil shale is performed by regression model, which, however, has the limitation and weakness of big error or over-fitting. This paper attempts to integrate the classical data mining algorithm with "Big Data" concept and logging application knowledge for oil content quantification to improve the accuracy and generalize the model. The explanatory variables of DT
s, DEN
s and GR
s for analysis are obtained by the improved ΔlogR technique. The data mining algorithm of support vector regression (SVR) can greatly improve the model generalization and precision in the oil content quantification. The R
2 score of training samples in the model is 0.82. A high fitting precision is achieved in the test samples, of which the R
2 score is 0.70. The SVR model is more generalized than traditional regression model, and can avoid over-fitting problem and well applied.