Ecology and Environmental Sciences ›› 2025, Vol. 34 ›› Issue (6): 950-960.DOI: 10.16258/j.cnki.1674-5906.2025.06.012

• Research Article [Environmental Science] • Previous Articles     Next Articles

Collaborative Enhancement of Soil Heavy Metal Prediction Accuracy Using Hyperspectral Sensitive Band Selection and Machine Learning

MENG Chang1,2(), HONG Mei1,2,*(), LI Fei1,2,*()   

  1. 1. Inner Mongolia Key Laboratory of Soil Quality and Nutrient Resources, Hohhot 010018, P. R. China
    2. Inner Mongolia Key Laboratory of Soil Quality and Nutrient Resources/Agricultural Ecological Safety and Green Development of Inner Mongolia University Key Experiment, Hohhot 010018, P. R. China
  • Received:2024-11-14 Online:2025-06-18 Published:2025-06-11

高光谱敏感波段筛选与机器学习协同提升土壤重金属预测精度

孟畅1,2(), 红梅1,2,*(), 李斐1,2,*()   

  1. 1.内蒙古自治区土壤质量与养分资源重点实验室,内蒙古 呼和浩特 010018
    2.内蒙古土壤质量与养分资源重点实验室/农业生态安全与绿色发展内蒙古高等学校重点实验室,内蒙古 呼和浩特 010018
  • 通讯作者: * 李斐, E-mail: Lifei@imau.edu.cn;红梅, E-mail: nmczhm1970@126.com
  • 作者简介:孟畅(1997年生),女,硕士研究生,主要研究方向为农业遥感。E-mail: 2021202040012@emails.imau.edu.cn
  • 基金资助:
    国家自然科学基金项目(M2042002);国家重点研发计划项目(2022YFD1500901);内蒙古自治区自然科学基金项目(2021ZD10)

Abstract:

This study aimed to investigate effective methods for extracting sensitive spectral bands and to assess potentially sensitive bands using machine learning models to estimate soil heavy metal concentrations. Hyperspectral remote sensing data were used to predict the concentrations of Cu, Zn, Pb, and Cr in soils from polluted sites near abandoned tailings in Inner Mongolia. Sixteen sensitive band extraction methods are classified into filtering, wrapper, and embedded approaches. These were coupled with the decision tree (DT), random forest (RF), and gradient boosting decision tree (GBDT) models for concentration inversion. Compared with the filtering and embedded methods, the results demonstrated that the wrapper method provided the highest explanatory power for heavy metal concentrations with sensitive bands, primarily in the 450-750 nm and 1829-2493 nm ranges. Among the six wrapper methods, Competitive Adaptive Reweighted Sampling (CARS) and Variable Iterative Space Shrinking Approach (VISSA) offered critical spectral information for Cu and Cr. At the same time, the successive projections algorithm (SPA) exhibited high sensitivity for Zn and Pb. Compared to the DT and RF models, the GBDT showed a superior fitting performance. In particular, in the sensitive bands, the combination of GBDT with CARS, VISSA, and SPA yielded more accurate estimates of the soil heavy metal concentrations. The results about independent validation from mining areas confirms that the combination of CARS, VISSA, and SPA with the GBDT model maintained stable estimation performance, with coefficients of determination (R2) of 0.91, 0.89, 0.87, and 0.84 for Cu, Zn, Pb, and Cr, respectively. The soil heavy metal monitoring model developed in this study enhances the interpretability of soil spectral data, offering a novel method with significant practical potential for rapid monitoring of heavy metals in mining-affected soils.

Key words: hyperspectral inversion, feature band extraction, tailings area, soil heavy metals, decision tree algorithm, independent validation

摘要:

为探究土壤重金属有效波段提取方法,明确敏感波段耦合机器学习模型对土壤重金属浓度的估测潜力,以内蒙古多个废弃尾矿区周边典型污染场地为研究对象,通过高光谱遥感数据预测土壤Cu、Zn、Pb和Cr重金属的浓度。基于16种敏感波段提取方法(按过滤法、包裹法、嵌入法分类)并结合决策树(DT)、随机森林(RF)和梯度决策树(GBDT)模型,进行重金属浓度反演。结果表明,相比过滤法和嵌入法,包裹法提取的敏感波段对重金属浓度的解释性最高,敏感波段主要集中在450-750 nm和1829-2493 nm。在6种包裹法中,竞争自适应重加权抽样法(CARS)和可变迭代空间收缩法(VISSA)分别为Cu和Cr提供了关键光谱信息,而连续投影算法(SPA)则对Zn和Pb具有较高敏感度。相比DT和RF模型,GBDT在聚焦敏感波段时表现出更强大的拟合性能,耦合CARS、VISSA和SPA方法能更准确地估测土壤重金属浓度。利用独立矿区验证时,CARS、VISSA和SPA组合GBDT模型对土壤重金属浓度仍具有稳定的估测性能,Cu、Zn、Pb和Cr的决定系数(R2)分别为0.91、0.89、0.87和0.84。该研究构建的土壤重金属监测模型能有效增强土壤光谱信息可解释性,为矿区土壤重金属的快速监测提供了具有实际应用潜力的新方法。

关键词: 高光谱反演, 特征波段提取, 尾矿区, 土壤重金属, 决策树算法, 独立验证

CLC Number: