Ecology and Environment ›› 2023, Vol. 32 ›› Issue (6): 1007-1015.DOI: 10.16258/j.cnki.1674-5906.2023.06.001

• Research Articles •     Next Articles

Estimation of Aboveground Biomass in the Arid Oasis Based on the Machine Learning Algorithm

WANG Xuemei1,2(), YANG Xuefeng1,2, ZHAO Feng1,2, AN Baisong1, HUANG Xiaoyu1   

  1. 1. College of Geographic Science and Tourism, Xinjiang Normal University, Urumqi 830054, P. R. China
    2. Xinjiang Laboratory of Lake Environment and Resources in Arid Zone, Urumqi 830054, P. R. China
  • Received:2023-01-28 Online:2023-06-18 Published:2023-09-01

基于机器学习算法的干旱区绿洲地上生物量估算

王雪梅1,2(), 杨雪峰1,2, 赵枫1,2, 安柏耸1, 黄晓宇1   

  1. 1.新疆师范大学地理科学与旅游学院,新疆 乌鲁木齐 830054
    2.新疆干旱区湖泊环境与资源实验室,新疆 乌鲁木齐 830054
  • 作者简介:王雪梅(1976年生),女,教授,博士,硕士研究生导师,研究方向为干旱区资源环境遥感应用研究。E-mail: wangxm_1225@sina.com
  • 基金资助:
    国家自然科学基金项目(41561051);新疆维吾尔自治区自然科学基金项目(2020D01A79);国家自然科学基金项目(42261062)

Abstract:

The aboveground biomass of vegetation is an important index reflecting the carbon sequestration capacity of terrestrial ecosystems. Using remote sensing technology to carry out vegetation aboveground biomass estimation and spatial inversion in arid areas can provide an important basis for health assessment and carbon storage estimation in desert oasis ecosystems. Based on field surveys and field sampling data, seven vegetation indices and 13 band variables were extracted from Landsat 8 OLI multispectral images to form four variable combinations for modeling. Support Vector Machine (SVM), Back Propagation Neural Network (BPNN), eXtreme Gradient Boost (XGBoost), and Random Forest (RF), which are four machine learning algorithms, could estimate aboveground biomass by remote sensing and spatial inversion in the delta oasis of Weigan-Kuqa rivers in Xinjiang. The results showed that (1) the vegetation aboveground biomass inversion models constructed by band variables and random frog jump algorithm preferred variables had significantly better estimation accuracy than the total variables and index variables. The prediction abilities were more stable. Compared with SVM and BPNN, the models constructed by XGBoost and RF algorithms had a better estimation effect and could more accurately estimate the aboveground biomass of vegetation in the study area. (2) Among the constructed estimation models, the band variable combined with the Random Forest algorithm had the highest accuracy and the strongest stability. The coefficients of determination for the modeling set and validation set were 0.898 and 0.742, respectively, and the average absolute error was 82.1 g·m-2 and 79.2 g·m-2, respectively. The root-mean-square errors were 110.8 g·m-2 and 132.1 g·m-2, and the relative analysis errors were both greater than 1.8, so the model had the best fitting effect. (3) The spatial differentiation of aboveground biomass of vegetation in the study area was obvious, showing higher biomass in the oasis area and lower biomass in the desert area and a gradual decreasing trend from the inner oasis to the outer oasis. Compared with the other three machine learning algorithms, the estimation model constructed by the Random Forest algorithm had a better estimation ability and stability and could accurately estimate the aboveground biomass of the arid oasis. In general, the machine learning algorithm model based on optimal variable combinations provides a scientific basis for aboveground biomass inversion.

Key words: machine learning algorithm, vegetation index, spectral band, aboveground biomass, spatial inversion, the delta oasis of Weigan-Kuqa rivers

摘要:

植被地上生物量是反映陆地生态系统固碳能力的重要指标,利用遥感技术开展干旱区植被地上生物量估算与空间反演,可为荒漠绿洲生态系统的健康评价与碳储量估算提供重要依据。以野外调查和实地采样数据为基础,利用Landsat 8 OLI多光谱影像提取的7个植被指数和13个波段变量构成4种建模变量组合,采用支持向量机(Support Vector Machine,SVM)、反向传播神经网络(Back Propagation Neural Network,BPNN)、极端梯度提升(eXtreme Gradient Boosting,XGBoost)和随机森林(Random Forest,RF)这4种机器学习算法对新疆渭干河-库车河三角洲绿洲地上生物量进行遥感估算和空间反演。结果表明,(1)由波段变量和随机蛙跳算法优选变量构建的植被地上生物量反演模型,其估测精度明显优于全变量和指数变量,预测能力更为稳定。与SVM和BPNN算法相比,XGBoost和RF算法构建的模型具有更好的估测效果,能更准确地估算研究区植被地上生物量。(2)在构建的估测模型中,波段变量结合RF算法模型的精度最高,稳定性最强,其建模集和验证集的决定系数分别为0.898和0.742,平均绝对误差分别为82.1 g·m-2和79.2 g·m-2,均方根误差为110.8 g·m-2和132.1 g·m-2,相对分析误差均大于1.8,模型拟合效果最佳。(3)研究区植被地上生物量的空间分异较为明显,整体呈现出绿洲区高,荒漠区低,由绿洲内部向绿洲外围逐渐降低的变化趋势。与其他3种机器学习算法相比,随机森林算法构建的估测模型具有良好的估测能力和稳定性,可准确估算干旱区绿洲地上生物量。同时,基于最优变量组合的机器学习算法模型为地上生物量反演提供了科学依据。

关键词: 机器学习算法, 植被指数, 光谱波段, 地上生物量, 空间反演, 渭干河-库车河三角洲绿洲

CLC Number: