论文标题
使用机器学习从印度喜马拉雅地区的卫星图像时间序列中生成开放式农田地图
Using Machine Learning to generate an open-access cropland map from satellite images time series in the Indian Himalayan Region
论文作者
论文摘要
农作物图对于农业监测和食品管理至关重要,还可以支持特定领域的应用,例如在发展中国家设定冷供应链基础设施。机器学习(ML)模型,结合了自由使用的卫星图像,可用于生成具有成本效益和高空间分辨率的作物图。但是,由于诸如小型地理和零散的地理因素,在发展中国家访问地面真相数据在发展中国家尤其具有挑战性,这通常导致缺乏农作物类型地图甚至可靠的农田地图。这项研究感兴趣的领域位于印度喜马al尔邦,我们旨在在10米的Kullu,Shimla和Mandi地区生产开放式二元农田地图。为此,我们开发了一条依赖Sentinel-2卫星图像时间序列的ML管道。我们研究了两个基于像素的监督分类器,即支持矢量机(SVM)和随机森林(RF),这些分类器(RF)用于对二元农田映射进行分类。用于培训,验证和测试的基础真实数据是从现场调查参考点和非常高分辨率(VHR)图像的视觉解释的组合手动注释的。我们通过空间交叉验证训练和验证了模型,以解释局部空间自相关,并由于整体鲁棒性和较低的计算成本而选择了RF模型。我们通过计算每个地区的保持测试集的准确性,召回,精度和F1得分来测试所选模型在像素级别上的概括能力,从而达到了87%的RF(我们最佳模型)的平均准确性。我们使用该模型为喜马al尔邦的三个地区生成了一个农田地图,跨越了14,600 km2,从而改善了现有公共地图的分辨率和质量。
Crop maps are crucial for agricultural monitoring and food management and can additionally support domain-specific applications, such as setting cold supply chain infrastructure in developing countries. Machine learning (ML) models, combined with freely-available satellite imagery, can be used to produce cost-effective and high spatial-resolution crop maps. However, accessing ground truth data for supervised learning is especially challenging in developing countries due to factors such as smallholding and fragmented geography, which often results in a lack of crop type maps or even reliable cropland maps. Our area of interest for this study lies in Himachal Pradesh, India, where we aim at producing an open-access binary cropland map at 10-meter resolution for the Kullu, Shimla, and Mandi districts. To this end, we developed an ML pipeline that relies on Sentinel-2 satellite images time series. We investigated two pixel-based supervised classifiers, support vector machines (SVM) and random forest (RF), which are used to classify per-pixel time series for binary cropland mapping. The ground truth data used for training, validation and testing was manually annotated from a combination of field survey reference points and visual interpretation of very high resolution (VHR) imagery. We trained and validated the models via spatial cross-validation to account for local spatial autocorrelation and selected the RF model due to overall robustness and lower computational cost. We tested the generalization capability of the chosen model at the pixel level by computing the accuracy, recall, precision, and F1-score on hold-out test sets of each district, achieving an average accuracy for the RF (our best model) of 87%. We used this model to generate a cropland map for three districts of Himachal Pradesh, spanning 14,600 km2, which improves the resolution and quality of existing public maps.