Spectral Unmixing
Type of resources
Keywords
Contact for the resource
status
Groups
-
Land Use and Land Cover (LULC) maps are crucial for environmental monitoring. This workflow uses Remote Sensing (RS) and Artificial Intelligence (AI) to automatically create LULC maps by estimating the relative abundance of LULC classes. Using MODIS data and ancillary geographic information, an AI model was trained and validated in Andalusia, Spain, providing a tool for accurate and efficient LULC mapping. Background Land Use and Land Cover (LULC) maps are of paramount importance to provide precise information for dynamic monitoring, planning, and management of the Earth. Regularly updated global LULC datasets provide the basis for understanding the status, trends, and pressures of human activity on carbon cycles, biodiversity, and other natural and anthropogenic processes. Because of that, being able to automatically create these maps without human labor by using new Remote Sensing (RS) and Artificial Intelligence (AI) technologies is a great avenue to explore. Introduction In the last few decades, LULC maps have been created using RS images following the "raster data model", where the Earth's surface is divided in squares of a certain spatial resolution called pixels. Then, each of these pixels is assigned a "LULC class" (e.g., forest, water, urban...) that represents the underlying type of the Earth surface in each pixel. The number of different classes of a LULC map is referred to as thematic resolution. Frequently, the spatial and thematic resolutions do not match, which leads to the mixed pixel problem, i.e., pixels are not pure but contain several LULC classes. Under a "hard" classification approach, a mixed pixel would be assigned just one LULC class (e.g., the dominant class) while under a "soft" classification approach (also called spectral unmixing or abundance estimation) the relative abundance of each LULC class is provided per pixel. Moreover, ancillary information regarding the geographic, topographic, and climatic information of the studied area could also be useful to classify each pixel to its corresponding LULC class. Concretely, the following ancillary variables are studied: GPS coordinates, altitude, slope, precipitation, potential evapotranspiration, mean temperature, maximum temperature, and minimum temperature. Aims To estimate the relative abundance of LULC classes in Andalusia and develop an AI model to automatically perform the task, a new labeled dataset of Andalusia of pixels from MODIS at 460m resolution was built. Each pixel is a multi-spectral time series and includes the corresponding ancillary information. Also, each pixel is labeled with its corresponding LULC class abundances inside that pixel. The label is provided at two hierarchical levels, namely N1 (coarser) and N2 (finer). To create these labels, the SIPNA (Sistema de Información sobre el Patrimonio Natural de Andalucía) product was used, which aims to build an information system on the natural heritage of Andalusia. The first level "N1" contains four high-level LULC classes, whereas the second level "N2" contains ten finer LULC classes. Thus, this model was mainly trained and validated in the region of Andalusia in Spain. Once the dataset was created, the AI model was trained using about 80% of the data and then validated with the remaining 20% following a carefully spatial block splitting strategy to avoid spatial autocorrelation. The AI model processes the multi-spectral time series from MODIS at 460m and the ancillary information to predict the LULC abundances in that pixel. Both the RS dataset with the ancillary data used to create the AI model and the AI model itself are the deliverables of this project. In summary, we provide an automatic tool to estimate the LULC classes abundances of MODIS pixels from Andalusia using a soft classification approach and set a methodology that could be applied to other satellites where a better spatial resolution allows the use of more fine LULC classes in the future. Also, the AI model could serve as a starting point for researchers interested in applying the model in other locations, i.e., they can fine-tune the existing model with data for the new region of interest requiring far less training data thanks to transferring the learned patterns of our model. Scientific Questions Through the development of this workflow, we aim at addressing three main scientific questions: 1. Can we predict LULC abundances in a particular place through remote sensing and ancillary data and AI technologies?
-
This workflow employs a deep learning model for blind spectral unmixing, avoiding the need for expensive hyperspectral data. The model processes 224x224 pixel RGB images and associated environmental data to generate CSV files detailing LULC abundance at two levels of detail (N1 and N2). The aim is to provide an efficient tool for LULC monitoring, answering the question: Can LULC abundance be estimated from RGB images and environmental data? This framework supports environmental monitoring and land cover analysis. Background Land Use and Land Cover (LULC) represents earth surface biophysical properties of natural or human origin, such as forests, water bodies, agricultural fields, or urban areas. Often, different LULC types are mixed together in the same analyzed area. Nowadays, spectral imaging sensors allow us to capture these mixed LULC types (i.e., endmembers) together as different spectral data signals. LULC types identification within a spectral mixture (i.e., endmembers identification) and their quantitative abundance assessment (i.e., endmembers abundances estimation) play a key role in understanding earth surface transformations and climate change effects. These two tasks are carried out through spectral unmixing algorithms by which the measured spectrum of a mixed image is decomposed into a collection of constituents (i.e., spectra, or endmembers), and a set of fractions indicating their abundances. Introduction Early research on spectral unmixing dates back more than three decades. First attempts, referred to as linear unmixing, assumed that the spectral response recorded for an LULC mixture is simply an additive function of the spectral response of each class weighted by its proportional coverage. Notably, some authors used linear regression and similar linear mixture-based techniques in order to relate the spectral response to its class composition. Afterwards, other authors claimed the necessity of overcoming this assumption by proposing non-linear unmixing methods. However, non-linear methods require endmember spectra extraction for each LULC class, which has been found difficult in several works. Moreover, some studies indicated that it is unlikely that the spectra could be derived directly from the remotely sensed data since the majority of image pixels may be mixed. To overcome these limitations, several works introduced what is called blind spectral unmixing as an alternative method to avoid the need to derive any endmember spectra or making any prior assumption about their mixing nature. However, the majority of works that adopted blind spectral unmixing used deep learning-based models trained with expensive and hard-to-process hyperspectral or multispectral images. Therefore, many researchers during the last decade pointed out that more effort should be dedicated towards the usage of more affordable remote sensing data with few bands in spectral unmixing. They justified this need by two important factors: (1) In real situations, we might have access to images with only a few bands because of their availability, cost-effectiveness, and acquisition time-efficiency in comparison to imagery gathered with multi-band devices that require more processing effort and expenses; (2) In some cases, we do not really need a huge number of bands, as they can be used as a fundamental dataset from which we determine optimal wavebands for a particular application. In parallel, high-quality research in artificial intelligence application to remote sensing imagery, such as computer vision-based techniques and especially DL, is continuously achieving new breakthroughs that encourage researchers to entrust remote sensing imagery analysis tasks to these models and be confident about their performance. Aims The objective of this work is to present what is to our knowledge the first study that explores a multi-task deep learning approach for blind spectral unmixing using only 224x224 pixels RGB images derived from Sentinel-2 and enriched with their corresponding environmental ancillary data (topographic and climatic ancillary data) without the need to use any expensive and complex hyperspectral or multispectral data. The proposed deep learning model used in this study is trained in a multi-task learning approach (MTL) as it constitutes the most adequate machine learning method that aims to combine several information from different tasks to improve the performance of the model in each specific task, motivated by the idea that different tasks can share common feature representations. Thus, the provided model in this workflow was optimized for elaborating endmembers abundance estimation task that aims to quantify the spatial percentage covered by each LULC type within the analyzed RGB image, while being trained for other spectral unmixing related tasks that improves its accuracy in the main targeted task which is endmembers abundance estimation. The provided model here is able to give for each input (RGB image + ancillary data) the contained endmembers abundances values inside its area summarized in an output CSV file. The results can be computed for two different levels N1 and N2. These two levels reflect two land use/cover levels definitions in SIPNA land use/cover mapping campaign (Sistema de Información sobre el Patrimonio Natural de Andalucía) which aims to build an information system on the natural heritage of Andalusia in Spain (https://www.juntadeandalucia.es/medioambiente/portal/landing-page-%C3%ADndice/-/asset_publisher/zX2ouZa4r1Rf/content/sistema-de-informaci-c3-b3n-sobre-el-patrimonio-natural-de-andaluc-c3-ada-sipna-/20151). The first level "N1" contains four high-level LULC classes, whereas the second level "N2" contains ten finer level LULC classes. Thus, this model was mainly trained and validated on the region of Andalusia in Spain. Scientific Questions Through the development of this workflow, we aim at addressing the following main scientific question: - Can we estimate the abundance of each land use/land cover type inside an RGB satellite image using only the RGB image and the environmental ancillary data corresponding to the area covered by this image?