AI Insights

Harnessing AI to Monitor and Optimize Reforestation Efforts in Madagascar

May 23, 2024


article featured image

Reforestation is a critical solution to combat climate change and restore degraded ecosystems. However, monitoring the success of reforestation efforts can be challenging, especially over large areas. To address this problem, we partnered with Bôndy to develop a machine learning system that uses satellite and drone imagery to track and quantify the progress of their reforestation program in Madagascar. By leveraging advanced AI algorithms, we aimed to provide valuable insights into the effectiveness of reforestation efforts and support data-driven decision making for environmental conservation.

Why we need Reforestation

Reforestation done by voluntary group

Reforestation done by voluntary group. Source: Freepik

The world is facing a climate crisis. The global temperature is rising, the ice caps are melting and the oceans are warming up. The Earth’s natural resources are being depleted and we need to take action now before it’s too late. One of the most immediate things that we can do is reforestation which can help mitigate the climate crisis.

According to the World Resources Institute, the world has lost one-third of its forests since the last ice age, and deforestation continues at an alarming rate of about 10 million hectares per year. Reforestation can help reverse this trend and provide numerous benefits. 

Reforestation can sequester up to 1.1–1.6 GT of CO2 per year, equivalent to about 25% of current annual fossil fuel emissions.

In addition to carbon sequestration, reforestation can also improve biodiversity, prevent soil erosion, and provide economic opportunities for local communities.

Why Reforestation is Challenging

  1. Harsh Environmental Conditions: Reforestation efforts often take place in areas with challenging environmental conditions, such as drought, extreme temperatures, or poor soil quality. These factors can hinder the growth and survival of newly planted trees, making it difficult for them to establish themselves in the ecosystem.
  2. Lack of Resources: Reforestation projects require significant resources, including funding, labor, and equipment. In many cases, these resources may be limited, making it challenging to implement and maintain large-scale reforestation efforts effectively.
  3. Pests and Diseases: Young trees are particularly vulnerable to pests and diseases, which can cause significant damage and mortality. Without proper monitoring and intervention, these threats can quickly spread and compromise the success of the entire reforestation project.
  4. Competition with Invasive Species: Invasive plant species can outcompete native tree seedlings for resources, such as water, nutrients, and sunlight. This competition can severely impact the growth and survival of the planted trees, making it crucial to monitor and manage invasive species populations.
  5. Human Interference: Reforestation sites may be subject to human interference, such as grazing, logging, or land conversion for agriculture. These activities can damage or destroy young trees, undermining the reforestation efforts. Close monitoring is necessary to identify and prevent such interference, ensuring the long-term success of the project.

The Partner

Our partner Bôndy’s incredible initiative towards mitigating climate change through their reforestation program includes planting thousands of trees in Madagascar (one of the biodiversity hotspots) to create social and environmental impact and help the rural population to have more sustainable revenues. Bôndy’s community-based reforestation program in Madagascar consists of 5 regions, spanning a range of climate zones and land use from agroforestry to mangroves.

The Goal

This Omdena challenge aimed to develop an AI algorithm using satellite and drone imagery to monitor the success of reforestation efforts, focusing on the critical first 5 years of tree growth and survival. By analyzing imagery data, the algorithm would assess the health and progress of newly planted trees, providing insights to ensure the long-term success of the reforestation initiative.

The project brought together AI experts, environmental scientists, and forestry specialists to create a robust and scalable solution that could be applied to reforestation projects worldwide, contributing to the fight against climate change and the preservation of our planet’s ecosystems.

Our Approach

Project Pipeline

Project Pipeline

Data Collection

The following data sources were used in the project.

Satellite Data Sources

Planet satellite monthly data

Norway’s International Climate & Forests Initiative (NICFI) mosaics contain both monthly and biannual collections. (Biannual collections are generated every 6 months). For this challenge, the team used monthly data.

  • RGB (Red Green Blue) and Near InfraRed bands
  • 4.7m resolution
  • Timeframe: May 2020 – May 2022

Sentinel 2 satellite cloud-free L3A data

Level 3A products for Sentinel-2 are monthly, cloudless, surface reflectance syntheses.

  • RGB and Near InfraRed bands
  • 10m resolution
  • Timeframe: May 2020 – May 2022

Satellite Data Sources

Drone Image Sources:

These were the drone images provided by Bondy.

Bondy collected data

  • RGB bands
  • Range of ground sampling distance 5cm +
  • Timeframe: May 2020 – July 2022
Georeferenced drone image of a Bondy field

Georeferenced drone image of a Bondy field

Meteorological Data:

ERA-5 data

ERA5-Land is a reanalysis dataset providing a consistent view of the evolution of land variables over several decades at an enhanced resolution compared to ERA5.

  • Temperature, Precipitation, and Evapotranspiration
  • 10km (0.1 degree) spatial resolution
  • Timeframe: May 2020 – July 2022
ERA 5 Land Reanalysis Data

ERA 5 Land Reanalysis Data

Field Data

Bondy collected field information: This is the field information provided by Bondy about their 150 fields and their outlines. The data was provided in the form of KMZ files for 5 regions.

  • 150 Bondy field outlines
  • Tree planting data per field
  • 1800 Tree locations and tree images

Field Data

Data Preprocessing and Analysis

The team used Python libraries like Rasterio for satellite data processing and visualization. QGIS with KMLTools was used to clean field data and extract valid reforestation site polygons.

Python’s PIL library extracted drone image metadata, while OpenDroneMap software created orthomosaics, elevation models, and 3D textured models. Ground Sample Distance (GSD) was also obtained, which is essential for estimating tree counts from images.

Georeferenced Digital Elevation Models(DSM)

Applying Super-Resolution to Images

The team experimented with pre-trained super-resolution models, particularly the SRCNN model from Dong et al., using an implementation by WarrenGreen. They began by applying transfer learning, then trained the model on high-resolution drone images from Bôndy. Subsequently, they retrained the model using medium-resolution images and evaluated its performance on lower-resolution ones.

The model accepts 400×400 images, so larger images were cropped, augmented, and used for training. To create lower-resolution training images, they scaled the images down and then back up. After training, the model was converted to the ONNX format to enable faster inference and can be applied as a convolutional filter to transform images into their super-resolution counterparts.

Super Resolution Images

Vegetation Indices

The three vegetation indices namely Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI), and Modified Soil Adjusted Vegetation Index Values (MSAVI2) were calculated. The indices calculation is done using a python library called rasterio.

Vegetation Indices (NDVI, NDWI and MSAVI2)

Vegetation Indices (NDVI, NDWI and MSAVI2)

Time Series for Meteorological Data

Three environmental parameters – temperature, precipitation, and evaporation – were chosen to monitor the impact of the environment on the health of trees in 3 regions of Madagascar. Time series models for each parameter were built using the FaceBook prophet library, with data from ERA5-Land hourly data (1950-present) computed using the python library xarray at the closest grid point to the parcel.

Temperature and evaporation time series

Temperature and evaporation time series

Precipitation Trend

Precipitation Trend

Modeling

UNet++ architecture is used in training. It is an architecture for semantic segmentation based on UNet. The architecture is formerly used in medical images. The reason we chose to use UNet++ in this challenge is that it outperforms the UNet by using connected nested decoder sub-networks enhancing extracted features processing. EfficientNet is used as backbone also known as feature extractor in our model.

Modeling Steps

Modeling Steps

Postprocessing

The model outputs raw masks showing pixel-wise tree locations. To get the approximate tree count and vegetation percentage, the outputs must be processed using the ground sampling distance (GSD) in cm and average tree size in cm². While the model can’t detect individual trees, it can detect tree patches and estimate the tree count, albeit with a potentially large error margin. However, this can still provide insights into vegetation changes over time. Calculating vegetation percentage and approximating tree numbers involved basic mathematical operations.

Vegetation Percent

Vegetation Percent

Number of trees

Number of trees

Dashboard Deployment

The main objective of the dashboard is to identify the number of trees in a given region for a given drone image. To detect and estimate the number of trees UNet++ model was integrated into the dashboard. The vegetation indices time series and meteorological parameter forecasting time series were also integrated into the dashboard. The dashboard was dockerized by creating a docker container.

Deployment with docker container

Deployment with docker container

Results and Insights

After experimenting with deep learning architectures, the team selected Unet++ for model training. This robust semantic segmentation model can identify areas with planted trees. While it may not detect every small sapling, it will be useful as the trees mature. The model can help detect and estimate tree counts in a given parcel.

Model performance was evaluated using 4 metrics: IoU (Intersection over Union), Precision, Recall, and Loss.

As we trained the model, the IoU increases over the number of batches/steps on training data. Similar trend is seen on test data.

The intersection of Union (IoU) on test and training data

The intersection of Union (IoU) on test and training data

The loss function tells how good your model is in predictions. If the model predictions are closer to the actual values the Loss will be minimum and if the predictions are far away from the original values the loss value will be maximum. Our model shows a decrease in loss function as we train the model.

Loss function on test and training data

Loss function on test and training data

As we train the model more, the precision increases with an increase in batch size but for the test, it is a bit bumpy.

Precision on test and training data

Precision on test and training data

Usually, recall should increase during the training. But for some reason from batches 10 to 25 our model has seen a very unusual decrease but we kept on training and the trend started showing up from batches 25 onwards. Although the recall for the test is a bit bumpy, it’s showing an increasing trend. The recall improved over the test set.

Recall on test and training data

Recall on test and training data

Dashboard

From the drone image, the model can predict tree patches, the approximate number of trees, and vegetation health (as a percentage). The model can also predict the best precision, best loss, or best recall. The user can also adjust ground sampling distance, tree size, and confidence threshold.

Model Prediction Using Dashboard

Model Prediction Using Dashboard

The vegetation indices page lets users choose from NDVI, NDWI, and MSAVI2 for a particular date and particular region. It also shows the time series data for that index.

Vegetation Indices

 

NDVI with time

Dashboard with Vegetation Indices and Time Series

Meteorological data shows the time series forecasting for temperature, precipitation, and evaporation for 3 regions of Madagascar.

Time series for meteorological data

Time series for meteorological data

Key Achievements

The partnership between Bôndy and Omdena’s machine learning team yielded several notable achievements in advancing reforestation monitoring capabilities in Madagascar:

  1. Developed Machine Learning Model for Reforestation Monitoring: Successfully built and trained a machine learning model to track and quantify reforestation progress in Madagascar using satellite and drone imagery. The model can predict tree patches, estimate tree counts, and assess vegetation health from drone images.
  2. Created Interactive Monitoring Dashboard: Developed a user-friendly dashboard that visualizes model predictions, allows customization of parameters like ground sampling distance and confidence thresholds, and displays vegetation indices and meteorological time series data for different regions of interest.
  3. Integrated Multiple Data Sources: Leveraged a combination of satellite imagery, drone imagery, and meteorological data to provide a comprehensive view of reforestation efforts over time. This multi-modal approach enables more robust monitoring and analysis.
  4. Established Methodology for Scaling Monitoring Efforts: Provided recommendations for collecting valid geometry parcels, automating drone flight paths, establishing ground sampling points, and exploring advanced imaging technologies. These steps lay the groundwork for scaling and optimizing the reforestation monitoring methodology.

Future scope

The achievements of the Bôndy-Omdena partnership have laid a strong foundation for expanding and enhancing reforestation monitoring capabilities in Madagascar and beyond. To build upon this progress, several key areas of focus have been identified for future development:

  • Collecting Valid Geometry Parcels: Launch a survey for all sites to collect valid geometry parcels per site.
  • Automated Drone Flight Paths: For drone imagery, Bôndy could automate flight paths in advance. A constant flight height is recommended for a unique Ground Sampling Distance (GSD), improving model prediction. 5-65 images per plot are required (32 recommended) for a good balance between processing time and accuracy. Image overlap should be 65%+ (72% recommended), and 3D images must overlap 83%+.
  • Establishing Ground Sampling Points: Ground Sampling Points (GCPs) can be established at one site (only) to check the result of processing for different automatic flight path settings until the best one is obtained.
  • Exploring Advanced Imaging Technologies: In the future, Bôndy could try a multi-spectral camera with NIR band, point cloud (LiDAR) data, or commercial satellite data. However, LiDAR data may be too precise for the purposes and may not detect very small saplings in the first year or two of growth. It may be better suited for trees at 4-5+ years post-planting.
  • Calculating Tree Height and Topography: An alternative solution to LiDAR would be to calculate tree height and topography with photogrammetry principles using Phantom 4.
  • Crowdsourcing Drone Data Collection: Another alternative solution would be to rent a drone to fly LiDAR once per field site. For example, GLOBHE (https://globhe.com/) offers crowdsourced drone data collection for a low cost.

Potential Applications in Other Industries

  1. Precision Agriculture: Adapt the machine learning model to assess crop health, estimate yield, and detect disease or pests, helping farmers optimize resources and improve productivity.
  2. Urban Forestry Management: Use the technology to assess urban tree health, identify areas needing maintenance, and quantify environmental benefits like carbon sequestration and air pollution reduction.
  3. Wetland and Coastal Restoration: Monitor the progress of restoration projects, such as mangrove reforestation or seagrass bed rehabilitation, by analyzing vegetation establishment and water quality changes.
  4. Mine Site Rehabilitation: Utilize the model to monitor revegetation efforts, detect soil erosion or instability, and ensure compliance with environmental regulations.
  5. Wildfire Recovery Monitoring: Track natural regeneration, assess the effectiveness of post-fire restoration, and identify areas requiring additional interventions like erosion control.
  6. Habitat Conservation and Restoration: Assess the health and extent of critical habitats to inform conservation strategies, prioritize restoration efforts, and evaluate habitat management practices.
  7. Carbon Offset Project Verification: Verify the success and environmental benefits of carbon offset projects involving reforestation or afforestation, ensuring the integrity of carbon offset markets.

Want to work with us too?

media card
How We Leveraged Advanced Data Science and AI to Make Farms Greener
media card
Clear Data for Clear Skies: How We used AI to Predict Air Quality in Poland
media card
News from Romania: We Did a Groundbreaking App that Lets Us Together Protect a Country from Illegal Trees Cutting
media card
A Beginner’s Guide to Exploratory Data Analysis with Python