📢 PaaS for AI Development — And Why AI Development Differs from Traditional Software Development

Flood Risk Assessment Using Analytical Hierarchy Process (AHP) and Machine Learning Models

Step by step case study on Flood Risk Assessment using Analytical Hierarchy Process and Machine Learning. Applied in Togo.

October 15, 2021

15 minutes read

article featured image

A step-by-step case study on how Analytical Hierarchy Process (AHP) and machine learning models can support decision-makers in understanding and managing flood risk. These models help quantify the likelihood and severity of damage to buildings, crops, and communities — enabling faster, more informed, and more compassionate responses during crises.

This project was carried out in collaboration with the impact-driven startup Finz.

Introduction

Natural disasters are regarded as the most pressing issue that must be addressed on a global, regional, and local level. Climate change may increase the frequency and magnitude of catastrophic events like floods, droughts, and wildfires.

Togo, Africa, is highly vulnerable to these natural calamities. Flooding and drought are common occurrences in the country, having negative socioeconomic consequences for the inhabitants, the environment, and the economy. Floods have been extremely devastating in recent years, wrecking infrastructure and destroying cultivated land.

While excessive rainfall is the primary cause of flooding, there are numerous other factors that contribute to flooding which includes deforestation, land degradation, rapid population growth, urbanization, poor land use planning and inadequate drainage and discharge management.

Figure 2: Monthly prediction of temperature and Precipitation [1]

Monitoring and predicting flood risk is critical in order to provide appropriate flood and environmental management solutions. Flood risk mapping is an important part of land use planning and mitigation techniques.

Analytical Hierarchy Process (AHP) 

The Analytical Hierarchy Process (AHP) model was developed to identify and map areas with high flood risk in Togo. AHP is a multi-criteria decision-making approach that combines several conditioning factors such as drainage density, soil type, slope, precipitation, population density, Euclidean distance, and land use.

By integrating these diverse datasets, the team generated both hazard maps and vulnerability maps, which together create a comprehensive picture of flood risk.

Hazard map

The hazard map evaluates how environmental factors contribute to the likelihood and intensity of flooding.

Key components include:

  • Drainage density – The total length of streams and channels within a basin compared to its area. Higher density indicates a greater probability of water accumulation.
  • Drainage Density Formula – Length of all channels / Area of basin
  • Precipitation (Isohyet) – A major driver of flooding, especially during intense rainfall periods.
  • Slope – Areas with steeper slopes experience faster surface runoff, increasing flood danger.
  • Soil type – Soil texture influences how quickly water is absorbed. Clay-heavy soils, for example, absorb water more slowly, increasing runoff and flood susceptibility

Vulnerability map

The vulnerability map reflects how exposed and sensitive communities are to flood impacts. It considers factors like:

  • Euclidean distance – Proximity to main river channels or flow paths increases flood exposure.
  • Land use / land cover (LULC) – Vegetation type, cultivation patterns, and land usage help determine environmental resilience.
  • Population density (PD) – Rapidly growing areas face increased risk due to informal or unplanned urbanization.

Analytical Hierarchy Process uses hierarchical structures to represent a problem and then develop priorities for alternatives based on user judgement (Saaty, 1980). The process consists of the following steps:

  • Break down the problem into its component factors
  • Develop the hierarchy
  • Develop the paired comparison matrix based on subjective judgements
  • Calculate the relative weights of each criterion
  • Check consistency of subjective judgement

The AHP method uses a hierarchical structure to break down the problem into clear steps:

  • Identify the core problem
  • Build the hierarchical structure
  • Create pairwise comparison matrices
  • Assign relative weights for each criterion
  • Validate consistency of the judgments

The overall AHP workflow consists of:

  • Data collection
  • Data preprocessing
  • AHP modeling

Figure 3: AHP Pipeline [2]

Data Collection 

The following datasets were used to construct the hazard and vulnerability maps:

  • Country boundary shapefile (DIVA GIS)
  • Digital Elevation Model (ALOS PULSAR, ALOS World 3D)
  • Land Use Land Cover (Copernicus Global Land Service)
  • Precipitation data (University of California CHRS)
  • Population density (Facebook Data for Good)
  • Soil maps (FAO)
  • River/stream network (Stanford University)

These datasets formed the foundation for understanding environmental behavior across Togo.

Figure 4: Data collected from various sources [4] [9] [7] [6] [5]

Data Pre-processing

Data preprocessing included generating layers using QGIS/ArcGIS.

  • Slope maps were derived from the DEM
  • Euclidean distance maps were created from the river network
  • Drainage density maps were calculated similarly
  • All maps were standardized and reclassified into comparable units

This step ensures all variables can be analyzed consistently in the AHP model.

Figure 5: Maps generated from collected data [2]

AHP modelling

Creating the Hierarchy

In AHP, there are different levels set up as a hierarchy:

  • Level 0: main objective, which in our case is the flood risk map
  • Level 1: Different Criteria which are hazard map and vulnerability map
  • Level 2: Elements (Parameters) to be considered in each criterion. We try to measure their influence on the criteria

Figure 6: AHP hierarchy [10]

All elements above for each criterion were determined based on literature.

Pairwise Comparison Matrix:

Generating pairwise comparison matrix and checking consistency ratio

For each criterion, a pairwise comparison matrix is created. The scores to be used in the matrix are based on the Saaty scale (Saaty 1980) as shown below:

Scale Meaning
1 Equally important
3 Moderately important
5 Important
7 Very strongly important
9 Extremely important
2, 4, 6, 8 Intermediate values between adjacent scales

For every pair in the hazard comparison matrix, the better option is assigned between 1 (equally good) and 9 (better), while the other option is assigned the reciprocal of the value. For example, for the pair D (Row)-ST(Column), we assign a value of 3 while for the pair ST(Column)-R(Row), we assign a value of â…“. Applying this operation for each pair gives the matrix:

D ST S P
D 1 3 1/3 1/5
ST 1/3 1 1/3 1/5
S 3 3 1 1/3
P 5 5 3 1

Please note the values used are based on literature.

Then, for each row the eigenvector Vp is determined using the formula below:

Vp = (W1 X … X Wk)1/k

Vp = eigenvector, Wk = element, k = number of elements

We then get the following:

D ST S P Vp
D 1 3 1/3 1/5 0.67
ST 1/3 1 1/3 1/5 0.39
S 3 3 1 1/3 1.32
P 5 5 3 1 2.94

We then calculate the weighting coefficients Cp using the equation below:

Cp = Vp / (Vp1 + …. + Vpk)

The sum of Cp of all the parameters must equal to 1. We then get the following:

D ST S P Vp Cp
D 1 3 1/3 1/5 0.67 0.13
ST 1/3 1 1/3 1/5 0.39 0.07
S 3 3 1 1/3 1.32 0.25
P 5 5 3 1 2.94 0.55
Sum 5.32 1

Check Consistency:

Now that we have our weights, we need to check if the weights are correct. In other words, we need to check whether the scores we assigned to the pairwise comparison matrix based on our subjective judgement are acceptable.

We create a matrix, let’s call it A3, by doing matrix multiplication on the pairwise matrix (which is a 4×4 matrix) and the weights matrix (which is a 4×1 matrix).We then create another matrix, call it A4, by dividing every value above by the corresponding weight. For example, for row ‘D’, we will divide 2.87/0.67. We get the following matrix:

D 4.29
ST 4.21
S 4.15
P 4.15

We then average the above values to get 4.1975. This value is known as the maximum Eigenvalue (ƛmax).

We then calculate the consistency index (CI) using the formula:

CI = (Æ›max – k)/(k-1), k=number of parameters

CI = (4.1975-4)/(4-1) = 0.066

We then determine the consistency ratio (CR) by the formula:

CR = CI/RI, RI = random index

The random index value is taken from the following table:

(Saaty, 1980)

Number of parameters 1 2 3 4 5 6 7 8 9 10
RI 0 0 0.58 0.9 1.12 1.24 1.32 1.41 1.45 1.49

CR = 0.066/0.9 = 0.073

If the value of the Consistency Ratio is less than or equal to 10%, the weights are acceptable. If the value is greater than 10%, we need to revise our subjective judgment.

Generating vulnerability and hazard map

AHP Hazard Map: Hazard is defined as a natural and man-made phenomenon that occurs with intensity that can cause harm due to a stream overflow.

The hazard map can identify all regions that are at risk of flooding. The spatial extent and possibly vulnerable locations to climatic threats that can induce floods are mapped by combining conditioning factors. Different weights are assigned to determine hazard.

We can calculate the hazard map using the formula:

Hazard index = 0.13*D + 0.07*ST + 0.25*S + 0.55*P

Figure 7: AHP workflow for generating hazard map [2]

AHP vulnerability map

Vulnerability represents the extent of expected repercussions of a natural phenomenon, while risk is the most important component of vulnerability because it decides whether or not someone is exposed to a hazard.

Flood vulnerability mapping is the process of determining a given area’s flooding susceptibility and exposure.

Using the same process, we used for generating the hazard map, we calculate the weights of the vulnerability map and use the given formula to get the vulnerability map:

Vulnerability index = 0.26*PD + 0.64*LULC + 0.1*ED

Figure 8: AHP workflow for generating vulnerability map [2]

Flood risk map

The flood risk map is produced by combining the hazard and vulnerability maps:

Flood Risk = Hazard Index × Vulnerability Index

This final output gives decision-makers a clear view of where urgent action, planning, and protective measures are most needed.

Figure 9: AHP workflow for generating flood risk map [2]

Figure 10: Hazard, vulnerability and flood risk map [2]

After creating flood risk maps, training and testing datasets have been generated by using stratified random sampling to build machine learning algorithms. The problem of class imbalance was rectified and outliers were removed. Automl library MLjar is used to find the best model. MLjar is a state-of-art automated machine learning library used to create an end-to-end machine learning pipeline.

Figure 11: Feature importance or heatmap for machine learning models [2]

The following machine learning models were built using MLjar library [11]:

  • Linear regression model
  • Decision tree
  • Random forest
  • XGboost
  • Neural network
  • Ensemble model

Model validation is done using the ROC curve. An ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds. This curve plots two parameters:

  • True Positive Rate
  • False Positive Rate

Figure 12: ROC curves comparisons of Machine Learning Models [2]

Other flood factors to be considered might be: altitude, aspect, curvature, stream power index (SPI), topographic wetness index(TWI),  sediment  transport  index(STI),  topographic roughness index (TRI), geology, surface runoff. The confusion matrix is also generated which shows that the ensemble model and XGboost model performed best.

Figure 13: Confusion matrix for ensemble and Xgboost models [2]

 

This shows that the ensemble model performed best considering the time factor.

Comparison between AHP and Machine Learning models:

Figure 14: Flood risk map created using machine learning model [2]

Figure 15: Flood risk map created using AHP model [2]

As seen from the above diagrams, the machine learning model was able to classify high flood risk areas more accurately than the AHP model. We had few samples for moderate and high-risk categories, but still the model is at par with AHP technique. The map is smoother and is even more accurate using Machine Learning. We can predict risk with features half of that in the AHP.

Conclusion

While generating the hazard map, the team found out that precipitation is the most dominating factor. The hazard map shows how much the region is prone to floods. The regions in red indicate that they are more prone to flood because of high precipitation in that region. The regions in green are very less prone to floods because of low precipitation.

While generating vulnerability maps, land use land cover maps were given highest weight-age. When we combined hazard map and vulnerability map to generate flood risk map, the most dominating factors were precipitation, land use, land cover and population density.

The study shows that stringent action needs to be taken. There is a need for proper land use planning, drainage and discharge management is necessary in order to mitigate flood risk.

These risk maps can be further improved by adding more relevant information like flow accumulation and lithology etc. There is a wide spectrum of research opportunities available in which AHP modelling could be applied.

Alternatively, the AHP model could be used for target countries. And a machine learning model to create a risk score for the surrounding areas without having to create an AHP model. Using a machine learning model directly on the input of the AHP model reduces the computational steps to create a risk score. Expert Knowledge can be used to set up a regional AHP model to refine the scoring for the areas where the machine learning model is not estimating credible scores.

References

This article is written by Deepali Bidwai, Satyam Suman, Sam Joy.

Want to work with us too?

FAQs

AHP (Analytical Hierarchy Process) is a multi-criteria decision method used to weigh environmental and socio-economic factors to assess flood risk accurately.
Machine learning processes large datasets, identifies complex patterns, and generates more accurate flood risk predictions compared to traditional methods.
Common factors include precipitation, slope, drainage density, soil type, and land use.
DEM data, rainfall data, soil maps, population density, LULC maps, river networks, and satellite imagery are typically used.
AHP provides expert-guided weights while ML models offer better scalability, automation, and predictive accuracy.
Models such as Random Forest, XGBoost, Neural Networks, and Ensemble models often deliver the highest accuracy.
Machine learning models can outperform AHP by detecting high-risk areas more precisely, especially with large and diverse datasets.
They guide urban planning, early warning systems, infrastructure development, and disaster preparedness strategies.