AI for Solar Energy Adoption in Sub-Saharan Africa
Discover AI-powered dashboards predicting solar energy use and ROI to boost solar adoption in Sub-Saharan Africa’s energy landscape.
December 4, 2025
12 minutes read

Sub-Saharan Africa faces a critical energy deficit. More than 600 million people live without reliable electricity, and demand is rising by 11 percent annually—the fastest rate worldwide. Over the next two decades, power needs are expected to reach 390 TWh, with solar projected to supply roughly 70 percent of that growth. As photovoltaic costs continue to decline, solar energy stands out as the most viable, scalable, and sustainable path to electrify communities where extending the grid is neither practical nor affordable.
To help accelerate this transition, Omdena partnered with Zimbabwe-based NeedEnergy to apply artificial intelligence to solar planning and management. The collaboration resulted in two powerful dashboards: one that supports long-term decision-making by sizing PV systems and estimating financial returns, and another that provides short-term forecasts to help users balance demand and generation. By combining data science, physics-based models, and modern web technologies, the project demonstrates how intelligent tools can unlock cleaner, more reliable power for millions.
Bringing Solar Data and AI Together
Solar energy is a promising renewable resource for a climate‑friendly future that reduces reliance on finite fossil fuels. Yet converting sunshine into usable electricity requires careful planning. In markets like Harare, Zimbabwe, there is little public data on energy consumption, and prospective adopters need robust tools to evaluate whether going off grid makes sense. To overcome these barriers, the Omdena–NeedEnergy team built dashboards that combine meteorological information, consumption data, and open‑source libraries to help users size PV systems and plan around their expected savings.
The PV Sizing Tool: Long‑Term Planning
The first dashboard is a long‑term planning tool for Harare that helps prospective customers estimate how much solar energy their PV installation will produce over the next 20 years, the typical life span of solar panels. In addition to estimating generation, the tool calculates the number of panels needed to meet demand and compares the installation cost with projected savings to assess return on investment. This type of PV sizing application is common in countries with mature solar markets, but it is novel in Zimbabwe because of the limited availability of consumption data.
Data Collection and Wrangling
Building the PV sizing tool starts with assembling the right datasets. In this case the model draws on three primary data sources: energy consumption from NeedEnergy’s clients (accessed through their proprietary API); solar irradiance measurements for Harare from Solcast averaged over 14 years (2007 – 2021) and saved as the Mean Meteorological Year; and panel and inverter specifications from the PVlib Python library.
The irradiance record includes three key variables—Diffuse Horizontal Irradiance (DHI), Direct Normal Irradiance (DNI) and Global Horizontal Irradiance (GHI)—which are essential for modeling PV performance. (For a primer on solar irradiance terminology, see this article from the National Renewable Energy Laboratory.) Researchers merged the historical irradiance record with the client consumption data using the Pandas library and set the date column as the index. After this preparation, the tool offers two options to users: estimate potential savings or estimate the size of a PV array for their home or business.
The interface shown above allows users to choose a panel type, inverter type, price per installed watt, time horizon and number of panels before running the calculations. It is designed to be intuitive for prospective customers who may not have technical expertise.
Modeling Solar Energy Production
Once the input data have been prepared, the tool can model how much electricity a PV system will generate. It has two main purposes: calculating net savings and sizing the PV system. The first purpose estimates the difference between avoided utility bills and the initial investment over the chosen time horizon, given the system cost, the energy consumption profile and the assumed electricity price. The second purpose computes how many panels of a given type are required to meet demand based on the consumption history and the average irradiance profile.
The dashboard relies on PVLIB to simulate PV system performance. PVLIB is an open‑source package developed at Sandia National Laboratories that provides functions for modeling irradiance, temperature effects, and PV component behaviour. Due to limited data availability in Harare, the team made several assumptions: energy demand is periodic, the growth rate of demand is treated as constant, and seasonality is not fully captured because much of NeedEnergy’s API data covers less than one year.
First, the installation parameters such as module type, inverter type and temperature model are retrieved from PVLIB:
// import pvlib
sandia_modules = pvlib.pvsystem.retrieve_sam(‘SandiaMod’)
sapm_inverters = pvlib.pvsystem.retrieve_sam(‘cecinverter’)
module = sandia_modules[module_name]
inverter = sapm_inverters[inverter_name]
temperature_model_parameters = pvlib.temperature.TEMPERATURE_MODEL_PARAMETERS[‘sapm’][‘open_rack_glass_glass’]
system = {‘module’: module,
‘inverter’: inverter,
‘surface_azimuth’: 180}
Next, the meteorological context is specified. The tool calculates the sun’s position and determines the air mass and atmospheric pressure at Harare’s latitude (−17.824858) and longitude (31.053028) with an altitude of 1 490 m:
altitude = 1490
latitude = -17.824858
longitude = 31.053028
times = data_frame.index
system[‘surface_tilt’] = latitude
solpos = pvlib.solarposition.get_solarposition(times, latitude, longitude)
dni_extra = pvlib.irradiance.get_extra_radiation(times)
airmass = pvlib.atmosphere.get_relative_airmass(solpos[‘apparent_zenith’])
pressure = pvlib.atmosphere.alt2pres(altitude)
am_abs = pvlib.atmosphere.get_absolute_airmass(airmass, pressure)
Using these angles and irradiance values, the code computes the effective irradiance on the panel surface and converts it to direct current (DC) and alternating current (AC) power. The parameter number_modules corresponds to the number of installed panels:
aoi = pvlib.irradiance.aoi(system[‘surface_tilt’], system[‘surface_azimuth’],
solpos[‘apparent_zenith’], solpos[‘azimuth’])
total_irrad = pvlib.irradiance.get_total_irradiance(
system[‘surface_tilt’], system[‘surface_azimuth’],
solpos[‘apparent_zenith’], solpos[‘azimuth’],
data_frame[‘Dni’], data_frame[‘Ghi’], data_frame[‘Dhi’],
dni_extra=dni_extra, model=’haydavies’)
tcell = pvlib.temperature.sapm_cell(
total_irrad[‘poa_global’], temp_air, wind_speed,
**temperature_model_parameters)
effective_irradiance = pvlib.pvsystem.sapm_effective_irradiance(
total_irrad[‘poa_direct’], total_irrad[‘poa_diffuse’],
am_abs, aoi, module)
dc = pvlib.pvsystem.sapm(effective_irradiance, tcell, module)
ac_power = np.maximum(number_modules *
pvlib.inverter.sandia(dc[‘v_mp’], dc[‘p_mp’], inverter), 0)
Calculating Net Estimated Savings
Finally, the tool calculates the net savings for a PV installation by comparing the cost of purchasing electricity with the cost of installing panels. It takes as input the merged dataframe of irradiance and consumption, the time horizon, price per watt, number of panels, chosen module and inverter types, and the retail price of electricity per kilowatt‑hour:
// power_savings = np.minimum(data_frame[‘consumption’],
panels_count * estimated_generation_by_unit)
mean_hourly_power_savings_by_day = power_savings.groupby(
power_savings.index.date).mean()
expected_hourly_power_savings = mean_hourly_power_savings_by_day.mean()
potential_energy_savings = expected_hourly_power_savings * time_horizon * 365 * 24
watts_per_panel = int(panel_type.split(‘_’)[3][0:3])
initial_investment = panels_count * watts_per_panel * price_per_watt
total_invoice_reduction = potential_energy_savings * (price_per_kwh / 1000)
potential_savings = total_invoice_reduction – initial_investment
When users select a time horizon and enter their system details, the dashboard returns interactive plots. The red curve represents the estimated solar generation, while the blue curve represents historical energy consumption. Because the plots are built with the [Plotly library], users can zoom in to examine specific days or zoom out to view longer trends.
Another chart depicts the daily variation in energy demand and predicted production. Interactive charts help prospective clients understand whether the chosen panel configuration will meet their needs throughout the year.
The Energy Alert Tool: Short‑Term Forecasting
While the PV sizing tool supports long‑term planning, the second dashboard serves existing solar customers by predicting energy demand and generation over the next 36 hours. This short‑term dashboard provides alerts when the forecasted solar output is likely to fall short of demand so that users can adjust their consumption or switch to backup sources. It was built specifically for Harare, where consumption data were available and where the Solcast API can provide reliable seven‑day irradiance forecasts.
The tool draws on three sources:
The short‑term dashboard relies on three ingredients: historical consumption data from NeedEnergy’s API; a seven‑day solar irradiance forecast for Harare from the Solcast API that is used to generate hourly predictions; and panel and inverter specifications (including tilt angle, azimuth angle and panel count) from the PVLIB Python library.
The dashboard interface allows users to specify these parameters. It then forecasts demand and generation and displays whether the solar array will meet demand over the next week.
Energy Demand Forecasting with LightGBM
Predicting future consumption requires sophisticated modelling. The team chose LightGBM, a fast, memory‑efficient gradient boosting library that performs well on large datasets (see the lightgbm package documentation for more information). Because each client has unique consumption patterns, the model trains a separate forecaster for each user and for each of the 36 forecasting hours. This approach avoids the complexity of fitting a single model across all clients and allows the system to capture individual behaviour more accurately.
To reduce overfitting and simplify the model, the team employed a pruning strategy based on SHAP values, which measure how much each input contributes to the model’s output. Initially, the training set includes the previous 72 hours of consumption, the weekday, and the time of day. Two pruning steps are applied: Initially the training set includes the previous 72 hours of consumption, the weekday, and the time of day. To simplify the model the team applies two pruning steps. First, after training an initial LightGBM model the input features are ranked by their SHAP values, and only the 20 most important variables are retained. A helper function sorts features by importance:
def keep_importants(cols, importances, size=20):
important_index = np.argsort(importances)[::-1][:size]
important_features = cols[important_index]
return important_features
Second, the remaining variables are re‑evaluated by normalizing their SHAP values relative to the largest contribution. Features contributing less than 5 percent of the maximum importance are removed:
def keep_by_percentage(cols, importances, percentage=0.05):
largest_importance = np.max(importances)
normalized_importance = importances / largest_importance
mask = normalized_importance > percentage
important_features = cols[mask]
return important_features
The pruning process is akin to the recursive feature elimination technique ([RFE]), with the added benefit of SHAP values for interpretability. One caveat is that SHAP values may distribute the influence of a single correlated effect across multiple variables, causing important effects to appear less significant. Care must be taken when interpreting SHAP‑based pruning.
The table below summarizes which features remain at each stage of the pruning process. Each cell lists short phrases rather than full sentences to preserve readability.
| Pruning stage | Features kept |
|---|---|
| Initial set | previous 72‑h consumption; weekday name; time of day |
| After step 1 | twenty most important features by SHAP value |
| After step 2 | features with ≥5 % of normalized SHAP importance |
Solar Energy Production Forecast with PVLIB
Once demand has been forecasted, the next challenge is to estimate how much electricity the PV system will generate over the same horizon. The tool queries the Solcast API through the [solcast python library] to retrieve irradiance forecasts with 30‑minute resolution:
// import solcast
latitude = -17.824858
longitude = 31.053028
API_KEY = # Place your API_KEY here
data = solcast.get_radiation_forecasts(latitude, longitude, API_KEY)
seven_day_forecast = data.forecasts
data_df = pd.DataFrame(seven_day_forecast)
The team then repeats the PVLIB methodology described in the PV sizing section to estimate AC power generation. The time horizon for modelling solar production matches the 36‑hour demand forecast. After calculating both demand and generation, the tool compares the two series and issues alerts when predicted production falls below projected consumption by a predefined percentage.
In the snapshot above, the historical energy demand is shown in blue, the forecasted demand for the next 36 hours appears in purple, and the predicted solar production (given a chosen number of panels) is shown in red. Alerts notify users when the PV system will not meet demand so that they can plan accordingly.
Building and Deploying the Dashboards
The user interface for both dashboards is built with Streamlit, a Python framework for building data apps, and the applications are deployed on Heroku. Readers interested in deploying similar tools may consult the [Streamlit Tutorial: Deploying an AutoML Model Using Streamlit and the tutorial by Navid Mashinchi, “A quick tutorial on how to deploy your Streamlit app to Heroku”, which explain these steps in detail.
Demo Dashboards
The figure below shows a demo of the dashboards in action. Users can explore how different panel configurations affect long‑term savings and monitor short‑term demand and generation forecasts in near real time.

Demo Dashboard
Conclusion
Machine learning and physics‑based modelling can be powerful allies in the drive toward renewable energy adoption. By combining consumption data, irradiance records, and open‑source libraries such as PVLIB and LightGBM, the Omdena–NeedEnergy project demonstrates that it is possible to build practical tools for both long‑term planning and short‑term operational awareness. These dashboards are especially valuable in regions like Sub‑Saharan Africa, where electrification needs are urgent and reliable data are scarce. Interested readers can explore the dashboards using the link provided by Omdena and NeedEnergy.
You might also like
- Rooftops Classification and Solar Installation Acceleration using Deep Learning
- Increasing Solar Adoption in the Developing World through Machine Learning and Image Segmentation
- Tackling Energy Poverty in Nigeria Through Artificial Intelligence
If you want to expand solar adoption with confidence, reduce planning uncertainty, and improve system reliability, Omdena can co-build AI tools that convert complex data into clear electrification decisions.











