Projects / Top Talent Project

Extraction and Construction of International Wastewater Database Leveraging AI & NLP

Application Deadline: Nov 20th


Omdena Featured image

This is a paid opportunity. In order to be eligible to apply for this project, you need to be part of the Omdena community and have finished at least one AI Innovation Challenge.

You can find our upcoming AI Innovation Challenges at https://omdena.com/projects.

The problem

Globally, wastewater management is a critical environmental and public health concern. The discharge of untreated or inadequately treated industrial wastewater into water bodies can lead to significant ecological damage and health risks. Regulatory bodies issue permits to industrial facilities, including limits on pollutants and discharge volumes to mitigate these risks. However, these permits are often dispersed across various public databases, published in non-standardized PDF formats, making it challenging for wastewater solution providers to access and analyze compliance data. This barrier hinders the ability of solution providers to identify non-compliant facilities and offer necessary interventions, thus impeding efforts to promote sustainable wastewater management practices.

The lack of a structured, easily accessible database of wastewater permits creates a missed opportunity for solution providers to contribute to environmental protection by helping facilities achieve compliance through innovative wastewater treatment solutions.

Klarifi has a vision to become the leading hub for industrial pollution data, incorporating data from environmental regulators across the world. Klarifi has started incorporating data from the USA, and is now expanding coverage to Europe.

This Omdena-Klarifi project aims to harness the power of Artificial Intelligence (AI) and Natural Language Processing (NLP) to build a comprehensive, structured international wastewater database. The envisioned database will serve as a pivotal tool for wastewater solution providers and other stakeholders in the wastewater ecosystem. By simplifying access to compliance data, the project will promote compliance with environmental regulations, and catalyze the adoption of advanced wastewater treatment technologies. Ultimately, this initiative will contribute to the global effort to safeguard water resources, protect ecosystems, and ensure public health and safety.

The project goals

The main goal of this project is to develop a structured, comprehensive database that aggregates and organizes information from publicly available wastewater permits for industrial facilities in Europe. This database will be designed to be easily accessible for wastewater solution providers, enabling them to identify and target potential clients who are in need of wastewater treatment solutions based on their compliance with permit regulations and other factors.

The ultimate objective is to support the improvement of wastewater management practices globally, thereby contributing to environmental sustainability and public health protection.

Project Scope:

  • Understanding of context and setting: To help the team get started, the Klarifi team will provide the following during the onboarding
    • Introduction to the subject matter: Wastewater permits
    • Introduction to the Klarifi data platform
    • A mapping of the relevant data sources for the target countries
    • A list of indicators and other information that would be relevant to extract (such as facility information, nitrogen discharge limits, facility capacity etc.)
    • Any other support as needed
  • Targeted Data Extraction: Utilize facility identifiers to systematically extract pertinent data from each identified source, ensuring a thorough representation of the wastewater landscape.
  • Intelligent Document Analysis: Implement AI-driven algorithms to automate the reading of complex documents, determining which key variables can be reliably extracted.
  • Rigorous Quality Assurance: Conduct quality testing of the content extracted through automated processes, ensuring accuracy and reliability (it’s preferable to have less information that is more reliable).

**More details will be shared with the designated team.

Why join? The uniqueness of Omdena Top Talent Projects

Top Talent opportunities come as a natural next step after participating in Omdena’s AI Innovation Challenges.

Everyone in the community is eligible to participate once they have shown the relevant skills based on the merit of involvement in past Omdena challenges and the community.

If you are looking for the next challenge after participating in one or more Omdena AI Innovation Challenges, then we believe you have made the right choice! With a healthy, pressured environment, you will be pushed to contribute, learn and grow even more.

Find more information on how an Omdena Top Talent Program works

First Omdena Project?

Join the Omdena community to make a real-world impact and develop your career

Build a global network and get mentoring support

Earn money through paid gigs and access many more opportunities



Eligibility to join an Omdena Top Talent project

Finished at least one AI Innovation Challenge

Received a recommendation from the Omdena Core Team Member/ Project Owner (PO) is a plus



Skill requirements

Good English

Machine Learning Engineer

Experience working with Machine Learning and/or NLP is a plus.



This challenge is hosted with our friends at


Application Form
media card
Optimizing & Deploying Climate and Credit Risk Scoring for African SMEs With AI
media card
Building AI-powered Early Warning System for Extreme Weather Conditions in Tanzania
media card
Optimizing the Accuracy & Explainability of Medical Insurance Claim (Fraud, Waste and Abuse) FWA Detection by Leveraging AI & Anomaly Detection

Become an Omdena Collaborator

media card
Visit the Omdena Collaborator Dashboard Learn More