AI Has Outgrown Scrum. We Built What Came Next.
AI teams outgrew notebooks and generic workflows. Learn why AI delivery breaks in production and how Umaku was built to fix it.

Over the last few years, AI teams crossed a quiet threshold. Weâre no longer just experimenting. We are shipping systems that have to run in production, integrate with real products, and serve real users.
But the workflows AI teams rely on were never designed for this reality. This is the story of how we overcame that wallâand why we built Umaku.
Background: Journey of Omdena

Journey from Omdena to Umaku
Phase 1: A collaborative platform for AI model development
Omdena started in 2019 as a collaborative platform to build AI models for social good. Hundreds of contributors explored models and ideas together, collaborating with leading organizations such as WFP, UNHCR, UNICEF, and Save the Children.
That approach worked extremely wellâfor research.Â
But as expectations shifted, organizations didnât just want AI models for research. They wanted demonstrable MVPs: systems with real architecture, frontends, and backends.
Phase 2: Building MVPs was the easy part
To deliver MVPs, on top of the collaborative platform, we formed smaller teams from the top 1â2% of Omdena contributorsâpeople who combined data science depth with engineering judgment.
This approach worked well for the next few years until we had to deliver production-ready, scalable AI products.
Phase 3: Real-World AI Product Development and DeploymentÂ

AI Product Development Workflow
We were managing federated, remote AI teams spread across countries, time zones, and skill levelsâdata scientists, ML engineers, product thinkers, domain expertsâat scale. Our clients were asking us to:
- package experiments into services,
- maintain pipelines,
- collaborate across multiple domain teams,
- and ship work that had to survive beyond a notebook.
This transition for us happened fastâoften without new tools or processes to support it. We experienced that AI delivery introduces new failure modes that traditional software and data science workflows were never designed to handle:
- Leaky data splits when experiments become shared artifacts
- Inconsistent feature engineering across notebooks as work scales
- Silent shape mismatches that surface only downstream
- Hard-coded paths that break in collaborative or remote runs
- Notebook cells that depend on hidden execution order
These werenât beginner mistakes. These were symptoms of a workflow that hadnât caught up with what AI teams were now expected to deliver.
AI Has Moved On. Software Development Workflows Havenât.

Software Development vs AI Development
Weâre no longer experimenting in isolation. Weâre delivering:
- Products and models that need to be deployed and served as a microservice.Â
- Pipelines that need to be maintained
- Notebooks that need to evolve into production systems
- Decisions that affect real users, not just benchmarks
But most of our tools still assumed that:
- Code lives in neat scripts
- Reviews are generic
- Tasks and code exist in separate universes
- âAI copilot feedbackâ can be context-free
That mismatch is where everything starts to break.
The Challenges
1. The Copilot Problem: Feedback Without Context Is Just Noise

Representation of Copilot Reviewing Code
We decided to use AI copilots and automated code reviewers, but what we noticed instead were:
- Noise: Style suggestions on research notebooks.
- Irrelevance: Deployment advice on pre-training experiments.
- Blindness: Feedback that looked smart but didn’t understand the goal.
The Insight: An AI reviewing a PR without knowing the project maturity is like reviewing a book by reading one random paragraph. Technically impressive. Practically useless.
2. Kanban Boards and Code Live in Parallel Universes

Kanban Boards and Code Live in Parallel Universes
Another daily frustration: project management tools donât speak code. Kanban boards know what task is âin progressâ, who is assigned, when something is âdoneâ, but they donât know:
- Which notebook implements the task
- Which PR actually advances it
- whether the code aligns with the taskâs intent
Meanwhile, GitHub knows the codeâbut has no idea why it exists.
So teams spend time translating:
- task â code
- code â task
- decision â documentation
That translation tax compounds fast in federated teams.
3. Scrum and Kanban werenât built for AI
Scrum and Kanban werenât the enemy. They just werenât built for AI work. Scrum assumes “known-knowns,” while AI is 80% “known-unknowns.” You can’t “Sprint” toward model accuracy; you can only “Explore.”Â
As AI engineers, we found that most Scrum ceremonies added friction, while only a few elements actually helped:
- clear goals
- visible progress
- shared understanding of what âdoneâ means
4. Jupyter Notebooks: Powerful, Fragile, and Misunderstood

Fragile Jupyter Notebooks
And then thereâs Jupyter. The backbone of AI work. Also, one of the hardest artifacts to reason about automatically. We repeatedly ran into:
- inaccurate parsing of notebooks
- missed dependencies between cells
- broken assumptions about execution order
- tools treating notebooks like scripts (theyâre not)
Most systems either oversimplify notebooksâor avoid them altogether. Our experience days, ‘The Ticket’ and ‘The Notebook’ is where AI projects go to die.”
So, What Did We Build for AI Development?

Project Overview in Umaku
Our goal was to keep:
- Focus over velocity: sprint goals that embrace uncertainty instead of fake precision
- Intent over status: tickets that explain why work exists, not just its state
- Progress over ceremony: visibility through real artifactsâcode, notebooks, dataânot meetings
- Alignment over micromanagement: context that keeps teams moving together without constant syncs
And we dropped the rest. No performative planning. No story-point theater. No boards that look busy but explain nothing. Also, we didnât recreate Scrum boards or Kanban flows. We extracted what truly benefited us as AI engineers. So, we needed something that understands the project charter, the sprint goal, the intent behind each ticket, and how code and notebooks change against them.
Most Importantly, We Needed Context-Aware, Agentic Feedback
Context-aware agentic feedback means the agent does not operate on artifacts in isolation. It reasons over the project charter, the explicit business objectives, and the current execution phaseâmodel exploration, validation, hardening, or packaging for production. It ingests tickets, ticket comments, design decisions, and historical discussion to reconstruct why the work exists and what constraints shaped it.
Code Quality Feedback in Umaku

Code Quality Feedback
Code Snippet Comparison in Umaku

Code Snippet Comparison
Bug Finder Feedback in Umaku

Bug Finder Feedback
Overall Agentic Feedback Dashboard in Umaku

Overall Agentic Feedback Dashboard
This changes the nature of feedback entirely. Instead of flagging patterns blindly, the agent evaluates decisions relative to project goals and the delivery stage. A modeling shortcut during early experimentation is treated differently from the same shortcut during packaging. A hard-coded path is understood as a prototype artifactâor identified as a release-blocking riskâbased on context, not heuristics.
Most teams today assemble this workflow from disconnected tools: sprint boards to track tasks, ticketing systems to capture intent, bug trackers to log failures, and AI copilots that analyze code without access to any of that context. Each handoff strips meaning away. By the time feedback is generated, the agent knows what changed but not why.
Context is the primary object, not an afterthought. The agent is charter-aware, ticket-aware, and discussion-aware. It understands how decisions evolve over time and how expectations shift as a project moves from research-grade notebooks to production-ready systems.
Umaku: Built From the Inside, Not the Whiteboard
We named our platform Umaku. It comes from the Japanese word (ăăžă/ä¸ćă/塧ă/ć¨ă), meaning “skillful”. Not generic. Not stylistic. But grounded in the intent of the system being built.
Umaku exists because we lived the pain of:
- managing distributed AI teams
- shipping under ambiguity
- reviewing messy notebooks at scale
- aligning product intent with technical reality
Umaku sits between your task board and your Jupyter notebooks, ensuring the AI agent reviewing your PR knows whether you’re in ‘research mode’ or ‘production mode’.”
In short, Umaku is a single platform where:
- Sprints are designed for exploratory, evolving AI work,
- tickets preserve intent across notebooks, models, and code,
- bugs capture assumptions, data issues, and silent failuresânot just errors,
- and the project charter remains visible as work evolves.
We didnât design it in theory. We built it to survive reality.
And now, weâre opening it upâbecause we know weâre not the only ones who felt this gap, and as weâre moving beyond the era of the ‘Experimental Notebook’ and into the era of the ‘AI Product.’Â
Itâs time our workflows caught up. Welcome to Umaku.Â
Please signup for a trial account on umaku.ai

