Building ULog: A deterministic Log Normalization & Classification pipeline

Problem Statement
AI systems emit many kinds of telemetry: API calls, model interactions, agent/tool traces and computer-vision pipelines. These logs are often inconsistent in structure, naming and meaning. That inconsistency causes:
- missed critical signals (safety, model quality, tool failures),
- noisy or untrustworthy alerts,
- slow incident triage because engineers must manually reconcile raw logs, and
- difficulty answering cross-system questions and audits.
ULog addresses this by creating a common contract and reliable, explainable classification for every event.
Primary objectivesÂ
- Reliable detection: Ensure important incidents (safety events, tool errors, model drift) are discovered consistently across services.
- Actionable alerts: Reduce alert noise so on-call teams receive high-value, high-trust notifications.
- Faster triage: Provide normalized, contextual event records so engineers can diagnose and remediate faster.
- Explainability & auditability: Each classified event must include a clear rationale that can be inspected during investigations.
- Predictable onboarding: Make it straightforward for teams to connect new services and get useful classification within days.
- Cross-system visibility: Enable easy queries and dashboards that span multiple AI components without custom per-service logic.
Points to care aboutÂ
- Determinism: Classification decisions must be reproducible and auditable.
- Versioning: Contracts evolve with clear versioning and migration windows.
- Data protection: Raw event data is preserved for forensics but access is controlled; personally identifiable information must be handled per policy.
- Low-latency: Support real-time routing and alerting.
- Operational usability: Dashboards, runbooks and alerts must be clear and actionable.
- Fast adoption: Provide adapter templates and onboarding checklists so teams can integrate quickly.
Solution ConceptÂ
Define versioned JSON logging contracts for key event types (API calls, model interactions, agent steps, CV pipelines); implement a streaming ingestion and validation pipeline that normalizes incoming events to those contracts; apply a deterministic rule engine that classifies each event into categories, severity and outcomes; and wire the classified events into alerting, dashboards and audits. The emphasis is on predictable, explainable outputs rather than opaque ML-only decisions.
8-week planÂ
Sprint 1 (Weeks 1–2) — Foundations
- Define canonical event fields and v1 JSON contracts.
- Establish governance policy for schema versioning.
- Provision dev messaging and secure storage.
Sprint 2 (Weeks 3–4) — Ingestion & Adapters
- Build ingestion API that accepts raw events and stores encrypted raw blobs.
- Add schema validation.
- Implement adapters for three representative sources (example: model service, agent runtime, CV pipeline).
Sprint 3 (Weeks 5–6) — Canonicalization & Deterministic Classification
- Normalize adapter outputs into canonical records and compute enrichment fields (latency, token rates, drift indicator).
- Implement a YAML-driven deterministic rule engine that emits category, sub-category, severity and a decision trace.
- Wire alerts for high-severity events to collaboration and on-call channels.
Sprint 4 (Weeks 7–8) — Storage, Dashboards, Backfill & Handover
- Persist normalized events in a queryable store; build dashboards for KPIs.
- Backfill recent historical logs to seed dashboards.
- Finalize governance UI, runbooks, onboarding checklist, and a production rollout plan.
What success looks like
- When an incident occurs, responders receive a concise ticket with: normalized event, concrete category, the failing tool/service, severity, link to raw data, and a short decision trace — enabling effective action within minutes.
- Dashboards provide trend answers (e.g., “hallucinations by model this month”) without custom per-service work.
- Integrating a new telemetry source requires minimal effort using provided adapter templates.
FAQ (short)
Q: Is this going to replace current logging tools?
A: No. The goal is to normalize and classify events as they are produced so existing monitoring and alerting tools can be fed with consistent, actionable data.
Q: Must every service change its logging format?
A: No. Adapters will map existing formats to the canonical contract. Teams are welcome to emit contract-native logs if preferred.
Q: What if a classification is wrong?
A: Every classification includes a decision trace and the rule set is versioned. Rules will be tuned iteratively; incident reviews drive improvements.
First Omdena Project?
Join the Omdena community to make a real-world impact and develop your career
Build a global network and get mentoring support
Earn money through paid gigs and access many more opportunities
Requirements
Good English
A very good grasp in computer science and/or mathematics
(Senior) ML engineer, data engineer, or domain expert (no need for AI expertise)
Understanding of Machine Learning, and/or Data Analysis
Application Form
Become an Omdena Collaborator

