Building Inkrant: AI-Native Education from Zero to Production
March 15, 2026
Every product starts with a frustration. For me, it was watching schools drown in data they couldn't act on.
Student performance records, attendance patterns, behavioral flags — all sitting in spreadsheets and legacy systems, disconnected from the decisions that matter. Teachers were making promotion decisions based on gut feel and incomplete information.
So I started building Inkrant.
The core insight was that education AI doesn't need one massive model. It needs the right model for the right task. Small queries — attendance lookups, quick summaries — go to LLaMA 3.1 8B. It's fast and cheap. Complex reasoning — performance prediction, promotion decisions — routes to LLaMA 3.3 70B. Dynamic routing based on task complexity.
This dual-model architecture reduced our inference costs by 70% compared to routing everything through a large model. But cost wasn't the real win.
The real win was Memory-Augmented Generation. Unlike vanilla RAG that retrieves context per-query, our system builds persistent memory — learned patterns, time-series insights, contextual understanding that compounds over time. The AI doesn't just answer questions. It develops an understanding of each student.
We built 4 core AI services powering 15+ features: student performance prediction, attendance forecasting, promotion decision systems, and more. Each service uses role-based prompt pipelines with context injection and temperature tuning optimized for the specific task.
The infrastructure is provider-agnostic — we can switch between OpenAI, Together AI, and Gemini without code changes. This flexibility has been critical as the LLM landscape evolves.
Inkrant is now a platform I'm proud of. Not because it's perfect, but because it solves a real problem for real schools with real students.