josh.
All Posts
AI Systems10 min read

Building Inkrant: AI-Native Education from Zero to Production

March 15, 2026

Every product starts with a frustration. For me, it was watching schools drown in data they couldn't act on.

Student performance records, attendance patterns, behavioral flags — all sitting in spreadsheets and legacy systems, disconnected from the decisions that matter. Teachers were making promotion decisions based on gut feel and incomplete information.

So I started building Inkrant.

The core insight was that education AI doesn't need one massive model. It needs the right model for the right task. Small queries — attendance lookups, quick summaries — go to LLaMA 3.1 8B. It's fast and cheap. Complex reasoning — performance prediction, promotion decisions — routes to LLaMA 3.3 70B. Dynamic routing based on task complexity.

This dual-model architecture reduced our inference costs by 70% compared to routing everything through a large model. But cost wasn't the real win.

The real win was Memory-Augmented Generation. Unlike vanilla RAG that retrieves context per-query, our system builds persistent memory — learned patterns, time-series insights, contextual understanding that compounds over time. The AI doesn't just answer questions. It develops an understanding of each student.

We built 4 core AI services powering 15+ features: student performance prediction, attendance forecasting, promotion decision systems, and more. Each service uses role-based prompt pipelines with context injection and temperature tuning optimized for the specific task.

The infrastructure is provider-agnostic — we can switch between OpenAI, Together AI, and Gemini without code changes. This flexibility has been critical as the LLM landscape evolves.

Inkrant is now a platform I'm proud of. Not because it's perfect, but because it solves a real problem for real schools with real students.