India's most ambitious AI project is not being built by IIT, DRDO, or a government ministry. It is being built by Krutrim, the AI lab spun out of Ola in 2023, and its second-generation model represents a genuine step toward something that has never existed: a foundation model with first-class support for the full breadth of India's linguistic landscape.
The linguistic challenge
India has 22 officially recognised languages, 1,600 dialects, and alphabets including Devanagari, Tamil, Telugu, Kannada, Malayalam, Bengali, and Gurmukhi scripts. Building an AI model that handles all of these with genuine competence — not the superficial token-level coverage that general-purpose models provide — requires training data and evaluation infrastructure that did not exist when Krutrim started.
Krutrim has spent 30 months building that infrastructure. The company has assembled what it claims is the largest corpus of Indian language text ever compiled for AI training — 5 trillion tokens across 22 languages, including previously undigitised sources from regional newspapers, government records, and literature.
What Krutrim 2 can actually do
On the company's internal benchmarks, Krutrim 2 outperforms GPT-4o on tasks conducted in Hindi, Tamil, Telugu, Kannada, Malayalam, Bengali, and Punjabi. Independent evaluation by researchers at IIT Madras partially corroborates these claims for Tamil and Telugu, where Krutrim 2 shows particular strength in cultural context and idiomatic usage.
The model is multimodal from launch, handling voice input in Indian languages — a critical feature in a country where smartphone penetration far exceeds keyboard literacy.
The commercial strategy
Krutrim is pursuing two parallel tracks: the consumer Krutrim Assistant targeting Ola's 100 million+ user base with voice-first AI in regional languages, and an enterprise API targeting banks, insurers, and government agencies that need AI communicating with customers in their native language.
Why it matters beyond India
Krutrim's approach — building from the ground up around linguistic diversity rather than adding it as an afterthought — is a template that Indonesia, Nigeria, and Ethiopia are watching closely. If Krutrim proves the approach is commercially viable, it becomes a blueprint for sovereign AI development in the global south.