Hyderabad: Bengaluru-based startup Sarvam AI has launched the Sarvam 1 LLM, a new open-source AI model that has been trained on 11 languages, including Bengali, English, Gujarati, Hindi, Kannada, Marathi, Malayalam, Oriya, Punjabi, Tamil, and Telugu.
Sarvam 1 is a 2-billion-parameter model trained on 4 trillion tokens using a custom tokeniser on Nvidia H100 Tensor Core GPUs. It is claimed to be up to four times more efficient than other AI models trained in Indian languages.
To enable multilingual tasks, the Sarvam-2T training corpus includes 20 per cent of datasets in Hindi, English, and programming languages. Sarvam AI used synthetic data generation methods to build its datasets in order to deal with the shortage of high-quality training data for Indian languages. Developers can use the base model of Sarvam AI, available on Hugging Face, to build their own AI applications for Indic languages.
In December 2023, Sarvam AI launched the country's first Hindi LLM-- Open Hathi, built on Meta AI's Llama models. Back in August 2024, the startup also launched its first foundational AI model called Sarvam 2B.
Meta also talked about Sarvam AI at its recently held 'Build with AI Summit' in Bengaluru, calling it a startup that is advancing AI for Indic languages and creating Hindi LLMs while operating under limited resources.
Meta Chief AI Scientist in Bengaluru
At the Build with AI Summit, Meta's VP and Chief AI Scientist Yann LeCun talked about India's thriving developer ecosystem and growing talent pool. The event highlighted India's potential to lead AI adoption and integration across sectors, while also mentioning initiatives like the Skill AI Chatbot for AI-driven education in multiple languages.
Meta also talked about the open-source Llama models powering platforms like Arivihan, India's first fully automated online learning platform that generates personalised lecture scripts and answers to over 100,000 student queries. It also talked about AI4Bharat and startups such as Sarvam AI.
AI Hackathon and Llama 3.1 Impact Grants
Meta also organised an AI Hackathon ahead of the Build with AI Summit to boost innovation using Meta Llama's open platform. With over 1,500 registrations, 350+ proposals, and 90+ teams, CurePharmaAI, CivicFix, and evAIssment won, with an all-women team SheBuilds receiving special mention. Participants also got the chance to submit their proposals for the Llama Impact Grants, with potential funding of up to $100K regionally and $500K globally.
The Llama 3.1 Impact Grants support startups in developing AI solutions. In 2023, an Indian non-profit Wadhwani AI was globally recognised for using Llama 3 to enhance its AI-enabled oral reading fluency assessment, creating personalised practice modules for students.