Soket AI Labs and Google Cloud Join Forces to Empower Indian Language AI

Google HQ
Preeti Bali / 10:57 am / May 17, 2024

Indian AI research leader Soket AI Labs has joined forces with Google Cloud to propel its open-source multilingual foundation model, Pragna-1B, to new heights. Launched in May 2024, Pragna-1B caters specifically to Indian languages, including Hindi, English, Bengali, and Gujarati. This collaboration promises significant advancements in AI accessibility and efficiency within the Indian landscape.

Built for Performance, Optimized for Efficiency

Pragna-1B prioritizes Indian languages and contexts. Its unique architecture – a Transformer Decoder-only model with 1.25 billion parameters and a 2048-token context length – offers cutting-edge performance despite utilizing fewer parameters compared to similar models. This translates to an efficient model that delivers exceptional results.

Furthermore, Pragna-1B has been pre-trained on a massive dataset of 150 billion tokens encompassing multiple languages. This ensures robust support and balanced representation across all included languages. Notably, the model’s tokenizer surpasses others in its ability to handle Indian languages like Kannada, Gujarati, Tamil, and Urdu.

Seamless Integration for Developers

The partnership extends beyond model development. Soket AI Labs and Google Cloud are working together to integrate Pragna-1B into existing technical and marketplace infrastructure. This includes listing the Soket AI Developer Platform and the Pragna series models on the Google Cloud Marketplace and the Google Vertex AI model registry.

This integration offers developers streamlined access to powerful resources like Vertex AI and TPUs. These resources facilitate efficient model fine-tuning and scaling of AI projects, empowering developers to achieve more with less.

Building the Foundation for the Future of AI

The collaboration delves deeper than just model development. It encompasses joint efforts on foundational aspects of AI development in India, such as training large-scale models and curating high-quality datasets for Indian languages.

Soket AI Labs has created ‘Bhasha,’ a collection of exceptional datasets. It includes ‘Bhasha-wiki,’ boasting over 44.1 million articles across six Indian languages, and ‘Bhasha-wiki-indic,’ focusing on content specific to India.

By leveraging Google Cloud’s AI infrastructure, these efforts aim to propel AI innovation in India while maintaining transparency and cost-effectiveness.

Leaders Share Their Vision

Abhishek Upperwal, founder of Soket AI Labs, emphasized the importance of this collaboration. He stated, “Partnering with Google Cloud allows Pragna-1B to deliver exceptional performance despite having fewer parameters, making it highly efficient and competitive with other models in language processing tasks.”

Bikram Singh Bedi, Vice President and Country Managing Director at Google Cloud India, expressed his excitement about the partnership: “We are thrilled to collaborate with Soket AI Labs to democratize AI innovation in India. Built on Google Cloud, the launch of Pragna-1B marks a significant leap in Indian language technology, offering organizations enhanced scalability and efficiency.”

This collaboration between Soket AI Labs and Google Cloud promises to be a game-changer for AI development in India, paving the way for a more inclusive and powerful future.


More Stories