India still has many hurdles to cross for DeepSeek moment in AI development

0
279
India still has many hurdles to cross for DeepSeek moment in AI development
India still has many hurdles to cross for DeepSeek moment in AI development

New version of Alibaba’s Qwen 2.5 AI model was unveiled at the same time as the world was in awe of DeepSeek. It outperformed the DeepSeek V3 on a number of important metrics.

Questions have also been raised in India by Chinese corporations’ consecutive unveiling of sophisticated AI models. What prevents it from creating a fundamental AI model of its own?

In a recent news conference, Ashwini Vaishnaw, the minister of information technology, partially addressed this. Over the next eight to ten months, he said, India would have many independent fundamental AI models created and prepared for use. Over the last 18 months, the Ministry of Electronics and Information Technology has been collaborating with specialists in the fields of long language models (LLM) and short language models (SLM) to create the framework for India’s initial artificial intelligence model, according to Vaishnaw. Additionally, he brought attention to the problem of computational capability, which is a crucial need for creating fundamental models.

“We have nearly 15,000 high-end GPUs. Just to give you some context, DeepSeek was trained on 2,000 GPUs while the Chat GPT version 4 was trained on 25,000 GPUs. This (procurement of 18,693 GPUs) gives us a very robust compute facility, which will give a boost for creating AI applications, models, distillation and training processes and creating new algorithms,” Vaishnaw said.

Academics and industry professionals expressed confidence in response to the news, but they also noted that difficulties still exist. Building fundamental models requires consideration of several important elements, including compute power.

According to several experts, DeepSeek casts doubt on the notion that creating core models only requires significant financial outlays. It does not, however, completely remove budgetary limitations.

“The model was published by DeepSeek under an MIT license, which permits free reuse and modification. This is one of the few models with a highly permissible license, making it particularly valuable for open-source enthusiasts and developers. Unlike restrictive licenses, the MIT license allows users to build better models on top of DeepSeek’s foundation, fostering innovation and collaboration,” explains Y Kiran Chandra, founder of Swecha and Centre head Viswam AI, a collaborative CoE by Swecha and IIIT Hyderabad.

A center of excellence, Viswami is committed to creating AI solutions that are specifically suited to the demands of the Global South.

However, Chandra notes that it’s crucial to remember that the model cannot be regarded as completely open-source due to the lack of training data.

“For a project to be truly open-source, all four critical components—algorithms, datasets, model weights, and source code–must be accessible for public review and use,” he added.

Nevertheless, the publication of a “open-weight” model by DeepSeek remains a noteworthy achievement. They have made it possible for academics and developers to examine, modify, and expand upon the method by making the model weights public.

“This approach demonstrates that high-quality AI models can be developed without relying solely on the brute-force, resource-intensive methods often pursued by organisations in the Global North. Already, the community has begun innovating with the model, for instance, some have extracted the reasoning component and integrated it with other models like LLaMA, showcasing its versatility and potential for hybrid applications,” said Chandra. The success of DeepSeek has made India wonder if it is becoming overly dependent on the West. Is it possible to develop an Indian-specific fundamental model? Will the problems of prejudice and hallucinations be resolved by it? And more.

According to Kunal Walia, a partner at Dalberg Advisor, creating an atmosphere that encourages creativity despite these constraints and, above all, creating excellent datasets tailored to India would be essential to success.

“A stronger focus on research and development is essential, along with collaboration between public and private sectors to mobilise resources for building and training these models. This development serves as a wake-up call, and India will naturally move in this direction,” said Walia.

India needs a more robust AI research environment with better incentives for basic research, according to Professor Balaraman Ravindran, head of IIT Madras’ Centre for Responsible AI and dean of the Wadhwani School of Data Science and AI. China has made significant efforts to create a robust AI ecosystem, but India has to concentrate on industry-academia cooperation and more venture capital funding to close the gap.

“We do have high quality research happening in India, but in a very few universities. Perhaps confined to a few IITs, IISC, and IIIT. But in China they have probably 100 institutes of that level.  The amount of investment that some of the top universities in China get for fundamental research is way higher than we can imagine… I think Tsinghua and Peking Universities get more money for research than the entire Indian academia,” added Ravindran.

It is possible that the Indian government is becoming aware of this. For FY25–26, education and literacy got a total of Rs 78,572 crore, the highest amount ever allocated to the Department of School Education and Literacy. The government also said that it will invest Rs 500 crore to establish an AI Center of Excellence. Experts also note that the availability of digital first datasets is the largest obstacle to developing a foundational model for India that can be used to India use cases. Chandra thinks that through crowd sourcing, India can simply produce its own datasets. Chandamama Kathalu, India’s first Telugu language SML, was developed by Swecha. The Gonthuka project, which had 45,000 participants and 1.5 million voice samples, came next.

India needs an LLM with a strong cultural heritage, Chandra emphasizes. Language is an essential component of cultural subtleties and a condensed method of expression.  “To tackle that as a community we have been able to collect close to 50 million tokens. We believe that it is possible to build a large language model with cultural nuances and datasets with 200 million tokens. We trained 30,000 students last summer to get data from nearby villages with cultural nuances,” added Chandra.

Also read: Viksit Workforce for a Viksit Bharat

Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter

About us:

The Mainstream formerly known as CIO News is a premier platform dedicated to delivering latest news, updates, and insights from the tech industry. With its strong foundation of intellectual property and thought leadership, the platform is well-positioned to stay ahead of the curve and lead conversations about how technology shapes our world. From its early days as CIO News to its rebranding as The Mainstream on November 28, 2024, it has been expanding its global reach, targeting key markets in the Middle East & Africa, ASEAN, the USA, and the UK. The Mainstream is a vision to put technology at the center of every conversation, inspiring professionals and organizations to embrace the future of tech.