Information Technology Minister Ashwini Vaishnaw said while announcing the AI Kosha platform that 14,000 GPUs had been commissioned for shared access, compared to around 10,000 when announced earlier this year. File
| Photo Credit: ANI
The Union government on Thursday (March 6, 2025) launched AI Kosha, a platform with datasets, that is being touted as a home for non-personal data that will assist with developing Artificial Intelligence models and tools. At launch, the platform contains 316 datasets, the bulk of these being programme to help in creating or validating language translation tools for Indian languages.
The IndiaAI Datasets Platform is one of the seven pillars of the IndiaAI Mission, the Union government’s main general State-backed AI effort. The Mission has an outlay of ₹10,370 crore, and last month the Centre announced that under its Compute Capacity pillar, startups and academia would be able to use pooled access to Graphics Processing Units (GPUs), which are needed to train and run AI models.
Other than translation, the limited datasets include submissions from Telangana’s own open data initiative, such as health data, 2011 Census data; satellite imagery captured by Indian satellites; meteorological and pollution data, and so on.
More GPUs
Information Technology Minister Ashwini Vaishnaw said while announcing the AI Kosha platform that 14,000 GPUs had been commissioned for shared access, compared to around 10,000 when announced earlier this year. More GPUs will be added on a quarterly basis, Mr. Vaishnaw said.
The Minister also provided an update on the government-supported effort to create a homegrown foundational AI model, an aim that has gained urgency following the success of DeepSeek, the Chinese firm that was able to train and launch such a model at a fraction of the cost that American firms like OpenAI and Google had to spend. “Now, the team is actually inundated with how to evaluate these applications,” Mr. Vaishnaw said, indicating a high level of interest from startups to build such a foundational model for India.

Government datasets
This is not the first time the Union government has sought to aggregate public data to nudge other entities to leverage it. The government’s Open Governance Data platform (data.gov.in) currently hosts over 12,000 datasets provided by different government agencies across India. The government has designated “Chief Data Officers” across different Ministries and departments, encouraging them to provide datasets that can be used by researchers, companies, and other parts of the government.
In 2018, the government constituted a committee to explore the possibility of compelling firms to provide startups and government access to non-personal data, such as traffic data from ride-sharing apps, to help new entrants and assist government policy. The committee, led by Infosys co-founder Kris Gopalakrishnan, submitted its report in 2020. However, the proposals faced pushback from the tech industry, as private players were reluctant to share their data with other parties. The conversation within the government around non-personal data from private firms took place largely before the advent of large language models (LLMs) like ChatGPT.
Published – March 06, 2025 10:47 pm IST