India must invest heavily in Open Source AI
I was invited to speak at the IPCIDE Conference panel titled, “The Future of AI: Big Tech Monopoly or Open-Source Technology.” The event was organized by the Indian Council for Research in International Economic Relations (ICRIER). This panel was moderated by Urvashi Aneja, Founder & Executive Director, Digital Futures Lab and my co-panelists were:
- Pramod Bhasin, Chairperson, ICRIER
- Carl Benedikt Frey, Dieter Schwarz Associate Professor of AI & Work, Oxford Internet Institute and
- Payal Malik, Visiting Professor, ICRIER Chair:
Prior to the event, Urvashi sent us a set of questions and I wrote up my answers before the event and am sharing them here. Will also share a link to the video of the panel, when that is released.
Note: I could not cover all the points written here due to limited time.
Q. How might we democratise the AI innovation ecosystem — what might this mean across the tech stack?
The AI tech stack consists of the compute hardware, models, training data and services. For India, catching up to global standards of hardware is probably a 10 year long effort. We must invest in open source efforts like RISC-V that are open source hardware initiatives. A little known fact is that IIT Madras was one of the founding members of the RISC-V consortium.
On models, we have efforts like Sarvam AI that might take 4–5 years to hit mainstream. On data, the picture is a little more hopeful because efforts like Bhashini are working to make open data sets and translation tools and other resources in Indian languages. The importance of this cannot be understated because:
(A) most Internet data in India is in English, a language that excludes 90 percent of Indians, and
(B) Only the government can create public goods like open data sets that can be used to train models.
The fourth layer of the AI stack is services which play to India’s strengths. I think India will be a leader in the deployment of AI based services. Like the discovery of the fire and the wheel, AI is general purpose technology with broad applicability. Therefore, we must identify areas where AI can help India’s biggest challenges, and create missions to address them. For example, India is now the diabetes capital of the world. Can we apply AI to address that? We also have a large number of orphan diseases that are not addressed by drug manufacturers because they are not commercially viable drugs. Can we use AI to reduce the cost of drug discovery for such diseases, and bring affordable drugs to the market?
Q. What are some of the challenges that India needs to address to grow the AI ecosystem in India?
Given geopolitical realities, and the risk of technology denial regimes, we have to invest heavily in open source models for the long run. We also have to build India specific datasets and invest heavily in capacity building. In countries like the US, 50–60 years of research has led to a seminal moment like the release of ChatGPT, one of the first general purpose systems to pass the Turing test. China has invested almost three decades to reach where it is in the field of AI. A book titled, “The Hundred Year Marathon” documents how the Chinese have invested heavily in the technologies of the future, in their effort to overtake the US as a superpower. We need a quantum leap in R&D investments if we want to be a player in the hardware and models layer of the AI stack, and we need to invest heavily in creating locally relevant data sets as public goods. We will do very well in AI services but we need to recognize that services represent the smallest segment of the AI value chain. Therefore, we need to have a long term plan to go up the AI value chain.
Q. Will they be enough to address the dominance of BT players in this space?
Big Tech in AI has become big because of an audacious land grab. They have taken advantage of a regulatory vacuum to train their models on publicly available data sets. Without such exploitation, the valuation of these companies will be a fraction of what they are today. India must move to a future where consented data flows are used to train models, and there is traceability and accountability of the data used to train models. This will create better foundations of legal certainty and ensure that value can flow back to the data creators. We will need technologies like Verified Credentials to ensure trusted data flows, and to curb the menace of fake news.
We also need to clamp down on patenting of AI methods in India, which will spawn a lot of legal challenges. Section 3(k) of the Indian Patents Act says that, “mathematics, business methods, computer programs per se and algorithms,” are not eligible for patents. Ai patents are essentially patents on algorithms and should not be allowed. Patenting requires a lot of business sophistication and capital. Therefore, patents can become a tool for Big Tech to prevent startups from flourishing. We also need to urgently ramp up research in AI governance to ensure that society as a whole benefits from AI.
Q. Is there a risk of replacing global Big Tech with domestic Big Tech — or perhaps that’s the objective?
India is one of the most unequal countries in the world with 22% of the national income held by the top 1%, while the bottom 50% holds just 13% in 2021, according to the World Inequality Report 2022. If there is an implicit strategy to support National Big Tech companies, it will only exacerbate income inequality in India. We must use open source, open data, capacity building, and judicious AI governance to ensure that the benefits of AI are widely dispersed.