CIO Insider

CIOInsider India Magazine

Separator

How Open-Source AI is Setting New Standards in AI Development

Separator

Towards the end of January, the tech-heavy Nasdaq sunk nearly three percent in the opening hours of the stock market, marking a spike in performance compared to the past two years. The sharp decline was the result of the explosive boom of the Chinese artificial intelligence startup DeepSeek, which had all eyes glued on it after releasing its AI models, DeepSeek-V3 and DeepSeek-R1, a reasoning model. In a short span of time, its AI models overtook ChatGPT and became the most downloaded app in the App Store. Unlike the proprietary models used by tech giants, Deep Seek uses ones that are publicly available and can be deployed on any system. But the question lies in how open-source AI will impact AI development, overall.

How DeepSeek Made a Breakthrough
The tech and AI industries across the world were taken aback by DeepSeek's explosive growth, which simultaneously sent waves to the stock market, leaving bigger companies indecisive with their AI plans.
Since DeepSeek AI banks on way fewer GPUs, it depends less on expensive Nvidia chips, which affects Nvidia's income sources. Nvidia's market value fell by $593 billion after DeepSeek's AI models were released, sending chills down investors’ spines about the AI industry's dwindling demand for GPUs.

OpenAI is perhaps the one that took a massive hit. While DeepSeek-R1 leans on superior natural language processing and coding, OpenAI's GPT-4o combines advanced multi-modal capabilities (text, vision, and audio), making it a competitive option for numerous AI applications. Additionally, DeepSeek-R1 was made by investing only $6 million, at the same time OpenAI's GPT-4 is thought to have cost about $100 million to train. In its defense, OpenAI responded by providing o3-mini at no cost, which makes it unclear whether this is a loss or a cost-effective method.

Furthermore, DeepSeek offers an alternative AI context, which puts further pressure on Google's Gemini models, which still face criticism and performance concerns.

Is Open-Source Better than Closed-Source AI?
As the globally recognized open-source community building authority, Open Source Initiative states that an open-source AI system gives users complete autonomy to use, research, modify and share the model as they see fit. It’s similar to giving a chef an endless supply of ingredients to work with in the kitchen.
The exact opposite of that is closed-source AI. The source code of proprietary models is kept under wraps, restricting outside development or modifications. Closed AI proponents claim that by improving security and preventing data exploitation, helps to protect user privacy.

However, DeepSeek’s sudden rise to fame caused a new trend where numerous companies flock towards switching to open-source alternatives. They believe that it provides flexible and affordable solutions, while proprietary AI demands significant investments in data collection and processing capacity. The factor that grabbed the industry’s attention is DeepSeek-R1's training cost, which is apparently said to be only $6 million. This is significantly less compared to the billions of dollars that OpenAI, Google, and Anthropic have spent on their state-of-the-art models.

Is an AI Pricing War Fast Approaching?
For that matter, OpenAI’s CEO, Sam Altman, even complimented DeepSeek's innovation on ‘X’ by writing: “DeepSeek’s R1 is an impressive model, particularly in terms of what they’ve achieved at such a low cost.”
Given that AI companies’ valuations are based on their capacity to train proprietary models, this change may lead to dire consequences. Following DeepSeek's disruption, companies might have a hard time securing investment if they are unable to provide something extraordinary above their LLMs.

Winning Points of Open-Source AI Models
By making the source code openly accessible, developers from any part of the world can work in tandem to enhance it. This speeds up innovation since the open-source community's pooled wisdom propels advancement much faster than it would under a closed-source paradigm.

Open-source AI models facilitate increased collaboration and integration thanks to the advanced AI capabilities that are largely available. Take DeepSeek's technology itself; it is accessible to developers, researchers, and companies of all sizes due to its open-source model, which eliminates cost obstacles. Since the code can be single-handedly examined for flaws and ethical concerns, open-source models are, by nature, more transparent. This promises ethical technology deployment and fosters user trust. The open-source community offers reliable feature improvements, troubleshooting, and assistance.

Drawbacks
Like other models, the open-source model is not without drawbacks as well. The possibility of malevolent actors taking advantage of security loopholes and the need to safeguard intellectual property. However, the model training stage is when things become complex for open-source AI developers. Large volumes of data are needed to train AI models, yet few open-source AI initiatives or companies make this data available to the general public. Two major issues are brought about by the dearth of publicly accessible training data for open-source AI models. This could be challenging for developers to alter and retrain a model. While it is said that the model's source code can be effortlessly modified, developers are unable to retrain their modified model on the original data set as they do not have access to the training data.

But right now, access to open-source AI models usually does not offer all the benefits of other forms of open-source software due to the difficulties involved in model training.

Even when a model's source code is accessible, a lack of transparency regarding the training data can make it difficult to understand how the model functions.

Besides that it’s often claimed to pose a difficulty to strike a balance between the advantages of teamwork and the need for security and intellectual property protection.

There is a chance that any project will get fragmented when more than one developer contributes, which could cause problems with compatibility between versions. For the community to remain consistent and interoperable, it calls for explicit rules and standards in place.

Given these obstacles, developers and organizations interested in open-source AI will most likely have to make do for the time being with pre-trained models from companies like Meta and Mistral. These models' transparency and flexibility are limited by their inability to be retrained, which poses issues that are uncommon for other kinds of open-source software.

If bigger companies invest in developing open-source data sets specifically designed for AI training and supply the processing power required to carry out that training, the scenario could be different. But right now, access to open-source AI models usually does not offer all the benefits of other forms of open-source software due to the difficulties involved in model training.



Current Issue
Powering Tomorrow : The Digital Age of Semiconductors



🍪 Do you like Cookies?

We use cookies to ensure you get the best experience on our website. Read more...