Open source AI is transforming technology, with initiatives like DeepSeek influencing the financial sector. Red Hat emphasizes practical solutions over the pursuit of artificial general intelligence, confronting challenges like data opacity and the complexities of AI training data. Leaders Richard Fontana and Chris Wright advocate for transparency and community collaboration while cautioning against overextending the definition of openness. Red Hat aims to create a sustainable, accessible framework for AI development, aligning with evolving standards in the open source community.
Transforming the Landscape of Open Source AI
Open source artificial intelligence (AI) is revolutionizing our understanding of this technology. A prime example is DeepSeek, the groundbreaking Chinese open source initiative that has significantly impacted the financial sector within AI. Red Hat, a leading name in the Linux domain, is acutely aware of the potential that open source coupled with AI holds.
Red Hat’s practical approach to open source AI highlights its long-standing commitment to addressing the intricate challenges posed by contemporary AI systems. Instead of chasing the elusive dream of artificial general intelligence (AGI), Red Hat focuses on balancing the practical demands of businesses with the capabilities that AI can currently deliver.
Navigating the Challenges of Data Opacity
However, Red Hat recognizes the complexity surrounding the term “open source AI.” At the Linux Foundation Members Summit in November 2024, Richard Fontana, Red Hat’s chief commercial counsel, emphasized that while traditional open source software is built on transparent source code, AI introduces hurdles related to the obscurity of training data and model weights.
During a panel discussion, Fontana remarked, “What is the analogy [with source code] for AI? It’s not clear. Some believe that training data should be open, but that’s quite impractical for large language models [LLMs]. This indicates that the concept of open source AI may be more of an idealistic vision at this point.”
This uncertainty is visible in models that are marketed as “open source” yet come with restrictive licenses. Fontana criticizes such trends, pointing out that many of these licenses unfairly limit certain business sectors or groups while still claiming to promote openness.
One of the key hurdles lies in reconciling transparency with competitive and legal realities. Although Red Hat promotes openness, Fontana cautions against strict definitions that mandate full disclosure of training data, as revealing detailed training information could expose model creators to legal challenges in the current environment. The fair use of publicly available data further complicates expectations around transparency.
Chris Wright, Red Hat’s CTO, advocates for practical steps toward reproducibility, promoting open models like Granite LLM and tools such as InstructLab, which facilitate community-driven fine-tuning. Wright notes, “InstructLab empowers anyone to enhance the models, fostering true collaboration in AI. This is the same way open source triumphed in software, and now we are applying it to AI.”
Wright draws parallels between this initiative and Red Hat’s legacy with Linux: “Just as Linux standardized computing infrastructure, RHEL AI lays the groundwork for open, flexible, and hybrid enterprise AI by design.”
Red Hat envisions an AI development framework that embodies the collaborative ethos of open source software. “Models should be regarded as open source artifacts. Knowledge sharing is at the core of Red Hat’s mission, which helps prevent vendor lock-in and guarantees that AI serves the broader community,” asserts Chris Wright.
Nonetheless, this task is not without its challenges. Wright acknowledges that “AI, especially large language models that power generative AI, cannot be perceived in the same manner as open source software. Unlike software, AI models primarily consist of numerical parameters, or model weights, that dictate how a model interprets inputs and connects various data points. These weights are derived from a rigorous process involving extensive, meticulously curated training data.”
Despite the differences, Wright adds, “In some ways, AI models function similarly to code. Comparing training data to a model’s source code is straightforward, but training data alone does not fulfill this function. Most enhancements and adjustments to AI models within the community do not require access to original training data; rather, they stem from modifications to model weights or fine-tuning processes, which can also enhance performance. The ability to implement these improvements depends on the publication of weights with permissions granted under open source licenses.”
However, Fontana urges caution against overextending the definition of openness, advocating for minimum standards rather than lofty ideals. “The open source definition (OSD) has been effective because it establishes a baseline, not a maximum. AI definitions should prioritize clarity in licenses, avoiding burdensome transparency requirements for developers.”
This philosophy aligns with the Open Source AI Definition (OSAID) 1.0 from the Open Source Initiative (OSI), although Red Hat has not yet endorsed the document. Wright explains, “Our perspective has been focused on what makes open source AI practical and accessible for a wide array of communities, organizations, and providers.” He concludes, “The future of AI is open, but it is a journey. We will tackle transparency, sustainability, and trust, project by project.”
Fontana’s careful approach underlines this vision, emphasizing that open source AI must navigate competitive and legal landscapes. The community should refine definitions gradually instead of imposing ideals on a developing technology.
The OSI concurs, noting that OSAID 1.0 is merely the first iteration, with plans for further revisions already in progress. In the meantime, Red Hat remains committed to shaping the future of open AI by bridging gaps between developer and business communities while addressing the complex ethics of transparency in AI.