For some reason, we keep thinking history won’t repeat itself. We expect perfect open source licenses and that everyone will geek out on peace, love, and freely shared code. Yet we’re 26 years into the term open source and, wildly popular though open source software absolutely is, the vast majority of the world’s software may contain open source, but it isn’t licensed. Developers seem fine with this, including those active on OSI mailing lists. Most of them use a plethora of closed hardware and software. As a result, despite the groundswell of support for “open source AI” (whatever that means), we are very much on track for mostly proprietary licensing to dominate AI, just as the proprietary model has dominated software.
Do we care?
It’s nice to think we do, or that we should, but the history of software is an ongoing blend of proprietary and open source. It seems to work. There’s little reason to think AI will be any different—or that it should be any different.
Here we go again
Steven Vaughan-Nichols wrote a great synopsis of open source and AI. “Defining open source AI is a messy issue that has yet to be settled.” I’ll go one further: It’s not going to be settled. Not soon. Not ever.
I explored this idea a few weeks ago. “While the OSI and others are trying to committee their way to an updated Open Source Definition (OSD),” I suggested, “powerful participants like Meta are releasing industry-defining models, calling them ‘open source,’ and not remotely caring when some vocally chastise them for affixing a label that doesn’t seem to fit the OSD.” Despite earnest pleas for everyone to open source their AI models, “basically none of today’s models are ‘open source’ in the way we’ve traditionally considered the term.”
We don’t have to be happy about that, but we should get used to it. Open source has never been bigger, and yet it’s still a relative rounding error in terms of the software we use every day. Most software that we use is not licensed as open source, even if it has open source inside. Open source is an essential ingredient, for sure, but it’s rarely the finished product.
Given the likelihood that AI will increasingly permeate the software and systems we depend on, it’s fair but unrealistic to want those AI models to be open source. Vaughan-Nichols blames “top AI vendors [that] are unwilling to commit to open sourcing their programs and data sets,” suggesting that “businesses hope to gild their programs with open source’s positive connotations of transparency, collaboration, and innovation.” Maybe? Or maybe they don’t have the luxury of giving away all their code because that turns out to be really bad business. I know some like to lazily gesture at Red Hat as some classic example of what business success looks like, but it’s actually a terrible example when compared to Meta, AWS, etc. As Hugging Face’s Sasha Luccioni said at the United Nations OSPOs for Good Conference, “You can’t really expect all companies to be 100% open source as the open source license defines it. You can’t expect companies just to give up everything that they’re making money off of and do so in a way they’re comfortable with.”
Maybe we’d like reality to be different, but after decades of open source and proprietary software living comfortably together, why would we expect AI to be any different?
Just as with cloud and with on-premises software before that, most AI software will not be open source. Now, as then, most developers simply won’t care, because most developers care more about going to their kids’ soccer games after work than existential open source issues. For years we’ve fixated open source conversations on the wrong things, and younger developers have mostly tuned it out. But whether young or old, developers care about getting stuff done. They care about the cost, speed, and performance gains of Mistral’s latest model, and not so much about its non-open source license. Ditto OpenAI, Meta’s Llama, etc.
All of which is not to say that open source doesn’t matter for AI. It’s one thing that matters, not the only thing. When we obsess about open source licensing, we lose sight of the tens of millions of developers who just need software to help them get their jobs done with a minimum of fuss.