The Software Freedom Conservancy (SFC), a non-profit community of open source advocates, today announced its withdrawal from GitHub in a scathing blog post urging members and supporters to rebuke the platform once and for all.
In front: The SFC’s problem with GitHub stems from allegations that Microsoft and OpenAI trained an AI system called Copilot on data published under an open source license.
Open source code is not like a donation box where you can just take whatever you want and use it any way you want.
It’s more like photography. Just because a photographer doesn’t charge you for using any of their images, you’re still ethically and legally obligated to give credit where it’s due.
According to a blog post on the SFC site, Copilot doesn’t when it comes to using other people’s code snippets:
This refers to long-standing problems with GitHub and the central reason why we should give up GitHub together. We’ve seen with Copilot, with GitHub’s core hosting service, and in almost every area of effort, GitHub’s behavior is significantly worse than their peers. We do not believe that Amazon, Atlassian, GitLab or any other for-profit hoster are perfect actors. However, a relative comparison of GitHub’s behavior with that of its peers shows that GitHub’s behavior is much worse.
Background: GitHub is the de facto repository for open source code in the world. It’s like a combination of YouTube, Twitter, and Reddit, but for programmers and the codes they produce.
Of course, there are other options. But switching from one code repository ecosystem to another isn’t the same as trading Instagram for TikTok.
Microsoft acquired GitHub in 2018 for more than $7 billion.
In the time since, Microsoft leveraged its position as OpenAI’s primary benefactor in a concerted effort to build Copilot.
And the only way to access Copilot is through a special invite from Microsoft or a paid subscription.
The SFC and other open source proponents are angry that Microsoft and OpenAI are essentially monetizing other people’s code and depriving those who use that code of their ability to give proper credit.
In other words, Microsoft takes people’s work, takes credit, and sells it to others through algorithms.
A solution: Kill the copilot. Alternatively, Microsoft and OpenAI could build a time machine, go back in time, and label each individual data point in Copilot’s database, so that a second model could be built that would apply appropriate credit to each output.
But it’s always easier to exploit the Wild West’s regulatory environment and take advantage of people than to care about the ethics of the products and services you offer.
Neural Mind: When it comes to solid examples of AI making people’s lives easier, GitHub’s Copilot tops the list. It takes some annoying things that can cost developers hours of work and makes them as easy as pressing a button or typing a few lines of text.
And there’s a bit of a precedent here. GPT-3 and Dall-E use databases of human-generated media to generate new outputs.
But there is an important difference between those generators and Copilot. Drawing a duck in the style of Monet or asking GPT-3 to tell you a story about a happy dog is one thing.
Bubbling up line by line of code snippets from files in a database isn’t coding someone else’s style, it’s using someone else’s code.
It’s probably a little more nuanced than that. There is, of course, sometimes more than one way to solve a coding problem. And coding is often as much art as science.
But just because you can take a photo of the sunset with your iPhone, doesn’t mean you can steal someone else’s sunset photo, call it your own, and sell it to other people.
At the end of the day it doesn’t matter. Copilot is a hit. The dev community seems to absolutely love it. It’s gotten more positive press than any kind of saying no is likely to affect.
It doesn’t matter what it will do to the open source community in the end. Who needs open source repositories when you can just work for free to make money for Microsoft?
The best thing is you don’t have a choice. There is no opt-in or out. Microsoft and OpenAI have your data and nothing is stopping them from doing what they want with it. Resistance is meaningless.