Yeah, that’s basically it.
But I think what’s getting overlooked in this conversation is that it probably doesn’t matter whether it’s AI or not. Either new content is derivative or it isn’t. That’s true whether you wrote it or an AI wrote it.
One of the cofounders of partizle.com, a Lemmy instance primarily for nerds and techies.
Into Python, travel, computers, craft beer, whatever
Yeah, that’s basically it.
But I think what’s getting overlooked in this conversation is that it probably doesn’t matter whether it’s AI or not. Either new content is derivative or it isn’t. That’s true whether you wrote it or an AI wrote it.
If I created a web app that took samples from songs created by Metallica, Britney Spears, Backstreet Boys, Snoop Dogg, Slayer, Eminem, Mozart, Beethoven, and hundreds of other different musicians, and allowed users to mix all these samples together into new songs, without getting a license to use these samples, the RIAA would sue the pants off of me faster than you could say “unlicensed reproduction.”
The RIAA is indeed a litigious organization, and they tend to use their phalanx of lawyers to extract anyone who does anything creative or new into submission.
But sampling is generally considered fair use.
And if the algorithm you used actually listened to tens of thousands of hours of music, and fed existing patterns into a system that creates new patterns, well, you’d be doing the same thing anyone who goes from listening to music to writing music does. The first song ever written by humans was probably plagiarized from a bird.
It wouldn’t matter, because derivative works require permission. But I don’t think anyone’s really made a compelling case that OpenAI is actually making directly derivative work.
The stronger argument is that LLM’s are making transformational work, which is normally fair use, but should still require some form of compensation given the scale of it.
Her lawsuit doesn’t say that. It says,
when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs’ copyrighted works—something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works
That’s an absurd claim. ChatGPT has surely read hundreds, perhaps thousands of reviews of her book. It can summarize it just like I can summarize Othello, even though I’ve never seen the play.
I haven’t been able to reproduce that, and at least so far, I haven’t seen any very compelling screenshots of it that actually match. Usually it just generates text, but that text doesn’t actually match.
If you say “AI read my book and output a similar story, you owe me money” then how is that different from “Joe read my book and wrote a similar story, you owe me money.”
You’re bounded by the limits of your flesh. AI is not. The $12 you spent buying a book at Barns & Noble was based on the economy of scarcity that your human abilities constrain you to.
It’s hard to say that the value proposition is the same for human vs AI.
A better comparison would probably be sampling. Sampling is fair use in most of the world, though there are mixed judgments. I think most reasonable people would consider the output of ChatGPT to be transformative use, which is considered fair use.
No, it isn’t. There are enumerated rights a copyright grants the holder a monopoly over. They are reproduction, derivative works, public performances, public displays, distribution, and digital transmission.
Commercial vs non-commercial has nothing to do with it, nor does field of endeavor. And aside from the granted monopoly, no other rights are granted. A copyright does not let you decide how your work is used once sold.
I don’t know where you guys get these ideas.
The published summary is open to fair use by web crawlers. That was settled in Perfect 10 v Amazon.
Derivative and transformative are quite different though.
I very much agree.
The thing is, copyright isn’t really well-suited to the task, because copyright concerns itself with who gets to, well, make copies. Training an AI model isn’t really making a copy of that work. It’s transformative.
Should there be some kind of new model of renumeration for creators? Probably. But it should be a compulsory licensing model.
If I gave a worker a pirated link to several books and scientific papers in the field, and asked them to synthesize an overview/summary of what they read and publish it, I’d get my ass sued. I have to buy the books and the scientific papers.
Well, if OpenAI knowingly used pirated work, that’s one thing. It seems pretty unlikely and certainly hasn’t been proven anywhere.
Of course, they could have done so unknowingly. For example, if John C Pirate published the transcripts of every movie since 1980 on his website, and OpenAI merely crawled his website (in the same way Google does), it’s hard to make the case that they’re really at fault any more than Google would be.
Yes. I do. And I’m right.
There is already a business model for compensating authors: it is called buying the book. If the AI trainers are pirating books, then yeah - sue them.
That’s part of the allegation, but it’s unsubstantiated. It isn’t entirely coherent.
these companies who have been using copyrighted material - without compensating the content creators - to train their AIs.
That wouldn’t be copyright infringement.
It isn’t infringement to use a copyrighted work for whatever purpose you please. What’s infringement is reproducing it.
Maybe you don’t care, but the OSI definition does.
In fairness, they didn’t release anything open at all.
You meet them online, but they’re a vocal minority. Especially when a smaller phone means a smaller battery and worse camera system, two of the consistently top priorities for consumers.