Meta's Llama 2 is not open source

bouncing@partizle.com · 1 year ago

You meet them online, but they’re a vocal minority. Especially when a smaller phone means a smaller battery and worse camera system, two of the consistently top priorities for consumers.

bouncing@partizle.com · 1 year ago

Yeah, that’s basically it.

But I think what’s getting overlooked in this conversation is that it probably doesn’t matter whether it’s AI or not. Either new content is derivative or it isn’t. That’s true whether you wrote it or an AI wrote it.

bouncing@partizle.com · 1 year ago

If I created a web app that took samples from songs created by Metallica, Britney Spears, Backstreet Boys, Snoop Dogg, Slayer, Eminem, Mozart, Beethoven, and hundreds of other different musicians, and allowed users to mix all these samples together into new songs, without getting a license to use these samples, the RIAA would sue the pants off of me faster than you could say “unlicensed reproduction.”

The RIAA is indeed a litigious organization, and they tend to use their phalanx of lawyers to extract anyone who does anything creative or new into submission.

But sampling is generally considered fair use.

And if the algorithm you used actually listened to tens of thousands of hours of music, and fed existing patterns into a system that creates new patterns, well, you’d be doing the same thing anyone who goes from listening to music to writing music does. The first song ever written by humans was probably plagiarized from a bird.

bouncing@partizle.com · 1 year ago

It wouldn’t matter, because derivative works require permission. But I don’t think anyone’s really made a compelling case that OpenAI is actually making directly derivative work.

The stronger argument is that LLM’s are making transformational work, which is normally fair use, but should still require some form of compensation given the scale of it.

bouncing@partizle.com · 1 year ago

Her lawsuit doesn’t say that. It says,

when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs’ copyrighted works—something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works

That’s an absurd claim. ChatGPT has surely read hundreds, perhaps thousands of reviews of her book. It can summarize it just like I can summarize Othello, even though I’ve never seen the play.

bouncing@partizle.com · 1 year ago

I haven’t been able to reproduce that, and at least so far, I haven’t seen any very compelling screenshots of it that actually match. Usually it just generates text, but that text doesn’t actually match.

bouncing@partizle.com · 1 year ago

If you say “AI read my book and output a similar story, you owe me money” then how is that different from “Joe read my book and wrote a similar story, you owe me money.”

You’re bounded by the limits of your flesh. AI is not. The $12 you spent buying a book at Barns & Noble was based on the economy of scarcity that your human abilities constrain you to.

It’s hard to say that the value proposition is the same for human vs AI.

bouncing@partizle.com · 1 year ago

A better comparison would probably be sampling. Sampling is fair use in most of the world, though there are mixed judgments. I think most reasonable people would consider the output of ChatGPT to be transformative use, which is considered fair use.

bouncing@partizle.com · 1 year ago

No, it isn’t. There are enumerated rights a copyright grants the holder a monopoly over. They are reproduction, derivative works, public performances, public displays, distribution, and digital transmission.

Commercial vs non-commercial has nothing to do with it, nor does field of endeavor. And aside from the granted monopoly, no other rights are granted. A copyright does not let you decide how your work is used once sold.

I don’t know where you guys get these ideas.

bouncing@partizle.com · 1 year ago

https://www.reuters.com/technology/chatgpt-launches-boom-ai-written-e-books-amazon-2023-02-21/

bouncing@partizle.com · edit-2 1 year ago

The published summary is open to fair use by web crawlers. That was settled in Perfect 10 v Amazon.

bouncing@partizle.com · 1 year ago

Derivative and transformative are quite different though.

bouncing@partizle.com · 1 year ago

I very much agree.

bouncing@partizle.com · 1 year ago

The thing is, copyright isn’t really well-suited to the task, because copyright concerns itself with who gets to, well, make copies. Training an AI model isn’t really making a copy of that work. It’s transformative.

Should there be some kind of new model of renumeration for creators? Probably. But it should be a compulsory licensing model.

bouncing@partizle.com · 1 year ago

If I gave a worker a pirated link to several books and scientific papers in the field, and asked them to synthesize an overview/summary of what they read and publish it, I’d get my ass sued. I have to buy the books and the scientific papers.

Well, if OpenAI knowingly used pirated work, that’s one thing. It seems pretty unlikely and certainly hasn’t been proven anywhere.

Of course, they could have done so unknowingly. For example, if John C Pirate published the transcripts of every movie since 1980 on his website, and OpenAI merely crawled his website (in the same way Google does), it’s hard to make the case that they’re really at fault any more than Google would be.

bouncing@partizle.com · 1 year ago

Yes. I do. And I’m right.

bouncing@partizle.com · 1 year ago

There is already a business model for compensating authors: it is called buying the book. If the AI trainers are pirating books, then yeah - sue them.

That’s part of the allegation, but it’s unsubstantiated. It isn’t entirely coherent.

bouncing@partizle.com · 1 year ago

these companies who have been using copyrighted material - without compensating the content creators - to train their AIs.

That wouldn’t be copyright infringement.

It isn’t infringement to use a copyrighted work for whatever purpose you please. What’s infringement is reproducing it.

bouncing@partizle.com · 1 year ago

Maybe you don’t care, but the OSI definition does.

bouncing@partizle.com · 1 year ago

In fairness, they didn’t release anything open at all.

bouncing@partizle.com · 1 year ago

Meta's Llama 2 is not open source

bouncing@partizle.com · 1 year ago

Sam Altman’s Worldcoin token to launch Monday

bouncing@partizle.com · 1 year ago

Fear, loathing, and excitement as Threads adopts open standard used by Mastodon

bouncing@partizle.com · 1 year ago

iOS 17's Photos app can now identify and explain laundry symbols