Google says AI systems should be able to mine publishers’ work unless companies opt out, turning copyright law on its head

0x815@feddit.de · 1 year ago

Google says AI systems should be able to mine publishers’ work unless companies opt out, turning copyright law on its head

nous@programming.dev · 1 year ago

Someone getting sued does not mean they are wrong or that they lost the case. Each case needs to look at the works in question and decide if that perceptual case violates copy write. Lots of things are taken into account here, and even is small elements might have been used or be similar does not automatically win the case.

There is also a difference between some implementation and the overall feature in question. For instance, APIs are not copy writeable, nor are cords in music, nor what something does overall. Only specific implementations are copy writeable.

The same can apply to AI - if it generates a work that if a human did it it would violate copy write then it does - if not then it does not. But AI shows a different problem. That of scale. There is only a limited amount of work that a human can do. But an AI can produce vastly more content - enough that a case by case evaluation of infringement might not be viable. And if that becomes the case then AI works might need to be treated differently from human created works - or maybe how the models are created and how they can use copy writed works. The current laws were never designed with the speed at which AI can work in mind.

Boinketh@lemm.ee · 1 year ago

If an AI has been trained on copyrighted material and can be shown to be capable of reproducing something close enough to said material, would that be infringement already or not? If you use a paid service like Midjourney to generate copyrighted content, the company is essentially selling you access to copyrighted content they lack the rights to.

nous@programming.dev · 1 year ago

What do you mean by infringement already? So you mean it automatically infringes copyright for all its output just because it might create something similar to a copyrighted work? Or do you mean that if it does create a copyrighted work that work in infringing on a copyright? Your wording is vague here.

can be shown to be capable of reproducing something close enough to said material

I don’t think it is a good benchmark for forbidding AI generation of content. If you create a random image generate that has no inputs and is truly random then it is capable of generating something similar to copyrighted work - by pure chance. Even if that chance is very low you could generate enough images and show it can create something similar to copyrighted works.

What happens if you create one that is trained only on public domain images or works properly licensed? Its output is still partially random and could still generate an image similar to some other copyrighted work outside of its training set by pure chance.

I would argue that both of these should be allowed. They are not doing anything obviously wrong even if they could be used to generate copyrighted works. Just like you could use photoshop - or a paint brush to create copyrighted work.

But then, what if you take some other AI that is trained on all sorts of data, copyrighted or not. But then the output of that is fed through a checker that compares it to the training set (and maybe more copyrighted content) and rejects/regenerates work until it is known to not infringe on copyrighted work. Making the chances of it ever producing a copyrighted work far less then the above programs? Should that be allowed? It is using copyrighted work much like an artist would and you could argue that any copyrighted work it does produce was by pure accident as there are intentional steps to mitigate that.

If you use a paid service like Midjourney to generate copyrighted content, the company is essentially selling you access to copyrighted content they lack the rights to.

As far as I understand the laws involved, yeah I would expect that to infringe on some copyright holders work and midjourney would likely be coppable for damages. Just like hiring a artist to create some work and they decide to copy some copyrighted work would also make that artist coppable for damages.

And you also have to consider another side of things - if you can effectively stop AI from training on most works you will effectively stunt its usefulness. Which could lead all efforts in regulated nations to become useless which can result in it just moving to places that are much more open with the technology and where authors of the copyrighted work will have far less control over things. IMO AI generated content is out of the bag now and we will not get it back in. So the best we can do is ensure the right people get compensated for their works. Push to hard in the wrong direction (either way) and there is a real chance they never will.

I don’t really have the solutions to many of these problems - but I do think it is worth talking about and don’t think that outright bans (or actions leading to an effective ban) on this tech is the correct way to go.

Boinketh@lemm.ee · edit-2 1 year ago

To be clear, my position is that copyright law should be loosened, not tightened. I know that it’s unreasonable and infeasible to limit AI like that, both for practical and competitive reasons.

When I said that it could be shown to generate copyrighted content, I didn’t mean it had a chance, I meant showing actual examples of it doing so. I also think that it should be allowed to do that, but so should everyone else. In my opinion, derivative works should almost always be allowed unless they can be proven to cause significant harm to the original creator.