Father, Hacker (Information Security Professional), Open Source Software Developer, Inventor, and 3D printing enthusiast

  • 6 Posts
  • 222 Comments
Joined 2 years ago
cake
Cake day: June 23rd, 2023

help-circle





  • stolen copied creations

    When something is stolen the one who originally held it no longer has it anymore. In other words, stealing covers physical things.

    Copying is what you’re talking about and this isn’t some pointless pedantic distinction. It’s an actual, real distinction that matters from both a legal/policy standpoint and an ethical one.

    Stop calling copying stealing! This battle was won by every day people (Internet geeks) against Hollywood and the music industry in the early 2000s. Don’t take it away from us. Let’s not go back to the, “you wouldn’t download a car” world.



  • My argument is that the LLM is just a tool. It’s up to the person that used that tool to check for copyright infringement. Not the maker of the tool.

    Big company LLMs were trained on hundreds of millions of books. They’re using an algorithm that’s built on that training. To say that their output is somehow a derivative of hundreds of millions of works is true! However, how do you decide the amount you have to pay each author for that output? Because they don’t have to pay for the input; only the distribution matters.

    My argument is that is far too diluted to matter. Far too many books were used to train it.

    If you train an AI with Stephen King’s works and nothing else then yeah: Maybe you have a copyright argument to make when you distribute the output of that LLM. But even then, probably not because it’s not going to be that identical. It’ll just be similar. You can’t copyright a style.

    Having said that, with the right prompt it would be easy to use that Stephen King LLM to violate his copyright. The point I’m making is that until someone actually does use such a prompt no copyright violation has occurred. Even then, until it is distributed publicly it really isn’t anything of consequence.


  • If we’re going pie in the sky I would want to see any models built on work they didn’t obtain permission for to be shut down.

    I’m going to ask the tough question: Why?

    Search engines work because they can download and store everyone’s copyrighted works without permission. If you take away that ability, we’d all lose the ability to search the Internet.

    Copyright law lets you download whatever TF you want. It isn’t until you distribute said copyrighted material that you violate copyright law.

    Before generative AI, Google screwed around internally with all those copyrighted works in dozens of different ways. They never asked permission from any of those copyright holders.

    Why is that OK but doing the same with generative AI is not? I mean, really think about it! I’m not being ridiculous here, this is a serious distinction.

    If OpenAI did all the same downloading of copyrighted content as Google and screwed around with it internally to train AI then never released a service to the public would that be different?

    If I’m an artist that makes paintings and someone pays me to copy someone else’s copyrighted work. That’s on me to make sure I don’t do that. It’s not really the problem of the person that hired me to do it unless they distribute the work.

    However, if I use a copier to copy a book then start selling or giving away those copies that’s my problem: I would’ve violated copyright law. However, is it Xerox’s problem? Did they do anything wrong by making a device that can copy books?

    If you believe that it’s not Xerox’s problem then you’re on the side of the AI companies. Because those companies that make LLMs available to the public aren’t actually distributing copyrighted works. They are, however, providing a tool that can do that (sort of). Just like a copier.

    If you paid someone to study a million books and write a novel in the style of some other author you have not violated any law. The same is true if you hire an artist to copy another artist’s style. So why is it illegal if an AI does it? Why is it wrong?

    My argument is that there’s absolutely nothing illegal about it. They’re clearly not distributing copyrighted works. Not intentionally, anyway. That’s on the user. If someone constructs a prompt with the intention of copying something as closely as possible… To me, that is no different than walking up to a copier with a book. You’re using a general-purpose tool specifically to do something that’s potentially illegal.

    So the real question is this: Do we treat generative AI like a copier or do we treat it like an artist?

    If you’re just angry that AI is taking people’s jobs say that! Don’t beat around the bush with nonsense arguments about using works without permission… Because that’s how search engines (and many other things) work. When it comes to using copyrighted works, not everything requires consent.


  • I just wrote a novel (finished first draft yesterday). There’s no way I can afford professional audiobook voice actors—especially for a hobby project.

    What I was planning on doing was handling the audiobook on my own—using an AI voice changer for all the different characters.

    That’s where I think AI voices can shine: If someone can act they can use a voice changer to handle more characters and introduce a great variety of different styles of speech while retaining the careful pauses and dramatic elements (e.g. a voice cracking during an emotional scene) that you’d get from regular voice acting.

    I’m not saying I will be able to pull that off but surely it will be better than just telling Amazon’s AI, “Hey, go read my book.”







  • If you hired someone to copy Ghibli’s style, then fed that into an AI as training data, it would completely negate your entire argument.

    It is not illegal for an artist to copy someone else’s style. They can’t copy another artist’s work—that’s a derivative—but copying their style is perfectly legal. You can’t copyright a style.

    All of that is irrelevant, however. The argument is that—somehow—training an AI with anything is somehow a violation of copyright. It is not. It is absolutely 100% not a violation of copyright to do that!

    Copyright is all about distribution rights. Anyone can download whatever TF they want and they’re not violating anyone’s copyright. It’s the entity that sent the person the copyright that violated the law. Therefore, Meta, OpenAI, et al can host enormous libraries of copyrighted data in their data centers and use that to train their AI. It’s not illegal at all.

    When some AI model produces a work that’s so similar to an original work that anyone would recognize it, “yeah, that’s from Spirited Away” then yes: They violated Ghibli’s copyright.

    If the model produces an image of some random person in the style of Studio Ghibli that is not violating anyone’s copyright. It is not illegal nor is it immoral. No one is deprived of anything in such a transaction.


  • Riskable@programming.devtoTechnology@lemmy.world*Permanently Deleted*
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    1 month ago

    I think your understanding of generative AI is incorrect. It’s not just “logic and RNG”…

    If it runs on a computer, it’s literally “just logic and RNG”. It’s all transistors, memory, and an RNG.

    The data used to train an AI model is copyrighted. It’s impossible for something to exist without copyright (in the past 100 years). Even public domain works had copyright at some point.

    if any of the training data is copyrighted, then attribution must be given, or at the very least permission to use this data must be given by the current copyright holder.

    This is not correct. Every artist ever has been trained with copyrighted works, yet they don’t have to recite every single picture they’ve seen or book they’ve ever read whenever they produce something.


  • Riskable@programming.devtoTechnology@lemmy.world*Permanently Deleted*
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    2
    ·
    1 month ago

    I’m still not getting it. What does generative AI have to do with attribution? Like, at all.

    I can train a model on a billion pictures from open, free sources that were specifically donated for that purpose and it’ll be able to generate realistic pictures of those things with infinite variation. Every time it generates an image it’s just using logic and RNG to come up with options.

    Do we attribute the images to the RNG god or something? It doesn’t make sense that attribution come into play here.