GenAI tools ‘could not exist’ if firms are made to pay copyright::undefined

  • satanmat@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    1
    ·
    10 months ago

    I’m just trying to think about how refined AI would be if it could only use public domain data.

    ChatGPT channels Jane Austin and Shakespeare.

    • kromem@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      10 months ago

      That’s not really how it would work.

      If you want that outcome, it’s better to train on as massive a data set as possible initially (which does regress towards the mean but also manages to pick up remarkable capabilities and relationships around abstract concepts), and then use fine tuning to bias it back towards an exceptional result.

      If you only trained it on those works, it would suck at pretty much everything except specifically completing those specific works with those specific characters. It wouldn’t model what the concerns of a prince in general were, but instead model that a prince either wants to murder his mother (Macbeth) or fuck her (Oedipus).

    • wewbull@feddit.uk
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      10 months ago

      That’s how it should be, but public domain has been crippled by Disney and co.