• Mnemnosyne@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    ·
    5 months ago

    Thank you, I understand better now. So in theory, if one of the other search engines chose to not have their crawler identify itself, it would be more difficult for them to be blocked.

    • tb_@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      5 months ago

      This is where you get into the whole webscraping debate you also have with LLM “datasets”.

      If you, as a website host, are detecting a ton of requests coming from a singular IP you can block said address. There are ways around that by making the requests from different IP addresses, but there are other ways to detect that too!

      I’m not sure if Reddit would try to sue Microsoft or DDG if they started serving results anyway through such methods. I don’t believe it is explicitly disallowed.
      But if you were hoping to deal in any way with Reddit in the future I doubt a move like this would get you in their good graces.

      All that is to say; I won’t visit Reddit at all anymore now that their results won’t even show up when I search for something. This is a terrible move and will likely fracture the internet even more as other websites may look to replicate this additional source of revenue.