

I don’t know if it’s just my age/experience or some kind of innate “horse sense” But I tend to do alright with detecting shit responses, whether they be human trolls or an LLM that is lying through its virtual teeth. I don’t see that as bad news, I see it as understanding the limitations of the system. Perhaps with a reasonable prompt an LLM can be more honest about when it’s hallucinating?
I agree – That’s why I’m chalking it up to some kind of healthy sense of skepticism when it comes to trusting authoritative-sounding answers by themselves. e.g. “ok that sounds plausible, let’s see if we can find supporting information on this answer elsewhere or, maybe ask the same question a different way to see if the new answer(s) seem to line up.”
Interesting – I still see them largely as black boxes so reading about how people smarter than me describe the processes is fascinating.