• ReallyActuallyFrankenstein@lemmynsfw.com
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    10 hours ago

    I get that it’s usually just a dunk on AI, but it is also still a valid demonstration that AI has pretty severe and unpredictable gaps in functionality, in addition to failing to properly indicate confidence (or lack thereof).

    People who understand that it’s a glorified autocomplete will know how to disregard or prompt around some of these gaps, but this remains a litmus test because it succinctly shows you cannot trust an LLM response even in many “easy” cases.