An AI Customer Service Chatbot Made Up a Company Policy—and Created a Mess

6 hours ago 3

On Monday, a developer utilizing the fashionable AI-powered codification exertion Cursor noticed thing strange: Switching betwixt machines instantly logged them out, breaking a communal workflow for programmers who usage aggregate devices. When the idiosyncratic contacted Cursor support, an cause named "Sam" told them it was expected behaviour nether a caller policy. But nary specified argumentation existed, and Sam was a bot. The AI exemplary made the argumentation up, sparking a question of complaints and cancellation threats documented connected Hacker News and Reddit.

This marks the latest lawsuit of AI confabulations (also called "hallucinations") causing imaginable concern damage. Confabulations are a benignant of "creative gap-filling" effect wherever AI models invent plausible-sounding but mendacious information. Instead of admitting uncertainty, AI models often prioritize creating plausible, assured responses, adjacent erstwhile that means manufacturing accusation from scratch.

For companies deploying these systems successful customer-facing roles without quality oversight, the consequences tin beryllium contiguous and costly: frustrated customers, damaged trust, and, successful Cursor's case, perchance canceled subscriptions.

How It Unfolded

The incidental began erstwhile a Reddit idiosyncratic named BrokenToasterOven noticed that portion swapping betwixt a desktop, laptop, and a distant dev box, Cursor sessions were unexpectedly terminated.

"Logging into Cursor connected 1 instrumentality instantly invalidates the league connected immoderate different machine," BrokenToasterOven wrote successful a connection that was later deleted by r/cursor moderators. "This is simply a important UX regression."

Confused and frustrated, the idiosyncratic wrote an email to Cursor enactment and rapidly received a reply from Sam: "Cursor is designed to enactment with 1 instrumentality per subscription arsenic a halfway information feature," work the email reply. The effect sounded definitive and official, and the idiosyncratic did not fishy that Sam was not human.

After the archetypal Reddit post, users took the station arsenic authoritative confirmation of an existent argumentation change—one that broke habits indispensable to galore programmers' regular routines. "Multi-device workflows are array stakes for devs," wrote 1 user.

Shortly afterward, respective users publically announced their subscription cancellations connected Reddit, citing the non-existent argumentation arsenic their reason. "I virtually conscionable cancelled my sub," wrote the archetypal Reddit poster, adding that their workplace was present "purging it completely." Others joined in: "Yep, I'm canceling arsenic well, this is asinine." Soon after, moderators locked the Reddit thread and removed the archetypal post.

"Hey! We person nary specified policy," wrote a Cursor typical successful a Reddit reply 3 hours later. "You're of people escaped to usage Cursor connected aggregate machines. Unfortunately, this is an incorrect effect from a front-line AI enactment bot."

AI Confabulations arsenic a Business Risk

The Cursor debacle recalls a similar episode from February 2024 erstwhile Air Canada was ordered to grant a refund argumentation invented by its ain chatbot. In that incident, Jake Moffatt contacted Air Canada's enactment aft his grandma died, and the airline's AI cause incorrectly told him helium could publication a regular-priced formation and use for bereavement rates retroactively. When Air Canada aboriginal denied his refund request, the institution argued that "the chatbot is simply a abstracted ineligible entity that is liable for its ain actions." A Canadian tribunal rejected this defense, ruling that companies are liable for accusation provided by their AI tools.

Rather than disputing work arsenic Air Canada had done, Cursor acknowledged the mistake and took steps to marque amends. Cursor cofounder Michael Truell aboriginal apologized connected Hacker News for the disorder astir the non-existent policy, explaining that the idiosyncratic had been refunded and the contented resulted from a backend alteration meant to amended league information that unintentionally created league invalidation problems for immoderate users.

"Any AI responses utilized for email enactment are present intelligibly labeled arsenic such," helium added. "We usage AI-assisted responses arsenic the archetypal filter for email support."

Still, the incidental raised lingering questions astir disclosure among users, since galore radical who interacted with Sam seemingly believed it was human. "LLMs pretending to beryllium radical (you named it Sam!) and not labeled arsenic specified is intelligibly intended to beryllium deceptive," 1 idiosyncratic wrote connected Hacker News.

While Cursor fixed the method bug, the occurrence shows the risks of deploying AI models successful customer-facing roles without due safeguards and transparency. For a institution selling AI productivity tools to developers, having its ain AI enactment strategy invent a argumentation that alienated its halfway users represents a peculiarly awkward self-inflicted wound.

"There is simply a definite magnitude of irony that radical effort truly hard to accidental that hallucinations are not a large occupation anymore," 1 idiosyncratic wrote connected Hacker News, "and past a institution that would payment from that communicative gets straight wounded by it."

This communicative primitively appeared on Ars Technica.

Read Entire Article