science

22237 readers

719 users here now

A community to post scientific articles, news, and civil discussion.

rule #1: be kind

founded 2 years ago

MODERATORS

m3t00@lemmy.world

Joleee@lemmy.world

laverabe@lemmy.world

DeadPand@midwest.social

laverabe@lemmy.zip

201

AI sycophancy (excessively agreeing with user) is pervasive and harmful for people who seek advice from AIs (arxiv.org)

submitted 1 day ago* (last edited 1 day ago) by zlatiah@lemmy.world to c/science@lemmy.world

14 comments fedilink hide all child comments

Relatively new arXiv preprint that got featured on Nature News, I slightly adjusted the title to be less technical. The discovery was done using aggregated online Q&A... one of the funnier sources being 2000 popular questions from r/AmITheAsshole that were rated YTA by the most upvoted response. Study seems robust, and they even did several-hundred participants trials with real humans.

A separate preprint measured sycophancy across various LLMs in a math competition-context (https://arxiv.org/pdf/2510.04721), where apparently GPT-5 was the least sycophantic (+29.0), and DeepSeek-V3.1 was the most (+70.2)

The Nature News report (which I find a bit too biased towards researchers): https://www.nature.com/articles/d41586-025-03390-0

you are viewing a single comment's thread
view the rest of the comments

[–] Scubus@sh.itjust.works 2 points 1 day ago

That was my original frustration with the llms. It was pointless to use them to bounce ideas off of because they would assume you were right when you were just stating a theory. I dont actively use an of the llms, but i believe google uses gemini and it seems to be more willing to tell me im wrong these days. I was using it to better my understanding of superconductors and bouncing some theories off of it, and it seemed very determined to stick to proven physics. More specifically, i was trying to run some theories on emulating cooper pairs at hogher tempertures and it was having none of it. Definitely an improvement over how they used to be.