this post was submitted on 02 Aug 2025
291 points (98.3% liked)

Fuck AI

3753 readers
432 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 1 year ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] pixxelkick@lemmy.world -2 points 1 week ago

...no that's not the summarization.

The summarization is:

if you reinforce your model via user feedback, via "likes" or "dislikes" or etc, such that you condition the model towards getting positive user feedback, it will start to lean towards just telling users whatever they want to hear in order to get those precious likes, cuz obviously you trained it to do that

They demo'd in the same paper other examples.

Basically, if you train it on likes, the model becomes duper sycophantic, laying it on super thick...

Which should sound familiar to you.