BigMuffN69

joined 3 months ago
[–] BigMuffN69@awful.systems 7 points 1 day ago

Im reporting live from the Portland war front, brothers. We are running dangerously low on our femboy furry rations. I fear we will not sustain our goon sesh through the Winter months 😔

[–] BigMuffN69@awful.systems 13 points 2 days ago (2 children)

Last week on, "Please do not build the torment nexus"

No, it never works out in the movies, I mean somehow these poor delusional humans convince themselves that they can control the kill bots...

[–] BigMuffN69@awful.systems 7 points 5 days ago (2 children)

It almost surely has been done, but dare I ask where one can find trashy romance a la “pegged by the basilisk”

[–] BigMuffN69@awful.systems 12 points 6 days ago (5 children)

Gork is this true?

[–] BigMuffN69@awful.systems 9 points 1 week ago (3 children)

You think he would maybe, idk, search around to see if this was a known formula before making such a bombastic statement…

[–] BigMuffN69@awful.systems 14 points 1 week ago

Oh god, he unironically recommends reading the sequences wtf 🤢🤮

[–] BigMuffN69@awful.systems 10 points 1 week ago* (last edited 1 week ago)

Great response^

I think Julian is going to be mildly surprised that METR’s chart keeps going up, and yet, will have relatively small effect on the majority of swe roles.

At the same time, he did create alphaZero so he has a big old noggin! I wonder, after his success at Go, was he swept up in the mania that we would quickly translate that success to create super duper ai?

[–] BigMuffN69@awful.systems 10 points 1 week ago* (last edited 1 week ago) (1 children)

Links to the METR tasks w/ massive error bars at 50% level lmaou.

Someone in the comments rightly points out the comparison with covid isn’t apt. With covid, underlying mechanism caused an exponential effect in covid’s spread

With LLMs the exponential trend is being caused by exponentially spending money and a healthy dose of targeting benchmarks, which is why people are calling the top. The money literally doesn’t exist for this shit to go on so you can create your 50% accurate mechanical turk.

Edit: idk the more I think about this the more it irks me. Like if I was allowed to pick and choose benchmarks that agree with my biases I would post something like this…

… and claim model performance is actually getting worse over time.

https://xcancel.com/sayashk/status/1966144670561612202#m

[–] BigMuffN69@awful.systems 10 points 1 week ago (5 children)

https://scottaaronson.blog/?p=9183

Quantum scoot is quantum spooked 😱 after GPT-5 manages to solve a subproblem for him (after multiple attempts), thanks the powers that be for his tenure!

… even though GPT-5 probably generates the answer via websearch

[–] BigMuffN69@awful.systems 4 points 2 weeks ago

Nice result, not too shocking after IMO performance. A friend of mind told me that this particular competition is highly time constrained for human competitors, i.e., questions aren’t impossibly difficult per se, but some are time sinks that you simply avoid to get points elsewhere. (5 hours on 12 Qs is tight…)

So when you are competing against a data center using a nuclear reactor vs 3 humans running on broccoli, the claims of superhuman performance definitely require an * attached to them.

[–] BigMuffN69@awful.systems 5 points 2 weeks ago

X and coinbase on this list lmfaou what a joke

[–] BigMuffN69@awful.systems 9 points 3 weeks ago

Hecate left no crumbs

view more: next ›