I have two theories on how the modelfarmers (I like that slang, it seems more fitting than "devs" or "programmers") approached this...
-
Like you theorized, they noticed people doing lots of logic tests, including twists on standard logic tests (that the LLMs were failing hard on), so they generated (i.e. paid temp workers) to write a bunch of twists on standard logic tests. And here we are, with it able to solve a twist on the duck puzzle, but not really better in general.
-
There has been a lot of talk of synthetically generated data sets (since they've already robbed the internet of all the text they could). Simple logic puzzles could actually be procedurally generated, including the notation diz noted. The modelfarmers have over-generalized the "bitter lesson" (or maybe they're just lazy/uninspired/looking for a simple solution they can tell the VCs and business majors) and think just some more data, deeper network, more parameters, and more training will solve anything. So you get the buggy attempt at logic notation from synthetically generated logic notation. (Which still doesn't quite work, lol.)
I don't think either of these approaches will actually work for letting LLM's solve logic puzzles in general, these approaches will just solve individual cases (for solution 1) and make the hallucinations more convincing (for 2). For all their talk of reaching AGI... the approaches the modelfarmers are taking suggest a mindset of just reaching the next benchmark (to win more VC, and maybe market share?) and not of creating anything genuinely reliable much less "AGI". (I'm actually on the far optimistic end of sneerclub in that I think something useful might be invented that lasts the coming AI winter... but if the modelfarmers just keep scaling and throwing more data at the problem, I doubt they'll even manage that much).
You don't think the coming crash is going to drive compute costs down? I think the VC money for training runs drying up could drive down costs substantially... but maybe the crash hits other aspects of the supply chain and cost of GPUs and compute goes back up.
Yeah this shit grates so much. Copyright is so often a tool of capital to extract rent from other people's labor.