If you train one of the larger models on these specific problems (i.e DM for D&D problems) it probably will surprise you. The larger models are great at generic text production but when fine-tuned for specific people/task emulation they're quite surprisingly good.
mitthrowaway2|1 year ago
fluoridation|1 year ago
dartos|1 year ago
But they still fail at things like puzzles.