(no title)
nickreese | 3 months ago
Especially in extraction tasks. This appears as inventing data or rationalizing around clear roadblocks.
My biggest hack so far is giving them an out named "edge_case" and telling them it is REALLY helpful if they identify edgecases. Simply renaming "fail_closed" or "dead_end" options to "edge_case" with helpful wording causes qwen models to adhere to their prompting more.
It feels like there are 100s of these small hacks that people have to have discovered... why isn't there a centralized place where people are recording these learnings?
rotexo|3 months ago
nickreese|3 months ago
alach11|3 months ago
nickreese|3 months ago
Lots of optimization left there, but the systems are pinned most of the time so not focused on that at the moment as the gpus are the issue not the queuing.