This is something we've been grappeling with on my team. Many of the researchers in the org want to try all these reasoning techniques to increase performance, and my team keeps pushing back that we don't actually need that extra performance- we just want to decrease latency and cost.
iinnPP|1 year ago