top | item 46509705

(no title)

Cr8 | 1 month ago

unfortunately disabling temperature / switching to greedy sampling doesn't necessarily make most LLM inference engines _fully_ deterministic as parallelism and batching can result in floating point error accumulating differently from run to run - it's possible to make them deterministic but does come with a perf hit

some providers _do_ let you set the temperature, including to "zero", but most will not take the perf hit to offer true determinism

discuss

No comments yet.