For large context (up to 100K tokens in some cases). We found that GPT-5:
a) has worse instruction following; doesn't follow the system prompt b) produces very long answers which resulted in a bad ux c) has 125K context window so extreme cases resulted in an error
internet_points|3 months ago
tifa2up|3 months ago
Shank|3 months ago
cj|3 months ago
Xmd5a|3 months ago