(no title)
thomashop | 1 year ago
- Informally benchmarked against 4 specific competitors: Gemini, OpenAI, o3, and Claude
- Identified two concrete features: URL content ingestion and integrated search
- Noted specific limitations: search engine occasionally misses key resources
- Provided a real-world test case: consulting business analysis where it found new opportunities other models missed
infecto|1 year ago
snet0|1 year ago
joaohaas|1 year ago
- Informal Benchmarks: I'm sorry, what? He mentions 'It’s picking up on nuances—and even uncovering entirely new angles—that other models have overlooked' and 'identified an entirely new sphere of possibility that I hadn’t seen nor had any of the other top models'. Not only it is complete horseshit by itself, but it does not benchmark in any way or form against the mentioned competitors. It's the exact stuff I'd expect out of a LLM.
- Real-World Test Case: As mentioned above, complete horseshit.
- 2 Concrete Features: Yes, I mentioned URLs in the input. I didn't consider 'Integrated Search' (which I'm assuming is searching the web for up-to-date data) because AFAIK it's already more or less a staple in LLM stuff, and his only remarks about is is that it is 'solid but misses sometimes'.