top | item 44217822 (no title) mormegil | 8 months ago Did the testing prompt for LLMs include a clause forbidding the use of any tools? If not, why are you adding it here? discuss order hn newest simonw|8 months ago The way I run the pelican on a bicycle benchmark is to use this exact prompt: Generate an SVG of a pelican riding a bicycle And execute it via the model's API with all default settings, not via their user-facing interface.Currently none of the model APIs enable tools unless you ask them to, so this method excludes the use of additional tools. diggan|8 months ago The models that are being put under the "Pelican" testing don't use a GUI to create SVGs (either via "tools" or anything else), they're all Text Generation models so they exclusively use text for creating the graphics.There are 31 posts listed under "pelican-riding-a-bicycle" in case you wanna inspect the methodology even closer: https://simonwillison.net/tags/pelican-riding-a-bicycle/
simonw|8 months ago The way I run the pelican on a bicycle benchmark is to use this exact prompt: Generate an SVG of a pelican riding a bicycle And execute it via the model's API with all default settings, not via their user-facing interface.Currently none of the model APIs enable tools unless you ask them to, so this method excludes the use of additional tools.
diggan|8 months ago The models that are being put under the "Pelican" testing don't use a GUI to create SVGs (either via "tools" or anything else), they're all Text Generation models so they exclusively use text for creating the graphics.There are 31 posts listed under "pelican-riding-a-bicycle" in case you wanna inspect the methodology even closer: https://simonwillison.net/tags/pelican-riding-a-bicycle/
simonw|8 months ago
Currently none of the model APIs enable tools unless you ask them to, so this method excludes the use of additional tools.
diggan|8 months ago
There are 31 posts listed under "pelican-riding-a-bicycle" in case you wanna inspect the methodology even closer: https://simonwillison.net/tags/pelican-riding-a-bicycle/