top | item 46207639 (no title) willahmad | 2 months ago I think this benchmark could be slightly misleading to assess coding model. But still very good result.Yes, SVG is code, but not in a sense of executable with verifiable inputs and outputs. discuss order hn newest jstummbillig|2 months ago I love that we are earnestly contemplating the merits of the pelican benchmark. What a timeline. andrepd|2 months ago It's not even halfway up the list of inane things of the AI hype cycle. hdjrudni|2 months ago But it does have a verifiable output, no more or less than HTML+CSS. Not sure what you mean by "input" -- it's not a function that takes in parameters if that's what you're getting at, but not every app does.
jstummbillig|2 months ago I love that we are earnestly contemplating the merits of the pelican benchmark. What a timeline. andrepd|2 months ago It's not even halfway up the list of inane things of the AI hype cycle.
hdjrudni|2 months ago But it does have a verifiable output, no more or less than HTML+CSS. Not sure what you mean by "input" -- it's not a function that takes in parameters if that's what you're getting at, but not every app does.
jstummbillig|2 months ago
andrepd|2 months ago
hdjrudni|2 months ago