top | item 46329828

(no title)

tsazan | 2 months ago

A CSV is a dump of facts. CommerceTXT is a layer of intent and logic. If you give an AI a giant CSV of your whole inventory, you blow the context window before the conversation even starts. If you serve a CSV per product, you still pay for headers and commas without getting any behavioral control.

Our spec handles this via @SEMANTIC_LOGIC and @BRAND_VOICE. It’s about how the AI represents your brand, not just the raw numbers.

Regarding bs4: mapping HTML to a thousand different store layouts is exactly what we are trying to escape. That is the 'fragility tax'. We are proposing a deterministic fast-lane that bypasses the need for custom scrapers for every single store.

You don't want the AI to 'guess' your data. You want it to 'know' your data.

discuss

dehugger|2 months ago

the entire point of the system I described is that it never needs to load that data into context.

AI is excellent at mapping from one format to another.

I use this method to great affect.

tsazan|2 months ago

The mapping approach assumes the web is static. In reality, you're building a 'maintenance debt' machine. For every 1,000 stores, you need 1,000 AI-generated mappings that break whenever a dev changes a CSS class.

CommerceTXT isn't just about extraction; it's about contract-based delivery. We are moving from 'Guessing through Scraping' to 'Knowing through Protocol'. You're optimizing the process of scraping; we are eliminating the need for it.

IgorPartola|2 months ago

Meh. I would rather just have the ability to query any given products catalog in a machine-readable way. Any tool or protocol specifically designed for an LLM to consume is in my opinion a design smell. We should instead design proper APIs and protocols usable by all kinds of program and the LLMs can adapt.

You are also solving a business problem with a technical solution. Shopify recently announced that they will open up their entire catalog via an easy to use API to a select few enterprise partners. Amazon is doing a similar thing. This is because they do not want you and I to have the ability to programmatically query their catalog. They want to extract money out of specific partners who are trying to enshittify AI chat apps by throwing tons of ads in there. The big movers in the industry could have already easily adopted a similar standard but they are not going to on purpose. On top of you technical issues other commenters are pointing out, I don’t see why this should be in use at all.

tsazan|2 months ago

You’ve identified the exact tension we are navigating.

I support platforms like Shopify and Wix because they empower 80% of independent merchants to exist online. But I oppose their move toward 'enterprise-only' data silos. When Shopify gates their catalog API for a few select partners, they aren't protecting the merchant. They are protecting their own rent-seeking position.

CommerceTXT is a way for a merchant on any platform to say: 'My data is mine, and I want it to be discoverable by any agent, not just the ones who paid the platform's entry fee'.

Regarding 'design smell': Every major shift in computing has required specialized protocols. We didn't use Gopher for the web, and we shouldn't use 2010-era REST APIs for 2025-era LLMs. Models have unique constraints-token costs and hallucination risks-that traditional APIs simply weren't built to handle.

We aren't building for the gatekeepers. We are building for the open commons.