Some day in the near future, the marketing department will wonder why so many people were curious about all of their products after browsing Organic Ground Beef[0]
I’ve always held on to my grocery receipts for the last 8 years. I took pictures of them but when I moved a box got water damaged so now I only have like the last 3ish years.
Is there any open source software that I can use to transfer these receipts into a useful csv?
I have an idea for a few interesting data visualizations as I’d often buy the same things every week. Grocery bill went from like $70 to $150 with not much changes from what I can tell.
I have attempted this and the biggest issue is that sometimes the receipts use codes hard to understand. And the codes will change from store to store.
If you're lucky, you won't need to go to a grocery store and determine what a code means, you will only need to map the code to an actual item you bought.
I do the same with WholeFoods receipts. The pipeline is:
1. Scan to FTP dir to TIFF
2. Nightly job submits image(s) from the dir to Veryfi (their free tier is enough and they looks like the best for receipts OCR)
3. Save that raw JSON, enhanced JSON (fix occasional mis-attribution for discounts, calculate unit price), and CSV. Filename is a purchase date - correctly extracted by Veryfi.
4. Render with bash + gnuplot.
5. TODO: store into some DB and render with Grafana or something.
Given that these prices are going into a database, I was hoping that you could click on an item and get its pricing over time (not a referral link to the Trade Joes web site).
Open food fact recently launched Open Prices (https://prices.openfoodfacts.org/). It's currently crowd-sourced instead of an automated crawling, but prices are localized in space and time which could lead to intersting results.
This will lead to an open database of food product prices.
It's a neat idea, but I think you need some automation to make it useful over a long period of time. There's a website to do track gas prices, and they just change too much to keep updated.
Localized pricing just adds noise to data. Intermittent updates combined with possibility of input error also creates issues.
Retailers that utilize price zones typically have the baseline price that drives the prices for each price zone (e.g. set prices in California to be 10% more than baseline). Getting the baseline price in an automated way is the ideal solution.
Pleasantly surprised to see about as many price reductions as price increases. Also funny that they marked up the price of roses ahead of Valentine's Day!
This caught me completely by surprise. I never would have imagined that a grocery chain, even Trader Joe's, would have a publicly-accessible API endpoint for its current product catalog.
I wonder how the author even found the endpoint! Is this at risk of Trader Joe's noticing and moving to require an auth token? I can't imagine the openness being intentional, at least outside that brief period of techno-optimism in the late 2000s where it seemed like everyone was offering data sources to build into your Twitter bot.
Possible secondary application: could this GraphQL endpoint potentially be used to determine when a product is being discontinued?
Is this for one store or all stores? It's commendable that you posted your code, but a minimal README would be appreciated.
Looking through your code, I see that the default store is Chicago South Loop (701). This would be helpful information to include on the website displaying the results.
Yes, you make a good point. Although I suspect there may be regional differences in price, I haven't yet run the diff on that. Should be simple enough for me to allow the user to select their regional store location.
If you don't mind sharing, how do you find their API? I don't understand graphql that well and I've been trying to play with https://www.traderjoes.com/api/graphql to no avail. Cool project, github star achieved.
YES! thank you for doing this, I have been curious about some of the stuff I buy often and I felt like over the last year or two, things have climbed in prices a lot compared to the normal inflation price hikes.
Open nodes at stores could help BLS & commercial orgs doing regular price data gathering. I work on supplier pricing dynamics and for commoditized products freeing your pricing data is powerful.
This is cool, but with the apparently random ordering it's just about impossible to find anything. Add sort buttons? A description of why this ordering is used?
This is shredding close to HNs "please no political battle", but I'll try to take this as neutral as possible.
You are suggesting these are stores people should avoid for labor-unfriendly practices. Assuming that people are aligned with you on that value set, I don't think a sarcastic comment without actual recommendations, on a post about a tangentially related project, is going to move any needles.
And so, assuming the mods let us live: What are your recommended alternatives to TJs? TJs is, for better a worse, one of the few grocery store options with both decent quality and not entirely ludicrous prices. This project makes it even more enticing, because you can finally look up prices directly and see price history.
If there's another store that offers all that, I think a lot of people would be interested, independent of "why".
Didn't work at Trader Joes, but at another grocery chain for a while as a store buyer. I had essentially no control over pricing unless we had a bunch of backstock we had to move quick to avoid expiration. In those cases we had some level of store-level autonomy to "price to move." That being said, it was heavily tracked and if anyone was doing it too much I'm sure there would be consequences of some sort.
Besides that, we'd get updates from corporate with a list of new price tags we'd print out any time they changed something (100% with regional fluctuation baked in, but not at a store level).
I think there are enough grocery apps that it is better to show price data than not. Though I'm certainly baising that off how I shop rather than a general economics / game theory POV.
I regularly use the Ralphs app to cross shop while I'm at Trader Joes. They are only 2 minutes walking apart, so I normally start at TJs. However, sometimes I end up at Ralphs first and now having this data it could lead to an unplanned trip to TJs to save a few bucks.
scarletphoenix|2 years ago
[0]: https://github.com/cmoog/traderjoes/blob/ea2da58a84d3a04e28f...
cmoog|2 years ago
m0rbz|2 years ago
[deleted]
azemetre|2 years ago
I’ve always held on to my grocery receipts for the last 8 years. I took pictures of them but when I moved a box got water damaged so now I only have like the last 3ish years.
Is there any open source software that I can use to transfer these receipts into a useful csv?
I have an idea for a few interesting data visualizations as I’d often buy the same things every week. Grocery bill went from like $70 to $150 with not much changes from what I can tell.
Would be cool to put it out in the public.
graphe|2 years ago
Nextcloud also has OCR. You can use a scanner with either.
Avoid touching the receipts. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5453537/
m_0x|2 years ago
If you're lucky, you won't need to go to a grocery store and determine what a code means, you will only need to map the code to an actual item you bought.
DreamGen|2 years ago
Or use JSON mode with the API.
transitionnel|2 years ago
Huge user base right off the bat.
Neileeso44|2 years ago
1. Scan to FTP dir to TIFF
2. Nightly job submits image(s) from the dir to Veryfi (their free tier is enough and they looks like the best for receipts OCR)
3. Save that raw JSON, enhanced JSON (fix occasional mis-attribution for discounts, calculate unit price), and CSV. Filename is a purchase date - correctly extracted by Veryfi.
4. Render with bash + gnuplot.
5. TODO: store into some DB and render with Grafana or something.
Edit: formatting
alberto180|2 years ago
[deleted]
Ir0nMan|2 years ago
Y_Y|2 years ago
cmoog|2 years ago
pphysch|2 years ago
ChuckMcM|2 years ago
bobchadwick|2 years ago
rexf|2 years ago
(I shop their stores, but I don't use their website at all.)
manuelleduc|2 years ago
matthewbauer|2 years ago
internet101010|2 years ago
Retailers that utilize price zones typically have the baseline price that drives the prices for each price zone (e.g. set prices in California to be 10% more than baseline). Getting the baseline price in an automated way is the ideal solution.
bthallplz|2 years ago
spondylosaurus|2 years ago
brianwawok|2 years ago
samschooler|2 years ago
orev|2 years ago
Simple supply/demand.
swyx|2 years ago
one of the few notable production gql users?
mortenjorck|2 years ago
I wonder how the author even found the endpoint! Is this at risk of Trader Joe's noticing and moving to require an auth token? I can't imagine the openness being intentional, at least outside that brief period of techno-optimism in the late 2000s where it seemed like everyone was offering data sources to build into your Twitter bot.
Possible secondary application: could this GraphQL endpoint potentially be used to determine when a product is being discontinued?
inferiorhuman|2 years ago
orange_county|2 years ago
Also won’t prices differ by location? So many questions.
cmoog|2 years ago
Discussion of regional price differences in other comments.
ElijahLynn|2 years ago
marsissippi|2 years ago
https://www.julyp.com/shared-widget/018d8a17-fc96-72b3-806f-...
clumsysmurf|2 years ago
rnadomvirlabe|2 years ago
Looking through your code, I see that the default store is Chicago South Loop (701). This would be helpful information to include on the website displaying the results.
crazygringo|2 years ago
So that shouldn't be an issue.
cmoog|2 years ago
cnees|2 years ago
jon_adler|2 years ago
dgrin91|2 years ago
cmoog|2 years ago
xur17|2 years ago
It would be neat to be able to click on a product and see a price history graph as well (since it seems like you should have this data your db).
cmoog|2 years ago
pazimzadeh|2 years ago
But where are the cornichons?
packjc|2 years ago
nostromo|2 years ago
sva_|2 years ago
jenningsjason|2 years ago
https://www.harpercollinsleadership.com/9781400225422/becomi...
matthewbauer|2 years ago
_ea1k|2 years ago
cmoog|2 years ago
RSHEPP|2 years ago
e40|2 years ago
bordercases|2 years ago
muhammadusman|2 years ago
faramarz|2 years ago
Really cool, and gives you POV like you’d get with Uber Surge.
Is this how retailers hedge and make up for losses elsewhere?
I wish we could also know tracked wholesale price.
ctrlGsysop|2 years ago
Open nodes at stores could help BLS & commercial orgs doing regular price data gathering. I work on supplier pricing dynamics and for commoditized products freeing your pricing data is powerful.
dtgriscom|2 years ago
camhart|2 years ago
borbtactics|2 years ago
erikig|2 years ago
__mharrison__|2 years ago
unknown|2 years ago
[deleted]
rconti|2 years ago
crazygringo|2 years ago
cmoog|2 years ago
unknown|2 years ago
[deleted]
thaumasiotes|2 years ago
unknown|2 years ago
[deleted]
moxplod|2 years ago
wizerno|2 years ago
max4c|2 years ago
chx|2 years ago
Yes, please, continue to shop at Trade Joe's and subscribe to Spotify. Please.
groby_b|2 years ago
You are suggesting these are stores people should avoid for labor-unfriendly practices. Assuming that people are aligned with you on that value set, I don't think a sarcastic comment without actual recommendations, on a post about a tangentially related project, is going to move any needles.
And so, assuming the mods let us live: What are your recommended alternatives to TJs? TJs is, for better a worse, one of the few grocery store options with both decent quality and not entirely ludicrous prices. This project makes it even more enticing, because you can finally look up prices directly and see price history.
If there's another store that offers all that, I think a lot of people would be interested, independent of "why".
timcobb|2 years ago
cebu|2 years ago
sharkweek|2 years ago
Besides that, we'd get updates from corporate with a list of new price tags we'd print out any time they changed something (100% with regional fluctuation baked in, but not at a store level).
hmcq6|2 years ago
tony_cannistra|2 years ago
obmelvin|2 years ago
I regularly use the Ralphs app to cross shop while I'm at Trader Joes. They are only 2 minutes walking apart, so I normally start at TJs. However, sometimes I end up at Ralphs first and now having this data it could lead to an unplanned trip to TJs to save a few bucks.
gigatexal|2 years ago
yoyopa|2 years ago
mandeepj|2 years ago
lom|2 years ago
ElijahLynn|2 years ago
Now do all the grocery stores!
traderjoes1fan|2 years ago
[deleted]
blast|2 years ago
inamberclad|2 years ago