I just list all of the plants sold at any of the stores I currently integrate with. I use a scientific taxonomic name database to make sure that the binomial names (genus, species) are legit and not miss-spelled and all of my liastings are anchored to that. I match products to binomial names through a process thats that uses a really complex regex, plus some manual labeling. Experimenting with using more ML here.Price/availability data is updated on a nightly basis. Probably want to increase that frequency, though no one has complained about stale data yet, so not high priority.
Getting some affiliate relationships set up right now! Already have two added in the past week.
That extra / is mysterious... I will dive into that after work.
vidyesh|1 year ago
I am quite surprised that shopify doesn't have hotlink protection for images!
>I match products to binomial names through a process thats that uses a really complex regex, plus some manual labeling. Experimenting with using more ML here.
Thats what I wondering. I am building something similar, region specific for books and sometimes the names are just a little off or partial or alternate names. I am currently doing a string comparison to match at least 80-90% of the words in the title, which works okay for now. So thank you for the ideas.
Your product update frequency is very interesting, I always thought scraping for price aggregation meant one has to make sure its very frequently updated. My approach is a bit different, it only scrapes on search, so not really scraping all the sites. Not the best approach, but its scary to me to scrape complete websites and that much data lol I currently am not using a db either but scraping and caching for 30mins, that specific item which now I think about is a bad idea if I want to make this a scalable project. I should start using a database indeed.
Some feedback on the UI/UX, instead of having 'All plants' selected on the homepage, it would be nice to instead have a smaller grid of plants from each type/tag on the home page itself. Selecting any of the tag would work the same as now but homepage will have more to explore because currently its just overwhelming to do anything on the homepage. I am just looking in specific tags or just searching.
Edit: This is a great resource for adding more info about pet friendly plants to the listed plants. https://www.aspca.org/pet-care/animal-poison-control/toxic-a...
ryebread777|1 year ago
One other tip - many sites have APIs that will give you their product data. You may need to contact them about getting access. Or it may be publicly available. But that is better than scraping if it is possible.