Show HN: I scrape Steam data every month and it's yours to download for free
161 points| csmets | 1 year ago |gginsights.io
To download the raw scraped data you need to become a paid member but you don't really need it unless you're wanting to finesse a table of data for a particular need. The cost is mostly just an incentive to help me pay the bills for running the website.
The bunch of available CSV files contain large amounts of data which has everything from tags, genres, pricing, wishlists, estimated revenue, etc. It's what the AI is reading from.
Hope you find it useful :-)
[+] [-] Apreche|1 year ago|reply
[+] [-] noirscape|1 year ago|reply
It means that steamdb, while extraordinarily useful for casual prodding at what's stored on Valve's servers, isn't very good if you want to run data analysis or something like that on the metadata of Steam games at scale.
Not sure if it's legal to charge for the raw scrape when OP doesn't seem to be affiliated with Valve, but that's not up to me to figure out.
[0]: https://steamdb.info/faq/
[+] [-] m00dy|1 year ago|reply
[+] [-] nickthegreek|1 year ago|reply
[+] [-] xerox13ster|1 year ago|reply
[+] [-] kmfrk|1 year ago|reply
[+] [-] ghfhghg|1 year ago|reply
Might be good to clarify in the FAQ because the people I know who would pay for this are not the most techy types.
[+] [-] ddxv|1 year ago|reply
[+] [-] lolinder|1 year ago|reply
Generally it's polite to avoid scraping if you can help it, so I'd start by considering whether OP is already providing what you are looking for.
[+] [-] netruk44|1 year ago|reply
It definitely won't fetch all the data that this person does though. It only fetches the current list of games on Steam, their store page information and some reviews for the game.
The code quality probably isn't amazing, but it might give you an idea of how to get started with your own scraper.
https://github.com/Netruk44/steam-embedding-search/blob/main...
[+] [-] DrammBA|1 year ago|reply
I found this explanation from steamdb that points to the various projects and libraries they use to gather all the data they have. It's not a how-to, but it has very useful info.
[+] [-] z3c0|1 year ago|reply
From the Terms of Service (emphasis mine):
6. Restrictions on Use
You agree not to:
Do you intend to delineate the data provided by the service from "the Service" itself? It seems most fair that data received via Fair Use remains in that arena, pun fully intended.That aside, it's an intriguing dataset nonetheless, but I'd prefer to see a sample of the data before signing up.
[+] [-] csmets|1 year ago|reply
[+] [-] akudha|1 year ago|reply
I am not sure what is considered derivative work and what isn’t
[+] [-] JadoJodo|1 year ago|reply
[+] [-] stared|1 year ago|reply
Also, for deeper insight than sales volumes (e.g., game design, general trends, demographics, types of players), such things would be crucial.
and
[+] [-] Ksudijaan|1 year ago|reply
The biggest advantage that SteamDB has, is that it has a ton of historical data. That is not retrievable from the Steam Network, so the only way to have gotten historical data is to start early.
My website is now defunct for a year, but I've kept the scraper running. I now have 7 years of historical data in my database.
[+] [-] eamsen|1 year ago|reply
[+] [-] bdd8f1df777b|1 year ago|reply
[+] [-] aranw|1 year ago|reply
[+] [-] somenameforme|1 year ago|reply
[+] [-] giancarlostoro|1 year ago|reply
This is kind of the only way I use AI really, to summarize things, and extract details, then review from the raw sources to make sure the LLM isn't misleading me. I find myself using this approach instead of Googling for things since Google crippled their search the last few years, it feels like every year its harder to find things with Google. I miss 2007 Google...
[+] [-] dewey|1 year ago|reply
[+] [-] bitbasher|1 year ago|reply
[+] [-] bloomingkales|1 year ago|reply
Do you think Steam reviews are coordinated?
[+] [-] bluefirebrand|1 year ago|reply
Anything from a small indie game to a huge AAA title, you can bet that the creators got their friends and family to post some nice reviews early, just to give it that positive bump
[+] [-] shagie|1 year ago|reply
Yes. It's not even a question. Steam flags outliers too.
https://store.steampowered.com/app/281990/Stellaris/
It got review bombed starting on Feb 14th because a different game that the company makes (HOI4) released DLC that upset the sensibilities of part of that player base. ( https://old.reddit.com/r/Stellaris/comments/1iqzih8/why_is_s... )
---
There are Steam review bots for discord ( https://www.codecks.io/steam-bot/ ) and that also encourages people who are members of a game's discord to leave a (positive) review.
---
It's a certainty that reviews are coordinated through a number of different means.
[+] [-] ryanisnan|1 year ago|reply
[+] [-] happyopossum|1 year ago|reply
Does not line up with
> To download the raw scraped data you need to become a paid member
Sooo, clickbait or just plain dishonest?
[+] [-] antasvara|1 year ago|reply
So I guess it depends if you consider the CSV as fundamentally different from the raw data in a way that makes this clickbait.
[+] [-] voodooEntity|1 year ago|reply
[+] [-] thot_experiment|1 year ago|reply
If I have to pay to download the data how is it mine to download for free?
[+] [-] appleaday1|1 year ago|reply
[+] [-] endre|1 year ago|reply