top | item 47174287

(no title)

pw | 4 days ago

HHS released a massive dataset of every Medicaid payment to every provider in the US: 227 million rows covering $1.09 trillion in spending across 617,000 billing providers. The data was released explicitly to crowdsource fraud detection.

The raw data is a 2.9 GB Parquet file. I built MedicaidSpending.org to make it searchable and browsable.

You can search by provider name or NPI, browse by state/city/specialty, and see individual provider pages with monthly spending trends, billing code breakdowns, and automated billing flags for statistical outliers.

Some of the patterns are striking. Brooklyn alone accounts for $31.8 billion in personal care services (code T1019) _ more than most states spend on all Medicaid combined. Some authorized officials control hundreds of billing entities. Early analysts scanning just 0.16% of providers flagged $90 billion in likely fraudulent payments.

Technical details: - Go single binary, ~15 MB - 3.3 GB SQLite database (read-only, pre-aggregated from the 227M rows using DuckDB) - 900,000+ indexable pages generated from 13 templates - No JavaScript framework _ server-rendered HTML, Chart.js for one chart per provider page - Runs on a single VPS behind Caddy

Data sources: HHS Medicaid Provider Spending dataset, NPPES provider registry, HCPCS code descriptions, OIG exclusion list, NUCC taxonomy codes.

All public data, no login required.

discuss

order

floxy|4 days ago

Thanks for doing this. I really like the idea of open/transparent government.