top | item 7823400

The Unitedstates Project

166 points| _pius | 11 years ago |theunitedstates.io | reply

33 comments

order
[+] audiodude|11 years ago|reply
Speaking as someone who contributed a few scrapers to the inspectors general project (https://github.com/unitedstates/inspectors-general), I think this is a great and worthwhile effort. It's actually not that hard to contribute a scraper if you know a little Python (and maybe a way to learn a little Python if you don't).

One thing that my friend who works in Open Data has told me is that it's important for websites like this to exist, to be able to point non-technical people at them and say "SEE. THIS is why you can't just publish everything as a PDF".

[+] the_watcher|11 years ago|reply
Wow. If this can aggregate some of the secondary sources that are the main source of Lexus/Westlaw's power, it would be fantastic.
[+] mikecb|11 years ago|reply
The main source of their power is every single court case, and more importantly, tracking which ones interact with each other and how, such as being overturned.

This not only needs access to pacer, but a good algorithm and a huge staff to catch up to westlaw/lexis.

[+] dj-wonk|11 years ago|reply
Although many folks from the Sunlight Foundation support the project, it has relatively decentralized control:

> This is an unusual, and occasionally chaotic, model for an open data project. the /unitedstates project is a neutral space; GitHub's permissions system allows many of us to share the keys, so no one person or institution controls it. What this means is that while we all benefit from each other's work, no one is dependent or "downstream" from anyone else. It's a shared commons in the public domain.

From http://sunlightfoundation.com/blog/2013/08/20/a-modern-appro...

[+] le_meta|11 years ago|reply
Just remember to mirror it locally.
[+] mattste|11 years ago|reply
Awesome list of resources! I'm currently working on a text-based Twilio app that simplifies updates on how their Senator/Representative votes on major legislation. Further down the line I'd like to tie in direct communication with Senators/Reps where they give a statement on why they voted the way they did, updates on when they're in their local offices, etc.
[+] declan|11 years ago|reply
May I suggest you include committee votes where you can?

I've done a bunch of technology voters guides for Wired and CNET by crawling House/Senate records (what a pain) and that's one thing I always thought would be useful. Not enough attention is paid to them, and many bills don't get to the floor. There were plenty of SOPA committee votes on amendments, but the legislation never made it to the floor.

[+] ljd|11 years ago|reply
We use the GovTrak API's and some from Sunlight Foundation for http://PlaceAVote.com. They are pretty awesome and well written.
[+] bloometal|11 years ago|reply
Doesn't www.enigma.io do this?
[+] dj-wonk|11 years ago|reply
Both have to do with open data, but otherwise, there are significant differences.

The Github @unitedstates Project, is an open, relatively decentralized directory to find tools and data related to the United States. Based on the organizations involved in its birth, I'd say its ethos is, broadly, about civic-minded issues. The tools mentioned vary and have different user experiences.

Enigma is a login-required, commercial offering (with a free option, at least for the time being) providing a web application interface to public data, worldwide. It is, at its core, a search engine that lets you drill down into data rows from a common user interface. Its ethos seems to be "find the data you are looking for, whatever your purpose: academic research, business analysis, civics, etc.

[+] HistoryInAction|11 years ago|reply
Excellent leadership from sinak!
[+] sinak|11 years ago|reply
Thanks Craig, but I was barely involved: all the credit should go to Sunlight Foundation and their partners (Govtrack, NY Times) who started the project and did the painstaking work to build the datasets over the course of 2 years.

I helped with a tiny tiny piece (the contact-congress repo), and even that was worked on for months before by the folks at Sunlight (in particular Dan Drinkard and Eric Mill).

[+] atonse|11 years ago|reply
This is awesome - though I'm amused that the site that clearly represents US data, is hosted on an overseas domain... (.io = British Indian Ocean Territory).

Edit: All snark aside though, this really is awesome. I can imagine all kinds of useful things that come out of this sort of structured data, including just interesting information (like demographic patterns of various politicians, etc).

[+] stormbrew|11 years ago|reply
Most of the people on the BIOT islands are American, though. The natives were expelled to build a US military base.
[+] rwallace|11 years ago|reply
Didn't the .io domain get repurposed for general use, which is why so many projects and companies have started using it lately?
[+] dba7dba|11 years ago|reply
It's surprising in this day of age no easily obtainable digital data of US demographics is available.

Yeah, it would be really awesome to just click a few times to see make up of a politician's district.

Hope the project does well.