Speaking as someone who contributed a few scrapers to the inspectors general project (https://github.com/unitedstates/inspectors-general), I think this is a great and worthwhile effort. It's actually not that hard to contribute a scraper if you know a little Python (and maybe a way to learn a little Python if you don't).
One thing that my friend who works in Open Data has told me is that it's important for websites like this to exist, to be able to point non-technical people at them and say "SEE. THIS is why you can't just publish everything as a PDF".
The main source of their power is every single court case, and more importantly, tracking which ones interact with each other and how, such as being overturned.
This not only needs access to pacer, but a good algorithm and a huge staff to catch up to westlaw/lexis.
Although many folks from the Sunlight Foundation support the project, it has relatively decentralized control:
> This is an unusual, and occasionally chaotic, model for an open data project. the /unitedstates project is a neutral space; GitHub's permissions system allows many of us to share the keys, so no one person or institution controls it. What this means is that while we all benefit from each other's work, no one is dependent or "downstream" from anyone else. It's a shared commons in the public domain.
Awesome list of resources! I'm currently working on a text-based Twilio app that simplifies updates on how their Senator/Representative votes on major legislation. Further down the line I'd like to tie in direct communication with Senators/Reps where they give a statement on why they voted the way they did, updates on when they're in their local offices, etc.
May I suggest you include committee votes where you can?
I've done a bunch of technology voters guides for Wired and CNET by crawling House/Senate records (what a pain) and that's one thing I always thought would be useful. Not enough attention is paid to them, and many bills don't get to the floor. There were plenty of SOPA committee votes on amendments, but the legislation never made it to the floor.
Both have to do with open data, but otherwise, there are significant differences.
The Github @unitedstates Project, is an open, relatively decentralized directory to find tools and data related to the United States. Based on the organizations involved in its birth, I'd say its ethos is, broadly, about civic-minded issues. The tools mentioned vary and have different user experiences.
Enigma is a login-required, commercial offering (with a free option, at least for the time being) providing a web application interface to public data, worldwide. It is, at its core, a search engine that lets you drill down into data rows from a common user interface. Its ethos seems to be "find the data you are looking for, whatever your purpose: academic research, business analysis, civics, etc.
Thanks Craig, but I was barely involved: all the credit should go to Sunlight Foundation and their partners (Govtrack, NY Times) who started the project and did the painstaking work to build the datasets over the course of 2 years.
I helped with a tiny tiny piece (the contact-congress repo), and even that was worked on for months before by the folks at Sunlight (in particular Dan Drinkard and Eric Mill).
This is awesome - though I'm amused that the site that clearly represents US data, is hosted on an overseas domain... (.io = British Indian Ocean Territory).
Edit: All snark aside though, this really is awesome. I can imagine all kinds of useful things that come out of this sort of structured data, including just interesting information (like demographic patterns of various politicians, etc).
[+] [-] audiodude|11 years ago|reply
One thing that my friend who works in Open Data has told me is that it's important for websites like this to exist, to be able to point non-technical people at them and say "SEE. THIS is why you can't just publish everything as a PDF".
[+] [-] the_watcher|11 years ago|reply
[+] [-] mikecb|11 years ago|reply
This not only needs access to pacer, but a good algorithm and a huge staff to catch up to westlaw/lexis.
[+] [-] yen223|11 years ago|reply
http://www.sinarproject.org/
[+] [-] lucaspiller|11 years ago|reply
http://data.gov.uk/
http://alphagov.github.io/
[+] [-] klunger|11 years ago|reply
http://www.ssb.no/en
[+] [-] dj-wonk|11 years ago|reply
> This is an unusual, and occasionally chaotic, model for an open data project. the /unitedstates project is a neutral space; GitHub's permissions system allows many of us to share the keys, so no one person or institution controls it. What this means is that while we all benefit from each other's work, no one is dependent or "downstream" from anyone else. It's a shared commons in the public domain.
From http://sunlightfoundation.com/blog/2013/08/20/a-modern-appro...
[+] [-] le_meta|11 years ago|reply
[+] [-] mattste|11 years ago|reply
[+] [-] declan|11 years ago|reply
I've done a bunch of technology voters guides for Wired and CNET by crawling House/Senate records (what a pain) and that's one thing I always thought would be useful. Not enough attention is paid to them, and many bills don't get to the floor. There were plenty of SOPA committee votes on amendments, but the legislation never made it to the floor.
[+] [-] ljd|11 years ago|reply
[+] [-] dfc|11 years ago|reply
[+] [-] konklone|11 years ago|reply
[+] [-] bloometal|11 years ago|reply
[+] [-] dj-wonk|11 years ago|reply
The Github @unitedstates Project, is an open, relatively decentralized directory to find tools and data related to the United States. Based on the organizations involved in its birth, I'd say its ethos is, broadly, about civic-minded issues. The tools mentioned vary and have different user experiences.
Enigma is a login-required, commercial offering (with a free option, at least for the time being) providing a web application interface to public data, worldwide. It is, at its core, a search engine that lets you drill down into data rows from a common user interface. Its ethos seems to be "find the data you are looking for, whatever your purpose: academic research, business analysis, civics, etc.
[+] [-] HistoryInAction|11 years ago|reply
[+] [-] sinak|11 years ago|reply
I helped with a tiny tiny piece (the contact-congress repo), and even that was worked on for months before by the folks at Sunlight (in particular Dan Drinkard and Eric Mill).
[+] [-] atonse|11 years ago|reply
Edit: All snark aside though, this really is awesome. I can imagine all kinds of useful things that come out of this sort of structured data, including just interesting information (like demographic patterns of various politicians, etc).
[+] [-] stormbrew|11 years ago|reply
[+] [-] rwallace|11 years ago|reply
[+] [-] dba7dba|11 years ago|reply
Yeah, it would be really awesome to just click a few times to see make up of a politician's district.
Hope the project does well.