In a previous, much crappier professional life, I worked as a programmer for the government. Most DC-area geeks do, in fact, although the work is usually so secret and/or boring that you don't hear much about it. But the federal government spends a staggering amount on IT. The only thing more astounding than the scale of the enterprise is how little direct good it does for the public.
I'll be the first to admit that not every project is a good candidate for release into citizens' hands. But there's a lot of code and data that could and should be released. But it isn't.
Now that I work on the advocacy side of things, I know that one prime example is the difficulty involved in helping a user find their congresswoman. Matching a zip code to a congressional district is a pretty obvious and simple capability that the government could make available to developers for very little cost. This would presumably facilitate conversations between constituents and representatives — if you believe in representative democracy, it's pretty hard to say that this would be anything other than a good thing.
But instead, developers usually have to buy this information from a vendor, for hundreds or even thousands of dollars. To me, it seems obvious that this information ought to be free.
Fortunately, from my time in the belly of the beast I know that the government actually does make this information available... at least, sort of. There's a collection of webpages on house.gov that provide the necessary data for a given zipcode, but they relay it in a thoroughly unusable form. If you're a developer, the obvious answer is to write some scripts to chew through that output, turning it into easily digestible SQL. Then you repeat the process for every zip code that you need to match to congressional districts. As part of another project, that's exactly what I did. I figured that other people might find it useful. At the very least, the price is right.
So! If you might find this database useful, have at it. You've got two options: first, you can download the scripts and use them to recreate the database. But the house.gov people probably wouldn't like that very much, and I'd hate to have them shut the door on this valuable data. Besides, it takes several hours to spider the necessary information.
The other option is to visit our charmingly ad-hoc bittorrent tracker and download the whole database. That archive includes the scripts, too, so that you can rebuild the database when the spiritual descendants of Tom Delay inevitably gerrymander us further into oblivion.
Enjoy! Now if we could just get the postal service to loosen their restrictions on zip+4 matching...
UPDATE: As pointed out in comments, it's slightly silly of me to have offered this via a torrent. Besides, the initial traffic has now died down. So: if you need the larger database file, you can download it here.
UPDATE 2: Since I'm still getting occasional emails about this entry, I thought I'd add a note about where everything stands. Unfortunately this data is for the 109th congress, and the 110th altered their site format in a way that broke the screen-scraping scripts. So this database is outdated and probably won't be much use to you.
But there's good news! The Sunlight Foundation is now offering this functionality via a free API. You can find information about it here. Good luck!
