Show HN: New search engine and free-FOIA-by-fax-via-web for US veteran records
118 points| Asparagirl | 1 year ago |birls.org
Back in September 2017, our organization made a Freedom of Information Act (FOIA) request to the US Department of Veterans Affairs (the VA) asking for a copy of a database they maintain called "BIRLS", which stands for the Beneficiary Identification Records Locator Subsystem. While it's not exactly an index of every single post-Civil-War veteran of every branch of the US military, it's possibly the closest thing that exists to it.
BIRLS is a database that indexes all the known-to-the-VA-in-or-after-the-1970s *veterans' benefits claims files*, also called C-Files or sometimes XC-Files. Older veterans' claims files have been moved to the National Archives (NARA), such as the famous Civil War pension files. But 95% of the later benefits claim files, from the late nineteenth century up to today, are still held at the VA, in their warehouses, and still haven't been sent to NARA.
And even if you know these files exist, the VA really doesn't make it easy to get them. The Veterans Benefits Administration (VBA) group within the VA only seems to accept FOIA requests for copies of C-Files by fax (!) and also seems to have made up a whole new rule whereby you have to have an actual wet ink signature on your FOIA request, not just a typed letter.
Well, seven years and one very successful FOIA lawsuit in SDNY against the VA later, we at Reclaim The Records are very proud to announce the acquisition and first-ever free public release of the BIRLS database, AND that we built a new website to make the data freely and easily searchable AND that we even built a free FOIA-by-FAX-API system (with a signature widget, to get around the dumb new not-FOIA rules!) built into our website's search results, that makes it much, much easier for people to finally get these files out of the VA warehouses and into your mailbox. :-)
We also added the ability to do searches through the data for soundalike names, abbreviated names, common nicknames, wildcards, searches by date of birth or death, or ranges of birth and death years, or search by SSN, or by branch(es) of services, or by gender...
For a lot more information about our FOIA lawsuit against the VA for the database, including copies of our court papers and the SDNY judge's order:
https://mailchi.mp/reclaimtherecords/the-birls-database-goes...
As for the tech stuff, actually building the website, the search engine, and its FOIAing capability...well, it has been a pretty fun project to build.
The BIRLS dataset was eventually provided to us by the VA (several years after we originally asked for it...) as a large zip file which, when decompressed via the command line, yielded the hilarious file name of *Redacted_Full.csv*. I then loaded the cleaned CSV data into a MySQL database, and then used a modified version of the Apache Solr search engine to index the data, so that it could become searchable by soundalike names (using Beider-Morse Phonetic Matching), nicknames (using Solr's synonyms feature), partial names (using wildcards), with dates converted to ISO 8601 format to enable both exact date and date range searches, and various other search criteria.
The front-end of the website is built with Nuxt and hosted on Digital Ocean's App Platform, with backups of the FOIA request data on the cloud storage service Wasabi. The fax interface for submitting FOIA requests is powered by the Notifyre API. We use Mailchimp to send e-mail newsletters, and their product Mandrill for programmatic e-mail sending. We use Sentry for error monitoring, Better Stack for server logging, and TinyBird to collect FOIA submission analytics.
Enjoy!
ldoughty|1 year ago
1) May want to auto-magically handle input for things like apostrophes. E.g. "O'Hare"... It looks like somewhere in the process this data was not preserved/saved/sent, but people will probably try to search with it. Might also want to handle the accent marks and what not too
2) The terms & conditions for Step 3, the checkbox at the bottom doesn't have enough contrast when checked. I do not have a disability, and I still found it very faint. Someone with a disability would likely have a lot of trouble (not to mention, it requires scrolling to the bottom to check it in the first place, which isn't awesome for accessibility)
3) I appreciate the warning on the terms and conditions about seeing things you might not want to see. A good reminder for those that might not want to tarnish a memory of someone... Reminds me of the DNA tests for Christmas, or learning about Punnett Squares and genetics, sometimes you might not want to go looking :-)
owenmarshall|1 year ago
I'd echo this. I found that to be exceptionally well-written and helped me understand the records I'd receive were unlikely to be the records I was interested in, so I cancelled at that point.
Your abandon rate at that step could make for interesting reading!
Asparagirl|1 year ago
So we try our best to help a user find the veteran even with the dirty data we have. For example, there is code here (using a common NPM package) to convert a user’s potential typed accent marks to a non-accented version of the same letter. In compound surnames we will also break up the surname on a space or a hyphen and search both parts, but not if a surname part is three letters or fewer. It’s imperfect but we have to work with the data we’ve got and can’t and shouldn’t normalize or clean the underlying file.
unknown|1 year ago
[deleted]
wtfssn38|1 year ago
Asparagirl|1 year ago
ldoughty|1 year ago
That said, the _typical_ things an SSN is used for would not be terribly useful for someone that's been dead >2-4 years... Automated checks should flag e.g. credit applications as being for a dead person :-)
patwolf|1 year ago
On one hand, if this works then I'll be happy to have the information I otherwise wouldn't have. But on the other hand, all these processes, no matter how convoluted, exist for a reason. It feels weird bypassing those.
Asparagirl|1 year ago
And even now, the “processes” to get the records, as defined by a 58+-year-old law (FOIA) are not really being followed. An agency refusing to process any FOIA requests except by fax (!) is insane, in this day and age. But more specifically, it’s against the law. A letter AND an e-mail are supposed to work. Hence our use of a fax API on this website…
Furthermore, the “requirement” that a FOIA requester must hand-sign the paperwork is absolutely made up by this agency. Hence our signature widget on this website…
Point being, if they’re going to shamelessly ignore or misinterpret the federal law, we are going to just jump through those hoops and say no, we want the files, please do your jobs.
Suppafly|1 year ago
My state has a process for claiming unclaimed funds that banks and such report to the government and that is what is keeping me from claiming some funds my grandmother has listed on the site. It's not even clear to me what constitutes 'next of kin' legally, presumably it'd be one of her kids, but it's not like we have laws designating the oldest male heir and then on down the list.
mattw2121|1 year ago
necovek|1 year ago
OTOH, if you have really succesfully worked to make this database public domain and do publish it somewhere (and you did, as I can see at https://archive.org/details/BIRLS_database), this wouldn't be of much help against any malicious actors out there.
But really, it seems the burden is on VA if there are non-deceased persons in the database since they have done a bad job of maintaining the data, and they would be liable for any leakage of information (unless Reclaim the Records was aware of any in particular). Even so, RTR might have put themselves out on the fence for some lawsuits against them too.
Asparagirl|1 year ago
fergbrain|1 year ago
Reminds me a bit muckrock.com as well.
Asparagirl|1 year ago
ungreased0675|1 year ago
tivert|1 year ago
Sounds like genealogy, and a small fraction of the documents in a veteran's would probably be very helpful in fleshing out some basic details of their military service (especially given a fire destroyed many of the original copies of those documents).
The actual medical records part seems inappropriate, though.
> Why is it good for the public have access to detailed records of individual, recently deceased veterans? Isn’t this a gold mine for scammers? Is this project LDS affiliated?
It seems like it's a gap in FOIA. These records should be available, just not to everyone in the whole world (at least not before, say, 60 years after the veteran's death). It seems legitimate that an appropriately-close family member should be able to request them (similar to restrictions in requesting birth certificates).
archerjax|1 year ago
hoppyhoppy2|1 year ago
I agree with you re: living people's SSNs, though.
Bjartr|1 year ago
> these materials were largely unknown and inaccessible to historians, journalists, and genealogists
I think it would be worthwhile to lead with that and include a little more detail too.
If there isn't a clear motivation, people will assume the worst.
draftsman|1 year ago
Suppafly|1 year ago
>If there isn't a clear motivation, people will assume the worst.
This is just a weird assumption.
neilv|1 year ago
* Intent is to sell the data, or otherwise "monetize" it, in the techbro sense.
* "Shell" effort of a specific company that wants the data.
* Shell effort of an organized crime group.
* Shell effort of a foreign intelligence agency, or terrorist group.
Awhile ago, there was a different project, which had the effect of making different US records, which were already reasonably accessible to US citizens and journalists, easily available to foreign adversaries, such as for espionage profiling and blackmail. When that project was promoted on HN, I caught the promoter seeming to use a sockpuppet account in the comments (accidentally using the wrong account to respond to themself), which I found additionally suspicious.
Even when a project is fully honest and with good intentions, we also have to consider the risks of likely other consumers of the data, which include all the possibilities above.
redeux|1 year ago
Veterans aren’t politicians, and they don’t deserve to have their lives put on display like this without their permission. Some vets signed up because they wanted to serve their country, some because they were running from people or poverty, but they were all just ordinary people trying to eke out a living.
I believe people, good people just trying to do their thing will be hurt by this information and that’s unfair. It’s just another example of people using veterans as pawns to achieve their ends.
What is the ends in this case? I couldn’t tell you. I do believe this will having a chilling effect on veterans seeking help from the VA at a time when they need it more and more.
Towaway69|1 year ago
Having obtained my father’s military records, I can definitely say that I’m glad these weren’t online, searchable. Via those records I learnt much concerning my father and I’m glad I obtained them.
To get those records, I had to prove I was his son, that my father was indeed deceased and that he hadn’t said/written anything to prevent me from having his records. That should be the minimum (IMHO) for obtaining such records. This wasn’t in the USA though.
jsjohnst|1 year ago
_DeadFred_|1 year ago
tompagenet2|1 year ago
flippyhead|1 year ago
tivert|1 year ago
Asparagirl|1 year ago
However, these particular files (benefits claims files, or C-Files) are a different type of file and never burned. Better yet, they often have some parts of the veteran’s OMPF that were copied *into* the C-File, to establish eligibility for those benefits — copies that were made before the fire! In other words, these files could serve as partial backups…
greentxt|1 year ago
Asparagirl|1 year ago
WarOnPrivacy|1 year ago
Might a less vitriolic response be more productive here? Given the altruism that leads this endeavor, strong negativity might best be delayed until a more holistic understanding is gained.
For example, I've been accessing these same records thru Ancestry for some time - along with millions of other Ancestry users. If these records were a realistic vector for actual meaningful harm, the evidence should have manifested some time ago.
Broadly speaking, if we want privacy efforts to help in tangible ways, it's important to limit restrictions to where they do provable good. The alternative is restrictions that apply only to us - and not those with motivation (financial, power) to use private data for their own ends.
bovermyer|1 year ago
tantalor|1 year ago
Ha, the only device I have with DVD is a PS5, this should be fun.
asacrowflies|1 year ago
Only jarheads seem to think the parental tone of "you don't know what freedom is" actually works.... Maybe because they have been thru boot camp idk.
tivert|1 year ago
As someone who's never been in the military and isn't even acquainted with that many people who have been, I think I should give you a head's up that you are being an entitled asshole.
I think most people would be mad if they found out some of the most private and personal records about them (https://news.ycombinator.com/item?id=42685002) would be made public to anyone who'd care to request them. It's a pretty terrible violation of privacy. Try to think about how you'd feel if something like that was going to happen to you (say, your complete browsing history would be made public, because I'm sure you have one and you almost certainly want to keep it private).