top | item 42680048

Show HN: New search engine and free-FOIA-by-fax-via-web for US veteran records

118 points| Asparagirl | 1 year ago |birls.org

Hi HN. I'm the president and founder of a small non-profit called Reclaim The Records that identifies historical and genealogical materials and data sets held by government agencies, archives, and libraries -- and then returns them to the public domain, for free public use.

Back in September 2017, our organization made a Freedom of Information Act (FOIA) request to the US Department of Veterans Affairs (the VA) asking for a copy of a database they maintain called "BIRLS", which stands for the Beneficiary Identification Records Locator Subsystem. While it's not exactly an index of every single post-Civil-War veteran of every branch of the US military, it's possibly the closest thing that exists to it.

BIRLS is a database that indexes all the known-to-the-VA-in-or-after-the-1970s *veterans' benefits claims files*, also called C-Files or sometimes XC-Files. Older veterans' claims files have been moved to the National Archives (NARA), such as the famous Civil War pension files. But 95% of the later benefits claim files, from the late nineteenth century up to today, are still held at the VA, in their warehouses, and still haven't been sent to NARA.

And even if you know these files exist, the VA really doesn't make it easy to get them. The Veterans Benefits Administration (VBA) group within the VA only seems to accept FOIA requests for copies of C-Files by fax (!) and also seems to have made up a whole new rule whereby you have to have an actual wet ink signature on your FOIA request, not just a typed letter.

Well, seven years and one very successful FOIA lawsuit in SDNY against the VA later, we at Reclaim The Records are very proud to announce the acquisition and first-ever free public release of the BIRLS database, AND that we built a new website to make the data freely and easily searchable AND that we even built a free FOIA-by-FAX-API system (with a signature widget, to get around the dumb new not-FOIA rules!) built into our website's search results, that makes it much, much easier for people to finally get these files out of the VA warehouses and into your mailbox. :-)

We also added the ability to do searches through the data for soundalike names, abbreviated names, common nicknames, wildcards, searches by date of birth or death, or ranges of birth and death years, or search by SSN, or by branch(es) of services, or by gender...

For a lot more information about our FOIA lawsuit against the VA for the database, including copies of our court papers and the SDNY judge's order:

https://mailchi.mp/reclaimtherecords/the-birls-database-goes...

As for the tech stuff, actually building the website, the search engine, and its FOIAing capability...well, it has been a pretty fun project to build.

The BIRLS dataset was eventually provided to us by the VA (several years after we originally asked for it...) as a large zip file which, when decompressed via the command line, yielded the hilarious file name of *Redacted_Full.csv*. I then loaded the cleaned CSV data into a MySQL database, and then used a modified version of the Apache Solr search engine to index the data, so that it could become searchable by soundalike names (using Beider-Morse Phonetic Matching), nicknames (using Solr's synonyms feature), partial names (using wildcards), with dates converted to ISO 8601 format to enable both exact date and date range searches, and various other search criteria.

The front-end of the website is built with Nuxt and hosted on Digital Ocean's App Platform, with backups of the FOIA request data on the cloud storage service Wasabi. The fax interface for submitting FOIA requests is powered by the Notifyre API. We use Mailchimp to send e-mail newsletters, and their product Mandrill for programmatic e-mail sending. We use Sentry for error monitoring, Better Stack for server logging, and TinyBird to collect FOIA submission analytics.

Enjoy!

72 comments

order

ldoughty|1 year ago

Neat, Submitted a request for my grandfather's records. Some comments:

1) May want to auto-magically handle input for things like apostrophes. E.g. "O'Hare"... It looks like somewhere in the process this data was not preserved/saved/sent, but people will probably try to search with it. Might also want to handle the accent marks and what not too

2) The terms & conditions for Step 3, the checkbox at the bottom doesn't have enough contrast when checked. I do not have a disability, and I still found it very faint. Someone with a disability would likely have a lot of trouble (not to mention, it requires scrolling to the bottom to check it in the first place, which isn't awesome for accessibility)

3) I appreciate the warning on the terms and conditions about seeing things you might not want to see. A good reminder for those that might not want to tarnish a memory of someone... Reminds me of the DNA tests for Christmas, or learning about Punnett Squares and genetics, sometimes you might not want to go looking :-)

owenmarshall|1 year ago

> 3) I appreciate the warning on the terms and conditions about seeing things you might not want to see.

I'd echo this. I found that to be exceptionally well-written and helped me understand the records I'd receive were unlikely to be the records I was interested in, so I cancelled at that point.

Your abandon rate at that step could make for interesting reading!

Asparagirl|1 year ago

Thanks. The original data set, as provided by the VA, has all sorts of data errors and oddities in it. The major ones involving surnames include the inconsistent use of apostrophes in names like O’BRIEN, often written as O BRIEN, and/or vice versa — or the inconsistent formatting of MC and MAC names like MCMAHON as MC MAHON, and/or vice versa. There are also some names where the VA includes an errant dash, not meant to be a hyphen, and other mistakes, as well.

So we try our best to help a user find the veteran even with the dirty data we have. For example, there is code here (using a common NPM package) to convert a user’s potential typed accent marks to a non-accented version of the same letter. In compound surnames we will also break up the surname on a space or a hyphen and search both parts, but not if a surname part is three letters or fewer. It’s imperfect but we have to work with the data we’ve got and can’t and shouldn’t normalize or clean the underlying file.

wtfssn38|1 year ago

Are you aware that the API appears to be publishing the SSN of each individual? Although I’m aware most SSNs are leaked in one breach or another, I still thought it was customary in the U.S. to attempt to keep such information somewhat protected.

Asparagirl|1 year ago

Yes, that’s on purpose. SSNs of *deceased* people are public, not private. They are never reused. They are available under FOIA from other sources as well, such as the SSDMF.

ldoughty|1 year ago

I agree it probably isn't a great idea to publish it... I'm guessing some malicious actor could find a way to use this information to fiddle with the remaining benefits their family might be receiving...

That said, the _typical_ things an SSN is used for would not be terribly useful for someone that's been dead >2-4 years... Automated checks should flag e.g. credit applications as being for a dead person :-)

patwolf|1 year ago

I previously tried getting military records for my deceased grandfather. From what I can recall, it was complicated by the fact that I wasn't "next of kin", which would limit the data I'd have access to. Even my parent, who was next of kin, would have needed additional paperwork as proof, e.g. a death/marriage certificate for my grandmother.

On one hand, if this works then I'll be happy to have the information I otherwise wouldn't have. But on the other hand, all these processes, no matter how convoluted, exist for a reason. It feels weird bypassing those.

Asparagirl|1 year ago

The processes for getting these very particular records (C-Files, as opposed to something like an OMPF or other better known military records) has been horrendously broken for years. They were almost completely inaccessible from this specific agency (the VBA, inside the VA) their entire existence. Only 5% of the files have been turned over to NARA, even for records that are very old.

And even now, the “processes” to get the records, as defined by a 58+-year-old law (FOIA) are not really being followed. An agency refusing to process any FOIA requests except by fax (!) is insane, in this day and age. But more specifically, it’s against the law. A letter AND an e-mail are supposed to work. Hence our use of a fax API on this website…

Furthermore, the “requirement” that a FOIA requester must hand-sign the paperwork is absolutely made up by this agency. Hence our signature widget on this website…

Point being, if they’re going to shamelessly ignore or misinterpret the federal law, we are going to just jump through those hoops and say no, we want the files, please do your jobs.

Suppafly|1 year ago

>From what I can recall, it was complicated by the fact that I wasn't "next of kin", which would limit the data I'd have access to.

My state has a process for claiming unclaimed funds that banks and such report to the government and that is what is keeping me from claiming some funds my grandmother has listed on the site. It's not even clear to me what constitutes 'next of kin' legally, presumably it'd be one of her kids, but it's not like we have laws designating the oldest male heir and then on down the list.

mattw2121|1 year ago

As someone trying to piece together family history, after most of my family has died, I really appreciate this. Any and all efforts to make records available helps with clues. Building an accurate family history is a process of "one more document". This effort is definitely helpful to me. I've already utilized your service to submit a request for my grandfather's records. I'll be spending time searching for other relatives as well. Thanks!

necovek|1 year ago

I believe you should work to limit exposure of sensitive information like SSN: while it's ok to allow search by an exact SSN, you should probably not display it unless the requestor already knows what it is.

OTOH, if you have really succesfully worked to make this database public domain and do publish it somewhere (and you did, as I can see at https://archive.org/details/BIRLS_database), this wouldn't be of much help against any malicious actors out there.

But really, it seems the burden is on VA if there are non-deceased persons in the database since they have done a bad job of maintaining the data, and they would be liable for any leakage of information (unless Reclaim the Records was aware of any in particular). Even so, RTR might have put themselves out on the fence for some lawsuits against them too.

Asparagirl|1 year ago

The VA worked to confirm that everyone in this dataset is deceased, in order to satisfy the judge’s order, and produced an internal document about how they did it — which we then FOIAed and posted online too. (It’s up on the site, next to the legal paperwork.) The veterans and their SSNs are believed to have been deceased prior to mid-2020, checked by the VA’s internal datasets as well as public data sets such as the SSDMF. And SSNs of deceased people are *not private*, since they are never reused. The Social Security Administration also makes copies of all deceased peoples’ original SS-5 applications available to the public under FOIA.

fergbrain|1 year ago

Thanks for your efforts in liberating this data so that Ancestry.com isn’t the only ones with it!

Reminds me a bit muckrock.com as well.

Asparagirl|1 year ago

We love MuckRock! And we made the original FOIA request to the VA for this dataset via MuckRock’s platform. You can see the actual screenshot in the “Reclaiming These Records” legal papers section. They also get a shoutout in our colophon for indirectly inspiring the FOIA-by-fax-via-web method, although I believe their site uses e-mail, including interfacing with agency FOIA portals when possible.

ungreased0675|1 year ago

This site feels icky and I’m not quite sure how to articulate why. What is the purpose of this service? Why is it good for the public have access to detailed records of individual, recently deceased veterans? Isn’t this a gold mine for scammers? Is this project LDS affiliated?

tivert|1 year ago

> This site feels icky and I’m not quite sure how to articulate why. What is the purpose of this service?

Sounds like genealogy, and a small fraction of the documents in a veteran's would probably be very helpful in fleshing out some basic details of their military service (especially given a fire destroyed many of the original copies of those documents).

The actual medical records part seems inappropriate, though.

> Why is it good for the public have access to detailed records of individual, recently deceased veterans? Isn’t this a gold mine for scammers? Is this project LDS affiliated?

It seems like it's a gap in FOIA. These records should be available, just not to everyone in the whole world (at least not before, say, 60 years after the veteran's death). It seems legitimate that an appropriately-close family member should be able to request them (similar to restrictions in requesting birth certificates).

archerjax|1 year ago

My father, grandfathers and all my uncles are here. All deceased. I’m failing to find a use case for social security numbers to be present here. Or any of it to be honest.

hoppyhoppy2|1 year ago

Dead people's SSNs are published on a regular basis by the Social Security Administration (the "Death Master File" or "Social Security Death Index"). Once you're dead it's not really private information.

I agree with you re: living people's SSNs, though.

Bjartr|1 year ago

Spent a few minutes trying to find an answer to "why do this?" Beyond just implying that it should be done and the most I was able to find was one sentence buried amongst paragraphs and paragraphs of "what" and "how".

> these materials were largely unknown and inaccessible to historians, journalists, and genealogists

I think it would be worthwhile to lead with that and include a little more detail too.

If there isn't a clear motivation, people will assume the worst.

draftsman|1 year ago

I think it’s critically important to mention that the VA provided all this data to Ancestry.com years ago. According to the newletter op linked, Ancestry.com charges $300/year for access to this data. This unfairness is what prompted the lawsuit and ultimate release of data.

Suppafly|1 year ago

I think it's pretty obvious why this material should be available.

>If there isn't a clear motivation, people will assume the worst.

This is just a weird assumption.

neilv|1 year ago

This project might be entirely well-intentioned, but some possibilities to be careful of, with this kind of effort:

* Intent is to sell the data, or otherwise "monetize" it, in the techbro sense.

* "Shell" effort of a specific company that wants the data.

* Shell effort of an organized crime group.

* Shell effort of a foreign intelligence agency, or terrorist group.

Awhile ago, there was a different project, which had the effect of making different US records, which were already reasonably accessible to US citizens and journalists, easily available to foreign adversaries, such as for espionage profiling and blackmail. When that project was promoted on HN, I caught the promoter seeming to use a sockpuppet account in the comments (accidentally using the wrong account to respond to themself), which I found additionally suspicious.

Even when a project is fully honest and with good intentions, we also have to consider the risks of likely other consumers of the data, which include all the possibilities above.

redeux|1 year ago

As a veteran, I am aghast that this exists. What an invasion of people’s lives. To me this is far worse than when I was notified that a foreign adversary had stolen my military records, because at least they’re not publishing it on the web for all to see.

Veterans aren’t politicians, and they don’t deserve to have their lives put on display like this without their permission. Some vets signed up because they wanted to serve their country, some because they were running from people or poverty, but they were all just ordinary people trying to eke out a living.

I believe people, good people just trying to do their thing will be hurt by this information and that’s unfair. It’s just another example of people using veterans as pawns to achieve their ends.

What is the ends in this case? I couldn’t tell you. I do believe this will having a chilling effect on veterans seeking help from the VA at a time when they need it more and more.

Towaway69|1 year ago

As a person, not a Veteran, I totally agree. This data should not be public domain. Definitely only accessible to family members (if at all).

Having obtained my father’s military records, I can definitely say that I’m glad these weren’t online, searchable. Via those records I learnt much concerning my father and I’m glad I obtained them.

To get those records, I had to prove I was his son, that my father was indeed deceased and that he hadn’t said/written anything to prevent me from having his records. That should be the minimum (IMHO) for obtaining such records. This wasn’t in the USA though.

jsjohnst|1 year ago

I’m also a veteran and immediately had the same thought. Thank you for thoughtfully replying, I probably would’ve been more aggressively toned had I not read your reply first.

_DeadFred_|1 year ago

If you give the government tools for hiding information, the government will abuse those tools. This is what freedom looks like. I have family members in this dataset, and they would be fine with it because they would pick freedom and this is what freedom has to look like.

tompagenet2|1 year ago

I think the people in this are deceased? Does that change your view (genuinely asking)?

flippyhead|1 year ago

I love the included legal timeline. Strong work!

tivert|1 year ago

Somewhat related: there was a fire in 1973 that destroyed the military records of a large fraction of former military personnel at the time: https://en.wikipedia.org/wiki/National_Personnel_Records_Cen....

Asparagirl|1 year ago

Yes, a terrible fire, although there are efforts ongoing to restore some of those files, even reading the data from charred papers and edges with newer technology.

However, these particular files (benefits claims files, or C-Files) are a different type of file and never burned. Better yet, they often have some parts of the veteran’s OMPF that were copied *into* the C-File, to establish eligibility for those benefits — copies that were made before the fire! In other words, these files could serve as partial backups…

greentxt|1 year ago

Really hope op and crew are sued to oblivion.

Asparagirl|1 year ago

Actually, “op and crew” were the Plaintiffs (well, really the Petitioners, to be pedantic) who already sued the VA for this database in federal court (SDNY), and won that multi-year lawsuit, and even won our attorneys fees too. If you had checked our website, you’d see we even posted the court papers online for free, from both sides — and the judge’s order in our favor, of course.

WarOnPrivacy|1 year ago

> Really hope op and crew are sued to oblivion.

Might a less vitriolic response be more productive here? Given the altruism that leads this endeavor, strong negativity might best be delayed until a more holistic understanding is gained.

For example, I've been accessing these same records thru Ancestry for some time - along with millions of other Ancestry users. If these records were a realistic vector for actual meaningful harm, the evidence should have manifested some time ago.

Broadly speaking, if we want privacy efforts to help in tangible ways, it's important to limit restrictions to where they do provable good. The alternative is restrictions that apply only to us - and not those with motivation (financial, power) to use private data for their own ends.

bovermyer|1 year ago

Why? I'm curious. What's wrong with making information on deceased service members searchable?

tantalor|1 year ago

If and when the VA locates the file, scans it, and redacts it, you should expect to receive a DVD in the mail containing the newly-digitized images of the materials several weeks or months after your initial acknowledgement letter has arrived.

Ha, the only device I have with DVD is a PS5, this should be fun.

asacrowflies|1 year ago

I need to get a copy of the database asap just for my data archiving neurosis lol. The amount of vitriolic comments here from entitled "service" members is astounding and makea me wish to safeguard it personally. I mean we have people in the comments calling people "extremist" and bringing up founding fathers bullshit ....

Only jarheads seem to think the parental tone of "you don't know what freedom is" actually works.... Maybe because they have been thru boot camp idk.

tivert|1 year ago

> The amount of vitriolic comments here from entitled "service" members is astounding and makea me wish to safeguard it personally.

As someone who's never been in the military and isn't even acquainted with that many people who have been, I think I should give you a head's up that you are being an entitled asshole.

I think most people would be mad if they found out some of the most private and personal records about them (https://news.ycombinator.com/item?id=42685002) would be made public to anyone who'd care to request them. It's a pretty terrible violation of privacy. Try to think about how you'd feel if something like that was going to happen to you (say, your complete browsing history would be made public, because I'm sure you have one and you almost certainly want to keep it private).