top | item 43468618

(no title)

jandinter | 11 months ago

I maintain a repository with all German legal acts which is up to date:

https://github.com/jandinter/gesetze-im-internet

I scrape the official website (https://www.gesetze-im-internet.de) once a week. The repository contains the "official" XML files with a formatting that is more focussed on presentation than on the logical structure of the legal acts, unfortunately (https://www.gesetze-im-internet.de/dtd/1.01/gii-norm.dtd).

Some time ago, someone from the digital service of Germany reached out and asked about my use case. Maybe there will be an official version of a "Git law" repo someday...

discuss

order

Bewelge|11 months ago

Very cool! I came across your project last year while building https://digebu.de .

I wanted to build an "IDE-inspired" law reader. It has selection highlighting and you can open references within the same window. It scrapes gesetze-im-internet.de daily, processes the XML to JSONS and builds static HTML pages, hosted on Github pages. The entire build process for the 6000+ pages takes 5-10 minutes. It uses up less than <20% of my actions minutes that come with Github pro.

It was a really fun rabbit hole to go down.

What I found most fascinating is that: There doesn't seem to be an official version of the German law. The state just publishes official announcements like "Law X will be changed as follows", "Law X will be removed" or "Law X will be added". So the official version of the German law really is something akin to a git tree. AFAIK, all consolidated versions are created by private entities.

I did a test by picking a law at random, finding the first time it was published and then applying all the changes from subsequent years. Turns out all available versions (gesetze-im-internet, dejure.org, buzer.de) had at least a couple of small mistakes. I found that quite fascinating (and a little scary).

It's also funny how often laws are referenced that don't even exist anymore. The collection of laws really are is as tidy as you would imagine an 80 year old system, where the maintainers change every 5 years, to be.

NoMoreNicksLeft|11 months ago

Has git ever made the necessary updates so that you can have proper datestamps on the 80 yr old laws? Last I had checked, nothing prior to unix epoch can be put into git.

hulium|11 months ago

> Turns out all available versions (gesetze-im-internet, dejure.org, buzer.de) had at least a couple of small mistakes.

Can you say more about what these small mistakes were? Would they affect the interpretation of the law?

couscouspie|11 months ago

What did you tell him about your use case?

I'm asking as I don't agree on the underlying assumption a use case was needed. I consider the value of transparency and public information for a democratic society as evident.

jraph|11 months ago

The question might not have been about the transparency, but more about the choice of having it as a git repository, or whether there are actual tools based on the git repository. Arguably, the git repository is unusable for the majority of people, so it cannot be an answer to transparency in itself, some user-friendly tools based on it might.

I'm also interested in the response btw :-)

jandinter|11 months ago

I just want to archive the "official" XML files since the "official" website does not provide an archive. For that reason, I also don't change the XML files: The spec is available and everyone can build their own transform (to JSON, XML, whatever) based on their particular needs.

nicbou|11 months ago

They are working on an official API to replace Gesetze im Internet. It should be out in the next few weeks according to its developer.

tapia|11 months ago

Nice work. Maybe you could do some preprocessing of the XML data, so that you actually have a diff of the content and not the whole XML block.

jandinter|11 months ago

I thought about it, but decided against pre-processing: The repo is meant to be an archive, and the XML spec can be looked up. If I were to introduce a new structure by pre-processing the files, I think that might be a plus for reading, but not for archiving. Whoever has a concrete use case (the "Digebu" website above looks great!), can write their own pre-processor for that use case.