top | item 19345940

(no title)

duairc | 7 years ago

> Writing my own docx parser? Sure, that will be one mythical man century of work.

I'm not sure that's actually a great example. I had a project a while ago where I needed to extract certain information from Word and Excel files, and it was less work to just write my own parser (it's just XML in a ZIP file) that got exactly the information I needed than to figure out all the complexity of using a full-blown docx/xlsx parser. It ended up being 100 lines of Haskell, and half of that was imports.

https://gist.githubusercontent.com/duairc/db3e99a7808668e84e...

Edit: The docx part of it is only 10 lines of code.

discuss

order

antt|7 years ago

There's a slight difference between extracting a few tags from an xml file and building an manipulable ast of it.

TeMPOraL|7 years ago

There is, but if your problem requires just the former, it's faster and better to build it yourself than to pull in a heavy third-party dependency (of which you'll use 1% anyway).

ai_ia|7 years ago

That's a pretty neat trick.