top | item 43268932

(no title)

CaptainJack | 1 year ago

I've used beancount extensively, spent many hours a few years ago. Built importers parsing bank PDFs (in UK, plaid doesn't work. Plus I'd rather also keep all the original statement PDFs).

Probably built 10+ importers, plus some plugins to do automated transaction annotations.

I have not made any update for many years now, because: - Downloading statements is still a pain, have to manually go through all websites. Banks are bad at making the statements available, and worse making it possible to automate it. - The root of the issue is actually that beancount is too slow. Any change/update takes ages. Python is both a blessing (makes it easy to add plugins/importers etc), and a curse (way slower than some other languages.

I believe the creator of beancount has started working on v3 with a mix of C++/python, relying on protobufs, a C++ core for parsing, etc. AFAIK, that is not production-ready yet.

discuss

order

chrislloyd|1 year ago

I have a very similar setup but with HLedger[1]. A "do-nothing"[2] script helps me download statements by opening bank websites, waits for manual import and finally checks balances. That makes it a lot less repetitive and error prone. Or at least, I catch the errors faster.

I've found HLedger and Shake to be fast enough to process almost a decade of finances. Dmitry Astapov has an extremely well produced tutorial workflow[3].

How have you managed the PDF parsing? Mine has become a bit of a mess dealing with slight variations in formatting as they change over time. I've been considering using LLMs but have been nervous about quality.

[1]: https://hledger.org [2]: https://blog.danslimmon.com/2019/07/15/do-nothing-scripting-... [3]: https://github.com/adept/full-fledged-hledger

Karrot_Kream|1 year ago

Why not spot check your PDF LLM outputs? I always make sure my accounts balance by hand anyway. Though Occasionally it's really painful especially if it's a missing Venmo transaction. It's rare that I need to really comb through my accounts to account for some money but when I do it's really time-consuming.

faustlast|1 year ago

I also use ledger/hledger to process a decade of finances. I reconcile once a year when doing taxes. I have multiple python scripts orchestrated with org-mode to generate reports/plots. I run them in separate processes since they are independent, which makes it fast enough (seconds).

What is Shake?

FredPret|1 year ago

I suspect Python isn't the limiting factor here - it's the file format. You can end up with huge interconnected text files that have to be fully parsed on every change.

If you have 1e5 - 1e6 of lines of transactions, I think a SQLite database would be a huge step forward. If you have much more than that, you probably need an ERP system.

Of course the text files make it ~easy to enter transactions, but maybe there's an elegant way to use those for ingestion only; that does make the system much more complicated to use. That might not be a problem for the kind of person using plain-text accounting over the course of years though.

mtlynch|1 year ago

v3 is out now and v2 is officially deprecated:

https://groups.google.com/g/beancount/c/iTdRuvZnE4E

I found the migration pretty confusing and haven't found good documentation on how to go from v2 to v3.

The best I've found is this unofficial write-up from an experienced Beancount user:

https://sgoel.dev/posts/moving-from-beancount-2x-to-3x/

CER10TY|1 year ago

As far as I can tell this is without the planned C++ rewrite though, and the documentation at https://beancount.github.io/ still says to use v2.

Is there a point in migrating already?

diftraku|1 year ago

I'd be really curious on how hard programmatic access to your own, personal banking data might be in the PSD2-era.

I can link my secondary bank account to my main bank's app so I can see the balance in one place, but the catch is that I need to refresh this authorization through the app every 90 days.

Ideally, you'd just use your banking credentials to authorise the API access and pull data through that. What this requires in practice, I have no idea but it probably involves a bit of bureucracy.

Nextgrid|1 year ago

Some modern banks (Monzo, Starling, etc) give the account holder (read-only) access to their API.

If you can't, you can try use one of the open banking providers such as TrueLayer, Plaid, Nordigen (seems to be acquired by GoCardless: https://gocardless.com/bank-account-data/), etc. Most have a free/dev tier that nevertheless allows connections to real accounts and might be enough for personal use.

Finally, screen-scraping is potentially an option. One of the few benefits of shifting everything to SPAs is that you generally have clean JSON APIs under the hood that are easier to interface with than "conventional" screen-scraping involving parsing HTML.

jazzyjackson|1 year ago

Ran into this annoyance recently setting up new accounting software, that the access my bank provides is last 6 months only, so I still had to go and export a csv, rejigger the column names and date format, to reimport the first 8 months of 2024.

My thought for working around tracking new transactions without a third party is to just set up email alerts so I get a notification on every charge, deposit etc and set up some cron job to read new emails and update my books.

BeetleB|1 year ago

> Downloading statements is still a pain, have to manually go through all websites.

Have you considered using Playwright?

I used aider[0] recently to log into my work's payslips and download all the relevant payslips into JSON format (with values encrypted). It took about 3 hours, but that's mostly because of my lack of knowledge of good CSS selectors.

jxjnskkzxxhx|1 year ago

Banks in the UK allow to export transactions in many formats. Login, pick time range, download in ofx format. Why is this a pain?

erikerikson|1 year ago

It makes it about them not about you. I don't care which banks and other financial providers I use. I care about managing my funds in a way that is efficient and healthy for my life. The banks I use are simply service providers, a subclass of service providers across all the dimensions of my life. They have regulations they must abide by but in so doing they attempt to force me to think and act in those terms and I think they're poor.

cranky908canuck|1 year ago

"banks allowing export of transactions" is only the start.

I deal with two banks for credit cards.

One (call it "Blue Bank") allows me to download a statement. I filter out a couple of things (payments mostly), check that it matches the paper statement balance, and post it. About 15 minutes start to finish.

The other (call it "Orange Bank") allows me to download a "statement". I filter out a couple of things (payments mostly), check my previous month's transactions to see which ones at the beginning of the file actually go in the current billing period (not already paid), stare at the last transactions to see which ones actually were posted to the current billing period (not after the cutoff), run the script to check the total (nope, doesn't match) then do that a couple of times until it matches. The time they changed the meaning of the "credit" column from "just confirming this is a credit" to "it's a credit, you need to flip the sign" it was 45 minutes.

But hey, it's all CSV!

BeetleB|1 year ago

Multiple bank accounts and multiple credit cards. Also, figuring out the time range for each bank.