Loading Chase.com transactions into Ledger20 Nov 2016
When trying to migrate away from Mint to Ledger, one of the major obstacles is automating the work of entering every 3 dollar coffee shop transaction manually. Ledger does provide some help in this area through the use of ledger xact, yet as of today, we enter 25 transactions per week into ledger and a large chunk of them is from our Chase credit cards. Automation seems to be the easiest way to make this a lot less painful.
Headless Selenium to the rescue.
Selenium commonly runs while interacting with an actual, visible browser. This however makes windows pop up on my machine which I don’t care about to I migrated to a headless implementation, in this case PhantomJS early on. This makes development a little trickier because you can’t observe what Selenium does and where it gets stuck but screenshotting mostly alleviates this problem.
The language of choice for this was ruby for no other reason than me being pretty familiar with it and it offering Selenium bindings as well as a reasonable amount of example code. Curiously, Chase seems to require the user-agent to look non-automated so this is the first hurdle to take care of while automating:
I mostly copied the user-agent string from my actual browser and that seemed to be enough. Since PhantomJS is a full browser, it takes care of all cookie handling and other features that might be required to get the Chase website to work.
Login & Frames :-(
What’s not trivial is realizing that the Chase website actually uses frames. Frames require special handling in Selenium so one has to realize that elements can’t be found because there’s a frame around the elements which needs to be entered first.
For Chase, we first navigate to a url displaying the login box:
Now we enter the frame the login inputs are located in and fill them
The code above enters the frame, fills the form, takes a screenshot if in debug mode and then logs us in. The last line exits the frame again as the remainder of the interaction happens outside of frames.
After successfully submitting the login, Chase may ask for a second authentication factor. The code on GitHub handles this case by asking for the OTP token that Chase emails to the address stored in their database; here I’ll obmit it for brevity.
Account discovery and CSV download.
Next we have to wait until the accounts overview is loaded, extract all account numbers and download a CSV file with the latest transactions.
This is pretty much it. Now we have a CSV file for all transactions for all accounts accessible through the supplied chase login. This can easily be translated to Ledger, either through its import methods or through Reckon.
I use the following Reckon command to convert Chase CSV to Ledger
reckon --ignore-columns 1,2 -l <Ledger file> --contains-header --unattended --account Liabilities:Chase -f <CSV file>
and this works quite well for me.
How often does it break?
In the last 4 months, I did not have to change any of the code. I did however once have to enable screenshotting to notice that Chase was serving a notice which one had to acknowledge before the accounts page. After manually doing that, the script worked again without modification. I have scripts for 2 others banks and I have not modified any of them. Bank websites don’t seem to change so much, it turns out.
The code for all this is here.Tweet