The problem is data gets into these mega-excels through all sorts of funky routes... and I really do mean funky :)
1. PowerQuery: this is defined statically in the notebook so is detectable by Pyoneer. But I don't know a ton about the integration in Python here. I imagine this is doable.
2. Manual data entry: Pyoneer can't detect this from a static Excel file, really - what's the difference between the static Excel sheet and data updates ever time? Oftentimes, users with a lot of manual data entry to "automate this in Python" by turning an Excel file into like a form. Generating a proper web app out of the Excel file would be pretty sweet!
3. Database output copied - aka, copy in a table. This one is sometimes pretty crazy - I've seen Excel workbooks that have SQL queries just copied and pasted into a random cell in the notebook, so you can copy that and run it on some archaic SQL server. And then copy the output back in...
4. Macros: runs an API call, or an SQL query, or pulls (and then formats) data from another Excel sheet. Then put it in the right place. This then requires translating Macros - which are a whole programming language of their own. This is actually pretty high-priority for us right now, based on early feedback from developers who are in the thick of it with big Excel files.
6. Custom plugins. Big finance shops build/buy plugins that pull in data all the time! We haven't really started investigating how to handle these.
5. Other workbooks: at large banks, there's an additional dependency graph of workbooks that rely on eachother across the org. It's epic. There's a single workbook that defines all market holidays, that's used for all excel files that do performance reporting. And then these performance reports feed into other Excel's (by way of direct references, but also by way of copy and pasting, but also by way of uploading/downloading through a database). Support multiple Excel files at once is something we'll have to tackle eventually!
So... there's a lot to do here. We're really early - so we're focused on two primary things right now:
1. Solving the most pressing pain points first. Hence the early launch so we can talk to more folks and prioritize better. I've got a reasonable idea since I've done so much of this work myself, but every finance shop does things different...
2. Leaving good TODOs when we can't translate something. Currently, we can't translate pivot tables or complex formulas -- but we generate TODOs for these so you can go back and fill them in with the Python skills you have (and maybe the help of ChatGPT).
We're aiming to just give you a Python script. So if we don't translate the data pull how you want... you can just edit the notebook :)
narush|1 year ago
The problem is data gets into these mega-excels through all sorts of funky routes... and I really do mean funky :)
1. PowerQuery: this is defined statically in the notebook so is detectable by Pyoneer. But I don't know a ton about the integration in Python here. I imagine this is doable.
2. Manual data entry: Pyoneer can't detect this from a static Excel file, really - what's the difference between the static Excel sheet and data updates ever time? Oftentimes, users with a lot of manual data entry to "automate this in Python" by turning an Excel file into like a form. Generating a proper web app out of the Excel file would be pretty sweet!
3. Database output copied - aka, copy in a table. This one is sometimes pretty crazy - I've seen Excel workbooks that have SQL queries just copied and pasted into a random cell in the notebook, so you can copy that and run it on some archaic SQL server. And then copy the output back in...
4. Macros: runs an API call, or an SQL query, or pulls (and then formats) data from another Excel sheet. Then put it in the right place. This then requires translating Macros - which are a whole programming language of their own. This is actually pretty high-priority for us right now, based on early feedback from developers who are in the thick of it with big Excel files.
6. Custom plugins. Big finance shops build/buy plugins that pull in data all the time! We haven't really started investigating how to handle these.
5. Other workbooks: at large banks, there's an additional dependency graph of workbooks that rely on eachother across the org. It's epic. There's a single workbook that defines all market holidays, that's used for all excel files that do performance reporting. And then these performance reports feed into other Excel's (by way of direct references, but also by way of copy and pasting, but also by way of uploading/downloading through a database). Support multiple Excel files at once is something we'll have to tackle eventually!
So... there's a lot to do here. We're really early - so we're focused on two primary things right now:
1. Solving the most pressing pain points first. Hence the early launch so we can talk to more folks and prioritize better. I've got a reasonable idea since I've done so much of this work myself, but every finance shop does things different...
2. Leaving good TODOs when we can't translate something. Currently, we can't translate pivot tables or complex formulas -- but we generate TODOs for these so you can go back and fill them in with the Python skills you have (and maybe the help of ChatGPT).
We're aiming to just give you a Python script. So if we don't translate the data pull how you want... you can just edit the notebook :)