top | item 32605090

(no title)

tcpekin | 3 years ago

The way I get around this is to start an IPython interpreter, and run .py files with `run -i file1.py`. This loads things into memory in the interpreter, and then I can run file2.py with the actual analysis, and iterate with file2.py until I'm happy. In the end, you can keep the files separate, or combine them into 1 file that will run top to bottom your whole analysis. As long as you keep the IPython session open everything remains in memory, just like in a notebook. The autoreload magic also works if you set it to the correct option, so if you are working on a library/package it will automatically reload them if necessary.

discuss

immibis|3 years ago

Isn't this literally just the Jupyter workflow but in a command line?

tcpekin|3 years ago

Whoops sorry for the late response - somewhat yes. You can configure a Jupyter workflow to work like this, but I don't find that I, or many other people do, as it takes more discipline to not hop around cells.

One of the main differences for me is that the .py file is run in its entirety (outside of if/else blocks for loading data). That usually corresponds to multiple cells of a Jupyter notebook that one would need to Ctrl+Enter through, where missing one would cause a problem.

The second is just how you can decouple the code and the terminal - it's a personal pet peeve that Jupyter notebooks jump around when running through cells - I don't want to be scrolling all around just to reset some variables to their original values, and it's really nice to run a whole .py script and see an output side by side, where the script is much longer than my screen. I can keep it open at the important part in VSCode, and change some intermediate process, and let all of the ad hoc plotting code remain at the bottom.

Finally, the biggest difference for me is how figures behave - the way I have it set up is that they open in their own window and remain interactive (can zoom/pan). I know you can do it in Jupyter as well, but the workflow really emphasizes inline plotting with non-interactive plots, especially when it comes to sharing them. But with the .py script and IPython command line, I can open up 5 figures, tile them however I'd like, and then refer to them by name/number in my script, so I can clear and overwrite them however I'd like, and they don't close or move around. This makes comparing things very easy, like how changing a parameter changes the rest of my analysis.

Lastly - the way it is set up is more like Matlab... whoops, but I think their workflow is much more ergonomic than a notebook. However, for sharing with other people, I usually just copy and paste the various parts of my scripts into a notebook, as that is the de facto standard.