top | item 36634771

(no title)

adament | 2 years ago

But would you version it by storing it as output in an ipnyb file where it is overwritten if you rerun that cell? I would store the data in a versioned database or as separate data files in the repository (possibly stored in git-lfs). And I would store results of the analysis as data files / image files / whatever else, NOT as ephemereal outputs in an ipynb file. But I am pretty far down the “ipynb files are for local use only” path.

discuss

tnecniv|2 years ago

Yeah if your analysis to takes hours to run, you should really split up the number crunching code and result analysis / visualization. Not only does it make version control of the code easier, you can save the output in an organized labeled manner (time-stamped, etc.) and, if you lose power or the kernel crashes, you don’t need to rerun the lengthy analysis if you want to make a change further down the pipeline.

Zandikar|2 years ago

It was an extreme example to drive home the point that one is "human scale time" and one is "computer scale time", people are reading far far too much into my choice of hours specifically there.