top | item 29301439

(no title)

HMH | 4 years ago

Containers would probably work fine for building but I am pretty sure running them is going to be a massive pain. Typically you need GPU pass-through and you will want to run your software on multiple nodes for example using MPI, which I don't think will be straightforward, especially if you have to explain all this to a scientist who just wants to run their software.

But perhaps someone can refute this?

discuss

order

cameron_b|4 years ago

I think there are two parts in play here: build process and deployment.

From what I’ve been reading at OpenHPC about Slurm and friends, you wouldn’t normally waste deployment time doing the compile operation on each node, you could but it’d be inefficient to have 1000 nodes all compile at the same time instead of mounting your share and pulling down the executable… Which can be built in any of your favorite build environments. I’ve seen some stuff most recently about writing in Python and using DSL compile tool chains to prep an executable that’s much faster than dropping Python on the cluster. That’s something you set up in a Docker environment or Git action mess that I don’t understand.