top | item 40147836 (no title) jfkfif | 1 year ago the problem is multinode runs that communicate through the network discuss order hn newest freeone3000|1 year ago Multinode runs don’t communicate through the network in a DGX configuration. NVlink allows for RDMA over direct infiniband. No need for network here. tomoyoirl|1 year ago Infiniband is a network too…But even if we set that aside you’ll get access to your data over a network connection because these are expensive nodes running batch jobs with finite disk space, not personal workstations. load replies (1) josh-sematic|1 year ago Yes, which is especially important for training. Getting good GPU interconnect can be really important for training large models.
freeone3000|1 year ago Multinode runs don’t communicate through the network in a DGX configuration. NVlink allows for RDMA over direct infiniband. No need for network here. tomoyoirl|1 year ago Infiniband is a network too…But even if we set that aside you’ll get access to your data over a network connection because these are expensive nodes running batch jobs with finite disk space, not personal workstations. load replies (1)
tomoyoirl|1 year ago Infiniband is a network too…But even if we set that aside you’ll get access to your data over a network connection because these are expensive nodes running batch jobs with finite disk space, not personal workstations. load replies (1)
josh-sematic|1 year ago Yes, which is especially important for training. Getting good GPU interconnect can be really important for training large models.
freeone3000|1 year ago
tomoyoirl|1 year ago
But even if we set that aside you’ll get access to your data over a network connection because these are expensive nodes running batch jobs with finite disk space, not personal workstations.
josh-sematic|1 year ago