(no title)
leeab | 10 months ago
The main 3 are:
- GPU runtime stats from NVIDIA smi
- Running pods from Kube state
- Node data & events from Kube state
We have several screens with similar information intended for different roles. For example, the Workloads screen is mainly for researchers to monitor their workloads from creation to completion. The Reports screen shows mainly cost data grouped by team/project, etc.
No comments yet.