Tahoma
Tahoma is a high performance computing (HPC) cluster with 184 Intel “Cascade Lake”™ nodes with a clock speed of 3.1 GHz.
The system has an aggregate of 98TB of memory, 10 PB of global storage in a BeeGFS file system, and an aggregate of 536TB of local disk. Its peak performance is 1015 Teraflops.
Nodes
Each compute node has 2 18-core sockets for a total of 36 Intel© Xeon™ Gold 6254 cores per node with 384GB of memory yielding 10 2/3 GB of memory per core.
Accelerators
Tahoma has 24 ML/AI nodes each with 2 NVIDIA Tesla V100 32GB GPUs (48 GPUs total). The GPUs are attached via a PCI-express bus (PCIe x16) to the Xeon© cores and their 1.5TB of memory.
Network
The 184 nodes are connected by an HDR Infiniband fabric connected through 7 40-port Mellanox MQM8790 switches. Four top-level switches have 5 200 gigabit/second links to each of 7 leaf switches. Each leaf switch is directly connected to up to 40 nodes at 100 gigabit/second.
Current Status
For internal connections, the status of Tahoma, the filesystems, and many other metrics associated with the compute resources are available in our Grafana dashboard: