Cluster Resources

Storrs HPC Cluster

The Storrs HPC cluster currently comprises over 11,000 cores, spread among 400 nodes, with two CPUs per node. The nodes include four generations of Intel CPUs: SandyBridge, IvyBridge, Haswell, and Skylake, and the latest AMD Epyc CPU. We also have a small number of Intel Broadwell nodes and Intel Phi nodes (see table below).

Hi-speed parallel storage is provided; 1000TB of scratch storage and 100TB of persistent storage (expanding to 200TB this summer).
Over 400TB of slower archive storage is also available.

We have four types GPU nodes available, listed in a table below.

The Red Hat RHEL7 operating system runs on the Skylake nodes and newer. The remaining nodes will be upgraded from RHEL6 to RHEL7 soon. The Slurm scheduler manages jobs. Network traffic travels over Ethernet at 10Gb per second between nodes, and file data travels over Infiniband at 56Gb or 100Gb per second, depending on the node. The nodes are each connected via our Infiniband network to over 1000TB of parallel storage, managed by GPFS.

The Storrs HPC cluster is supported by three full-time staff, and three or more student workers. Scientific applications are installed as needed; to date over 200 have been made available.

Node Details

CPU CPU
Speed
Cores
per Node
Memory
per Node
Number
of Nodes
Total
Cores
Network
Speed
Infiniband
Speed
Intel Sandy Bridge 1.20Ghz 16 64GB 40 640 1Gb 56Gb
Intel Ivy Bridge 2.80Ghz 20 128GB 32 640 10Gb 56Gb
Intel Haswell 2.59Ghz 24 128GB 184 4416 10Gb 56Gb
Intel Broadwell 2.20Ghz 44 256GB 4 176 10Gb 56Gb
Intel Phi 1.0Ghz 256 192GB 2 512 1Gb 56Gb
Intel SkyLake 2.70Ghz 36 192GB 90 3240 10Gb 100Gb
AMD Epyc 2.35Ghz 64 256GB 38 2432 10Gb 100Gb

GPU Node Details

GPU Type CPU Type Number of Nodes GPUs/node Total GPUS
NVidia Tesla K40m Haswell 2 2 4
NVidia Tesla V100 Skylake 9 1 to 3 18
NVidia Tesla GTX 1080 Ti Skylake 12 1 to 3 37
NVidia Tesla RTX 2080 TI Skylake 5 8 40

The Condo Model

In the Condo Model, researchers who fund nodes get priority access. However, if their priority job queue becomes idle, unprivileged jobs may run instead. Once started, unprivileged jobs can run for up to twelve hours before they stop. So although priority jobs could wait twelve hours to start, typically most priority jobs wait less than an hour. Futhermore, if priority users keep their job queue full, their jobs will not wait at all.

You can read more about the “Condo Model” on our Wiki page.


Last updated May 26, 2020