Skip to content

Node Reference Guide

This page provides technical specifications and configuration options for compute nodes on AAC clusters.

Cluster node specifications

MI325X nodes

CPU: - Dual-socket AMD EPYC processors - 256 CPU cores per node (128 cores per socket)

Memory: - System memory: varies by node configuration (check with sinfo or scontrol show node)

GPU: - 8× AMD Instinct MI325X accelerators per node - GPU memory: 256 GB HBM3 per GPU

Interconnect: - High-speed fabric for multi-node jobs

Operating System: - Ubuntu 22.04

MI355X nodes

CPU: - Dual-socket AMD EPYC processors - 256 CPU cores per node (128 cores per socket)

Memory: - System memory: varies by node configuration (check with sinfo or scontrol show node)

GPU: - 8× AMD Instinct MI355X accelerators per node - GPU memory: 288 GB HBM3 per GPU

Interconnect: - High-speed fabric for multi-node jobs

Operating System: - Ubuntu 22.04

GPU partitioning and NUMA modes

MI325X and MI355X nodes support multiple GPU partitioning modes (SPX, DPX, QPX, CPX) and, on MI325X, configurable NUMA modes (NPS1, NPS2, NPS4). MI355X has a fixed NUMA configuration.

For the full reference — mode descriptions, Slurm --constraint= examples, NUMA tables, verification commands, and use-case recommendations — see GPU partitioning modes.

Performance considerations

  • Node availability: Not all partitioning and NUMA modes may be available on all nodes. Use sinfo -N -o "%N %f %t" to check configured features before submitting jobs.
  • Partitioning and NUMA tuning: See GPU partitioning modes for guidance on choosing SPX/DPX/QPX/CPX and NPS1/NPS2/NPS4.