Node Reference Guide
This page provides technical specifications and configuration options for compute nodes on AAC clusters.
Cluster node specifications
MI325X nodes
CPU: - Dual-socket AMD EPYC processors - 256 CPU cores per node (128 cores per socket)
Memory:
- System memory: varies by node configuration (check with sinfo or scontrol show node)
GPU: - 8× AMD Instinct MI325X accelerators per node - GPU memory: 256 GB HBM3 per GPU
Interconnect: - High-speed fabric for multi-node jobs
Operating System: - Ubuntu 22.04
MI355X nodes
CPU: - Dual-socket AMD EPYC processors - 256 CPU cores per node (128 cores per socket)
Memory:
- System memory: varies by node configuration (check with sinfo or scontrol show node)
GPU: - 8× AMD Instinct MI355X accelerators per node - GPU memory: 288 GB HBM3 per GPU
Interconnect: - High-speed fabric for multi-node jobs
Operating System: - Ubuntu 22.04
GPU partitioning and NUMA modes
MI325X and MI355X nodes support multiple GPU partitioning modes (SPX, DPX, QPX, CPX) and, on MI325X, configurable NUMA modes (NPS1, NPS2, NPS4). MI355X has a fixed NUMA configuration.
For the full reference — mode descriptions, Slurm --constraint= examples, NUMA tables, verification commands, and use-case recommendations — see GPU partitioning modes.
Performance considerations
- Node availability: Not all partitioning and NUMA modes may be available on all nodes. Use
sinfo -N -o "%N %f %t"to check configured features before submitting jobs. - Partitioning and NUMA tuning: See GPU partitioning modes for guidance on choosing SPX/DPX/QPX/CPX and NPS1/NPS2/NPS4.
Related documentation
- AAC Slurm Cluster User Guide - General cluster usage and Slurm commands
- Storage and Shared Filesystems - Filesystem layout and best practices
- Prerequisites - Access requirements and common software