Release Notes - Version 2.0.0
Release Date: June 4, 2026
🎉 Highlights
Version 2.0.0 is a major release that completes the Okta SSO integration, adds powerful new dashboards for administrators and leadership, and enforces stricter resource governance across the platform. Users can now sign in through an external Okta identity provider with automatic team assignment, and administrators gain real-time operational visibility through a new home page dashboard. GPU capacity is now surfaced live in queue views, and SSH access to interactive workloads is available for power users. Complementary services also see meaningful improvements: Plexus Satellite restores Singularity connectivity, AAC Builder completes a full security hardening cycle, and the Analytics service becomes more resilient to InfluxDB outages.
🚀 New features
🔐 External Okta login
Users can now sign in using an external Okta identity provider through a dedicated login button on the sign-in page. The authentication flow uses a secure server-side code exchange, keeping tokens out of the browser. LDAP group membership is used to automatically assign Okta users to the correct teams, eliminating manual onboarding steps. Name parsing handles the "Last, First" format common in enterprise directories.
📊 Executive and operational dashboards
Two new dashboards are available in this release.
- Admin operational dashboard — Administrators now have a dedicated dashboard on the platform home page showing real-time activity across teams, users, invitations, and applications.
- Executive BI dashboard (phase 1) — A new business intelligence summary view displays GPU-hours metrics and workload activity, giving leadership actionable data without requiring access to raw platform APIs.
A dedicated executive summary API endpoint backs these dashboards, providing GPU-hours metrics that can also be consumed by external integrations.
⚡ Live GPU capacity in queue views
Queue resource views now surface real-time free and allocated accelerator capacity, sourced from Prometheus across both Slurm and Kubernetes clusters. Administrators and users can see at a glance how much GPU capacity is available before submitting a workload.
🖥️ SSH access to interactive workloads
AAC users can now SSH directly into interactive workload sessions, enabling more efficient debugging and management of running jobs without leaving the terminal.
🔒 Per-team GPU enforcement
GPU usage limits are now enforced at job-creation time. When a team has reached its GPU allocation, new jobs are blocked immediately rather than failing later, giving users clear and timely feedback.
🔑 SSH public key management
Users can now view their SSH public key on the user overview page and edit it directly in the account settings, without requiring administrator intervention.
🔧 Improvements
- Interactive port reliability — Interactive Connect is disabled when a port is not yet ready, and not-ready ports are sorted to the end of the port selector. This prevents failed connection attempts and reduces user confusion.
- Prometheus fields on clusters — Cluster views, creation, and editing screens now include Prometheus configuration fields, enabling administrators to wire up monitoring at cluster setup time.
- User storage quota in team limits — The user storage quota field is now visible in team queue limit views, giving administrators a complete picture of per-team resource constraints.
- Maximum accelerators per team — Queue limit views now show the maximum accelerators allowed per team alongside other quota information.
- LDAP group field on teams — The team form now includes an LDAP group field, making it easier to configure LDAP-driven team membership.
- Cluster detail stats for all users — The cluster detail page now calls the stats API for all users, not just admins, improving visibility.
- Node count consistency — The
num_nodesfield handling was unified across Slurm and Kubernetes job flows, eliminating inconsistencies in distributed workload submissions. - OpenAPI typed DTOs — API interactions for clusters, teams, queues, application libraries, application families, and file folder creation were migrated to typed OpenAPI-generated DTOs, reducing runtime errors and improving type safety throughout the platform.
- Angular 20 upgrade — The frontend framework was upgraded from Angular 19 to Angular 20, keeping the platform on a supported and current release.
- Django 5.2.14 upgrade — The backend framework was upgraded to Django 5.2.14, incorporating the latest security patches.
🐛 Bug fixes
- Fixed Prometheus password not being sent when creating or editing a Kubernetes cluster
- Fixed stuck persistent volume claims (PVCs) that could leave workloads in an unrecoverable state
- Fixed presigned S3 download signature failures affecting regional buckets
- Fixed WebSocket closed errors in the ws-server message dispatcher
- Fixed race condition causing
TransitionNotAllowedandDatabaseErrorexceptions during asynchronous invite sending - Fixed app bootstrap sequence to redirect to sign-in instead of showing a timeout after a 401 response on
users/me - Fixed dialog padding at the UI library level, removing per-component margin workarounds
- Fixed Singularity wizard incorrectly showing the upload definition file option when Virgo Builder is disabled
- Fixed per-task gevent timeouts and disabled ports on cluster outage, preventing stale connections from hanging
- Fixed Okta name parsing for "Last, First" format
🔒 Security
- RSA key removal — The Plexus RSA private and public key pair was removed from the codebase, eliminating a sensitive credential exposure risk
- Django 5.2.14 — Backend framework upgraded to incorporate the latest security patches
Plexus Satellite
This release resolves a connectivity issue affecting Singularity workloads and improves diagnostic logging in satellite-managed environments.
- Singularity URL fix — The Singularity endpoint URL was corrected, restoring connectivity for satellite environments running Singularity workloads
- Improved logging — Additional log output assists operators in diagnosing satellite operations
- Minor code quality improvements applied based on peer review
AAC Builder
This release completes a full security hardening and code quality cycle for the AAC Builder service.
- SonarQube remediation — All identified issues across low, medium, and high complexity levels were resolved
- Secret key externalized — The
SECRET_KEYwas moved out of the codebase into external configuration, eliminating a hardcoded credential risk - Code restructuring — Business logic was extracted from
tasks.pyinto a dedicatedservices.pymodule, improving testability and maintainability - Django 5.2.14 upgrade — Framework updated to the latest stable release for security and compatibility
- Python 3.10 standardized — The target Python version is now consistent across all project components
Analytics
This release improves the resilience of the Analytics service when the InfluxDB backend is unavailable.
- Improved connection handling — The service now degrades gracefully when InfluxDB is unreachable, preventing unhandled errors from propagating
- Enhanced exception reporting — InfluxDB-related exceptions now produce richer diagnostic output for easier troubleshooting
- nginx buffer tuning — nginx configuration updated to increase buffer sizes for larger payloads
- Deprecated method replaced —
datetime.utcnow()replaced withdatetime.now(timezone.utc), aligning the codebase with current Python best practices
Documentation updates
The bare-metal documentation section was comprehensively updated: outdated and deprecated commands were replaced, software versions were brought up to date, and missing details were added across all bare-metal guides.