Skip to content

Release Notes - Version 1.17.0

Release Date: March 12, 2026

๐ŸŽ‰ Highlights

This release expands multi-node and distributed workload capabilities, fixes resource allocation visibility, and hardens the platform with automated security scanning. Users running distributed AI/ML training jobs gain better out-of-the-box container support, including InfiniBand networking and configurable shared memory. Administrators benefit from more accurate queue listings, correct email link generation, and a streamlined user onboarding flow with team assignment during invitations. The removal of two deprecated internal modules and the introduction of continuous SonarQube scanning make the platform leaner and safer.


๐Ÿš€ New features

๐Ÿ–ง Multi-node and distributed workload support

The platform now ships everything needed to run distributed, multi-node workloads inside containers without manual setup.

What's included:

  • InfiniBand (IB) networking and multi-node coordination scripts are now bundled directly inside container jobs
  • Container jobs support configuring shared memory (shm) size, benefiting deep learning training workloads that require large amounts of inter-process shared memory
  • Multi-node jobs can be fully tested and validated end-to-end through the CI pipeline, giving higher confidence that distributed runs will work correctly in production

๐Ÿ‘ฅ User management โ€” team assignment during invitations

When inviting new users to the platform, you can now optionally assign them to a specific team directly during the invitation process, eliminating the need for additional steps after invitation acceptance.

What's included:

  • Team selection field added to the invitation form
  • Faster user onboarding and simplified team management
  • Reduced administrative overhead for team administrators

๐Ÿ”’ Security scanning

Continuous automated security scanning using SonarQube has been introduced for the codebase. Every code change is now checked for common security vulnerabilities and code quality issues before it reaches production.

What's included:

  • SonarQube scanning runs automatically on every code change
  • SonarScanner updated to bundle its own Java Runtime Environment, removing an external dependency and making scans more reliable and self-contained

๐Ÿ”ง Improvements

  • Queue visibility: Queues on nodes with no GPU/accelerator or whose accelerator type is undefined now appear correctly as allocatable, preventing CPU-only or unclassified nodes from being incorrectly hidden
  • Accelerator detection: GPU auto-detection logic now only recognizes detection markers at the start of a configuration line, preventing false positives from similarly named strings and improving hardware reporting accuracy
  • Email links: Email messages that contain links back to the platform now generate correct, fully qualified URLs when deployed in PoC environments, preventing broken links in notification emails
  • Satellite configuration: The satellite (proxy) web server now reads its configuration directly from the central Django settings, eliminating duplicated configuration and reducing the risk of environment-specific mismatches
  • Dataset interface: Removed outdated pricing fields from the datasets API and management interface, simplifying the data model and reducing confusion with deprecated fields
  • Invitations API documentation: Improved request/response descriptions for the Invitations API, making it easier for integrators and administrators to understand how to invite users correctly

๐Ÿ› Bug fixes

  • Fixed archive team and archive user endpoints to use HTTP POST instead of GET, following REST best practices and preventing accidental archiving through browser pre-fetching or link previews
  • Fixed Docker-based multi-node test reliability, reducing false failures when validating distributed container jobs

๐Ÿงช Test suite refactoring

Test suites for the following modules were thoroughly refactored and modernized: Apps, Clusters, Files, Jobs, Queues, AWS adapter, Prometheus adapter, Resource Manager adapter, SSH adapter, and Units adapter. Database migrations during CI test runs are now significantly faster, reducing the time developers wait for test feedback.


๐Ÿ—‘๏ธ Removals

  • The legacy Alerts Django application has been fully removed. Its functionality has been replaced; removal reduces surface area for bugs and simplifies maintenance.
  • The legacy Notifications Django application has also been fully removed for the same reasons.

โ† Back to What's New | Previous Release (v1.16.0) โ†’