Skip to content

Docker Applications FAQ

Best practices

Add main logic to the run script

Put the core of your logic in the run script. Avoid relying on prerun and postrun for main behavior.

Do not use latest-version containers

Using the latest tag can break your application when the image changes. Pin to a specific version.

Use a subdirectory per application in /home/aac

Use separate subdirectories for each application so runs stay clean. Do not mix files from different apps in your home directory.

Know how to run the container before adding it

You should know how to deploy and run the container. If there is no Kubernetes documentation, run and test it locally first, then configure the scripts and variables for AMD Accelerator Cloud (AAC).

Application finishes when the run script finishes

The application runs until the container run script completes.

Troubleshooting

SSH connection does not appear

To expose SSH for a container, the application must:

  • Run the SSH daemon inside the container.
  • Configure a port named SSH with value 22.

The workload may need to pull the image, so the SSH port can appear after a delay.

Not found files

If you see "file not found" errors, the application or container is likely expecting a file from your home directory. Attach those files as input files when you run the workload. They will be in /home/aac on the cluster until you delete them.

Example error:

[prerun]: tar (child): demo-fastdata.tar.gz: Cannot open: No such file or directory

Containers depending on other containers

If containers depend on each other, use the order attribute to control startup order.

Memory requirements

If you hit memory issues, set the container memory requirement in the container configuration. Memory is in megabytes.

Container needs the public IP of another container

Deploy two separate applications:

  • Configure the backend application with its containers and expose the required endpoint.
  • Configure the frontend with the IP and endpoint of the backend.

Interactive workloads or services finish immediately

The container run script must be a long-running process. If the run script starts a detached process, keep the container alive by appending && tail -f so the execution manager does not stop the application.

Kubernetes: manager pods fail

Manager pods have names such as stream-copy, create-scripts, read-analytics, retrieve-files, read-stdout, read-stderr. If they fail, common causes are:

  • Fail without displaying a reason or displaying any HTTP error:
  • This error is due to an issue with the target cluster.
  • Show timeout error of 60 seconds.
  • This can be because the cluster cannot create the pod:
    • Cluster resources are insufficient.
    • Cluster issues (network, permissions).