Skip to content

Launch TensorFlow Docker Application

This guide shows how to run the TensorFlow Docker application. Sign in to AAC if you have not already.

Select application

Click Applications, then select TensorFlow. In the Select An Application pop-up, choose the TensorFlow version with container type docker.

Note

In this example we use TensorFlow 2-10 ROCm 5-4-1 with docker.

Select TensorFlow

Select an application

New workload

Click New Workload at the top right.

New workload

Select team

If you have more than one team, select one in the pop-up and click Launch. Click Next to continue.

Selected team

Select input files

Select resources

Set the number of GPUs (max 8) and maximum allowed runtime. Click Next. Select the cluster and queue for the job (e.g. 1CN128C8G2H_2IB_MI210_SLES15 (Pre-emptible) - AAC Plano), then click Next.

Containers configuration

Queues

Review workload submission

Review the configuration and click Run Workload.

Review workload submission

Containers resources

Queue

Click Run Workload

When the queue is available, the status changes to Running. Click the running workload to open it.

Workloads on demand

Use the SYSLOG, STDOUT, and STDERR tabs to view logs and output. A token appears in STDOUT (in yellow). Copy the token.

Syslog tab

TensorFlow STDOUT tab

Interactive endpoints

When interactive endpoints are enabled, click Connect to open ML Studio (JupyterLab).

Interactive endpoints

Paste the token in the Password or token field and click Login.

Jupyter page

You can use JupyterLab for Python development.

Jupyter notebook page

Click Terminal to open a terminal. To run the benchmark, enter:

python3 /root/benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --model=resnet50 --num_gpus=8 --batch_size=256 --num_batches=100 --print_training_accuracy=True --variable_update=parameter_server --local_parameter_device=gpu

Service terminal output

Collect Performance metrics as needed.

Total images

When you are done, close JupyterLab.

Finish workload

Click Finish Workload.

TensorFlow overview tab

Download logs

After the workload finishes, download logs from the STDOUT tab via Download Logs.

Download logs