careless examples

Here we provide a sample bash script for running the examples accompanying the careless preprint.

These examples have been tested against SBGrid careless release version 0.2.7.

CUDA SDK is required

Careless GPU support requires that the CUDA Toolkit SDK be installed. We have tested with version 11.2. The executables provided in this package are not redistributable so we do not include them with our CUDA libraries.

libdevice not found

If your CUDA installation is in a non-standard directory you may need to set the XLA_FLAGS environment variable in order to pass a custom value of --xla_gpu_cuda_data_dir. See the script below for an example. An error message stating libdevice not found is the indicator that the custom flag needs to be set.

#!/usr/bin/env bash

# SBGrid 'careless' title - run examples
# Args: none
# James Vincent  -
# Jan 20, 2023

# Sample SLURM submission

#SBATCH --partition=mghpcc-gpu
#SBATCH --gres=gpu:NVIDIA_A40:1
#SBATCH --time=03:00:00
#SBATCH --job-name=jjv5-101051_careless

# Load cuda
module load  cuda/11.2

# Set my_cuda_dir to your local install 
export XLA_FLAGS=--xla_gpu_cuda_data_dir=${my_cuda_dir}

# Start SBGrid environment
source /programs/sbgrid.shrc
export CARELESS_X=0.2.7

# Get 'careless' examples 
curl -kLO

# pyp example - approx 14min on NVIDIA_A40
cd careless-examples-main/pyp
time ./
cd ../..

# little_careless example - approx 1.5min on NVIDIA_A40
cd careless-examples-main/little_careless
time python.careless
cd ../..

# hewl_ssad example - approx 20min on NVIDIA_A40
cd careless-examples-main/hewl_ssad
time ./
cd ../..

# thermolysin_xfel example - approx 33min on NVIDIA_A40
cd careless-examples-main/thermolysin_xfel
time ./
cd ../..

# little_careless example  -  approx 1min on NVIDA_A40
cd careless-examples-main/careless_zero
time python.careless ./
cd ../..


# Verify we have a working GPU
echo -e "hostname: $HOSTNAME \n\n"
echo -e "nvidia-smi output \n\n"

# Verify TF works:
echo -e "\n\n Testing TF with python.careless:  "
python.careless -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

# Verify GPU:
echo -e "\n\n Testing GPU with python.careless:  "
python.careless -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

timing results

The Apple M1 result is from a standard Mac build of careless not an M1 optimized build.