Installing SBGrid Software
Using the SBGrid Environment
Support for Site Administrators
Hardware Support Notes
Getting Help
Support for Developers
https://github.com/rs-station/careless-examples
Here we provide a sample bash script for running the examples accompanying the careless preprint.
These examples have been tested against SBGrid careless release version 0.2.7.
CUDA SDK is required
Careless GPU support requires that the CUDA Toolkit SDK be installed. We have tested with version 11.2. The executables provided in this package are not redistributable so we do not include them with our CUDA libraries.
libdevice not found
If your CUDA installation is in a non-standard directory you may need to set the XLA_FLAGS environment variable in order to pass a custom value of --xla_gpu_cuda_data_dir. See the script below for an example. An error message stating libdevice not found is the indicator that the custom flag needs to be set.
#!/usr/bin/env bash
# SBGrid 'careless' title - run examples
# Args: none
#
# James Vincent - biogrids.org
# vincent@hkl.hms.harvard.edu
# Jan 20, 2023
# Sample SLURM submission
#SBATCH --partition=mghpcc-gpu
#SBATCH --gres=gpu:NVIDIA_A40:1
#SBATCH --time=03:00:00
#SBATCH --job-name=jjv5-101051_careless
#SBATCH --mail-type=BEGIN,END,FAIL
# Load cuda
module load cuda/11.2
# Set my_cuda_dir to your local install
my_cuda_dir="/programs/local/cuda/11.2"
export XLA_FLAGS=--xla_gpu_cuda_data_dir=${my_cuda_dir}
# Start SBGrid environment
source /programs/sbgrid.shrc
export CARELESS_X=0.2.7
# Get 'careless' examples
curl -kLO https://github.com/rs-station/careless-examples/archive/main.zip
unzip main.zip
# pyp example - approx 14min on NVIDIA_A40
cd careless-examples-main/pyp
time ./merge.sh
cd ../..
# little_careless example - approx 1.5min on NVIDIA_A40
cd careless-examples-main/little_careless
time python.careless model.py
cd ../..
# hewl_ssad example - approx 20min on NVIDIA_A40
cd careless-examples-main/hewl_ssad
time ./merge.sh
cd ../..
# thermolysin_xfel example - approx 33min on NVIDIA_A40
cd careless-examples-main/thermolysin_xfel
time ./merge.sh
cd ../..
# little_careless example - approx 1min on NVIDA_A40
cd careless-examples-main/careless_zero
time python.careless ./careless_zero.py
cd ../..
exit
# Verify we have a working GPU
echo -e "hostname: $HOSTNAME \n\n"
echo -e "nvidia-smi output \n\n"
nvidia-smi
# Verify TF works:
echo -e "\n\n Testing TF with python.careless: "
python.careless -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
# Verify GPU:
echo -e "\n\n Testing GPU with python.careless: "
python.careless -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
The Apple M1 result is from a standard Mac build of careless not an M1 optimized build.