rosettafold_all_atom

Rosettafold-All-Atom Example

Below we provide a sample bash script, a fasta file renamed from one of the RFAA examples and a config yaml file.

You must run RFAA outside of the installation directory.

In order to do this you must provide your own config yaml file(s) and set the location of this yaml file. We show an example using the --config-dir option from hydra.

These environment variables must be set:

ROSETTAFOLD_DB_PATH - the directory to the RosettaFold databases:

/programs/local/rosettafold
├── bfd
│   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata
│   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffindex
│   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffdata
│   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffindex
│   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffdata
│   └── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffindex
├── pdb100_2021Mar03
│   ├── LICENSE
│   ├── pdb100_2021Mar03_a3m.ffdata
│   ├── pdb100_2021Mar03_a3m.ffindex
│   ├── pdb100_2021Mar03_cs219.ffdata
│   ├── pdb100_2021Mar03_cs219.ffindex
│   ├── pdb100_2021Mar03_hhm.ffdata
│   ├── pdb100_2021Mar03_hhm.ffindex
│   ├── pdb100_2021Mar03_pdb.ffdata
│   └── pdb100_2021Mar03_pdb.ffindex
└── UniRef30_2020_06
    ├── UniRef30_2020_06_a3m.ffdata
    ├── UniRef30_2020_06_a3m.ffindex
    ├── UniRef30_2020_06_cs219.ffdata
    ├── UniRef30_2020_06_cs219.ffindex
    ├── UniRef30_2020_06_hhm.ffdata
    ├── UniRef30_2020_06_hhm.ffindex
    ├── UniRef30_2020_06_hhsuite.tar.gz
    └── UniRef30_2020_06.md5sums

ROSETTAFOLDAA_X - the SBGrid version of RosettaFold-All-Atom.

RFAA_DIR - The directory where RosettaFold-All-Atom is installed under SBGrid. In most cases the example shown in the script below will be correct.

The example script expects two files in the current directory:

sbgrid_example.yaml

and

sbgrid_example.fasta

We purposely renamed these example files to be sure we're using our own files and not the example files that come with RFAA.

sbgrid_example.yaml is the combined protein.yaml and base.yaml from the RFAA config/inference examples directory.

sbgrid_example.fasta contains the 7U7W_1 protein from that same example.

#!/usr/bin/env bash

## SBGrid RosettaFold-All-Atom example
##
## help@sbgrid.org
## May 13, 2024


# SLURM
#SBATCH --mem=64G
#SBATCH -t 2:00:00
#SBATCH -p gpu_quad
#SBATCH --gres=gpu:1


# Start SBGrid environment
source /programs/sbgrid.shrc

# Set version of RFAA
export ROSETTAFOLDAA_X=bf21483

# Set critical env variables
export ROSETTAFOLD_DB_PATH=/n/shared_db/RoseTTAFold
export RFAA_DIR=/programs/x86_64-linux/rosettafoldaa/${ROSETTAFOLDAA_X}/RoseTTAFold-All-Atom

# Call the correct python bundled with RFAA
python.rfaa --version

# Run inference, override hydra parameters from YAML file
python.rfaa -m rf2aa.run_inference \
  --config-name sbgrid_example --config-dir ./ \
  checkpoint_path=${RFAA_DIR}/RFAA_paper_weights.pt \
  database_params.command=${RFAA_DIR}/make_msa.sh \
  database_params.hhdb=${ROSETTAFOLD_DB_PATH}/pdb100_2021Mar03/pdb100_2021Mar03