98e8d9d676304739804a2bafc4c2858d8367a89f
examples/alphafold2.md
... | ... | @@ -1,8 +1,18 @@ |
1 | 1 | |
2 | 2 | ## ALPHAFOLD2 |
3 | - |
|
3 | +<!-- TOC --> |
|
4 | + |
|
5 | +- [ALPHAFOLD2](#alphafold2) |
|
6 | + - [Preparing to run Alphafold](#preparing-to-run-alphafold) |
|
7 | + - [Using the default run_alphafold.sh wrapper](#using-the-default-run_alphafoldsh-wrapper) |
|
8 | + - [Creating your own run_alphafold script](#creating-your-own-run_alphafold-script) |
|
9 | + - [Running the python script run_alphafold.py directly](#running-the-python-script-run_alphafoldpy-directly) |
|
10 | + - [Examples:](#examples) |
|
11 | + - [Web portal](#web-portal) |
|
12 | + - [Known issues](#known-issues) |
|
13 | + |
|
14 | +<!-- /TOC --> |
|
4 | 15 | ### Preparing to run Alphafold |
5 | - |
|
6 | 16 | The ALPHAFOLD2 source an implementation of the inference pipeline of AlphaFold v2.0. using a completely new model that was entered in CASP14. This is not a production application per se, but a reference that is capable of producing structures from a single amino acid sequence. |
7 | 17 | |
8 | 18 | The SBGrid installation of Alphafold2 does not require Docker to run, but does require a relatively recent NVidia GPU and updated driver. |
... | ... | @@ -59,6 +69,7 @@ The database directory shouuld look like this : |
59 | 69 | └── uniref90.fasta |
60 | 70 | ``` |
61 | 71 | |
72 | +### Using the default run_alphafold.sh wrapper |
|
62 | 73 | Once the databases are in place, AlphaFold can be run with the wrapper script run_alphafold.sh. The default location for the databases should be `/programs/local/alphafold`, but can be changed using the ALPHAFOLD_DB variable. For example: |
63 | 74 | |
64 | 75 | ``` |
... | ... | @@ -78,10 +89,27 @@ To use the run script, specify the path to the fasta file and an output directy |
78 | 89 | run_alphafold.sh <path to fasta file> <path to an output directory> |
79 | 90 | ``` |
80 | 91 | |
92 | +### Creating your own run_alphafold script |
|
93 | + |
|
94 | +You can use our run_alphafold script template here to create your own run script. |
|
95 | + |
|
96 | +[run_alphafold_template.sh](run_alphfold_template.sh) |
|
97 | + |
|
98 | +### Running the python script run_alphafold.py directly |
|
81 | 99 | run_alphafold.sh is a convenience wrapper script that shortens the required command arguments to run_alphafold.py. The run_alphafold.py script is also available which requires all parameters to be set explicitly, but provides greater flexibility. Pass --helpshort or --helpfull to see help on flags. |
82 | 100 | |
101 | +### Examples: |
|
102 | + |
|
103 | +We include reference sequences from CASP14 in the installation. |
|
104 | +This command should run successfully: |
|
105 | + |
|
106 | +``` |
|
107 | +run_alphafold.sh /programs/x86_64-linux/alphafold/2.0.0/alphafold/data/T1050.fasta |
|
108 | +``` |
|
83 | 109 | |
110 | +### Web portal |
|
84 | 111 | It is possible to run alphafold through a web portal. See |
85 | 112 | https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb . |
86 | 113 | |
114 | +### Known issues |
|
87 | 115 | Known issues: This version may not run on some newer GPUs that require CUDA 11.1 or later. An update is coming that should correct this. |
... | ... | \ No newline at end of file |
examples/run_alphafold_template.sh
... | ... | @@ -0,0 +1,57 @@ |
1 | +#!/bin/bash |
|
2 | +# Jason Key 1fac4ac 2021-07-23 08:33:47 -0400 |
|
3 | +# Copyright © 2021 SBGrid Consortium. All rights reserved. |
|
4 | +# |
|
5 | +# wrapper script for alphafold |
|
6 | + |
|
7 | +USAGE="$(basename $0): A wrapper script for running Alphafold jobs in the SBGrid Software installation \n\n |
|
8 | +Usage: $(basename $0) [fasta file] [path to output directory] \n\n" |
|
9 | + |
|
10 | +if [ $# -eq 0 ]; then |
|
11 | + echo -e $USAGE |
|
12 | + exit 1 |
|
13 | +fi |
|
14 | + |
|
15 | +## To set single GPU |
|
16 | +# export CUDA_VISIBLE_DEVICES=0 |
|
17 | + |
|
18 | +data_dir="/programs/local/alphafold/" |
|
19 | +if [ ! -z ${ALPHAFOLD_DB} ] && [ -d ${ALPHAFOLD_DB} ] ; then |
|
20 | + data_dir=${ALPHAFOLD_DB} |
|
21 | +fi |
|
22 | + |
|
23 | +if [ ! -d ${data_dir} ] ; then |
|
24 | + echo "${data_dir} is not a directory. Exiting... " |
|
25 | + exit 1 |
|
26 | +fi |
|
27 | + |
|
28 | +echo "Using databases in ${data_dir}" |
|
29 | + |
|
30 | +input_fasta=$1 |
|
31 | + |
|
32 | +if [ ! -f ${input_fasta} ] ; then |
|
33 | + echo "${input_fasta} is not a file. Exiting... " |
|
34 | + exit 1 |
|
35 | +fi |
|
36 | + |
|
37 | +output_dir="/tmp/alphafold" |
|
38 | +if [ $2 ] ; then |
|
39 | + output_dir=$2 |
|
40 | +fi |
|
41 | + |
|
42 | +mkdir -p "${output_dir}" |
|
43 | + |
|
44 | +/programs/x86_64-linux/alphafold/2.0.0/bin.capsules/run_alphafold.py \ |
|
45 | +--model_names="model_1,model_2,model_3,model_4,model_5" \ |
|
46 | +--data_dir="${data_dir}" \ |
|
47 | +--uniref90_database_path="${data_dir}/uniref90/uniref90.fasta" \ |
|
48 | +--mgnify_database_path="${data_dir}/mgnify/mgy_clusters.fa" \ |
|
49 | +--uniclust30_database_path="${data_dir}/uniclust30/uniclust30_2018_08/uniclust30_2018_08" \ |
|
50 | +--bfd_database_path="${data_dir}/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt" \ |
|
51 | +--pdb70_database_path="${data_dir}/pdb70/pdb70" \ |
|
52 | +--template_mmcif_dir="${data_dir}/pdb_mmcif/mmcif_files" \ |
|
53 | +--obsolete_pdbs_path="$data_dir/pdb_mmcif/obsolete.dat" \ |
|
54 | +--max_template_date="2020-05-14" \ |
|
55 | +--preset="full_dbs" \ |
|
56 | +--output_dir="${output_dir}" \ |
|
57 | +--fasta_paths="${input_fasta}" |