d9e2c66f883abd114c6cb3998930357ec96229ad
examples.md
... | ... | @@ -10,3 +10,4 @@ The following pages provide usage info and examples for select applications in t |
10 | 10 | - [PHENIX - phenix.rosetta_refine](examples/phenix.rosetta_refine) |
11 | 11 | - [SCIPION - running tutorials without write privileges](examples/running_scipion_tutorials) |
12 | 12 | - [DIALS - Version control in DIALS](examples/DIALS_version_control) |
13 | +- [ALPHAFOLD2](examples/alphafold2) |
|
... | ... | \ No newline at end of file |
examples/alphafold2.md
... | ... | @@ -0,0 +1,87 @@ |
1 | + |
|
2 | +## ALPHAFOLD2 |
|
3 | + |
|
4 | +### Preparing to run Alphafold |
|
5 | + |
|
6 | +The ALPHAFOLD2 source an implementation of the inference pipeline of AlphaFold v2.0. using a completely new model that was entered in CASP14. This is not a production application per se, but a reference that is capable of producing structures from a single amino acid sequence. |
|
7 | + |
|
8 | +The SBGrid installation of Alphafold2 does not require Docker to run, but does require a relatively recent NVidia GPU and updated driver. |
|
9 | + |
|
10 | +AlphaFold requires a set of (large) genetic databases that must be downloaded separately. See https://github.com/deepmind/alphafold#genetic-databases for more information. |
|
11 | + |
|
12 | +These databases can be downloaded with the included download script and the aria2c program, both of which are available in the SBGrid collection. Note that these databases are large in size (> 2Tb) and may require a significant amount of time to download. |
|
13 | + |
|
14 | +``` |
|
15 | +/programs/x86_64-linux/alphafold/2.0.0/alphafold/scripts/download_all_data.sh <destination path> |
|
16 | +``` |
|
17 | + |
|
18 | +The database directory shouuld look like this : |
|
19 | + |
|
20 | +``` |
|
21 | +├── bfd |
|
22 | +│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata |
|
23 | +│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffindex |
|
24 | +│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffdata |
|
25 | +│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffindex |
|
26 | +│ ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffdata |
|
27 | +│ └── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffindex |
|
28 | +├── mgnify |
|
29 | +│ └── mgy_clusters.fa |
|
30 | +├── params |
|
31 | +│ ├── LICENSE |
|
32 | +│ ├── params_model_1.npz |
|
33 | +│ ├── params_model_1_ptm.npz |
|
34 | +│ ├── params_model_2.npz |
|
35 | +│ ├── params_model_2_ptm.npz |
|
36 | +│ ├── params_model_3.npz |
|
37 | +│ ├── params_model_3_ptm.npz |
|
38 | +│ ├── params_model_4.npz |
|
39 | +│ ├── params_model_4_ptm.npz |
|
40 | +│ ├── params_model_5.npz |
|
41 | +│ └── params_model_5_ptm.npz |
|
42 | +├── pdb70 |
|
43 | +│ ├── md5sum |
|
44 | +│ ├── pdb70_a3m.ffdata |
|
45 | +│ ├── pdb70_a3m.ffindex |
|
46 | +│ ├── pdb70_clu.tsv |
|
47 | +│ ├── pdb70_cs219.ffdata |
|
48 | +│ ├── pdb70_cs219.ffindex |
|
49 | +│ ├── pdb70_hhm.ffdata |
|
50 | +│ ├── pdb70_hhm.ffindex |
|
51 | +│ └── pdb_filter.dat |
|
52 | +├── pdb_mmcif |
|
53 | +│ ├── mmcif_files |
|
54 | +│ ├── obsolete.dat |
|
55 | +│ └── raw |
|
56 | +├── uniclust30 |
|
57 | +│ └── uniclust30_2018_08 |
|
58 | +└── uniref90 |
|
59 | + └── uniref90.fasta |
|
60 | +``` |
|
61 | + |
|
62 | +Once the databases are in place, AlphaFold can be run with the wrapper script run_alphafold.sh. The default location for the databases should be `/programs/local/alphafold`, but can be changed using the ALPHAFOLD_DB variable. For example: |
|
63 | + |
|
64 | +``` |
|
65 | +export ALPHAFOLD_DB="/tmp/databases" |
|
66 | +``` |
|
67 | + |
|
68 | +specifies `/tmp/databases` as the database location in the run script in bash. |
|
69 | +tcsh users would use : |
|
70 | + |
|
71 | +``` |
|
72 | +setenv ALPHAFOLD_DB "/tmp/databases" |
|
73 | +``` |
|
74 | + |
|
75 | +To use the run script, specify the path to the fasta file and an output directy like so: |
|
76 | + |
|
77 | +``` |
|
78 | +run_alphafold.sh <path to fasta file> <path to an output directory> |
|
79 | +``` |
|
80 | + |
|
81 | +run_alphafold.sh is a convenience wrapper script that shortens the required command arguments to run_alphafold.py. The run_alphafold.py script is also available which requires all parameters to be set explicitly, but provides greater flexibility. Pass --helpshort or --helpfull to see help on flags. |
|
82 | + |
|
83 | + |
|
84 | +It is possible to run alphafold through a web portal. See |
|
85 | +https://colab.research.google.com/github/deepmind/alphafold/blob/main/notebooks/AlphaFold.ipynb . |
|
86 | + |
|
87 | +Known issues: This version may not run on some newer GPUs that require CUDA 11.1 or later. An update is coming that should correct this. |
|
... | ... | \ No newline at end of file |