manual.txt

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
NOTE: ATTRACT 2.0 is still UNDER DEVELOPMENT 
This manual is still in the DRAFT stage and 
 will undergo a serious overhaul before the final release
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

TODO: --cdie has distances capped at 50 A!
  rr1 = 1.0d0/sqrt(r2)-1.0/50.0
  r2a = rr2 - (1.0/50.0)*(1.0/50.0)
  
--only-flex
TODO: grid restraints (sym.cpp)
TODO: 1/(rmsd+1) weighting of MC ensemble switching
NEW: gravity 4, ghost, --fast option for randsearch
NEW: cdie, potshape, all-atom, alphabet, proxlim=0
NEW: --grid <index 1> <index 2>
NEW: --ensemble option: --ensemble <ligand> <PDB list file>
NEW: 2 extra floats in the parameter file: start and end of the switch
 function; if they are zero, then no switching 
NEW --shm switch for make-grid
NEW: make-grid omp version:
     timed on 1AVXA 
      normal: 199 seconds
      omp: 58 seconds (4 threads)
      omp-torque: 104 seconds (4 threads)
      omp-torque + shm: 94 seconds (4 threads)
NEW: --sym 
Basic usage:

attract.inp is gone. It's all command line options now. Starting structure generation is de-coupled from the docking. To perform a standard docking protocol, first run the systsearch (or randsearch) tool to write a DOF file, run ATTRACT on that DOF file, run ATTRACT again on the output of the first ATTRACT, etc. You can even connect them with Unix pipes if you want.
Just type "attract" without any arguments to get a basic idea of its command line usage.
Keep in mind that the receptor is not kept fixed by default!

The output structures are printed on stdout. Diagnostic messages are (should be) printed on stderr. Pipe them as needed. 

Coding issue: In a terminal there is no problem, but piped messages are not always printed in the order you expect. This is because C and Fortran have their independent printing mechanisms.

********************************************************************************
A short word on docking mechanics
********************************************************************************

The new ATTRACT version can be used in "classical" mode, i.e. without grids or anything. 
The code concerning these calculations (nonbon8.f, minfor.f) has been largely untouched, which has two advantages. First, everyone currently has their private ATTRACT versions. It should be easy to port the changes to the new version.
Second, the new ATTRACT will give exactly the same docking results as the old version, but within certain limits.
Docking is inherently chaotic: minimal perturbances in coordinates will give rise to macroscopic differences in docking results in a fraction of the cases. So, keep the following in mind:
- A single invocation of ATTRACT will give exactly the same results as a one-stage attract.inp, if the CPU architecture is the same, and if you fix the receptor.
- Multiple ATTRACT invocations will not give the same results as a multi-stage attract.inp, because all DOFs are written to disk in between docking stages, causing all DOFs and coordinates to be rounded with .3 - .8 digits precision.
- The old multi-body ATTRACT version will differ in the same way. There is an additional source of rounding differences here, caused by the fact that the new ATTRACT calculates the energies and forces as the sum of pairwise receptor-ligand interactions, with the ligand always rotated into the coordinate frame of the receptor.

********************************************************************************
The DOF file format (structures.dat)
********************************************************************************

This file format is both the input and the output format of ATTRACT. The file format is as follows:
- The first section contains zero or more comment lines, marked by ##. Currently, in the ATTRACT output, this contains the command line parameters  that where received by ATTRACT.
- The next section starts with "#pivot auto" or a manual pivot definition. This specifies the rotation pivot of each molecule. In case of "#pivot auto", it is automatically determined as the average of all atoms. Manual specification goes as follows:
#pivot 1 12.345 6.789 10.123
#pivot 2 22.345 -6.789 -10.123
and so on for each molecule
- Then comes the section with the centering convention. It contains two lines:
#centered receptor: false 
#centered ligand: false
or
#centered receptor: true
#centered ligands: true
or a combination of those.

If a molecule is centered, all translation coordinates are of the pivot center relative to the world origin. If it is not centered, all translation coordinates are relative to the original PDB coordinates.
Example: (0,0,0) for both receptor and ligand. When they are not centered, this represents the original position in the PDBs. When they are centered, it means that the pivot point of both receptor and ligand are on top of each other in the global origin.

- The final section contains the actual DOF values of each structure. For each structure, there comes first a line #X, where X is the number of the structure. (i.e. #1, #2, #3). Then there comes a comment section of zero or more lines  of per-structure comments, starting with ##. Finally, there comes a DOF line for each molecule (i.e. 2 DOF lines for a standard two-body docking, N lines for a N-body docking, 1<=N<=100)
A line contains at least 6 numbers: three for the rotation (phi-ssi-rot, in radians), three for the translation (see above, in A) and then the values for the normal modes. The number of normal modes can be different for each molecule (0-20 modes per molecule).
Note: per-structure comments starting with ### in the input file are treated as special by ATTRACT in the sense that they are printed out again unchanged in the output file. All other comments are discarded.

DOF files generated by systsearch or randsearch conform to the format above.

********************************************************************************
The PDB file format
********************************************************************************

You supply either two PDB files or one PDB file to ATTRACT. If it is one PDB, the first line must contain the number of molecules in the file, and then the molecules separated by TER (like the old multi-body version).
The PDB itself is in standard ATTRACT reduced format. They are generated with reduce as usual. The atom type field can be 0-99. 

- Residue type translation
Residue type 0 is the dummy atom type
By default, residue type 99 is translated to 0. Residue type 32 is also translated to 0 but only if fewer than 32 atom types have been specified in the parameter file.
You can specify a custom translation table for residue types. All forcefield calculations on atoms will be after this translation table has been applied, except for the EM module, which receives the untranslated atom types.
A  translation table consists of a text file containing two integer columns per line (1-99).
TODO: specifying a custom translation table is not yet implemented 

- Residue type address space
I propose the following: residue type 1-31 are ATTRACT protein reduced atom types, 32-74 are reserved for non-protein reduced atom types, residue 75-99 are reserved for full-atom/heavy atom types.

NEW: generic reduce (allatom) that can also deal with RNA, full atom, ...

********************************************************************************
Atom type parameter file format
********************************************************************************

All parameter files must have one additional line with two integers P and X at the beginning of the file.
P is the potential shape, which must be 8 or 12 (8 is standard ATTRACT, 12 is standard Lennard-Jones). X is the number of atomtypes.
The rest of the file must contain parameters for X atomtypes in standard ATTRACT format, as usual.

Testing issue: Piotr and I got a preliminary version working for RNA, but the new one still needs to be tested

NEW: full atom force field

********************************************************************************
Command line options
********************************************************************************

All command line options start with --
All options must be specified after structures.dat, the parameter file and the PDBs.

Coding issue: see parse_options.cpp for complete list of options, and ministate.cpp for the defaults.
At some point, we may want to auto-generate this from some configuration data model.

********************************************************************************
Sampling options
********************************************************************************

The following list contains the command line options for sampling

--fix-receptor (default: off!): keep the receptor protein's translation and orientation fixed
--only-rot : use only rotational degrees of freedom
--only-trans
--only-flex: use only imodes as degrees of freedom
--vmax <value> (default: 100)
  The maximum number of minimization steps, from attract.inp
--mc Use Monte Carlo (MC) sampling instead of energy minimization
--mcmax <value> : used in pure MC mode to indicate the total number of MC steps.
--rcut <value> (default: 1500)
  The distance squared pairlist cutoff from attract.inp  
  Orthogonality issue: Grids do not use an rcut, it should give similar results to an infinite rcut
  TODO/Orthogonality issue: The new version is not yet binary compatible with Piotr's version, because it uses rcut1 in pairgen.f while Piotr uses rcut. 
  
********************************************************************************
Tools
********************************************************************************

Some of them are in bin/ but most of them are in tools/, depending on whether they use the ATTRACT Fortran/C++ codebase or not.
For the exact command line syntax: just invoke the program without any arguments, this usually will tell you the exact command line syntax.

- collect
Collect takes now as input a DOF file + one or more PDBs.
There is a difference in the result whether you supply one or two PDBs.
If you supply one PDB, it must be in reduced format, with the first line containing the number of structures (see the PDB section above). The structures are printed one after another.
If you supply two or more PDBs, they can be in any format you like (reduced, full-atom, ...). The structures will be separated with MODEL/ENDMDL lines.
You can also provide a --modes <mode file> option, to read in and apply normal 
modes in the same way as ATTRACT does. See the Normal modes section for more details.
Collect has some sanity checks that you provide the correct number of structures and normal modes.

- center
A simple program that reads in a PDB, computes its center of mass and subtracts it from all coordinates. The coordinates are printed on stdout, the center of mass at stderr, pipe it as needed.

- fix_receptor
This will read in a DOF file and will fix the receptor (i.e. setting all its non-mode DOFs to zero). This is done by rotating all ligands (all molecules beyond the first) into the coordinate frame of the receptor.

- deredundant
This will read in a DOF file and remove all duplicate structures.
Since it does not read in actual PDB files, this is a bit empirical: the structure is considered as a sphere with a radius of gyration of 30 A.
Structures with a (ligand) RMSD of more than 0.05 A are considered different.
It will add a comment to the written structures to indicate where it came from.
Coding issue: the code is a bit outdated, but it should still work
Coding issue: there is also a Python version in tools/
NOTE: It assumes that the receptor is fixed
NOTE: For best results, sort the structures first by energy
NOTE: Normal modes are not taken into account
NOTE: This is not a clustering algorithm, use a different program for that

- sort.py
This will sort the structures by energy
It will add a comment to the written structures to indicate where it came from.

- split.py
Takes as input a DOF file, a pattern and a number X. It will split the DOF file into X files pattern-1, pattern-2, ..., pattern-X. This is highly useful for parallellization. 
It will add a comment to the written structures to indicate where it came from.

- join.py
Reverse of split.py. Takes as input a pattern. It will look for all files starting with pattern and join them back into a single DOF file, written at stdout.
It utilizes the comment line written by split.py to deduce where every structure should go. No other assumptions are made, you are free to scramble the structures and files in any way you want. If there are duplicate or missing structures, it will complain. 

- top
top <DOF file> <number of structures X>. Prints out the first X structures of a DOF file.

- fill-energies.py
This will take a DOF file and another file that contains "Energy: " lines. It will add the energy lines as comments into the DOF file and print it out, replacing any existing energy commments.  
The other file can be another DOF file but it can also be the output of attract --score. In this manner, you can re-score your docking results without re-minimization.

- fill.py
This will take a DOF file and another DOF file and add all comments (e.g. energy, seed) of the second file to the first. This is useful in combination with tools that don't preserve comments such as fix_receptor.

- monte.py / metro.py

Monte.py takes as input a DOF file and applies a random perturbation to the DOFs: rotation, translation and normal modes.
Metro.py takes as input two DOF files. They must have the same number of structures. The second DOF file is considered "new". For every structure, it selects the old or the new structure according to a Metropolis criterion. The selected structures are printed out.

Together, you can use them to do a Monte-Carlo + minimization protocol. Taking initial docking solutions, you run monte, then ATTRACT, then metro on the results + the initial docking solutions. You can repeat this as many times as you wish.
NEW: the monte.py DOF steps, command line
TODO: metro.py currently accepts only new structures that have a lower energy, i.e. the temperature is 0 K. I will make it parameterizable from command line.
Coding issue/TODO: monte.py uses the system time as seed. Perhaps it would be better to use the structure's seed, so that the result becomes deterministic (on the same CPU architecture at least).

- randsearch.py 
Specify the number of bodies, the number of desired structures, and optionally a seed number. It will generate starting structures of N bodies with random orientation and placement. The pivot points are guaranteed to be equidistant on a unit sphere of 75 A radius (i.e. 150 A from each other for 2 bodies): for more than two bodies, a steepest-descent energy minimization is performed to achieve this.

NOTE: This is the default starting structure sampling protocol in HADDOCK
NOTE: This is currently the only generic way to get starting structures for multi-body docking.
Coding issue: For more than two bodies, the energy minimization can be slow. You are recommended to install the psyco module or use pypy instead of vanilla Python. I will port it to C one of these days.
TODO: The randomization process is not completely random for rotation, I will implement a procedure that better samples the unit sphere.

TODO: organize the tools properly, adapt the Makefile.

********************************************************************************
Old tools
********************************************************************************

Those I didn't touch, ask Martin how they work :-)

translate	#Find starting points arround receptor, to place the center of ligand
rotam
modes
modesca
compare
viewe
rmsca

********************************************************************************
Normal modes
********************************************************************************
--modes <modes file>

There is a single modes file that describes all normal modes for all structures
Any molecule can have any number of modes (maximum number has to be set in bin/max.h)
See bin/read_hm.cpp for a description of the file format

As before, normal modes experience a force towards zero, a fourth order function. Force constants are given in the file as before. 
The DOF file format however enables a starting structure with non-zero starting values for the mode DOF.

Note: normal modes are for small scale deformations, up to an angstrom or two. For larger deformations, I suggest to generate multiple coordinate models with wide-apart mode coordinates, assign each DOF structure to the closest coordinate model, and subtract the DOF coordinate of that model. After doing the docking separately with each model and its associated structures, you can add the model DOF coordinate again.

Coding issue/TODO: Perhaps it would be better to switch to per-molecule normal mode files, specified in the same way as grids (e.g. --modes 1 hm1.dat --modes 2 hm2.dat ...)

*******************************************************************************
iATTRACT
*******************************************************************************
iATTRACT works with multi-body docking and ensemble docking or proteins and nucleic acids.
It employs the atomistic OPLS force field (allatom/allatom.par). Structures must be converted 
to OPLS format with tool allatom/aareduce.py using the --dumppatch option.
The protocols/iattract.py script requires the following parameters in addition to a normal ATTRACT docking

--infinite : includes all residues up to a distance of 50 A in the pairlist calculation of the nonbonded forces

--name <namingscheme>
Naming scheme for the iattract output files. Caution: if iattract files with this naming scheme are already present
these will be used instead of generating them on-the-fly. Can be also be used as an option
in bin/collect, bin/irmsd.py, bin/lrmsd.py tools.

--icut <value> 
Cutoff for on-the-fly detection of flexible interface residues. Default: 3.0
Use 5.0 for peptide-protein refinement.

Internal option for attract binary, index modes:
--imodes <modes file>
The index modes contain a list of atom numbers for the atoms that will be treated as fully flexible.
Different proteins are separated by -1.
In general, every structure has its own index file, the naming scheme is usually flexm-iattract1.dat etc.


Orthogonality issue: Index modes can be used together with normal modes to include flexibility on different scales.
TODO: test using global modes and index modes together


********************************************************************************
Restraints
********************************************************************************
--rest <restraints file>

Restraints can be intermolecular but also intramolecular (preserving secondary structure). It can do most of the things that HADDOCK can do.

You can repeat the "--rest" option in case of multiple restraint files, the restraints will be taken together.

See restraints.txt for more information on the file format.

- The air.py tool

This is a convenience tool to generate restraints files from HADDOCK-style active and passive residues. It takes eight files, run air.py without any argument to get its precise command line syntax. The file formats are as follows:
- residue list: a list of residues, one integer per line
- mapping file: a list of mappings, two columns per line. The first column is the residue number (integer). The second column is the residue identifier (usually integer but not necessarily) in the PDB. Active and passive residues in the residue list are first mapped and then converted to a selection of atoms.
NOTE: a mapping file is generated and printed out by reduce.
Example: 
active residue list: 20
mapping file: 20 19B
PDB: atom 321-326 are Ser 19B
=> active residue 20 is translated into the selection 321-326

You can give a ninth parameter to indicate the chance that a restraint is not used. Default is 0.5 (HADDOCK default). Unlike HADDOCK, you can adjust this parameter for every restraint separately. 

- The tbl2attract.py tool
Convenience tool for converting tbl restraints files to ATTRACT restraint format. Run without options in command line to get help.

********************************************************************************
Gravity
********************************************************************************
--gravity <gravity mode>

Defines attractive forces on pivot points. It depends on the gravity mode:
1: defines an attractive force on every pivot point towards the global origin
2: defines an attractive force between the receptor pivot and each ligand pivot
3: defines an attractive force between all pivots

--rstk <force constant> (default 0.0015)
The gravity potential is a harmonic restraint function. With this parameter, you can set the force constant. 

********************************************************************************
Multi-copy
********************************************************************************
  Multi-copy sidechains/loops should work like before. 
  Testing issue: Multi-copy sidechains/loops have not been tested by me.  
  Coding issue: Multi-copy sidechains/loops have their private energy evaluation function (in select.f)
  Orthogonality issue: Multi copy sidechains + multi-body docking. There will be a different copy for each interaction with another molecule.
  Orthogonality issue: Multi copy sidechains + restrains will give some inaccuracies (as far as the restraints are concerned, the sidechains are independent).
  Orthogonality issue: Multi copy sidechains + normal modes may give some weird results.
  
********************************************************************************
Grids
********************************************************************************
The new ATTRACT can use a combination of a neighbour and a potential grid to get a significant speedup. 
The potential grid energy is stored at the voxels, and the actual potential grid energy of an atom is computed by linear interpolation between eight voxels.
The neighbour energy of an atom  is computed at run time, using the neighbour list stored at the nearest voxel.

The following things should be kept in mind:

- The neighbour grid and potential grid are in one file
- Every molecule has its own grid file
- Grids, especially torque grids, can take a lot of memory!
- For two body docking, only the receptor needs to have a grid if it is fixed. If you don't fix the receptor, you need grids for both molecules, or a torque grid for one of the molecules.
- ATTRACT will always fall back to non-grid docking if grids are not available. For two-body docking, if you forget to fix the receptor, this means slow docking! 
- For multi-body docking, everything is viewed as a sum of pairwise interactions. Every pairwise interaction will be grid-accellerated if two grids or one torque grid are available for that pair, or if one of them is a fixed receptor for which a grid is available. Else, it will fall back to "classical" mode for that pair. 


NOTE: The following section will contain NOTE sections like this, containing instructions to change advanced grid settings. These advanced settings are hard-coded in the grid-generating programs. You can change the settings and recompile. However, the *ATTRACT* program does not expect particular grid settings: you don't need to recompile ATTRACT or anything to deal with a grid with a different voxel size or other settings (ATTRACT *will* recompile by itself, but only because the code to compute a grid and to read in a grid are in the same file, and ATTRACT uses that file).

1. Calculation of grids

A. First, you need to calculate an interior map. Voxels marked as interior will have their energy set to infinite and their gradients to zero.
Use the tool calc_interior (in bin/):
  calc_interior file.pdb file-interior.vol 
For best results, you should use a NON-reduced PDB as input.
The interior map will be in Situs EM format (.vol). You can visualize it as follows:
  bin/vol2ccp4 file-interior.vol file-interior.ccp4 
  pymol file-interior.ccp4

NOTE: At this point, your voxel size (default: 0.9 A) will have been determined: subsequent tools will check that the voxel size is the same as in the .vol file. To change the voxel size, adapt "gridspacing" in makegrid.h and recompile everything. 
calc_interior creates a 10.8 A box around your protein (the edge is at least 10.8 A from any atom). It then sets all voxels within 7 A of any atom as "interior" and then shrinks the interior inward by 23 voxels. All of these settings can also be adapted in makegrid.h

B. The next step is the actual calculation of the grid. Use make-grid for this.
Invoke make-grid without arguments to get the command line syntax.

You must supply a reduced PDB file and the interior map computed above. The last argument must be the name for the grid file. Two additional arguments describe the distance thresholds:
- The plateau distance is the border between the potential grid energy and the neighbour energy. The neighbour energy at and beyond the plateau distance is zero. The neighbour energy under the plateau distance is the standard energy minus the energy at the plateau distance. The potential grid energy at and beyond the plateau distance is equal to the standard energy. The potential grid energy under the plateau distance is equal to the energy at the plateau distance. 
The same goes for the gradients.
- The neighbour distance is the distance threshold for which neighbour atoms are stored on the grid. At runtime, if an atom is nearest to this voxel, the distance of each stored atom is computed. If it is above the plateau distance, it is ignored, else the energy is computed.

NOTE: There is actually a hidden option "rigid" in pairenergy.f, which is off by default. If it is on, it will ignore all neighbour atoms that were between the plateau distance and the neighbour distance when the grid was built. It will not compute their distance and just assume that they are beyond the plateau distance. When there are no normal modes and the voxel size would be very small, this would be a reasonable assumption, leading to some speed gain. However, with 0.9 A I found that these atoms often are within plateau distance and that their contribution is non-neglegible.

NOTE: There is an additional run-time approximation to gain speed by pre-tabulation. See the pre-tabulation section of the grid for more details.

NOTE: To calculate the potential grid, an extremely large cutoff of 50 A is used. This setting is called "distcutoff" in makegrid.h.

NOTE: make-grid uses the voxel box generated by calc_interior as the voxels for which to compute the potentials and neighbours. It will extend this box with 32 voxels of *double* size (1.8 A) for which *only* a potential grid is computed. This value of 32 is hard-coded in makegrid.h.

Coding issue: Grids have their private energy evaluation function (in nonbon.h). This function is used both at grid generation (for the potential grid energy) and at runtime (for the neighbour energy). In other words, you cannot just modify nonbon8.f and expect grids to work properly with your new energy: you have to modify nonbon.h as well.
 
Orthogonality issue: Piotr says that the very long cutoff used by grids doesn't work well with RNA. Perhaps a vdW switching function can be implemented.

2. Torque grids

There are torque versions for several binaries: attract-torque and make-grid-torque  (and shm-grid-torque; see memory management below). These versions generate/use torque grids instead of normal grids. Torque grids have exactly the same neighbour grid, but their potential grid differs: in addition to the energy and gradient they have a 3x3 torque matrix stored, used to calculate the torques on the receptor. This makes torque grids three times as large, but also ensures that a single evaluation computes both receptor and ligand forces.
No special options need to be specified for torque grids.

   Orthogonality issue: Torque grids will give (small) inaccuracies if the receptor has normal modes. This is because normal mode forces will be computed using the neighbour forces only, and potential grid forces will be neglected. I expect this to be a generally safe approximation, but testing is needed to be sure!

3. Grid usage options

Grids are specified for a molecule as follows:

--grid <molecule> <grid file>
molecule 1 is the receptor, 2 is the first ligand, 3 the second ligand, etc.

--gridmode <1 or 2>
gridmode is 1 by default for attract, meaning that two grids are necessary to grid-accelerate an interaction pair, and that two evaluations are made: one to calculate the energy and the ligand forces, and a second evaluation, with ligand and receptor reversed, to calculate the receptor forces. If the receptor is fixed, only one grid is necessary and only one evaluation is made. 

gridmode is 2 by default for attract-torque, meaning that only one grid is necessary to grid-accelerate an interaction pair, and that only one evaluation is made to compute both receptor and ligand forces.

For attract, if you change the gridmode to 2, the speed will double but there will be considerable inaccuracies in the receptor forces (only the neighbour part will be there).


3. Memory management

There is the possibility to load a large portion of a grid into shared memory (shm; /dev/shm on Ubuntu). This has great memory advantages with parallellization, when four or more attract processes can load the same grid(s) and still use not much more memory than a single attract process.

The shm-grid tool takes a grid and creates two shared memory segments (one for the potentials and one for the neighbours), generating a grid header file that holds a reference to these segments. A grid header file can be loaded with --grid as if it where a full grid file, but it is much much smaller. The header file will be valid only as long as the shared memory segments exist: these segments are be deleted at reboot or when the shm-clean tool is run, making the header file invalid. This will result in an error message.

4. Pre-tabulation (prox)

There is a final speed optimization to compute energies at medium distances. A small (2000 elements per A**2) table converts distance**2 into the nearest distance**-2. Then, for every distance**-2 and every atom type-atom type pair, a large prox table contains the neighbour energy (and gradient), i.e. the standard energy minus the standard energy at plateau distance. 
This pre-tabulation is only used for distances shorter than the plateaudistance and longer than some distance limit proxlim. Distances shorter than that are quite rare (< 6 A: less than 10 % in fully docked structures) and require too much memory for accurate tabulation.

The prox limits can be changed with the following command line options:

--proxlim <distance> (default 36). The distance**2 above which prox pre-tabulation is used. 
--proxmax <distance> (default 200) The maximum distance**2 for prox pre-tabulation. Must be at least as large as the largest plateau-distance**2 of any grid (this is validated)
--proxmaxtype <number of types> (default 31). The number of atom types for which a prox table must be generated (atom type 1 up to proxmaxtype)

Prox tables can be quite big. They are also loaded in shared memory; they are 
automatically deleted when all processes that use them are finished. The prox settings are coded into the segment name. Starting an additional attract process will re-use the prox settings of the other process. however,if you use different prox settings, a new prox table segment will be created.
In case of a crash, you can run shm-clean and it will remove the prox files as well; however, if your prox settings are non-standard, this will not happen and you will have to delete the file in /dev/shm yourself.

Testing issue: it should be extensively tested to what extent prox tables give a speed increase, and if it is really worth it, or that the memory is better used  e.g. for torque grids.

5. Grid alphabets
  
When building a grid, you can supply an alphabet of atom types. The grid will only be built for those atom types. When you use the grid for docking, it will be checked that the partners only contain those atom types.
  NEW: you cannot supply a custom alphabet yet
  TODO: checking against the alphabets is not yet implemented

********************************************************************************
Pure MC mode 
********************************************************************************

In this mode, no energy minimization is performed. Instead, moves are made and accepted/rejected according to a Metropolis Monte Carlo sampling algorithm.

--mc
Enables pure MC.

--mctemp <temperature in RT> (default: 3.5)
Adjusts the Metropolis temperature.

--mcscalerot <scaling in radians> (default: 0.05)
--mcscalecenter <scaling in A> (default: 0.1)
--mcscalemode <scaling in mode A> (default: 3)
Adjusts the step size of the rotation, translation and normal modes, respectively.

--mcensprob <probability> (default: 0.05)
Adjusts the probability to switch between ensemble copies in an MC step

Orthogonality issue: this is fully compatible with grids, normal modes, restraints and all other features.
Orthogonality issue: mode coordinates are not in A but such that the sum of squares of all displacements is 1 A. Therefore, you probably want large steps for modes and increase them for larger proteins.
Orthogonality issue: also in pure MC you can use vmax to control the maximum number of steps.
Testing issue: computation looks fine but proper parameters need to be investigated.


*******************************************************************************
Scoring and trajectory modes
********************************************************************************
--score
In this mode, no minimization is performed, but the energies and gradients of every structure is printed.
Orthogonality issue: this mode is not compatible with pure MC (it would be meaningless anyway)

--traj
In this mode, all input structures beyond the first are ignored. For the first structure, the structure is printed in DOF format after each minimization step. Use this together with collect to generate a trajectory movie of your minimization.

Orthogonality issue: this mode works also with pure MC, in which case the structure is printed after each accepted move.

*******************************************************************************
EM grids
*******************************************************************************
--em <em.inp file>

TODO: I will document this further once it is completely ready :-)

*******************************************************************************
SAXS
*******************************************************************************
Most SAXS related stuff can be found in tools/saxs
Check out the examples/attractsaxs folder to find out how ATTRACT-SAXS works.


********************************************************************************
Benchmarking tools
********************************************************************************

NEW: lrmsd program 
NEW: irmsd program
NEW: fnat program

********************************************************************************
Modified amino acids
********************************************************************************
ATTRACT now supports docking with a limited number of modified amino acids during
coarse-grained docking and atomistic refinement.
Note: for docking with modified amino acids all polar hydrogen atoms must be present
in the input PDB (PDB2PQR cannot deal with them). Use AddH in Chimera or another tool like CNS
to add hydrogens and select the option "PDB contains all polar hydrogens" in the GUI.
Also note that modified amino acids have to be labeled as ATOM so assure that HETATM
entries are changed to ATOM in the input PDB file. Otherwise, these residues will be ignored
or treated as non-covalently bound cofactors...

The following modified amino acids are supported:
HYP hydroxyproline
SEP phosphoserine
TPO phosphothreonine
TYP phosphotyrosine
TYS sulfotyrosine
NEP phosphonohistidine
CSP phosphocysteine
ALY acetyllysine
MLZ monomethyllysine
MLY dimethyllysine
M3L trimethyllysine

TODO: add naming convention to the web-interface
TODO: add hydrogen building by CNS
TODO: make a phosphate atom type in the future?