From 7987ac0000b2a85fb72f47d99fec3321d1374d21 Mon Sep 17 00:00:00 2001 From: Gareth Tribello <gareth.tribello@gmail.com> Date: Sat, 15 Dec 2018 00:18:15 +0000 Subject: [PATCH] Fixed spelling in CHANGES/v2.n.md where n=0,1,2,3,4 and in tutorials --- CHANGES/v2.0.md | 10 +- CHANGES/v2.1.md | 44 ++--- CHANGES/v2.2.md | 18 +- CHANGES/v2.3.md | 22 +-- CHANGES/v2.4.md | 14 +- user-doc/tutorials/a-trieste-1.txt | 20 +-- user-doc/tutorials/a-trieste-2.txt | 132 +++++++-------- user-doc/tutorials/a-trieste-3.txt | 48 +++--- user-doc/tutorials/a-trieste-4.txt | 46 +++--- user-doc/tutorials/a-trieste-5.txt | 14 +- user-doc/tutorials/a-trieste-6.txt | 10 +- user-doc/tutorials/belfast-1.txt | 38 ++--- user-doc/tutorials/belfast-2.txt | 80 ++++----- user-doc/tutorials/belfast-3.txt | 40 ++--- user-doc/tutorials/belfast-4.txt | 18 +- user-doc/tutorials/belfast-5.txt | 40 ++--- user-doc/tutorials/belfast-6.txt | 66 ++++---- user-doc/tutorials/belfast-7.txt | 10 +- user-doc/tutorials/belfast-8.txt | 46 +++--- user-doc/tutorials/belfast-9a.txt | 6 +- user-doc/tutorials/belfast-9b.txt | 32 ++-- user-doc/tutorials/cambridge.txt | 58 +++---- user-doc/tutorials/cineca.txt | 50 +++--- user-doc/tutorials/hrex.txt | 4 +- user-doc/tutorials/julich-devel.txt | 16 +- user-doc/tutorials/lugano-1.txt | 92 +++++------ user-doc/tutorials/lugano-2.txt | 156 +++++++++--------- user-doc/tutorials/moving.txt | 6 +- user-doc/tutorials/munster.txt | 60 +++---- user-doc/tutorials/others/isdb-1.txt | 22 +-- .../others/ves-lugano2017-01-metad.txt | 20 +-- .../others/ves-lugano2017-02-ves1.txt | 34 ++-- .../others/ves-lugano2017-03-ves2.txt | 6 +- .../others/ves-lugano2017-04-kinetics.txt | 6 +- 34 files changed, 640 insertions(+), 644 deletions(-) diff --git a/CHANGES/v2.0.md b/CHANGES/v2.0.md index 71edda97a..1356a40ce 100644 --- a/CHANGES/v2.0.md +++ b/CHANGES/v2.0.md @@ -65,7 +65,7 @@ For users: - Several small fixes in documentation and log file. For developers: -- Added possibility to setup replica exchange from MD codes in fortran (commands "GREX setMPIFIntercomm" and "GREX setMPIFIntracomm"). +- Added possibility to setup replica exchange from MD codes in Fortran (commands "GREX setMPIFIntercomm" and "GREX setMPIFIntracomm"). - cmd("setStopFlag") should now be called after PLUMED initialization. - Several small fixes in documentation. @@ -89,7 +89,7 @@ For users: - Several small fixes in documentation and log file. For developers: -- Fixed makefile dependencies in some auxiliary files in src/lib (*cmake and *inc). +- Fixed Makefile dependencies in some auxiliary files in src/lib (*cmake and *inc). - Changed way modules are linked in src/. E.g. src/colvar/tools/ is not anymore a symlink to src/colvar but a real directory. (Notice that this introduces a regression: when using plumed as an external library @@ -121,7 +121,7 @@ For developers: declared as static. - Small fix in user-doc compilation, so that if plumed is not found the sourceme.sh file is sourced -- Fixed non-ansi syntax in a few points and a non-important memory leakage. +- Fixed non-ANSI syntax in a few points and a non-important memory leakage. - Split cltools/Driver.cpp to make parallel compilation faster. Version 2.0.4 (Sep 15, 2014) @@ -143,8 +143,8 @@ Version 2.0.5 (Dec 15, 2014) \plumednotmaintained For users: -- Fixed a bug in replica exchange with different Hamiltonians (either lamdba-dynamics - or plumed XX-hrex branch) possibly occuring when using charge or mass dependent +- Fixed a bug in replica exchange with different Hamiltonians (either lambda-dynamics + or plumed XX-hrex branch) possibly occurring when using charge or mass dependent variables. - Fixed a bug in analysis (e.g. \ref HISTOGRAM) leading to wrong accumulation of statistics when running a replica exchange simulation. diff --git a/CHANGES/v2.1.md b/CHANGES/v2.1.md index 3ed147c78..88f5db722 100644 --- a/CHANGES/v2.1.md +++ b/CHANGES/v2.1.md @@ -24,7 +24,7 @@ character now has a special usage in component names. - The command SPHERE has been replaced by \ref UWALLS. - New configuration system based on autoconf (use ./configure from root directory). Optional packages are detected at compile time and correctly - enabled or disabled. An internal version of lapack and blas will be used + enabled or disabled. An internal version of LAPACK and BLAS will be used if these libraries are not installed. - New actions: - \ref SPRINT topological collective variables. @@ -62,17 +62,17 @@ character now has a special usage in component names. additional option for KERNEL=DISCRETE to accumulate standard histograms. - \ref sum_hills : added options --spacing (alternative to --bin to set grid spacing) and --setmintozero to translate the minimum of the output files to zero. - - \ref CONTACTMAP : parallelised and added weights. -- New features in MD patches (require repatch): + - \ref CONTACTMAP : parallelized and added weights. +- New features in MD patches (require re-patch): - New patch for Gromacs 5.0 - Gromacs 4.6.X patch updated to 4.6.7 - Gromacs 4.6.7 supports \ref COMMITTOR analysis; can be now be used to perform energy minimization; now passes temperature to PLUMED (this allows temperature to be omitted in some actions, namely \ref METAD and analysis actions). . - Notice that if you use runtime binding it is not compulsory to repatch, + Notice that if you use runtime binding it is not compulsory to re-patch, and that all combinations should work correctly - (new/old PLUMED with repatched/non-repatched MD code). + (new/old PLUMED with re-patched/non-re-patched MD code). - Other new features: - \ref driver can now read trajectories in many formats using VMD molfile plugin (requires VMD plugins to be compiled and installed). In case VMD plugins are not installed, @@ -84,18 +84,18 @@ character now has a special usage in component names. - Multicolvars with neighbor lists now work correctly in replica exchange simulations. - Improved multicolvar neighbor lists. - Optimizations: - - Root-mean-square devations with align weights different from displace weights + - Root-mean-square deviations with align weights different from displace weights are now considerably faster. This will affect \ref RMSD calculations plus other variables based on RMSD. - - \ref WHOLEMOLECULES is slighlty faster. - - \ref COORDINATION is slighlty faster when NN and MM are even and D_0=0. + - \ref WHOLEMOLECULES is slightly faster. + - \ref COORDINATION is slightly faster when NN and MM are even and D_0=0. - Atom scattering with domain decomposition is slightly faster. - Link cells are now exploited in some multicolvars. - Derivatives are not calculated unless they are specifically required, because for instance you are adding a bias. - Documentation: - All tutorial material from the recent plumed meeting in Belfast is now in the manual - - Improvements to documentation, including lists of referenceable quantities outputted by each action + - Improvements to documentation, including lists of quantities that are output by each action that can be referenced - Manual has been re-organized following suggestions received at the plumed meeting. - An experimental PDF version of the manual is now provided (a link can be found in the documentation homepage). @@ -105,13 +105,13 @@ Changes from version 2.0 which are relevant for developers: (this is needed for regtests to work properly). - Improved class Communicator. Many operations can now be done directly on Vectors, Tensors, std::vector and PLMD::Matrix. - Modified class RMSD. -- Patches for GPL codes (QuantumEspresso and Gromacs) now also include +- Patches for GPL codes (Quantum Espresso and Gromacs) now also include original code so as to simplify their modification. - Fixed dependencies among actions such that it is now possible (and reliable) to use MPI calls inside Action::prepare() - colvar/CoordinationBase.cpp has been changed to make it faster. If you devised a class which inherits from here, consider that CoordinationBase::pairing now needs _squared_ distance instead of distance -- It is possible to run "make install" from subdirectories (e.g. from src/colvar) +- It is possible to run "make install" from sub-directories (e.g. from src/colvar) - There is a small script which disables/enables all optional modules (make mod-light/mod-heavy/mod-reset) - Added "-q" option to plumed patch - You can now create new metrics to measure distances from a reference configurations. If you do so such @@ -121,7 +121,7 @@ Changes from version 2.0 which are relevant for developers: - Updated script that generated header files so that they properly show years. Notice that the script should new be run from within a git repository -This list is likely incompleted, if you are developing in PLUMED you are encouraged to follow changes on github. +This list is likely incomplete, if you are developing in PLUMED you are encouraged to follow changes on github. Version 2.1.1 (Dec 15, 2014) ---------------------------------------------- @@ -144,7 +144,7 @@ For users: - Fixed a linking issue with ALMOST, where bz2 was always used to link ALMOST to PLUMED even if it is not compulsory to build ALMOST. - Fixed a wrong include in the GMX5 patch. -- \ref FUNCPATHMSD can now be used together with \ref CONTACTMAP to define pathways in contactmaps space +- \ref FUNCPATHMSD can now be used together with \ref CONTACTMAP to define pathways in contact-map space - Configuration is more verbose, a warning is given if a default option cannot be enabled and an error is given if an option explicitly enabled cannot be enabled. - Compilation is less verbose (use "make VERBOSE=1" to have old behavior) @@ -159,15 +159,15 @@ Version 2.1.2 (Mar 16, 2015) ---------------------------------------------- For users: -- Added two new short tutorials to the manual (\ref cambridge and \ref munster). +- Added two new short tutorials to the manual ( \ref cambridge and \ref munster ). - Fixed a severe bug on \ref DRMSD - cutoff values were ignored by PLUMED. Notice that this bug was introduced in 2.1.0, so that it should not affect the 2.0.x series. - Fixed a bug affecting LAMMPS patch used with a single processor. Notice that - the fix is inside PLUMED, thus it does not necessarily requires repatching. + the fix is inside PLUMED, thus it does not necessarily requires re-patching. - Sander patch now works with multiple replica (no replica exchange yet). It also contains some fix from J. Swails. - GMX5 patch was not working for bias-exchange like cases -- Patching system now checks for the availabity of shared/static/runtime version of plumed before +- Patching system now checks for the availability of shared/static/runtime version of plumed before patching - Configure now check better if compiler flag are accepted by the compiler. This makes configure on bluegene more robust. @@ -182,7 +182,7 @@ For users: - Fixed bug in \ref GHOST : virial is now computed correctly. - Fixed a serious bug in virial communicated from plumed to gromacs, for both gromacs versions 4.6 and 5.0. See \issue{132}. - This fix requires gromacs to be repatched and could be very important if you run biased simulations in the NPT ensemble. + This fix requires gromacs to be re-patched and could be very important if you run biased simulations in the NPT ensemble. - Fixed a bug in the virial computed with \ref FIT_TO_TEMPLATE when the reference pdb had center non located at the origin. - Fixed a bug in the the forces computed with \ref FIT_TO_TEMPLATE when used in combination with \ref COM, \ref CENTER, or \ref GHOST - Fixed a bug that could lead plumed to be stuck with domain decomposition in some extreme case (one domain with all atoms, other domains empty). @@ -219,10 +219,10 @@ Version 2.1.4 (Oct 13, 2015) For users: - Fixed NAMD patch. Masses and charges were not passed correctly, thus resulting in wrong \ref COM or \ref CENTER with MASS. - This fix required repatching NAMD. + This fix required re-patching NAMD. Notice that this bug was present also in v2.0 but in a different form. More information here (\issue{162}), including a workaround that allows masses to be fixed - without repatching. + without re-patching. - When installing with PLUMED_LIBSUFFIX an underscore is used as separator instead of a dash. E.g. `make install PLUMED_LIBSUFFIX=2.1` will result in an executable named `plumed_v2.1`. This fix a potential problem (see \ref Installation). @@ -233,10 +233,10 @@ For users: processors per replica. - Small change in numerical accuracy of lattice reduction. Should be more robust when running with highly optimizing compilers. -- Fixed a bug in normalisation of kernel functions. This affects \ref HISTOGRAM - If these actions were used with previous versions of the code care should be taken when analysing the +- Fixed a bug in normalization of kernel functions. This affects \ref HISTOGRAM + If these actions were used with previous versions of the code care should be taken when analyzing the results. -- Fixed a bug in derivatives of kernel functions with non-diagonal covariances. This affects the +- Fixed a bug in derivatives of kernel functions with non-diagonal covariance matrices. This affects the derivatives output by \ref sum_hills Version 2.1.5 (Jan 18, 2016) diff --git a/CHANGES/v2.2.md b/CHANGES/v2.2.md index eda83de6c..baba1943a 100644 --- a/CHANGES/v2.2.md +++ b/CHANGES/v2.2.md @@ -28,7 +28,7 @@ Changes from version 2.1 which are relevant for users: DESTDIR and other standard autoconf directories (e.g. bindir) are completely supported. Additionally, everything should work properly also when directory names include spaces (\issue{157}). Finally, compiler is not invoked on install unless path are explicitly changed (\issue{107}). -- Related to installation refactoring, upon install a previusly installed plumed is not removed. +- Related to installation refactoring, upon install a previously installed PLUMED is not removed. This is to avoid data loss if prefix variable is not properly set - Several changes have been made in the Makefile.conf that makes it not compatible with those packaged with plumed 2.0/2.1. Please use ./configure to generate a new configuration file. @@ -39,9 +39,9 @@ Changes from version 2.1 which are relevant for users: the switching function goes to zero. Users should always set this parameter when using a switching function in order to achieve optimal performance. - DHENERGY option is no longer possible within \ref DISTANCES. You can still calculate the DHENERGY colvar by using \ref DHENERGY -- Reweighting in the manner described in \cite Tiwary_jp504920s is now possible using a combination of the \ref METAD and \ref HISTOGRAM actions. The relevent keywords in \ref METAD are REWEIGHTING_NGRID and REWEIGHTING_NHILLS. The \f$c(t)\f$ and the appropriate weight to apply to the configurations are given by the values labeled rct and rbias. +- Reweighting in the manner described in \cite Tiwary_jp504920s is now possible using a combination of the \ref METAD and \ref HISTOGRAM actions. The relevant keywords in \ref METAD are REWEIGHTING_NGRID and REWEIGHTING_NHILLS. The \f$c(t)\f$ and the appropriate weight to apply to the configurations are given by the values labeled rct and rbias. - News in configure and install: - - ./configure now allows external blas to be used with internal lapack. This is done automatically if only blas are available, + - ./configure now allows external BLAS to be used with internal LAPACK. This is done automatically if only BLAS are available, and can be enforced with --disable-external-lapack. - ./configure supports --program-prefix, --program-suffix, and --program-transform-name. - make install supports DESTDIR and prefix. @@ -55,8 +55,8 @@ Changes from version 2.1 which are relevant for users: - \ref MFILTER_LESS filter multicolvar by the value of the colvar - \ref MFILTER_MORE - \ref MFILTER_BETWEEN - - \ref PCARMSD pca collective variables using OPTIMAL rmsd measure - - \ref PCAVARS pca collective variables using any one of the measures in reference + - \ref PCARMSD PCA collective variables using OPTIMAL rmsd measure + - \ref PCAVARS PCA collective variables using any one of the measures in reference - \ref GRADIENT can be used to calculate the gradient of a quantity. Used to drive nucleation - \ref CAVITY - \ref PUCKERING implemented for 5-membered rings (thanks to Alejandro Gil-Ley). @@ -74,8 +74,8 @@ Changes from version 2.1 which are relevant for users: - \ref MOLINFO now supports many more special names for rna and dna (thanks to Alejandro Gil-Ley). - VMEAN and VSUM allow one to calculate the sum of a set of vectors calculated by VectorMultiColvar. Note these can also be used in tandem with \ref AROUND or \ref MFILTER_MORE to calculate the average vector within a particular - part of the cell or the average vector amonst those that have a magnitude greater than some tolerance - - New way of calculating the minimum value in multicolvars (ALT_MIN). This is less succetible to overflow for certain + part of the cell or the average vector among those that have a magnitude greater than some tolerance + - New way of calculating the minimum value in multicolvars (ALT_MIN). This is less susceptible to overflow for certain values of \f$\beta\f$. - New keywords for calculating the LOWEST and HIGHEST colvar calculated by a multicolvar - Added components to \ref DIPOLE (\issue{160}). @@ -84,7 +84,7 @@ Changes from version 2.1 which are relevant for users: For developers: -- In order to be able to use openmp parallelism within multcolvar, secondarystructure, manyrestraints and crystallisation +- In order to be able to use openMP parallelism within multicolvar, secondarystructure, manyrestraints and crystallisation we had to make some substantial changes to the code that underlies these routines that is contained within vesselbase. In particular we needed to get rid of the derivatives and buffer private variables in the class ActionWithVessel. As a consequence the derivatives calculated in the various performTask methods are stored in an object of type MultiValue. Within multicolvar @@ -102,7 +102,7 @@ For users: - PLUMED now reports an error when using \ref HISTOGRAM with UNNORMALIZED without USE_ALL_DATA. See \issue{175} - Fixed a bug in configure together with --enable-almost. The check for lbz2 library was not working properly. - Fixed a bug in install procedure that was introducing an error in linking with CP2K. -- Fixed a bug that sometimes was preventing the printing of a usefull error message. +- Fixed a bug that sometimes was preventing the printing of a useful error message. For developers: - Vector and Tensor now support direct output with `<<`. diff --git a/CHANGES/v2.3.md b/CHANGES/v2.3.md index d6e37d217..7bc6c9831 100644 --- a/CHANGES/v2.3.md +++ b/CHANGES/v2.3.md @@ -71,7 +71,7 @@ Changes from version 2.2 which are relevant for users: - \ref RESET_CELL - \ref JCOUPLING - \ref ERMSD -- New features in MD patches (require repatch): +- New features in MD patches (require re-patch): - Patch for amber 14 now passes charges with appropriate units (fixes \issue{165}). Notice that the patch is still backward compatible with older PLUMED version, but the charges will only be passed when using PLUMED 2.3 or later. @@ -92,19 +92,19 @@ Changes from version 2.2 which are relevant for users: - \ref driver now allows --trajectory-stride to be set to zero when reading with --ixtc/--itrr. In this case, step number is read from the trajectory file. - \ref METAD and \ref PBMETAD can now be restarted from a GRID - Added keywords TARGET and DAMPFACTOR in \ref METAD - - When using \ref METAD with file-based multple walkers and parallel jobs (i.e. mpirun) extra suffix is not added (thanks to Marco De La Pierre). + - When using \ref METAD with file-based multiple walkers and parallel jobs (i.e. mpirun) extra suffix is not added (thanks to Marco De La Pierre). - \ref ENSEMBLE added keywords for weighted averages, and calculation of higher momenta - \ref MOLINFO now allows single atoms to be picked by name. - \ref FIT_TO_TEMPLATE now supports optimal alignment. - \ref CONSTANT added the possibility of storing more values as components with or without derivatives - \ref PUCKERING now supports 6 membered rings. - - Extended checkpoint infrastracture, now \ref METAD and \ref PBMETAD will write GRIDS also on checkpoint step (only the GROMACS patch + - Extended checkpoint infrastructure, now \ref METAD and \ref PBMETAD will write GRIDS also on checkpoint step (only the GROMACS patch is currently using the checkpointing interface) - Other features: - Added a plumed-config command line tool. Can be used to inspect configuration also when cross compiling. - Added a `--mpi` option to `plumed`, symmetric to `--no-mpi`. Currently, it has no effect (MPI is initialized by default when available). - PLUMED now generate a VIM syntax file, see \ref VimSyntax - - The backward cycle is now parallelised in MPI/OpenMP in case many collective variables are used. + - The backward cycle is now parallelized in MPI/OpenMP in case many collective variables are used. - GSL library is now searched by default during `./configure`. - Tutorials have been (partially) updated to reflect some of the changes in the syntax - Parser now reports errors when passing numbers that cannot be parsed instead of silently replacing their default value. See \issue{104}. @@ -114,7 +114,7 @@ Changes from version 2.2 which are relevant for users: For developers: - IMPORTANT: BIAS can now be BIASED as well, this changes can lead to some incompatibility: now the "bias" component is always defined automatically - by the constructure as a componentWithDerivatives, derivatives are automaticcaly obtained by forces. The main change is that you don't have to define + by the constructor of Bias as a componentWithDerivatives, derivatives are automatically obtained by forces. The main change is that you don't have to define the bias component anymore in your constructor and that you can use setBias(value) to set the value of the bias component in calculate. - Added new strings for plumed cmd: setMDMassUnits, setMDChargeUnits, readInputLine, performCalcNoUpdate, update and doCheckPoint. - Easier to add actions with multiple arguments @@ -128,10 +128,10 @@ For developers: ## Version 2.3.1 (Mar 31, 2017) - Fix to FIT_TO_TEMPLATE as in 2.2.5. Notice that in 2.3.0 also the case with TYPE=OPTIMAL was affected. This is fixed now. -- small change in \ref CS2BACKBONE to symmetrise the ring current contribution with respect to ring rotations (also faster) +- small change in \ref CS2BACKBONE to symmetrize the ring current contribution with respect to ring rotations (also faster) - fixed `plumed-config` that was not working. - log file points to the `config.txt` files to allow users to check which features were available in that compiled version. -- `make clean` in root dir now also cleans `vim` subdirectory. +- `make clean` in root dir now also cleans `vim` sub-directory. - Updated gromacs patch to version 2016.3 For developers: @@ -148,10 +148,10 @@ See branch \branch{v2.3} on git repository. - Fixed bug in if condition in \ref PCAVARS so that you can run with only one eigenvector defined in input - Fixed bug with timers in \ref sum_hills \issue{194}. - Fixed bug when using \ref MOVINGRESTRAINT with periodic variables such as \ref TORSION \issue{225}. -- Fixed bug in \ref HBOND_MATRIX that used to apear when you used DONORS and ACCEPTORS with same numbers of atoms +- Fixed bug in \ref HBOND_MATRIX that used to appear when you used DONORS and ACCEPTORS with same numbers of atoms - Fixed bug in \ref DISTANCES that appears when using BETWEEN and link cells. - Prevented users from causing segfaults by storing derivatives without LOWMEM flag. In these cases PLUMED crashes with meaningful errors. -- Fixed bug in \ref HISTOGRAM that causes nans when using KERNEL=DISCRETE option +- Fixed bug in \ref HISTOGRAM that causes NaNs when using KERNEL=DISCRETE option - Fixed a bug in the parser related to braces, see \issue{229} - Fixed a bug that appeared when using \ref Q3, \ref Q4 and \ref Q6 with LOWEST or HIGHEST flag - Fixed a bug that appears when you use \ref MFILTER_LESS as input to \ref COORDINATIONNUMBER with SPECIESA and SPECIESB flags @@ -219,7 +219,7 @@ For developers: ## Version 2.3.5 (Mar 2, 2018) For users: -- Fixed `plumed partial_tempering` to agree with GROMACS conventions for the choice of dihedrals (see \issue{337}). +- Fixed `plumed partial_tempering` to agree with GROMACS conventions for the choice of dihedral angles (see \issue{337}). Should be irrelevant for the vast majority of cases. - Fixed small bug in regexp parser - the part outside the parentheses was just ignored. @@ -265,7 +265,7 @@ For users: For developers: - Small fix in LDFLAGS when enabling coverage. - Fixed order of flags in tests for static linking done by configure (see \issue{407}). -- Fixed the way paths are hardcoded so as to facilitate conda packaging (see \issue{416}). +- Fixed the way paths are hard-coded so as to facilitate conda packaging (see \issue{416}). */ diff --git a/CHANGES/v2.4.md b/CHANGES/v2.4.md index ae03f3e0e..5e92dd979 100644 --- a/CHANGES/v2.4.md +++ b/CHANGES/v2.4.md @@ -16,7 +16,7 @@ Changes from version 2.3 which are relevant for users: Since the number of c++11 features that we use is limited, older compilers might work as well. - The meaning of BIASFACTOR=1 in \ref METAD has been modified and can now be used to indicate unbiased simulations. Non-well-tempered metadynamics is BIASFACTOR=-1, which is the new default value. - Notice that this has an implication on the biasfactor written in the HILLS file when doing + Notice that this has an implication on the bias factor written in the HILLS file when doing non-well-tempered metadynamics. - Due to a change in \ref COMMITTOR, the format of its output file has been slightly changed. - \ref HISTOGRAM : When using weights default is now to output histogram divided by number of frames from which data was taken. In addition the @@ -104,8 +104,8 @@ Changes from version 2.3 which are relevant for users: - Sharing coordinates and applying force is now faster (in some cases these can result in much better scaling of the performances in parallel). - \ref COMMITTOR : new flag to use committor to keep track of the visited basins without stopping the simulation - \ref PBMETAD : multiple walkers using files (thanks to Marco De La Pierre). - - \ref PBMETAD : adaptive gaussians - - \ref PBMETAD : default names for GRID and FILE (usefull with many collective variables) + - \ref PBMETAD : adaptive Gaussian kernels + - \ref PBMETAD : default names for GRID and FILE (useful with many collective variables) - \ref METAD : BIASFACTOR=1 is allowed and performs unbiased sampling. HILLS file can be used to recover free energy also in this case. - \ref METAD : a RECT option is available that allows setting an array of bias factors, one for each replica. @@ -142,7 +142,7 @@ Changes from version 2.3 which are relevant for developers: - C++ exceptions are enabled by default. - A large number of loops have been changed to use the `auto` keyword in order to improve code readability. - Stack trace is not written upon error anymore, unless environment variable `PLUMED_STACK_TRACE` is set at runtime. - - Fixed a potential bug using single precision system blas on a mac (notice that currently plumed only uses + - Fixed a potential bug using single precision system BLAS on a mac (notice that currently plumed only uses double precision, so it is harmless). - Added `--enable-rpath` option for autoconf (off by default). - Files related to changelog are now stored as `.md` files. This makes @@ -156,7 +156,7 @@ Changes from version 2.3 which are relevant for developers: - Store `MPIEXEC` variable at configure time and use it later for running regtests. Notice that in case `MPIEXEC` is not specified regtests will be run using the command stored in env var `PLUMED_MPIRUN` or, if this is also not defined, using `mpirun`. - - Added canonical makefile targets `check` and `installcheck`. Notice that `check` runs checks with + - Added canonical Makefile targets `check` and `installcheck`. Notice that `check` runs checks with non-installed plumed whereas `installcheck` uses the installed one, including its correct program name if it was personalized (e.g. with suffixes). Notice that this modifies the previously available `check` target. @@ -169,8 +169,8 @@ For users: (see \issue{343}). Results might depend on the exact architecture and on how aggressive is the compiler. The bug is a consequence of some erroneous SIMD directives introduced in 2.4.0, so it does not affect PLUMED 2.3.x. - Resolved a problem with \ref CS2BACKBONE and glycine atom names. - Module VES: Fixed a bug with basis functions that have a constant function different from 1 (e.g. scaled version of the Legendre basis functions, \ref BF_LEGENDRE) that was causing a time-dependent shift in the bias potential. - - Module VES: In optimizers (\ref OPT_AVERAGED_SGD and \ref OPT_DUMMY) the output of quantities related to the instantaneous gradients are now off by default as these quantities are generally not useful for normal users, their output can instead by re-enabled by using the MONITOR_INSTANTANEOUS_GRADIENT keyword. Also added an keyword MONITOR_AVERAGE_GRADIENT that allows to monitor the averged gradient and output quantities related to it. - - \ref RMSD variable and other collective variables using reference PDBs now crash when zero weights are passed (see \issue{247}). + - Module VES: In optimizers (\ref OPT_AVERAGED_SGD and \ref OPT_DUMMY) the output of quantities related to the instantaneous gradients are now off by default as these quantities are generally not useful for normal users, their output can instead by re-enabled by using the MONITOR_INSTANTANEOUS_GRADIENT keyword. Also added an keyword MONITOR_AVERAGE_GRADIENT that allows to monitor the averaged gradient and output quantities related to it. + - \ref RMSD variable and other collective variables using reference PDB files now crash when zero weights are passed (see \issue{247}). - Using \ref COM with \ref driver without passing masses now triggers an error instead of reporting NaNs (see \issue{251}). For developers: diff --git a/user-doc/tutorials/a-trieste-1.txt b/user-doc/tutorials/a-trieste-1.txt index 0af1e1d23..c9e3b1b16 100644 --- a/user-doc/tutorials/a-trieste-1.txt +++ b/user-doc/tutorials/a-trieste-1.txt @@ -10,7 +10,7 @@ simple collective variable and we will use them to analyze existing trajectories Once this tutorial is completed students will be able to: -- Write a simple PLUMED input file and use it with the PLUMED \ref driver to analyse a trajectory. +- Write a simple PLUMED input file and use it with the PLUMED \ref driver to analyze a trajectory. - Use the \ref GROUP keyword to make the input file compact and easy to read and to quickly build complex atom groups. - Print collective variables such as distances (\ref DISTANCE), torsional angles (\ref TORSION), gyration radius (\ref GYRATION), @@ -111,7 +111,7 @@ PRINT ARG=phi STRIDE=100 FILE=COLVAR2 \note If you are a VIM user, you might find convenient configuring PLUMED syntax files, see \ref VimSyntax. Syntax highlighting is particularly useful for beginners since it allows you to identify simple mistakes without the need to run PLUMED. In addition, VIM has a full dictionary of available - keywords and can help you autocompleting your commands. + keywords and can help you by autocomplete your commands. In the input file above, each line defines a so-called action. An action could either compute a distance, or the center between two or more atoms, or print some value on a file. Each action supports a number of keywords, @@ -161,11 +161,11 @@ Analyze the `traj-whole.xtc` trajectory and produce a colvar file with the follo - The torsional angle (\ref TORSION) corresponding to the glycosidic chi angle \f$\chi\f$ of the first nucleotide. Since this nucleotide is a purine (guanine), the proper atoms to compute the torsion are O4' C1 N9 C4. Find their serial number in the `ref.pdb` file or learn how to select a special angle reading the \ref MOLINFO documentation. -- The total number of contacts (\ref COORDINATION) between all RNA atoms and all water oxygens. +- The total number of contacts (\ref COORDINATION) between all RNA atoms and all water oxygen atoms. For \ref COORDINATION, set reference distance `R_0` to 2.5 A (be careful with units!!). - Try to be smart in selecting the water oxygens without listing all of them explicitly. + Try to be smart in selecting the water oxygen atoms without listing all of them explicitly. - Same as before but against water hydrogen. - Also in this case you should be smart to select water hydrogens. Documentation of \ref GROUP might help. + Also in this case you should be smart to select water hydrogen atoms. Documentation of \ref GROUP might help. - Distance between the Mg ion and the geometric center of the RNA duplex (use \ref CENTER and \ref DISTANCE). Notice that some of the atom selections can be made in a easier manner by using the \ref MOLINFO keyword with @@ -216,7 +216,7 @@ Once your `plumed.dat` file is complete, you can use it with the following comma \endverbatim Scroll in your terminal to read the PLUMED log. As you can see, -PLUMED gives a lot of feedbacks about the input that he is reading. There's the +PLUMED gives a lot of feedback about the input that he is reading. There's the place where you can check if PLUMED understood correctly your input. The command above will create a file `COLVAR` like this one: @@ -353,7 +353,7 @@ be used in the following cases: - To dump snapshots of our molecule conditioned to some value of some collective variable (see \ref UPDATE_IF). - To dump coordinates of atoms that have been moved by PLUMED. -The last point is perhaps the most surpising one. Some of the PLUMED actions can indeed move the stored atoms to +The last point is perhaps the most surprising one. Some of the PLUMED actions can indeed move the stored atoms to positions better suitable for the calculation of collective variables. The previous exercise was done on a trajectory where the RNA was already whole. For the next exercise you will use the @@ -368,10 +368,10 @@ This is not a problem during MD because of periodic boundary conditions. However to analyze this trajectory. In addition, some collective variables that you might want to compute could require the molecules to be whole (an example of such variables is \ref RMSD). -You might think that there are alternative programs that can be used to reconstruct PBCs correctly +You might think that there are alternative programs that can be used to reconstruct the molecules the molecules that are broken by the periodic boundary correctly in your trajectory *before* analyzing it. However, you should keep in mind that if you need to compute CVs on the fly to add a bias potential on those (as we will to in the next tutorials) you will -have to learn how to reconstruct PBCs within PLUMED. If you know alternative tools that can reconstruct PBCs, +have to learn how to reconstruct the molecules that are broken by the periodic boundary within PLUMED. If you know alternative tools that can reconstruct the molecules that are broken by the periodic boundary, it is a good idea to also use them and compare the result with PLUMED. @@ -557,7 +557,7 @@ As you can see, the generating vectors of the periodic lattice *before* fitting On the other hand, *after* fitting these vectors change so as to keep RNA correctly aligned to its reference structure. Later on you will learn how to add a bias potential on a give collective variable. In principle, you could -also add a \ref RESTRAINT ro the `cell_after.*` variables of the last example. This would allow you to +also add a \ref RESTRAINT to the `cell_after.*` variables of the last example. This would allow you to force your molecule to a specific orientation. \endhidden diff --git a/user-doc/tutorials/a-trieste-2.txt b/user-doc/tutorials/a-trieste-2.txt index 6a0b51abd..51e4bc0d5 100644 --- a/user-doc/tutorials/a-trieste-2.txt +++ b/user-doc/tutorials/a-trieste-2.txt @@ -3,7 +3,7 @@ \section trieste-2-aim Introduction -The aim of this tutorial is for you to understand how we analyse trajectory data by calculating ensemble averages. One key point we try to +The aim of this tutorial is for you to understand how we analyze trajectory data by calculating ensemble averages. One key point we try to make in this tutorial is that you must <b> always </b> calculate averaged quantities from our simulation trajectories. Consequently, in order to understand the analysis that is done when we perform molecular dynamics and Monte Carlo simulations, we will need to understand some things about probability and statistics. In fact, the things that we need to understand about probability and statistics are going to be the main subject of this @@ -14,9 +14,9 @@ model data instead. Once this tutorial is completed students will: -- Be able to explain the role played by the central limit theorem in many of the analyses we perform on simulation trajectories. +- Be able to explain the role played by the central limit theorem in many of the analysis methods that we perform on simulation trajectories. - Be able to use PLUMED to calculate ensemble averages and histograms using the keywords \ref AVERAGE and \ref HISTOGRAM. -- Be able to use PLUMED to perform block analyses of trajectory data using the keywords \ref AVERAGE and \ref HISTOGRAM. +- Be able to use PLUMED to perform block analysis of trajectory data using the keywords \ref AVERAGE and \ref HISTOGRAM. - Be able to use PLUMED to calculate unbiased ensemble averages and histograms from biased trajectories by using the keywords \ref AVERAGE, \ref HISTOGRAM and \ref REWEIGHT_BIAS. - Be able to explain how block analysis can be used to detect problems with error bar underestimation in correlated data. @@ -24,13 +24,13 @@ Once this tutorial is completed students will: Let's begin by thinking about what is actually output by a simulation trajectory. You probably know by now that molecular dynamics gives us a trajectory that provides us with information on how the positions and velocities of the all the atoms in the system change with time. Furthermore, you should by now -understand that we can use PLUMED to reduce the ammount of information contained in each of the trajectory frames to a single number or a low-dimensional vector +understand that we can use PLUMED to reduce the amount of information contained in each of the trajectory frames to a single number or a low-dimensional vector by calculating a collective variable or a set of collective variables. As we saw in \ref trieste-1 this process of lowering the dimensionality of the data contained in a simulation trajectory was vital as we understand very little by watching the motions of the hundreds of atoms that the system we are simulating contains. If we monitor the appropriate key variables, however, we can determine whether a chemical reaction has taken place, whether the system has undergone a phase transition to a more ordered form or if a protein has folded. Even so if all we do is monitor the values the the collective variables take during the simulation our result can only ever be qualitative and we can only say that we observed this process to take place under these particular conditions on one particular occasion. -Obviously, we would like to be able to do more - we would like to be able to make quantiative predictions based on the outcomes of our simulations. +Obviously, we would like to be able to do more - we would like to be able to make quantitative predictions based on the outcomes of our simulations. The first step in moving towards making these quantitative predictions involves rethinking what precisely our simulation trajectory has provided us with. We know from rudimentary statistical mechanics that when we perform a molecular dynamics simulation in the NVT ensemble the values of the collective variables @@ -48,12 +48,12 @@ to the integral in the numerator. The fact that we know what distribution the \f$X\f$-values obtained from our trajectory frames are taken from is not particularly helpful as it is impossible to calculate the integrals in the expression above analytically. The fact that we know that our trajectory frames represent samples from a distribution is helpful, however, because of a -result know as the Central Limit Thoerem. This theorem states the following: +result know as the Central Limit Theorem. This theorem states the following: \f[ \lim_{n \rightarrow \infty} P\left( \frac{\frac{S_n}{n} - \langle Y \rangle}{ \sqrt{\frac{\langle(\delta Y)^2\rangle}{n}} } \le z \right) = \Phi(z) \f] In this expression \f$\Phi(z)\f$ is the cumulative probability distribution function for a standard normal distribution and \f$S_n\f$ is a sum of \f$n\f$ independent samples -from a probability distribution - in our case the probability distribution that we introduced in the previous equation. \f$\langle Y \rangle\f$ is the ensemble average +from a probability distribution - in our case the probability distribution that we introduced in the previous equation. \f$\langleY\rangle\f$ is the ensemble average for the quantity \f$Y(x)\f$, which, in our case, we can calculate as follows: \f[ \langle Y \rangle = \frac{ \int \textrm{d}x \textrm{d}p Y(x) e^{-\frac{H(x,p)}{k_B T}} }{ \int \textrm{d}x\textrm{d}p e^{-\frac{H(x,p)}{k_B T}} } @@ -67,7 +67,7 @@ quantity is calculated using: The statement of the theorem provided above is compact way of stating the formal theorem. This is undoubtedly pleasing to mathematicians but to understand why this idea is so important in the context of molecular simulation it is useful to think about this in a slightly less formal way. The central limit theorem essentially tells us that if we take the set of random variables we extract from a simulation trajectory, which we know represent samples from the complicated distribution in the first equation above, -and we add all these random variables together and divide by \f$n\f$ we can think of the final number we obtain as a random variable that is taken from a Gaussian distribution. In other words, the probabiilty density function for the sample mean, \f$\frac{S_n}{n}\f$, that we get from our trajectory frames is given by: +and we add all these random variables together and divide by \f$n\f$ we can think of the final number we obtain as a random variable that is taken from a Gaussian distribution. In other words, the probability density function for the sample mean, \f$\frac{S_n}{n}\f$, that we get from our trajectory frames is given by: \f[ P(S_n) = \frac{1}{\sqrt{\frac{2\pi \langle(\delta Y)^2\rangle}{n}}} \exp\left( - \frac{\frac{S_n}{n} - \langle Y \rangle}{ 2\frac{\langle(\delta Y)^2\rangle}{n} } \right) \f] @@ -87,8 +87,8 @@ better understanding of the central limit theorem, confidence limits and error b \subsection trieste-2-average Calculating an ensemble average -As discussed in the introduction we are going to be using model data in this exercise so we must begin by generating some model data to analyse -using PLUMED. The following short python script will generate 10000 (pseudo) random variables between 0 and 1 from a uniform disribution in a format that +As discussed in the introduction we are going to be using model data in this exercise so we must begin by generating some model data to analyze +using PLUMED. The following short python script will generate 10000 (pseudo) random variables between 0 and 1 from a uniform distribution in a format that PLUMED can understand: \code{.py} @@ -102,15 +102,15 @@ for i in range(0,10001): Copy the contents of the box above to a plain text file called generate_data.py, save the file and then execute the script within it by running: \verbatim -> python generate_data.py > mydata +> python generate_data.py > my_data \endverbatim -This will generate a file called mydata that contains 10001 uniform random variables. PLUMED will ignore the first number in the colvar file as it assumes +This will generate a file called my_data that contains 10001 uniform random variables. PLUMED will ignore the first number in the colvar file as it assumes this is the initial configuration you provided for the trajectory. The sample mean will thus be calculated from 10000 random variables. Plot this data now using gnuplot and the command: \verbatim -gnuplot> p 'mydata' u 1:2 w p +gnuplot> p 'my_data' u 1:2 w p \endverbatim The probability distribution that we generated these random variables is considerably simpler than the probability distribution that we would typically @@ -125,11 +125,11 @@ These are: \end{aligned} \f] -Lets now try estimating these quantieis by calculating \f$\frac{S_n}{n}\f$ from the points we generated and exploting the central limit theorem. +Lets now try estimating these quantities by calculating \f$\frac{S_n}{n}\f$ from the points we generated and exploiting the central limit theorem. We can do this calculation by writing a PLUMED input file that reads: \plumedfile -data: READ FILE=mydata VALUES=rand +data: READ FILE=my_data VALUES=rand d2: MATHEVAL ARG=data VAR=a FUNC=a*a PERIODIC=NO av: AVERAGE ARG=data STRIDE=1 av2: AVERAGE ARG=d2 STRIDE=1 @@ -142,14 +142,14 @@ If you copy this input to a file called plumed.dat you can then run the calculat > plumed driver --noatoms \endverbatim -When the calculation is finished you should have a file called colvar that contains the estimate of the ensemble averages \f$\langle X \rangle\f$ and \f$\langle X^2 \rangle\f$. +When the calculation is finished you should have a file called colvar that contains the estimate of the ensemble averages \f$\langle X\rangle\f$ and \f$\langle X^2\rangle\f$. To be clear, the quantities output in this file are: \f[ \langle X \rangle = \frac{1}{N} \sum_{i=1}^N X_i \qquad \textrm{and} \qquad \langle X^2 \rangle \frac{1}{N} \sum_{i=1}^N X_i^2 \f] -We can calculate the flucutations, \f$(\delta X)^2\f$, from them using: +We can calculate the fluctuations, \f$(\delta X)^2\f$, from them using: \f[ (\delta X)^2 = \langle X^2 \rangle - \langle X \rangle^2 @@ -160,16 +160,16 @@ is reasonable but not perfect. \subsection trieste-2-histo Calculating a histogram -We can use what we have learnt about calculating an ensemble average to calculate an estimate for the probability density function or probability mass function +We can use what we have learned about calculating an ensemble average to calculate an estimate for the probability density function or probability mass function for our random variable. The theory behind what we do here is explained in this video http://gtribello.github.io/mathNET/histogram-video.html To do such a calculation with PLUMED on the random variables we generated from the uniform distribution in the previous section we would use an input like the one below: \plumedfile -data: READ FILE=mydata VALUES=rand +data: READ FILE=my_data VALUES=rand hh: HISTOGRAM ARG=data STRIDE=1 GRID_MIN=0 GRID_MAX=1.0 GRID_BIN=20 KERNEL=DISCRETE -DUMPGRID GRID=hh FILE=myhist.dat +DUMPGRID GRID=hh FILE=my_histogram.dat \endplumedfile Once again you can run this calculation by using the command: @@ -181,7 +181,7 @@ Once again you can run this calculation by using the command: Once this calculation is completed you can then plot the output using gnuplot and the command: \verbatim -gnuplot> p 'myhist.dat' u 1:2 w l +gnuplot> p 'my_histogram.dat' u 1:2 w l \endverbatim In the previous section we compared the estimates we got for the ensemble average with the exact analytical values, which we could determine @@ -195,9 +195,9 @@ Also notice that the probability estimate for the first and last points in our h As discussed in previous sections the sample mean of the CV values from our trajectory frames is a random variable and the central limit theorem tells us something about the the <b> distribution </b> from which this random variable is drawn. The fact that we know what distribution the sample mean is drawn from is what allows us to <b> estimate </b> the ensemble average and the fluctuations. The fact that this recipe is only estimating the ensemble average is critical and -this realisation should always be at the forefront of our minds whenever we analyse our simulation data. The point, once again, is that the sample mean for CV values +this realization should always be at the forefront of our minds whenever we analyze our simulation data. The point, once again, is that the sample mean for CV values from the simulation is random. As is explained in this video http://gtribello.github.io/mathNET/central-limit-theorem-video.html, however, we can use the -central limit theorem to calculate a range, \f$\{\langle X \rangle-\epsilon,\langle X \rangle+\epsilon\}\f$, that the sample mean for the CV values, \f$\frac{S_n}{n}\f$, +central limit theorem to calculate a range, \f$\{\langle X\rangle-\epsilon,\langle X\rangle+\epsilon\}\f$, that the sample mean for the CV values, \f$\frac{S_n}{n}\f$, will fall into with a probability \f$p_c\f$ using: \f[ \epsilon = \sqrt{ \frac{\langle ( \delta X )^2 \rangle }{n} } \Phi^{-1}\left( \frac{p_c + 1}{2} \right) @@ -206,7 +206,7 @@ Here \f$\Phi^{-1}\f$ is the inverse of the cumulative probability distribution f this range gets smaller as the number of samples from which you calculate the mean, \f$n\f$, increases. As is shown in the figure below, however, the rate at which this range narrows in size is relatively small. -\anchor triest-2-confidence +\anchor trieste-2-confidence \image html trieste-2-confidence.png "A figure showing how the width of the 90% confidence limit changes with number of samples. Two lines are shown - one at plus epsilon and one at minus epsilon. This figure shows clearly that simply averaging over more random variables will not get us markedly closer to the true value of the ensemble average, which is 0 for this particular example. For this random variable the average fluctuation is equal to one." This graph hopefully illustrates to you that an estimate of the ensemble average that is taken over 500 trajectory frames is not guaranteed @@ -218,7 +218,7 @@ various parts of ``the trajectory" are all consistent. We can perform a block averaging on the data we generated at the start of the first exercise above by using PLUMED and the input file below: \plumedfile -data: READ FILE=mydata VALUES=rand +data: READ FILE=my_data VALUES=rand av: AVERAGE ARG=data STRIDE=1 CLEAR=1000 PRINT ARG=av STRIDE=1000 FILE=colvar \endplumedfile @@ -229,7 +229,7 @@ Once again this calculation can be executed by running the command: > plumed driver --noatoms \endverbatim -The key differences between this input and the ones that we have seen previously is the CLEAR keyword on the line AVERAGE. The intruction CLEAR=1000 tells PLUMED +The key differences between this input and the ones that we have seen previously is the CLEAR keyword on the line AVERAGE. The instruction CLEAR=1000 tells PLUMED that the data that has been accumulated for averaging should be reset to zero (cleared) every 1000 steps. If you look at the subsequent PRINT command in the above input you see the instruction STRIDE=1000. The above input thus accumulates an average over 1000 trajectory frames. This average is then output to the file colvar on step 1000 and the accumulated data is immediately cleared after printing so that a new average over the next 1000 steps can be accumulated. We can plot the averages @@ -241,15 +241,15 @@ gnuplot> p 'colvar' u 1:2 w l If you try this now you should see that all the average values that were calculated are relatively consistent but that there are differences between them. <b> Try to calculate the size of \f$\epsilon\f$ for a 90 % confidence limit around this random variable using the formula that was provided above. How many of the averages that -you exptracted using PLUMED lie within this range? Is this behavior inline with your expectations based on your understanding of what a confidence limit is? </b> +you extracted using PLUMED lie within this range? Is this behavior inline with your expectations based on your understanding of what a confidence limit is? </b> We can also perform block averaging when we estimate histograms using PLUMED. The following input will calculate these block averaged histograms for the data we generated at the start of this exercise using PLUMED. \plumedfile -data: READ FILE=mydata VALUES=rand +data: READ FILE=my_data VALUES=rand hh: HISTOGRAM ARG=data STRIDE=1 GRID_MIN=0 GRID_MAX=1.0 GRID_BIN=20 KERNEL=DISCRETE CLEAR=1000 -DUMPGRID GRID=hh FILE=myhist.dat STRIDE=1000 +DUMPGRID GRID=hh FILE=my_histogram.dat STRIDE=1000 \endplumedfile Notice that the input here has the same structure as the input for the \ref AVERAGE. Once again we have a CLEAR=1000 keyword that tells PLUMED that the data that has been @@ -257,21 +257,21 @@ accumulated for calculating the histogram should be deleted every 1000 steps. I these blocks of trajectory data separately. The difference between the output from this input and the output from the input above is that in this case we have multiple output files. In particular, the input above should give you 10 output files which will be called: -- analysis.0.myhist.dat = analysis of data in first 1000 frames -- analysis.1.myhist.dat = analysis of data in second 1000 frames -- analysis.2.myhist.dat = analysis of data in third 1000 frames -- analysis.3.myhist.dat = analysis of data in fourth 1000 frames -- analysis.4.myhist.dat = analysis of data in fifth 1000 frames -- analysis.5.myhist.dat = analysis of data in sixth 1000 frames -- analysis.6.myhist.dat = analysis of data in seventh 1000 frames -- analysis.7.myhist.dat = analysis of data in eigth 1000 frames -- analysis.8.myhist.dat = analysis of data in ninth 1000 frames -- myhist.dat = analysis of data in tenth 1000 frames +- analysis.0.my_histogram.dat = analysis of data in first 1000 frames +- analysis.1.my_histogram.dat = analysis of data in second 1000 frames +- analysis.2.my_histogram.dat = analysis of data in third 1000 frames +- analysis.3.my_histogram.dat = analysis of data in fourth 1000 frames +- analysis.4.my_histogram.dat = analysis of data in fifth 1000 frames +- analysis.5.my_histogram.dat = analysis of data in sixth 1000 frames +- analysis.6.my_histogram.dat = analysis of data in seventh 1000 frames +- analysis.7.my_histogram.dat = analysis of data in eighth 1000 frames +- analysis.8.my_histogram.dat = analysis of data in ninth 1000 frames +- my_histogram.dat = analysis of data in tenth 1000 frames We can plot all of these histograms using gnuplot and the command: \verbatim -gnuplot> p 'analysis.0.myhist.dat' u 1:2 w l, 'analysis.1.myhist.dat' u 1:2 w l, 'analysis.2.myhist.dat' u 1:2 w l, 'analysis.3.myhist.dat' u 1:2 w l, 'analysis.4.myhist.dat' u 1:2 w l, 'analysis.5.myhist.dat' u 1:2 w l, 'analysis.6.myhist.dat' u 1:2 w l, 'analysis.7.myhist.dat' u 1:2 w l, 'analysis.8.myhist.dat' u 1:2 w l, 'myhist.dat' u 1:2 w l +gnuplot> p 'analysis.0.my_histogram.dat' u 1:2 w l, 'analysis.1.my_histogram.dat' u 1:2 w l, 'analysis.2.my_histogram.dat' u 1:2 w l, 'analysis.3.my_histogram.dat' u 1:2 w l, 'analysis.4.my_histogram.dat' u 1:2 w l, 'analysis.5.my_histogram.dat' u 1:2 w l, 'analysis.6.my_histogram.dat' u 1:2 w l, 'analysis.7.my_histogram.dat' u 1:2 w l, 'analysis.8.my_histogram.dat' u 1:2 w l, 'my_histogram.dat' u 1:2 w l \endverbatim Performing a comparison between the results from each of these blocks of data is more involved than the analysis we performed when comparing the ensemble averages as we have @@ -280,7 +280,7 @@ fraction of the many averages we have calculated here lie within the confidence \subsection trieste-2-biases Problem II: Dealing with rare events and simulation biases -As is discussed in many of the tutorials that are provided here one of PLUMED's primary purposes is to use simulation biases to resolve the rare events problem. +As is discussed in many of the tutorials that are provided here one of the primary purposes of PLUMED is to use simulation biases to resolve the rare events problem. When we use these technique we modify the Hamiltonian, \f$H(x,p)\f$, and add to it some bias, \f$V(x)\f$. The modified Hamiltonian thus becomes: \f[ H'(x,p) = H(x,p) + V(x) @@ -323,8 +323,8 @@ Copy the script above to a file called gen-normal.py and then execute the python \endverbatim <b> -Use what you have learnt from the exercises above to estimate the ensemble average from these generated data points using PLUMED. If you want -you can use block averaging but it doenst matter if you do it by just considering all the data in the input file. What would you expect the ensemble average +Use what you have learned from the exercises above to estimate the ensemble average from these generated data points using PLUMED. If you want +you can use block averaging but it does not matter if you do it by just considering all the data in the input file. What would you expect the ensemble average to be and how does the estimate you get from PLUMED compare with the true value? </b> @@ -342,7 +342,7 @@ performing here with the random variables this question perhaps feels somewhat a this question is essential as our ability to extract the true, unbiased distribution from a simulation run with a bias relies on being able to perform this sort of reweighting. -The true value of the ensemble average, which we estimated when we analysed the data generated from the Gaussin using the python script above is given by: +The true value of the ensemble average, which we estimated when we analyzed the data generated from the Gaussian using the python script above is given by: \f[ \langle X \rangle = \frac{ \int_0^1 x e^{-V(x)} \textrm{d}x }{\int_0^1 e^{-V(x)} \textrm{d}x } \qquad \textrm{where} \qquad @@ -389,13 +389,13 @@ data: READ FILE=mynormal VALUES=rand IGNORE_FORCES mm: RESTRAINT ARG=data AT=0.6 KAPPA=33.333 rw: REWEIGHT_BIAS TEMP=1 hh: HISTOGRAM ARG=data STRIDE=1 GRID_MIN=0 GRID_MAX=1.0 GRID_BIN=20 KERNEL=DISCRETE LOGWEIGHTS=rw -DUMPGRID GRID=hh FILE=myhist.dat +DUMPGRID GRID=hh FILE=my_histogram.dat \endplumedfile Plot the histogram that you obtain from this calculation using the command: \verbatim -gnuplot> p 'myhist.dat' w l +gnuplot> p 'my_histogram.dat' w l \endverbatim <b> @@ -406,11 +406,11 @@ have looked like and what functional form would our simulation bias have taken? \subsection triete-2-correlation Problem III: Dealing with correlated variables As a good scientist you have probably noticed that we have failed to provide error bars around our estimates for the ensemble average in previous sections. Furthermore, -you may have found that rather odd given that I showed you how to estimate the variance in the data in the very first exericse. This ignoring of the error bars has been +you may have found that rather odd given that I showed you how to estimate the variance in the data in the very first exercise. This ignoring of the error bars has been deliberate, however, as there is a further problem with the data that comes from molecular dynamics trajectories that we have to learn to deal with and that will affect our -ability to esimtate the error bars. This problem is connected to the fact that <b>the central limit theorem only holds when we take a sum of uncorrelated and identical +ability to estimate the error bars. This problem is connected to the fact that <b>the central limit theorem only holds when we take a sum of uncorrelated and identical random variables.</b> Random variables that are generated from simulation trajectories will contain correlations simply because the system will most likely not diffuse -from one edge of CV space to the other during a single timestep. In other words, the random CV value that we calcuate from the \f$(i+1)\f$th frame of the trajectory +from one edge of CV space to the other during a single timestep. In other words, the random CV value that we calculate from the \f$(i+1)\f$th frame of the trajectory will be similar to the value obtained from the \f$i\f$th trajectory frame. This problem can be resolved, however, and to show how we will thus use the following python script to generate some correlated model data: @@ -442,7 +442,7 @@ At present it is not possible to calculate this function using PLUMED. If we we samples that were taken from the distribution the python script above we would find that the auto correlation functions for these random variables look something like the figures shown below: -\anchor triest-2-autocorrelation +\anchor trieste-2-autocorrelation \image html trieste-2-autocorrelation.png "The autocorrelation functions for the data that was generated by taking samples from a uniform distribution (left panel) and the autocorrelation function for the correlated data that was generated using the python script in this section (right panel)" To understand what these functions are telling us lets deal with the samples from the uniform distribution first. The autocorrelation function in this case has a @@ -452,11 +452,11 @@ independent. If we now look at the autocorrelation function for the distributio slowly decays to zero. There are, therefore, correlations between the random variables that we generate that we must account for when we perform our analysis. We account for the correlations in the data when we do our analysis by performing a block analysis. To understand what precisely this involves we are going to perform -the analysis with PLUMED and explain the results that we get at each stage of the process. We wil begin by analysing the data we generated by sampling from the uniform +the analysis with PLUMED and explain the results that we get at each stage of the process. We will begin by analyzing the data we generated by sampling from the uniform distribution using the following PLUMED input: \plumedfile -data: READ FILE=mydata VALUES=rand +data: READ FILE=my_data VALUES=rand av5: AVERAGE ARG=data STRIDE=1 CLEAR=5 PRINT ARG=av5 FILE=colvar5 STRIDE=5 av10: AVERAGE ARG=data STRIDE=1 CLEAR=10 @@ -493,7 +493,7 @@ Copy the input above to a file called plumed.dat and run the calculation using t > plumed driver --noatoms \endverbatim -This calculation should output 14 colvar files, which should then be further analysed using the following python script: +This calculation should output 14 colvar files, which should then be further analyzed using the following python script: \code{.py} import numpy as np @@ -522,24 +522,24 @@ for i in range(1,15): Copy this script to a file called block-average-script.py and then execute the contents using the command: \verbatim -> python block-average-script.py > myaverages.dat +> python block-average-script.py > my_averages.dat \endverbatim -This will output a single file called myaverages.dat that can be plotted using gnuplot and the command: +This will output a single file called my_averages.dat that can be plotted using gnuplot and the command: \verbatim -gnuplot> p 'myaverages.dat' u 1:2:3 w e, 'myaverages.dat' w l +gnuplot> p 'my_averages.dat' u 1:2:3 w e, 'my_averages.dat' w l \endverbatim The final result should be a graph like that shown in the left panel of the figure below. The figure on the right shows what you should obtain when you repeat the same analysis on the correlated data that we generated using the python script in this section. -\anchor triest-2-block-average -\image html trieste-2-block-averages.png "The ensemble average and an estimate of the associated error bars calculated for different lengths of block average. The left panel shows the output obtained when the uncorrelated samples taken from uniform distribution are analysed in this way. The right panel shows the output obtained when the correlated samples that are generated using the python script in this section are analysed." +\anchor trieste-2-block-average +\image html trieste-2-block-averages.png "The ensemble average and an estimate of the associated error bars calculated for different lengths of block average. The left panel shows the output obtained when the uncorrelated samples taken from uniform distribution are analyzed in this way. The right panel shows the output obtained when the correlated samples that are generated using the python script in this section are analyzed." <b> The output that you obtain from these two calculations is explained in the video at: http://gtribello.github.io/mathNET/block_averaging_video.html -Before wathcing this explanation though take some time to note down the differences between the two graphs above. Try to look through the scripts +Before watching this explanation though take some time to note down the differences between the two graphs above. Try to look through the scripts above and to understand what is being done in the PLUMED inputs and the python scripts above so as to work out what the data in these two figures are telling you. </b> @@ -593,7 +593,7 @@ random translational moves of up to 0.1 units. An autocorrelation function that using data generated using this script is shown below. You can clearly see from this figure that there are correlations between the adjacent data points in the time series and that we have to do block averaging as a result. -\anchor triest-2-mc-autocorr +\anchor trieste-2-mc-autocorrelation \image html trieste-2-mc-autocorrelation.png "The autocorrelation function for the data that was generated using the Monte Carlo sampler in the script above" We can use the following PLUMED input to calculate block averages for the unbiased histogram from the data we generated: @@ -604,7 +604,7 @@ data: READ FILE=mcdata VALUES=rand IGNORE_FORCES mm: RESTRAINT ARG=data AT=0.6 KAPPA=33.333 rw: REWEIGHT_BIAS TEMP=1 hh: HISTOGRAM ARG=data STRIDE=1 GRID_MIN=0 GRID_MAX=1.0 GRID_BIN=20 KERNEL=DISCRETE LOGWEIGHTS=rw CLEAR=500 -DUMPGRID GRID=hh FILE=myhist.dat STRIDE=500 +DUMPGRID GRID=hh FILE=my_histogram.dat STRIDE=500 \endplumedfile Notice that this input instructs PLUMED to calculate block averages for the histogram from each set of 500 consecutive frames in the trajectory. @@ -626,7 +626,7 @@ import numpy as np # Here are some numbers you will need to change if you run this script on grids generated in different contexts nquantities = 1 # Number of quanities that have been evaluated on the grid grid_dimension = 1 # Number of collective variables that you provided using the ARG keyword -filename = "myhist.dat" # The name you specified the data to output to in the DUMPGRID command +filename = "my_histogram.dat" # The name you specified the data to output to in the DUMPGRID command # Function to read in histogram data and normalization def readhistogram( fname ) : @@ -677,7 +677,7 @@ variance = ( norm /(norm-(norm2/norm)) ) * variance # And lastly divide by number of grids and square root to get an error bar for each grid point ngrid = 1 + len( glob.glob( "analysis.*." + filename ) ) errors = np.sqrt( variance / ngrid ) -# Calcualte average error over grid and output in header +# Calculate average error over grid and output in header for i in range(0,nquantities) : mean_error = sum(errors[i,:]) / len(errors[i,:]) print("# Average error for " + str(i+1) + "th averaged function on grid equals ", mean_error ) @@ -719,7 +719,7 @@ error around the average in \ref triete-2-correlation. A plot of the average er is shown below. This figure demonstrates that a block average length of 500 is certainly long enough to correct for the problems due to the correlations in the values produced at different times: -\anchor triest-2-ebdata +\anchor trieste-2-errors-data \image html trieste-2-histogram-errors.png "The size of the error bars calculated with different lengths of block average. From this figure it is clear that taking block averages over 500 data points allows us to account for the correlations in the data and extract reasonable error bars." <b> @@ -729,11 +729,11 @@ If you have sufficient time try to see if you can reproduce this plot using the \section trieste-2-extensions Extensions The exercises in the previous sections have shown you how we can calculate ensemble averages from trajectory data. You should have seen that performing block averaging -is vital as this technique allows us to deal with the artefacts that we get because of the correlations in the data that we obtain from simulation trajectories. +is vital as this technique allows us to deal with the artifacts that we get because of the correlations in the data that we obtain from simulation trajectories. What we have seen is that when this technique is used correctly we get reasonable error bars. If block averaging is not performed, however, we can underestimate the errors in our data and thus express a false confidence in the reliability of our simulation results. -The next step for you will be to use this technique when analysing the data from your own simulations. +The next step for you will be to use this technique when analyzing the data from your own simulations. */ diff --git a/user-doc/tutorials/a-trieste-3.txt b/user-doc/tutorials/a-trieste-3.txt index 8586df344..dac69051b 100644 --- a/user-doc/tutorials/a-trieste-3.txt +++ b/user-doc/tutorials/a-trieste-3.txt @@ -9,7 +9,7 @@ The aim of this tutorial is to introduce the users to the use of constant biases - Apply a restraint on a simulations over one or more collective variables - Understand the effect of a restraint on the acquired statistics -- Perform a simple unbiasing of a restrained simulation +- Perform a simple un-biasing of a restrained simulation - Add an external potential in the form of an analytical or numerical function \section trieste-3-resources Resources @@ -26,7 +26,7 @@ of 2.4-only features, thus should also work with version 2.3. \section trieste-3-intro Introduction -PLUMED can calculate conformational properties of a system a posteriori as well as on-the-fly. This information can be use to manipulate a simulation on-the-fly. This means adding energy terms in addition to those of the original Hamiltonian. This additional energy terms are usually refered as \ref Bias. In the following we will see how to apply a constant bias potential with PLUMED. It is preferable to run each exercise in a separate folder. +PLUMED can calculate conformational properties of a system a posteriori as well as on-the-fly. This information can be use to manipulate a simulation on-the-fly. This means adding energy terms in addition to those of the original Hamiltonian. This additional energy terms are usually referred as \ref Bias. In the following we will see how to apply a constant bias potential with PLUMED. It is preferable to run each exercise in a separate folder. \hidden{Summary of theory} \subsection trieste-3-theory-biased-sampling Biased sampling @@ -104,9 +104,9 @@ than frames that has been explored with a less disfavoring bias potential. \endhidden -We will make use of two toy models: the first is a water dimer, i.e. two molecules of water in vacuo, that we will use to compare the effect of a constant bias on the equilibrium properties of the system that in this case can be readily computed. The second toy model is alanine dipeptide in vacuo. This system is more challanging to characterise with a standard MD simulation and we will see how we can use an interative approach to to build a constant bias that will help in flattening the underlying free energy surface and thus sped up the sampling. +We will make use of two toy models: the first is a water dimer, i.e. two molecules of water in vacuo, that we will use to compare the effect of a constant bias on the equilibrium properties of the system that in this case can be readily computed. The second toy model is alanine dipeptide in vacuo. This system is more challenging to characterize with a standard MD simulation and we will see how we can use an interactive approach to to build a constant bias that will help in flattening the underlying free energy surface and thus sped up the sampling. -\note Create a folder for each exercise and use subfolders if you want to run the same simulation with multiple choices for the parameters +\note Create a folder for each exercise and use sub-folders if you want to run the same simulation with multiple choices for the parameters \section trieste-3-ex-1 Exercise 1: converged histogram of the water dimer relative distance @@ -121,7 +121,7 @@ First of all let's start to learn something about the water dimer system by runn In this way we have a 25ns long trajectory that we can use to have a first look at the behavior of the system. Is the sampling of the relative distance between the two water molecules converged? -Use plumed driver to analyse the trajectory and evaluate the quality of the sampling. +Use plumed driver to analyze the trajectory and evaluate the quality of the sampling. Here you can find a sample `plumed.dat` file that you can use as a template. Whenever you see an highlighted \highlight{FILL} string, this is a string that you should replace. @@ -133,7 +133,7 @@ d: DISTANCE ATOMS=1,4 #accumulate block histograms hh: HISTOGRAM ARG=d STRIDE=10 GRID_MIN=0 GRID_MAX=4.0 GRID_BIN=200 KERNEL=DISCRETE CLEAR=10000 #and dump them -DUMPGRID GRID=hh FILE=myhist.dat STRIDE=10000 +DUMPGRID GRID=hh FILE=my_histogram.dat STRIDE=10000 # Print the collective variable. PRINT ARG=d STRIDE=10 FILE=distance.dat \endplumedfile @@ -152,8 +152,8 @@ Notice the peak at 0.9 nm, this is the effect of using cut-off for the calculati \section trieste-3-ex-2 Exercise 2: Apply a linear restraint on the same collective variable Now we will try to apply a linear restraint on the relative distance and compare the resulting distribution. The new sampling will reflect the effect of the bias. -Be carefull about the statistics: in the simulation of exercise 1 you were postprocessing a trajectory of 125000 frames accumulating one frame every ten in an histogram and clearing -the histogram after 10000 steps. As a result you had 12 blocks in the form of 11 analysis.* files and a final block named myhist.dat. In the following try to accumulate on the fly +Be careful about the statistics: in the simulation of exercise 1 you were post processing a trajectory of 125000 frames accumulating one frame every ten in an histogram and clearing +the histogram after 10000 steps. As a result you had 12 blocks in the form of 11 analysis.* files and a final block named my_histogram.dat. In the following try to accumulate on the fly the same amount of statistics. Look into the .mdp file to see how often frames are written in a trajectory. If you write too many analysis.* files (i.e. 100 files plumed will fail with an error). @@ -173,7 +173,7 @@ lr: RESTRAINT ARG=d KAPPA=0 AT=0 SLOPE=2.5 PRINT __FILL__ \endplumedfile -In a new folder we can run this new simulation this time biasing and analysing the simulation on-the-fly. +In a new folder we can run this new simulation this time biasing and analyzing the simulation on-the-fly. \verbatim > gmx mdrun -s dimer.tpr -plumed plumed.dat @@ -182,9 +182,9 @@ In a new folder we can run this new simulation this time biasing and analysing t The histogram should look different. The effect of a constant bias is that of systematically changing the probability of each conformation by a factor -\f$ \exp(+V_{bias}/k_{B}T) \f$. This means that it is easely possible to recover the unbias distribution at least in the regions of the conformational space that have been througly sampled. In practice the statistical weight of each frame is not 1 anymore but is given by the exponential of the bias. +\f$ \exp(+V_{bias}/k_{B}T) \f$. This means that it is easily possible to recover the un-biased distribution at least in the regions of the conformational space that have been thoroughly sampled. In practice the statistical weight of each frame is not 1 anymore but is given by the exponential of the bias. -In order to recover the unbiased distribution we can post process the simulation using plumed driver to recalculate the bias felt by each frame and store this information to analyse any property. Furthermore plumed can also automatically use the bias to reweight the accumulated histogram. +In order to recover the unbiased distribution we can post process the simulation using plumed driver to recalculate the bias felt by each frame and store this information to analyze any property. Furthermore plumed can also automatically use the bias to reweight the accumulated histogram. \plumedfile # vim:ft=plumed @@ -206,13 +206,13 @@ DUMPGRID __FILL__ PRINT ARG=*.* FILE=COLVAR STRIDE=1 \endplumedfile -Be carefull again about the difference in the way statistics is accumulated on-the-fly or for post processising. This is not critical for the result but is important in order to have comparable histograms, that is histograms with comparable noise. Remember to give different names to the new histogram otherwise the one obtained before will be overwritten. +Be careful again about the difference in the way statistics is accumulated on-the-fly or for post processing. This is not critical for the result but is important in order to have comparable histograms, that is histograms with comparable noise. Remember to give different names to the new histogram otherwise the one obtained before will be overwritten. \verbatim > plumed driver --mf_xtc traj_comp.xtc --plumed plumed.dat \endverbatim -\note To run block analysis of both sets of histograms you need to edit the python script because the file name is hardcoded. +\note To run block analysis of both sets of histograms you need to edit the python script because the file name is hard coded. \verbatim > python3 do_block_histo.py > histo-biased.dat @@ -223,7 +223,7 @@ Now the resulting histogram should be comparable to the reference one. \section trieste-3-ex-3 Exercise 3: Apply a quadratic restraint on the same collective variable -Do you expect a different behaviour? This time we can write the plumed input file in such a way to compare directly the biased and unbiased histograms. +Do you expect a different behavior? This time we can write the plumed input file in such a way to compare directly the biased and unbiased histograms. \plumedfile # vim:ft=plumed @@ -234,7 +234,7 @@ lr: RESTRAINT ARG=d KAPPA=10 AT=0.5 #accumulate the biased histogram hh: HISTOGRAM ARG=d STRIDE=500 GRID_MIN=0 GRID_MAX=4.0 GRID_BIN=200 KERNEL=DISCRETE CLEAR=500000 #dumpit -DUMPGRID GRID=hh FILE=myhist.dat STRIDE=500000 +DUMPGRID GRID=hh FILE=my_histogram.dat STRIDE=500000 #calculate the weights from the constant bias as: REWEIGHT_BIAS TEMP=298 #accumulate the unbiased histogram @@ -247,7 +247,7 @@ PRINT ARG=d,lr.bias FILE=distance.dat STRIDE=50 The comparison of the two histograms with the former will show the effect of the weak quadratic bias on the simulation. -\note To run block analysis of both sets of histograms you need to edit the python script because the file name is hardcoded. +\note To run block analysis of both sets of histograms you need to edit the python script because the file name is hard coded. \verbatim > python3 do_block_histo.py > histo-biased.dat @@ -255,7 +255,7 @@ The comparison of the two histograms with the former will show the effect of the \endverbatim \section trieste-3-ex-4 Exercise 4: Apply an upper wall on the distance. -In the above cases we have always applied weak biases. Sometimes biases are usefull to prevent the system in reaching some region of the conformational space. In this case instead of using \ref RESTRAINT , we can make use of lower or upper restraints, e.g. \ref LOWER_WALLS and \ref UPPER_WALLS. +In the above cases we have always applied weak biases. Sometimes biases are useful to prevent the system in reaching some region of the conformational space. In this case instead of using \ref RESTRAINT , we can make use of lower or upper restraints, e.g. \ref LOWER_WALLS and \ref UPPER_WALLS. What happen to the histogram when we use walls? @@ -283,12 +283,12 @@ Run it. > gmx mdrun -s dimer.tpr -plumed plumed.dat \endverbatim -If we have not sampled a region througly enough it is not possible to estimate the histogram in that region even using reweighting (reweighting is not magic!). +If we have not sampled a region thoroughly enough it is not possible to estimate the histogram in that region even using reweighting (reweighting is not magic!). \section trieste-3-ex-5 Exercise 5: Evaluate the free energy and use it as an external restraint -The main issue in sampling rare events is that importance sampling algorithms spend more time in low energy regions and if two low energy regions are separated by a high energy one is unlikely for the sampling algorithm to cross the high energy region and reach the other low energy one. From this point of view an algorithm based on random sampling will work better in crossing the barrier. A particularly efficient sampling can be obtained if one would know the underlying free energy and thus use that to bias the sampling and make the sampling probability uniform in the regions of relavent interest. -In this exercise we will make use of the free-energy estimate along the distance collective variable to bias the sampling of the same collective variable in the dimer simulation. To do so we will make use of a table potential applied using the \ref Bias \ref EXTERNAL. We first need to get a smooth estimate of the free-energy from our fist reference simulations, we will do this by accumulating a histogram with kernel functions, that is continuos function centered at the value of the accumulated point and added accumulated on the discrete represattion of the histogram, see <a href="https://en.wikipedia.org/wiki/Kernel_density_estimation"> Kernel density estimation </a>. +The main issue in sampling rare events is that importance sampling algorithms spend more time in low energy regions and if two low energy regions are separated by a high energy one is unlikely for the sampling algorithm to cross the high energy region and reach the other low energy one. From this point of view an algorithm based on random sampling will work better in crossing the barrier. A particularly efficient sampling can be obtained if one would know the underlying free energy and thus use that to bias the sampling and make the sampling probability uniform in the regions of relevant interest. +In this exercise we will make use of the free-energy estimate along the distance collective variable to bias the sampling of the same collective variable in the dimer simulation. To do so we will make use of a table potential applied using the \ref Bias \ref EXTERNAL. We first need to get a smooth estimate of the free-energy from our fist reference simulations, we will do this by accumulating a histogram with kernel functions, that is continuous function centered at the value of the accumulated point and added accumulated on the discrete representation of the histogram, see <a href="https://en.wikipedia.org/wiki/Kernel_density_estimation"> Kernel density estimation </a>. \plumedfile # vim:ft=plumed @@ -376,7 +376,7 @@ How do the biased and unbiased histograms look like? In the following we will ap \section trieste-3-ex-6 Exercise 6: Preliminary run with Alanine dipeptide -Alanine dipeptide is characterised by multiple minima separated by relatively high free energy barriers. Here we will explore the conformational space of +Alanine dipeptide is characterized by multiple minima separated by relatively high free energy barriers. Here we will explore the conformational space of alanine dipeptide using a standard MD simulation, then instead of using the free energy as an external potential we will try to fit the potential using gnuplot and add a bias using an analytical function of a collective variable with \ref MATHEVAL and \ref BIASVALUE . @@ -398,10 +398,10 @@ PRINT ARG=phi,psi FILE=colvar.dat STRIDE=10 \endplumedfile from the colvar file it is clear that we can quickly explore two minima but that the region for positive phi is not accessible. Instead of using the opposite -of the free energy as a table potential here we introduce the function \ref MATHEVAL that allows definining complex analytical functions of collective variables, -and the bias \ref BIASVALUE that allows using any continuos function of the positions as a bias. +of the free energy as a table potential here we introduce the function \ref MATHEVAL that allows defining complex analytical functions of collective variables, +and the bias \ref BIASVALUE that allows using any continuous function of the positions as a bias. -So first we need to fit the opposite of the free energy as a function of phi in the region explored with a periodic function, becasue of the gaussian like look +So first we need to fit the opposite of the free energy as a function of phi in the region explored with a periodic function, because of the gaussian like look of the minima we can fit it using <a href="https://en.wikipedia.org/wiki/Von_Mises_distribution"> the von Mises distribution</a>. In gnuplot \verbatim diff --git a/user-doc/tutorials/a-trieste-4.txt b/user-doc/tutorials/a-trieste-4.txt index 590925323..e9c624bea 100644 --- a/user-doc/tutorials/a-trieste-4.txt +++ b/user-doc/tutorials/a-trieste-4.txt @@ -26,7 +26,7 @@ The \tarball{trieste-4} for this project contains the following files: This tutorial has been tested on a pre-release version of version 2.4. However, it should not take advantage of 2.4-only features, thus should also work with version 2.3. -\note We suggest to run the three exercises in three separate directories. For Exercise 3, you will need the output of the first two exercizes, so don't delete it! +\note We suggest to run the three exercises in three separate directories. For Exercise 3, you will need the output of the first two exercises, so don't delete it! \section trieste-4-intro Introduction @@ -41,8 +41,8 @@ Here you can find a brief recap of the metadynamics theory. \hidden{Summary of theory} In metadynamics, an external history-dependent bias potential is constructed in the space of -a few selected degrees of freedom \f$ \vec{s}({q}) \f$, generally called collective variables (CVs) \cite metad. -This potential is built as a sum of Gaussians deposited along the trajectory in the CVs space: +a few selected degrees of freedom \f$ \vec{s}({q})\f$, generally called collective variables (CVs) \cite metad. +This potential is built as a sum of Gaussian kernels deposited along the trajectory in the CVs space: \f[ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) @@ -52,7 +52,7 @@ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) \f] where \f$ \tau \f$ is the Gaussian deposition stride, -\f$ \sigma_i \f$ the width of the Gaussian for the ith CV, and \f$ W(k \tau) \f$ the +\f$ \sigma_i \f$ the width of the Gaussian for the \f$i\f$th CV, and \f$ W(k \tau) \f$ the height of the Gaussian. The effect of the metadynamics bias potential is to push the system away from local minima into visiting new regions of the phase space. Furthermore, in the long time limit, the bias potential converges to minus the free energy as a function of the CVs: @@ -61,7 +61,7 @@ time limit, the bias potential converges to minus the free energy as a function V(\vec{s},t\rightarrow \infty) = -F(\vec{s}) + C. \f] -In standard metadynamics, Gaussians of constant height are added for the entire course of a +In standard metadynamics, Gaussian kernels of constant height are added for the entire course of a simulation. As a result, the system is eventually pushed to explore high free-energy regions and the estimate of the free energy calculated from the bias potential oscillates around the real value. @@ -85,16 +85,16 @@ where \f$ T \f$ is the temperature of the system. In the long time limit, the CVs thus sample an ensemble at a temperature \f$ T+\Delta T \f$ which is higher than the system temperature \f$ T \f$. The parameter \f$ \Delta T \f$ can be chosen to regulate the extent of free-energy exploration: - \f$ \Delta T = 0\f$ corresponds to standard MD, \f$ \Delta T \rightarrow \infty \f$ to standard + \f$ \Delta T = 0\f$ corresponds to standard MD, \f$ \Delta T \rightarrow\infty\f$ to standard metadynamics. In well-tempered metadynamics literature and in PLUMED, you will often encounter -the term "biasfactor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) +the term "bias factor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) and the system temperature (\f$ T \f$): \f[ \gamma = \frac{T+\Delta T}{T}. \f] -The biasfactor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed +The bias factor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed efficiently in the time scale of the simulation. Additional information can be found in the several review papers on metadynamics @@ -119,7 +119,7 @@ It is conventional use to characterize the two states in terms of Ramachandran d \subsection trieste-4-ex-1a Exercise 1a: setup and run -In this excercise we will setup and perform a well-tempered metadynamics run using the backbone dihedral \f$ \phi \f$ +In this exercise we will setup and perform a well-tempered metadynamics run using the backbone dihedral \f$ \phi \f$ as collective variable. During the calculation, we will also monitor the behavior of the other backbone dihedral \f$ \psi \f$. Here you can find a sample `plumed.dat` file that you can use as a template. @@ -133,7 +133,7 @@ psi: TORSION ATOMS=__FILL__ # Activate well-tempered metadynamics in phi metad: __FILL__ ARG=__FILL__ ... -# Deposit a Gaussian every 500 time steps, with initial height equal to 1.2 kJoule/mol +# Deposit a Gaussian every 500 time steps, with initial height equal to 1.2 kJ/mol PACE=500 HEIGHT=1.2 # the bias factor should be wisely chosen BIASFACTOR=__FILL__ @@ -151,7 +151,7 @@ The syntax for the command \ref METAD is simple. The directive is followed by a keyword ARG followed by the labels of the CVs on which the metadynamics potential will act. The keyword PACE determines the stride of Gaussian deposition in number of time steps, -while the keyword HEIGHT specifies the height of the Gaussian in kJoule/mol. For each CVs, one has +while the keyword HEIGHT specifies the height of the Gaussian in kJ/mol. For each CVs, one has to specify the width of the Gaussian by using the keyword SIGMA. Gaussian will be written to the file indicated by the keyword FILE. @@ -186,7 +186,7 @@ by the metadynamics bias potential to visit the other local minimum. As the simu the bias potential fills the underlying free-energy landscape, and the system is able to diffuse in the entire phase space. -The HILLS file contains a list of the Gaussians deposited along the simulation. +The HILLS file contains a list of the Gaussian kernels deposited along the simulation. If we give a look at the header of this file, we can find relevant information about its content: \verbatim @@ -197,7 +197,7 @@ If we give a look at the header of this file, we can find relevant information a \endverbatim The line starting with FIELDS tells us what is displayed in the various columns of the HILLS file: -the simulation time, the instantaneous value of \f$ \phi \f$, the Gaussian width and height, and the biasfactor. +the simulation time, the instantaneous value of \f$ \phi \f$, the Gaussian width and height, and the bias factor. We can use the HILLS file to visualize the decrease of the Gaussian height during the simulation, according to the well-tempered recipe: @@ -205,14 +205,14 @@ according to the well-tempered recipe: \image html munster-metad-phihills.png "Time evolution of the Gaussian height." If we look carefully at the scale of the y-axis, we will notice that in the beginning the value -of the Gaussian height is higher than the initial height specified in the input file, which should be 1.2 kJoule/mol. -In fact, this column reports the height of the Gaussian rescaled by the pre-factor that +of the Gaussian height is higher than the initial height specified in the input file, which should be 1.2 kJ/mol. +In fact, this column reports the height of the Gaussian scaled by the pre-factor that in well-tempered metadynamics relates the bias potential to the free energy. \subsection trieste-4-ex-1b Exercise 1b: estimating the free energy One can estimate the free energy as a function of the metadynamics CVs directly from the metadynamics -bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussians +bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussian kernels deposited during the simulation and stored in the HILLS file. To calculate the free energy as a function of \f$ \phi \f$, it is sufficient to use the following command line: @@ -239,7 +239,7 @@ The result should look like this: To assess the convergence of a metadynamics simulation, one can calculate the estimate of the free energy as a function of simulation time. At convergence, the reconstructed profiles should be similar. -The option \-\-stride should be used to give an estimate of the free energy every N Gaussians deposited, and +The option \-\-stride should be used to give an estimate of the free energy every N Gaussian kernels deposited, and the option \-\-mintozero can be used to align the profiles by setting the global minimum to zero. If we use the following command line: @@ -247,11 +247,11 @@ If we use the following command line: plumed sum_hills --hills HILLS --stride 100 --mintozero \endverbatim -one free energy is calculated every 100 Gaussians deposited, and the global minimum is set to zero in all profiles. +one free energy is calculated every 100 Gaussian kernels deposited, and the global minimum is set to zero in all profiles. The resulting plot should look like the following: \anchor trieste-4-metad-phifest-fig -\image html munster-metad-phifest.png "Estimates of the free energy as a function of the dihedral phi calculated every 100 Gaussians deposited." +\image html munster-metad-phifest.png "Estimates of the free energy as a function of the dihedral phi calculated every 100 Gaussian kernels deposited." These two qualitative observations: - the system is diffusing efficiently in the collective variable space (Figure \ref trieste-4-phi-fig) @@ -306,7 +306,7 @@ The procedure can be used to estimate the error in the free-energy as a function collective variable(s) used in the metadynamics simulation, or for any other function of the coordinates of the system. -First, we will calculate the "unbiasing" weights associated to each conformation sampled +First, we will calculate the "un-biasing" weights associated to each conformation sampled during the metadynamics run. In order to calculate these weights, we can use either of these two approaches: @@ -347,7 +347,7 @@ FILE=HILLS GRID_MIN=-pi GRID_MAX=pi PRINT ARG=__FILL__ FILE=COLVAR STRIDE=1 \endplumedfile -Once your `plumed.dat` file is complete, you can use the \ref driver utility to back-calculated the quantites +Once your `plumed.dat` file is complete, you can use the \ref driver utility to back-calculated the quantities needed for the error calculation \verbatim plumed driver --plumed plumed.dat --mf_xtc traj_comp.xtc @@ -370,7 +370,7 @@ The COLVAR file produced by \ref driver should look like this: Please check your `plumed.dat` file if your output looks different! Once the final bias has been evaluated on the entire metadynamics simulations, we can -easily calculate the "unbiasing weights" using the umbrella-sampling-like approach: +easily calculate the "un-biasing weights" using the umbrella-sampling-like approach: \verbatim # find maximum value of bias @@ -404,7 +404,7 @@ for i in `seq 1 10 1000`; do python3 do_block_fes.py phi.weight 1 -3.141593 3.01 \endverbatim For each value of block length `N`, you will obtain a separate `fes.N.dat` file, containing the value -of the \f$ \phi \f$ variable on a grid, the average free-energy, and the associated error (in Kjoule/mol): +of the \f$ \phi \f$ variable on a grid, the average free-energy, and the associated error (in kJ/mol): \verbatim -3.141593 23.184653 0.080659 diff --git a/user-doc/tutorials/a-trieste-5.txt b/user-doc/tutorials/a-trieste-5.txt index d1bec5d24..445f3b94f 100644 --- a/user-doc/tutorials/a-trieste-5.txt +++ b/user-doc/tutorials/a-trieste-5.txt @@ -88,8 +88,8 @@ all the files read by PLUMED, with the exception of PDB files where the suffix i The last comment implies that if you need to process your trajectories with _different_ input files you might do it creating files named `plumed.0.dat`, `plumed.1.dat`, and -`plumed.2.dat`. This is correct, and this was the ony way to do multi-replica -simulatons with different input files up to PLUMED 2.3. However, +`plumed.2.dat`. This is correct, and this was the only way to do multi-replica +simulations with different input files up to PLUMED 2.3. However, in the following we will see how to obtain the same effect with a new special command that has been introduced in PLUMED 2.4. @@ -194,7 +194,7 @@ RESTRAINT ... ... \endplumedfile -In short, whenever there are keywords that should vary across replicas, you should set them usign the `@replicas:` keyword. +In short, whenever there are keywords that should vary across replicas, you should set them using the `@replicas:` keyword. As mentioned above, you can always use the old syntax with separate input file, and this is recommended when the number of keywords that are different is large. @@ -373,7 +373,7 @@ The actual numbers might very depending on the length of your simulation. Now you can use the provided python script to analyze the `ALLBIAS` file. `SCRIPTS/wham.py` should be provided three arguments: name of the file with bias potentials, -number of replicas (that is the number of columns) and value of kbT (2.5 in kj/mol). +number of replicas (that is the number of columns) and value of k\f$_B\f$T (2.5 in kj/mol). \verbatim # use the python script to compute weights @@ -422,12 +422,12 @@ values close to these ones that tells you that the first minimum has a population of roughly 97%. \note -This approach can be used to reweight any replica exchange simulation, irrespectively of how +This approach can be used to reweight any replica exchange simulation, irrespective of how different were the employed bias potentials. This includes bias-exchange metadynamics \cite piana when using well-tempered metadynamics \cite Barducci:2008. Notice that for non-well-tempered bias exchange metadynamics one can use a very similar approach based on binning that is described in \cite marinelli-trpc09. -In addition, the approach illustrated here can be used direcly in +In addition, the approach illustrated here can be used directly in bias-exchange metadynamics simulations with additional restraints which are different in different replicas (as in \cite cunha2017unraveling) as well as other flavors of biased replica exchange simulations (e.g. \cite curuksu2009enhanced \cite gil2015enhanced). @@ -459,7 +459,7 @@ Compute as before the relative population of the two minima. \endverbatim Which is the result? What can you learn from this example? -If you did things correclty, here you should find a population close to 2/3 for the first minimum and 1/3 for the second one. +If you did things correctly, here you should find a population close to 2/3 for the first minimum and 1/3 for the second one. Notice that this population just reflects the way we initialized the simulations (2 of them in one minimum and 1 in the other minimum) and does not take into account which is the relative stability of the two minima according to the real potential energy function. In other words, we are just obtaining the relative population that we used to initialize the simulation. diff --git a/user-doc/tutorials/a-trieste-6.txt b/user-doc/tutorials/a-trieste-6.txt index b5066a532..2ef9d4dd7 100644 --- a/user-doc/tutorials/a-trieste-6.txt +++ b/user-doc/tutorials/a-trieste-6.txt @@ -38,7 +38,7 @@ The exercise are of increasing difficulties, inputs are partially provided for t \section trieste-6-ex-1 Exercise 1: analysis of the BARD1 complex simulation -The BARD1 complex is a heterodimer composed by two domains of 112 and 117 residues each. +The BARD1 complex is a hetero-dimer composed by two domains of 112 and 117 residues each. The system is represented at coarse-grained level using the MARTINI force field. \anchor trieste-6-bard1 @@ -64,7 +64,7 @@ However, we encourage all the users to experiment at least with the following CV 1) \ref RMSD and/or \ref DRMSD from the native state The following line can be added to the `plumed.dat` file to calculate the \ref RMSD on the beads -representing the backbone aminoacids: +representing the backbone amino acids: \plumedfile rmsd: RMSD REFERENCE=bard1_rmsd.pdb TYPE=OPTIMAL @@ -112,7 +112,7 @@ dih: TORSION ATOMS=__FILL__ \section trieste-6-ex-2 Exercise 2: analysis of the cmyc-urea simulation -Cmyc is a small disordered peptide made of 11 aminoacid. In solution, cmyc adopts +Cmyc is a small disordered peptide made of 11 amino acid. In solution, cmyc adopts a variety of different, but equally populated conformations. In the TARBALL of this exercise, we provide a long MD simulation of cmyc in presence of a single molecule of urea. @@ -127,7 +127,7 @@ The users are expected to: - characterize the conformational ensemble of cmyc by calculating free-energies as a function of different CVs. - calculate the fraction of bound and unbound molecules of urea by defining suitable CVs to measure the position of urea relative to cmyc. -- find the cmyc aminoacids that bind urea the most and the least. +- find the cmyc amino acids that bind urea the most and the least. - calculate the ensemble averages of different experimental CVs. \warning Be careful that the original trajectory might be broken by PBC! @@ -155,7 +155,7 @@ urea: GROUP ATOMS=192,193,194,197 mindist: DISTANCES GROUPA=cmyc GROUPB=urea MIN={BETA=50.} \endplumedfile -For estimating the cmyc aminoacid that bind the most and the least urea, we leave the users the choice of the best +For estimating the cmyc amino acid that bind the most and the least urea, we leave the users the choice of the best strategy. For the calculation of ensemble averages of experimental CVs, we suggest to use: diff --git a/user-doc/tutorials/belfast-1.txt b/user-doc/tutorials/belfast-1.txt index 482f11a60..77d71bad1 100644 --- a/user-doc/tutorials/belfast-1.txt +++ b/user-doc/tutorials/belfast-1.txt @@ -4,7 +4,7 @@ \section belfast-1-aims Aims The aim of this tutorial is to introduce the users to the plumed syntax. We will go through the writing of -simple collective variable and we will use them to analyse a trajectory in terms of probabilty distributions +simple collective variable and we will use them to analyze a trajectory in terms of probability distributions and free energy. \section belfast-1-lo Learning Outcomes @@ -12,14 +12,14 @@ and free energy. Once this tutorial is completed students will: - Know how to write a simple plumed input file -- Know how to analyse a trajectory using plumed +- Know how to analyze a trajectory using plumed \section belfast-1-resources Resources The <a href="tutorial-resources/belfast-1.tar.gz" download="belfast-1.tar.gz"> tarball </a> for this project contains the following files: - trajectory-short.xyz : a (short) trajectory for a 16 residue protein in xyz format. All calculations with plumed driver use this trajectory. -- template.pdb : a single frame from the trajectory that can be used in conjuction with the \ref MOLINFO command +- template.pdb : a single frame from the trajectory that can be used in conjunction with the \ref MOLINFO command \section belfast-1-instructions Instructions @@ -31,14 +31,14 @@ plumed --help \endverbatim some of the listed options report about the plumed available functionalities, other can be used to tell plumed to do something: -from analysing a trajectory to patch the source code of a MD code and so on. All the commands have further options that can +from analyzing a trajectory to patch the source code of a MD code and so on. All the commands have further options that can be seen using plumed command --help, i.e.: \verbatim plumed driver --help \endverbatim -In the following we are going to see how to write an input file for plumed2 that can be used to analyse a trajectory. +In the following we are going to see how to write an input file for plumed2 that can be used to analyze a trajectory. \subsection belfast-1-units A note on units By default the PLUMED inputs and outputs quantities in the following units: @@ -54,7 +54,7 @@ If you want to change these units you can do this using the \ref UNITS keyword. A typical input file for PLUMED input is composed by specification of one or more CVs, the printout frequency and a termination line. Comments are denoted with a # and the termination of the input for PLUMED is marked with the keyword ENDPLUMED. Whatever it follows is ignored by PLUMED. You can introduce blank lines. They are not interpreted by PLUMED. -In the following input we will analyse the \ref DISTANCE between the two terminal carbons of a 16 residues peptide, and we will \ref PRINT the results +In the following input we will analyze the \ref DISTANCE between the two terminal carbons of a 16 residues peptide, and we will \ref PRINT the results in file named COLVAR. \verbatim @@ -69,7 +69,7 @@ ENDPLUMED here I can write what I want it won't be read. \endverbatim -Now we can use this simple input file to analyse the trajectory included in the RESOURCES: +Now we can use this simple input file to analyze the trajectory included in the RESOURCES: \verbatim plumed driver --plumed plumed.dat --ixyz trajectory-short.xyz --length-units 0.1 @@ -139,8 +139,8 @@ ranges of atoms can be defined with a stride which can also be negative: 3. ATOMS=from,to:by (i.e.: 251-256:2) 4. ATOMS=to,from:-by (i.e.: 256-251:-2) -Now by plotting the content of the COLVAR file we can compare the behaviour in this trajectory of both the terminal carbons -as well as of the centre of masses of the terminal residues. +Now by plotting the content of the COLVAR file we can compare the behavior in this trajectory of both the terminal carbons +as well as of the center of masses of the terminal residues. \verbatim gnuplot @@ -153,7 +153,7 @@ vmd template.pdb trajectory-short.xyz \endverbatim Virtual atoms can be used in place of standard atoms everywhere an atom can be given as input, they can also be used together -with standard atoms. So for example we can analyse the \ref TORSION angle for a set of Virtual and Standard atoms: +with standard atoms. So for example we can analyze the \ref TORSION angle for a set of Virtual and Standard atoms: \verbatim first: CENTER ATOMS=1-6 @@ -166,7 +166,7 @@ ENDPLUMED \endverbatim The above CV don't look smart to learn something about the system we are looking at. In principle CV are used to reduce the complexity -of a system by looking at a small number of properties that could be enough to rationalise its behaviour. +of a system by looking at a small number of properties that could be enough to rationalize its behavior. Now try to write a collective variable that measures the Radius of Gyration of the system: \ref GYRATION. @@ -185,7 +185,7 @@ Now 'ca' is not a virtual atom but a simple list of atoms. Sometimes it can be useful to calculate properties of many similar collective variables at the same time, for example one can be interested in calculating the properties of the distances between a group of atoms, or properties linked to the distribution of the dihedral angles of a chain and so on. In PLUMED2 this kind of collective variables fall under the name of MULTICOLVAR (cf. \ref mcolv.) -Here we are going to analyse the distances between CA carbons along the chain: +Here we are going to analyze the distances between CA carbons along the chain: \verbatim ca: GROUP ATOMS=9,16,31,55,69,90,102,114,124,138,160,174,194,208,224,238 @@ -202,7 +202,7 @@ we have defined four collective variables that are calculated using the distance collective variables are stored as components of the defined action 'dd': dd.mean, dd.min, dd.max, dd.moment-2. -The infrastracture of multicolvar has been used to develop many PLUMED2 collective variables as +The infrastructure of multicolvar has been used to develop many PLUMED2 collective variables as for example the set of Secondary Structure CVs (\ref ANTIBETARMSD, \ref PARABETARMSD and \ref ALPHARMSD). \verbatim @@ -228,7 +228,7 @@ with a probability \f] where \f$ q \f$ are the microscopic coordinates and \f$ k_B \f$ is the Boltzmann constant. -It is possible to analyse the above probabilty as a function of one or more collective variable \f$ s(q)\f$: +It is possible to analyze the above probability as a function of one or more collective variable \f$ s(q)\f$: \f[ P(s)\propto \int dq e^{-\frac{U(q)}{kb_BT}} \delta(s-s(q)) @@ -246,7 +246,7 @@ system along that CV. Estimating the probability distribution of the conformatio 'sampling'. In order to estimate a probability distribution one needs to make \ref HISTOGRAM from the calculated CVs. -PLUMED2 includes the possibility of histogramming data both on the fly as well as a posteriori as we +PLUMED2 includes the possibility of constructing a histogram from data both on the fly as well as a posteriori as we are going to do now. \verbatim @@ -279,8 +279,8 @@ in PLUMED. The above input tells PLUMED to accumulate the two collective variables on a GRID. In addition the probability can be converted to a free-energy using the \ref CONVERT_TO_FES method and then use \ref DUMPGRID to write it. -Histograms can be accumulated in a smoother way by using a KERNEL function, a kernel is a normalised function, -for example a normalised gaussian is the default kernel in PLUMED, that is added to the histogram centered in the position of the data. +Histograms can be accumulated in a smoother way by using a KERNEL function, a kernel is a normalized function, +for example a normalized gaussian is the default kernel in PLUMED, that is added to the histogram centered in the position of the data. Estimating a probability density using kernels can in principle give more accurate results, on the other hand in addition to the choice of the binning one has to choose a parameter that is the WIDTH of the kernel function. As a rule of thumb: the grid spacing should be smaller (i.e. one half or less) than the BANDWIDTH @@ -301,8 +301,8 @@ DUMPGRID GRID=hh FILE=histo ENDPLUMED \endverbatim -If you have time less at the end of the session read the manual and look for alternative collective variables to analyse the -trajectory. Furthemore try to play with the \ref HISTOGRAM parameters to see the effect of using KERNEL in analysing data. +If you have time less at the end of the session read the manual and look for alternative collective variables to analyze the +trajectory. Furthermore try to play with the \ref HISTOGRAM parameters to see the effect of using KERNEL in analyzing data. */ diff --git a/user-doc/tutorials/belfast-2.txt b/user-doc/tutorials/belfast-2.txt index 40e6aff5f..08bcb7967 100644 --- a/user-doc/tutorials/belfast-2.txt +++ b/user-doc/tutorials/belfast-2.txt @@ -26,17 +26,17 @@ This is a complex transition and it is hard to tell which is the order parameter \image html belfast-2-cdk.png "CDK2 conformational change, PDB code 2C5X and 2C5Y. The upper loop is radically different. Which torsion or distance is important?" One could identify a distance that can distinguish open from close but -- the plasicity of the loop is such that the same distance can correspond to an almost closed loop and almost open loop. One would like to completely divide these two situations with +- the plasticity of the loop is such that the same distance can correspond to an almost closed loop and almost open loop. One would like to completely divide these two situations with something which is discriminating what intuitively one would think as open and closed -- the transition state is an important point where one would like to see a certain crucial interaction forming/breaking so to better explain what is really happening. If you capture then hypothetically you would be able to say what is dictating the rate of this conformational transition. A generic distance is a very hybrid measure that is unlikely to capture a salt-bridge formation and a concerted change of many dihedral change or desolvation contribution which are happening while the transition is happening. All these things are potentially important in such transition but none of them is explaining the whole thing. +- the transition state is an important point where one would like to see a certain crucial interaction forming/breaking so to better explain what is really happening. If you capture then hypothetically you would be able to say what is dictating the rate of this conformational transition. A generic distance is a very hybrid measure that is unlikely to capture a salt-bridge formation and a concerted change of many dihedral angles or the change in solvation state that are happening while the transition is happening. All these things are potentially important in such transition but none of them is explaining the whole thing. So basically in these cases you have to deal with an intrinsic multidimensional collective variable where you would need many dimensions. -How would you visualize a 10 dimensional CV where you use many distances, coordinations and dihedrals (ouch, they're periodic too!) ? +How would you visualize a 10 dimensional CV where you use many distances, coordination numbers and dihedral angles (ouch, they're periodic too!) ? -Another typical case is the docking of a small molecule in a protein cleft or gorge, which is the mechanism of drug action. This involves specific conformational transition from both the small molecule and the protein as the small molecule approaches the protein cavity. This also might imply a specific desolvation pattern. +Another typical case is the docking of a small molecule in a protein cleft or gorge, which is the mechanism of drug action. This involves specific conformational transition from both the small molecule and the protein as the small molecule approaches the protein cavity. This also might imply a specific change in the behavior when the solvation state changes. -Other typical examples are chemical reactions. Nucleophiloic attacks typically happen with a role from the solvent (see some nice paper from Nobel-prize winner Arieh Warshel) -and sizeable geometric distortions of the neighboring groups. +Other typical examples are chemical reactions. Nucleophillic attacks typically happen with a role from the solvent (see some nice paper from Nobel-prize winner Arieh Warshel) +and sizable geometric distortions of the neighboring groups. \section belfast-2-pcvs-general Path collective variables @@ -47,8 +47,8 @@ In a nutshell, your reaction might be very complex and happening in many degree \anchor belfast-2-ab-fig \image html belfast-2-ab.png "Given the reactant and the product, tell if another state is closer to any of the two with a continuous parameter" -For example, in Fig. \ref belfast-2-ab-fig, you see a typical chemical reaction (hydrolysis of methylphosphate) with the two end-points denoted by A and B. -If you are given a third point, just by looking at it, you might find that this is more resemblant to the reactant than the product, so, hypothetically, if you would intuitively +For example, in Fig. \ref belfast-2-ab-fig, you see a typical chemical reaction (hydrolysis of methyl phosphate) with the two end-points denoted by A and B. +If you are given a third point, just by looking at it, you might find that this resembles the reactant more than the product, so, hypothetically, if you would intuitively give a parameter that would be 1 for a configuration in the A state and 2 for a configuration in the B state, you probably would give it something like 1.3, right? Path collective variables are the extension to this concept in the case you have many conformation that describe your path, and therefore, instead of an index that goes from 1 to 2 you have an index that goes from 1 to \f$N\f$ , where \f$N\f$ is the number of conformation that you use in input to describe your path. @@ -62,11 +62,11 @@ S(X)=\frac{\sum_{i=1}^{N} i\ \exp^{-\lambda \vert X-X_i \vert }}{ \sum_{i=1}^{N} where in \ref belfast-2-s-eq the \f$ \vert X-X_i \vert \f$ represents a distance between one configuration \f$ X \f$ which is analyzed and another from the set that compose the path \f$ X_i \f$. The parameter \f$ \lambda \f$ is a positive value that is tuned in a way explained later. -here are a number of things to note to make you think that this is exactely what you want. +here are a number of things to note to make you think that this is exactly what you want. - The negative exponential function is something that is 1 whenever the value at the exponent is zero, and is progressively smaller when the value is larger than zero (trivially, the case with the value at the exponent larger than zero never occurs since lambda is a positive quantity and the distance is by definition positive). - Whenever you sit exactly on a specific images \f$ X_j \f$ then all the other terms in the sum disappear (if \f$ \lambda \f$ is large enough) and only the value \f$ j \f$ survives returning exactely \f$ S(X)=j \f$. -In order to provide a value which is continuous, the parameter \f$ \lambda \f$ should be correclty tuned. +In order to provide a value which is continuous, the parameter \f$ \lambda \f$ should be correctly tuned. As a rule of thumb I use the following formula \f[ @@ -82,9 +82,9 @@ So, at this point it is better to understand what is meant with "distance" since The way we refer here is by using mean square deviation after optimal alignment. This means that at each step in which the analysis is performed, a number N of optimal alignments is performed. Namely what is calculated is \f$ \vert X-X_i \vert = d(X,X_i)^2 \f$ where \f$ d(X,X_i) \f$ is the RMSD as defined in what you see here \ref RMSD. -Using the MSD instead of RMSD is sometimes more convenient and more stable (you do not have a denominator that gies to zero in the derivatives when biasing. +Using the MSD instead of RMSD is sometimes more convenient and more stable (you do not have a denominator that goes to zero in the derivatives when biasing. -Anyway this is a matter of choice. Potentially one could equally employ other metrics like a set of coordinations (this was done in the past), and then you would avoid the problem of rototranslations (well, which is not a problem since you have it already in plumed) but for some applications that might become appealing. +Anyway this is a matter of choice. Potentially one could equally employ other metrics like a set of coordination numbers (this was done in the past), and then you would avoid the problem of rototranslations (well, which is not a problem since you have it already in plumed) but for some applications that might become appealing. So in path collective variables (and in all the dimensional reduction based collective variables) the problem is converted from the side of choosing the collective variable in choosing the right way to calculate distances, also called "metrics". The discussion of this issue is well beyond the topic of this tutorial, so we can move forward in how to tell plumed to calculate the progress along the path whenever the MSD after optimal alignment is used as distance. @@ -96,7 +96,7 @@ PRINT ARG=p1.sss,p1.zzz STRIDE=100 FILE=colvar FMT=%8.4f Note that reference contains a set of PDB, appended one after the other, with a END field. Note that there is no need to place all the atoms of the system in the PDB reference file you provide. Just put the atoms that you think might be needed. You can leave out big domains, solvent and ions if you think that is not important for your use. -Additionally, note that the measure units of LAMBDA are in the units of the code. In gromacs they are in nm^2 while NAMD is Ang^2. +Additionally, note that the measure units of LAMBDA are in the units of the code. In gromacs they are in nm^2 while NAMD is Angstrom^2. \ref PATHMSD produces two arguments that can be printed or used in other ActionWithArguments. One is the progress along the path of \ref belfast-2-s-eq, the other is the distance from the closest point along the path, which is denoted with the zzz component. This is defined as @@ -124,19 +124,19 @@ This case is exemplified in Fig. \ref belfast-2-ab-sz-nowhere-fig \section belfast-2-pcvs-topo A note on the path topology A truly important point is that if you get a trajectory from some form of accelerated dynamics (e.g. simply by heating) this cannot simply be converted into a path. -Since it is likely that your trajectory is going stochastically back and forth (not in the case of SMD or US, discussed later), your trajectory might be not topologically +Since it is likely that your trajectory is going back and forth randomly (not in the case of SMD or US, discussed later), your trajectory might be not topologically suitable. To understand that, suppose you simply collect a reactive trajectory of 100 ps into the reference path you give to the \ref PATHMSD and simply you assign the frame of 1 ps to index 1 (first frame occurring in the reference file provided to \ref PATHMSD), the frame of 2 ps to index 2 and so on : it might be that you have two points which are really similar but one is associated to step, -say 5 and the other is associated with frame 12. When you analyse the same trajectory, when you are sitting on any of those points then the -calculation of S will be an average like \f$ S(X)=(5+12)/2=8.5 \f$ which is an average of the two indexes and is completely misleading sinec it let you think that you are moving +say 5 and the other is associated with frame 12. When you analyze the same trajectory, when you are sitting on any of those points then the +calculation of S will be an average like \f$ S(X)=(5+12)/2=8.5 \f$ which is an average of the two indexes and is completely misleading since it let you think that you are moving between point 8 and 9, which is not the case. So this evidences that your reference frames should be "topologically consecutive". This means that frame 1 should be the closest to frame 2 and all the other frames should be farther apart. Similarly frame 2 should be equally close (in an \ref RMSD sense) to 1 and 3 while all the others should be farther apart. Same for frame 3: this should be closest to frame 2 and 4 and farther apart from all the others and so on. This is equivalent to calculate an "RMSD matrix" which can be conveniently done in vmd (this is a good exercise for a number of reasons) with RMSD Trajectory tools, by choosing different reference system along the set of reference frames. \anchor belfast-2-good-matrix-fig -\image html belfast-2-good-matrix.png "A good matrix for path variables has a gullwing shape, which means that each frame is closest to its neighbor and all the other are far apart" +\image html belfast-2-good-matrix.png "A good matrix for path variables has a gull-wing shape, which means that each frame is closest to its neighbor and all the other are far apart" -This is shown in Fig. \ref belfast-2-good-matrix-fig where the matrix has a typical gullwing shape. +This is shown in Fig. \ref belfast-2-good-matrix-fig where the matrix has a typical gull-wing shape. On the contrary, whenever you extract the frames from a pdb that you produced via free MD or some biased methods (SMD or Targeted MD for example) then your frame-to-frame distance is rather inhomogeneous and looks something like @@ -153,9 +153,9 @@ the crucial thing is the distance between neighbors. As a matter of fact, this is not much important in the analysis but is truly crucial in the bias. When biasing a simulation, you generally want to introduce a force that push your system somewhere. In particular, when you add a bias which is based on a path collective variable, most likely you want that your system goes back and forth along your path. -The bias is generally applied by an additional term in the hamiltonian, this can be a spring term for Umbrella Sampling, a Gaussian function for Metadynamics or whatever term which +The bias is generally applied by an additional term in the Hamiltonian, this can be a spring term for Umbrella Sampling, a Gaussian function for Metadynamics or whatever term which is a function of the collective variable \f$ s \f$. -Therefore the Hamiltonian \f$ H (X) \f$ where \f$ X \f$ is the point of in the configurational phase space where your system is takes the following form +Therefore the Hamiltonian \f$ H (X) \f$ where \f$ X \f$ is the point of in the phase space where your system is takes the following form \f[ H'(X)=H(X)+U(S(X)) \f] @@ -183,13 +183,13 @@ It is interesting to note that whenever the \f$ \lambda \f$ is too small the for A very common question that comes is the following: "I have my reaction or a model of it. how many frames do I need to properly define a path collective variable?" This is a very important point that requires a bit of thinking. It all depends on the limiting scale in your reaction. For example, if in your process you have a torsion, as the smallest event that -you want to capture with path collective variable, then it is important that you mimick that torsion in the path and that this does not contain simply the initial and final point but also some intermediate. Similarly, if you have a concerted bond breaking, it might be that all takes place in the range of an Angstrom or so. In this case you should have intermediate frames that cover the sub-Angstrom scale. If you have both in the same path, then the smallest dominates and you have to mimick also the torsion with sub-Angstrom accuracy. +you want to capture with path collective variable, then it is important that you mimic that torsion in the path and that this does not contain simply the initial and final point but also some intermediate. Similarly, if you have a concerted bond breaking, it might be that all takes place in the range of an Angstrom or so. In this case you should have intermediate frames that cover the sub-Angstrom scale. If you have both in the same path, then the smallest dominates and you have to mimic also the torsion with sub-Angstrom accuracy. \section belfast-2-pcvs-tricks Some tricks of the trade: the neighbors list. If it happens that you have a very complex and detailed path to use, say that it contains 100 frames with 200 atoms each, then the calculation of a 100 alignment is required every time -you need the CV. This can be quite expensive but you can use a trick. If your trajectory is continuous and you are sure that your trajectory does not show jumps where your system suddedly move from the reactant to the product, then you can use a so-called neighbor list. The plumed input shown before then becomes +you need the CV. This can be quite expensive but you can use a trick. If your trajectory is continuous and you are sure that your trajectory does not show jumps where your system suddenly move from the reactant to the product, then you can use a so-called neighbor list. The plumed input shown before then becomes \verbatim p1: PATHMSD REFERENCE=all.pdb LAMBDA=50.0 NEIGH_STRIDE=100 NEIGH_SIZE=10 @@ -210,13 +210,13 @@ In Fig. \ref belfast-2-ala-fig its structure is shown. \anchor belfast-2-ala-fig \image html belfast-2-ala.png "The molecule of the day: alanine dipeptide." -The two main metastable states are called \f$ C_7eq \f$ and \f$ C_7ax \f$. +The two main metastable states are called \f$ C_7eq\f$ and \f$ C_7ax \f$. \anchor belfast-2-transition-fig \image html belfast-2-transition.png "Two metastable states of alanine dipeptide are characterized by their Ramachandran dihedral angles." Here metastable states are intended as states which have a relatively low free energy compared to adjacent conformation. At this stage it is not really useful -to know what is the free energy, just think in term of internal energy. This is almost the same for such a small system whith so few degrees of freedom. +to know what is the free energy, just think in term of internal energy. This is almost the same for such a small system which so few degrees of freedom. It is conventional use to show the two states in terms of Ramachandran dihedral angles, which are denoted with \f$ \Phi \f$ and \f$ \Psi \f$ in Fig. \ref belfast-2-transition-fig . @@ -228,13 +228,13 @@ It is conventional use to show the two states in terms of Ramachandran dihedral Now as a simple example, I want to show you that plotting some free dynamics trajectories shoot from the saddle point, you get a different plot in the path collective variables if you use the right path or if you use the wrong path. -In Fig. \ref belfast-2-good-bad-path-fig I show you two example of possible path that join the \f$ C_7eq \f$ and \f$ C_7ax \f$ metastable states in alanine dipeptide. +In Fig. \ref belfast-2-good-bad-path-fig I show you two example of possible path that join the \f$ C_7eq\f$ and \f$ C_7ax \f$ metastable states in alanine dipeptide. You might clearly expect that real (rare) trajectories that move from one basin to the other would rather move along the black line than on the red line. \anchor belfast-2-good-bad-path-fig \image html belfast-2-good-bad-path.png "Example of good and bad path: one sits confortably in the minimum free energy path connecting the two metastable states, while the other makes a direct connection via high energy states" -So, in this example we do a sort of "commmittor analysis" where we start shooting a number of free molecular dynamics from the saddle point located at \f$ \Phi=0 \f$ and \f$ \Psi=-1 \f$ and we want to see which way do they go. Intuitively, by assigning random velocities every time we should find part of the trajectories that move woward \f$ C_7eq \f$ and part that move towards \f$ C_7ax \f$. +So, in this example we do a sort of "commmittor analysis" where we start shooting a number of free molecular dynamics from the saddle point located at \f$ \Phi=0 \f$ and \f$ \Psi=-1 \f$ and we want to see which way do they go. Intuitively, by assigning random velocities every time we should find part of the trajectories that move toward \f$ C_7eq\f$ and part that move towards \f$ C_7ax \f$. I provided you with two directories, each containing a bash script script.sh whose core (it is a bit more complicated in practice...) consists in: @@ -259,7 +259,7 @@ done \endverbatim This runs 50 short MD runs (few hundreds steps) and just saves the colvar file into a labeled colvar file. -In each mdrun plumed is used to plot the collective variables and it is something that reads like the follwing: +In each mdrun plumed is used to plot the collective variables and it is something that reads like the following: \verbatim # Phi t1: TORSION ATOMS=5,7,9,15 @@ -274,7 +274,7 @@ PRINT ARG=* STRIDE=2 FILE=colvar FMT=%12.8f \endverbatim where I just want to plot \f$ \Phi \f$ , \f$ \Psi \f$ and the two path collective variables. -Note that each path has a different LAMBDA parameters. Here the Ramachandran angles are plotted so you can realize which path is the system walking in a more confortable projection. This is of course fine in such a small system but whenever you have to deal with larger systems and control hundreds of CVs at the same time, I think that path collective variables produce a more intuituve description for what you want to do. +Note that each path has a different LAMBDA parameters. Here the Ramachandran angles are plotted so you can realize which path is the system walking in a more comfortable projection. This is of course fine in such a small system but whenever you have to deal with larger systems and control hundreds of CVs at the same time, I think that path collective variables produce a more intuitive description for what you want to do. If you run the script simply with @@ -302,7 +302,7 @@ In first column you have the time, then t1 (\f$ \Phi \f$) , then t2 ( \f$ \Psi \ Note that the action PATHMSD calculates both the progress along the path (p1.sss) and the distance from it (p1.zzz) . In PLUMED jargon, these are called "components". So a single Action (a line in plumed input) can calculate many components at the same time. This is not always the case: sometimes by default you have one component but specific flags may enable more components to be calculated (see \ref DISTANCE for example). -Note that the header (all the part of a colvar file that contains \# as first character) is telling already what it inside the file and eventually also tells you if a variable is contained in boundaries (for example torsions, are periodic and their codomain is defined in -Pi and Pi ). +Note that the header (all the part of a colvar file that contains \# as first character) is telling already what it inside the file and eventually also tells you if a variable is contained in boundaries (for example torsion angles, are periodic and their co-domain is defined in -Pi and Pi ). At the end of the script, you also have two additional scripts. One is named script_rama.gplt and the other is named script_path.gplt. They contain some gnuplot commands that are very handy to visualize all the colvar files without making you load one by one, that would be a pain. @@ -332,15 +332,15 @@ Differently, if you now do gnuplot> load "script_path_right.gplt" \endverbatim -You see that the path, compared to what happened before, run much closer to small distance from the path. This means that the provided path is highlt resemblant and representative of what hapens in the reactive path. +You see that the path, compared to what happened before, run much closer to small distance from the path. This means that the provided path is right and more representative of what happens in the reactive path. Note that going at high distances can be also beneficial. It might help you to explore alternative paths that you have not explored before. But beware, the more you get far from the path, the more states are accessible, in a similar way as the fact that the surface of a sphere increase with its radius. The surface ramps up much faster than the radius therefore you have a lots of states there. This means also high entropy, so many systems actually tend to drift to high distances while, on the contrary, zero distance is never reached in practice (zero entropy system is impossible to reach at finite temperature). So you can see by yourself that this can be a big can of worms. In particular, my experience with path collective variables and biological systems tells me that most of time is hopeless to go to high distances to find new path in many cases (for example, in folding). While this is worth whenever you think that the paths are not too many (alternative routes in chemical reaction or enzymatic catalysis). \section belfast-2-pcvs-format How to format my input? Very often it is asked how to format a set of pdb to be suitably used with path collective variables. Here are some tricks. -- When you dump the files with vmd or (for gromacs users, using trjcat), the pdb you obtain is reindexed from 1. This is also the case -when you select a subensemble of atoms of the path (e.g. the heavy atoms only or the backbone atoms). This is rather unfortunate and you have to +- When you dump the files with vmd or (for gromacs users, using trjcat), the pdb you obtain is re-indexed from 1. This is also the case +when you select a sub-ensemble of atoms of the path (e.g. the heavy atoms only or the backbone atoms). This is rather unfortunate and you have to fix is someway. My preference is to dump the whole pdb but water (when I do not need it) and use some awk script to select the atoms I am interested in. - Pay attention to the last two column. These are occupancy and beta. With the first (occupancy) you set the atoms which are used to perform the alignment. The atoms which have zero occupancy will not be used in the alignment. The second column is beta and controls which atoms are used for the calculation of the distance after having @@ -348,17 +348,17 @@ performed the alignment on the set of atoms which have nonzero occupancy column. distance respect to another set. This is handy in case of protein-ligand. You set the alignment of the protein and you calculate the distance based on the ligand and the part of the protein which is in contact with the protein. This is done for example in <a href="http://pubs.acs.org/doi/abs/10.1021/jp911689r">this article</a>. - <a href="http://www.multiscalelab.org/utilities/PlumedGUI">Plumed-GUI</a> (version \> 2.0) provides the <em>Structure->Build reference structure...</em> function to generate inputs that conform to the above rules from within VMD. - Note that all the atoms contained in the REFERENCE must be the same. You cannot have a variable number of atoms in each pdb contained in the reference. -- The reference is composed as a set of concatenated PDBs that are interrupted by a TER/END/ENDMDL card. Both HETATM and ATOM cards denote the atoms of the set. +- The reference is composed as a set of concatenated PDB files that are interrupted by a TER/END/ENDMDL card. Both HETATM and ATOM cards denote the atoms of the set. - Include in the reference frames only the needed atoms. For example, if you have a methyl group involved in a conformational transition, it might be that you do not want to include the hydrogen atoms of the methyl since these rotate fast and probably they do not play ant relevant role. \section belfast-2-pcvs-metad-on-path Fast forward: metadynamics on the path -This section is actually set a bit foward but I included here for completeness now. It is recommended to be read after you have an introduction on Metadynamics and to well-tempered Metadynamics in particular. +This section is actually set a bit forward but I included here for completeness now. It is recommended to be read after you have an introduction on Metadynamics and to well-tempered Metadynamics in particular. Here I want to show a couple of concept together. - Path collective variables can be used for exploring alternative routes. It is effective in highly structure molecules, while it is tricky on complex molecules whenever you have many competing routes -- Path collective variables suffer from problems at the endpoints (as the higly popular coordinates \ref COORDINATION for example) that can be cured with flexible hills and an appropriate reweighting procedure within the well-tempered Metadynamics scheme. +- Path collective variables suffer from problems at the endpoints (as the highly popular coordinates \ref COORDINATION for example) that can be cured with flexible hills and an appropriate reweighting procedure within the well-tempered Metadynamics scheme. -Let's go to the last problem. All comes from the derivative \ref belfast-2-sder-eq. Whenever you have a point of phase space which is similar to one of the endpoint than one of the points in the center then you get a \f$ s(X) \f$ which is 1 or N (where N is the number of frames composing the path collective variable). In this case that exponential will dominate the others and you are left with a constant (since the derivative of RMSD is a constant since it is linear in space). This means that, no matter what happens here, you have small force. Additionally you have small motion in the CV space. You can move a lot in configuration space but if the closest point is one of the endpoint, your CV value will always be one of the endpoint itself. So, if you use a fixed width of your CV which you retrieve from a middle point in your path, this is not suitable at all at the endpoints where your CV flucutates much less. On the contrary if you pick the hills width by making a free dynamics on the end states you might pick some sigmas that are smaller than what you might use in the middle of the path. This might give a rough free energy profile and definitely more time to converge. +Let's go to the last problem. All comes from the derivative \ref belfast-2-sder-eq. Whenever you have a point of phase space which is similar to one of the endpoint than one of the points in the center then you get a \f$ s(X) \f$ which is 1 or N (where N is the number of frames composing the path collective variable). In this case that exponential will dominate the others and you are left with a constant (since the derivative of RMSD is a constant since it is linear in space). This means that, no matter what happens here, you have small force. Additionally you have small motion in the CV space. You can move a lot in configuration space but if the closest point is one of the endpoint, your CV value will always be one of the endpoint itself. So, if you use a fixed width of your CV which you retrieve from a middle point in your path, this is not suitable at all at the endpoints where your CV fluctuates much less. On the contrary if you pick the hills width by making a free dynamics on the end states you might pick some sigmas that are smaller than what you might use in the middle of the path. This might give a rough free energy profile and definitely more time to converge. A possible solution is to use the adaptive gaussian width scheme. In this scheme you adapt the hills to their fluctuation in time. You find more details in \cite Branduardi:2012dl Additionally you also adopt a non spherical shape taking into account variable correlation. So in this scheme you do not have to fix one sigma per variable sigma, but just the time in which you calculate this correlation (another possibility is to calculate it from the compression of phase space but will not be covered here). The input of metadynamics might become something like this @@ -391,7 +391,7 @@ that plots the hills in the progress along the path and the distance from the There are a number of things to observe: first that the path explores high distance since the metadynamics is working also in the distance from the path thus accessing the paths that were not -explored before, namely the one that goes from the upper left corner of the ramachandran plot and the one that passes through the lower left corner. So in this way one can also explore other paths. Additionally you can see that the hills are changing size rather considerably. This helps the system to travel faster since at each time you use something that has a nonzero gradient and your forces act on your system in an effective way. Another point is that you can see broad hills covering places which you have not visited in practice. For example you see that hills extend so wide to cover point that have negative progress along the path, which is impossible by using the present definition of the progress along the path. +explored before, namely the one that goes from the upper left corner of the Ramachandran plot and the one that passes through the lower left corner. So in this way one can also explore other paths. Additionally you can see that the hills are changing size rather considerably. This helps the system to travel faster since at each time you use something that has a nonzero gradient and your forces act on your system in an effective way. Another point is that you can see broad hills covering places which you have not visited in practice. For example you see that hills extend so wide to cover point that have negative progress along the path, which is impossible by using the present definition of the progress along the path. This introduced a problem in calculating the free energy. You actually have to correct for the point that you visited in reality. You can actually use \ref sum_hills to this purpose in a two-step procedure. @@ -401,13 +401,13 @@ First you calculate the negative bias on a given range: pd@plumed:~> plumed sum_hills --hills HILLS --negbias --outfile negative_bias.dat --bin 100,100 --min -5,-0.005 --max 25,0.05 \endverbatim -and then calculate the correction. You can use the same hills file for that purpose. The initial transient time should not matter if your simulation is long enough to see many recrossing and, secondly, you should check that the hills heigh in the welltempered are small compared to the beginning. +and then calculate the correction. You can use the same hills file for that purpose. The initial transient time should not matter if your simulation is long enough to see many recrossing and, secondly, you should check that the hills height in the well tempered are small compared to the beginning. \verbatim pd@plumed:~> plumed sum_hills --histo HILLS --bin 100,100 --min -5,-0.005 --max 25,0.05 --kt 2.5 --sigma 0.5,0.001 --outhisto correction.dat \endverbatim -Note that in the correction you should assign a sigma, that is a "trust radius" in which you think that whenever you have a point somewhere, there in the neighborhood you assign a bin (it is done with Gaussian in reality, but this does not matter much). This is the important point that compenstates for the issues you might encounter putting excessive large hills in places that you have not visited. +Note that in the correction you should assign a sigma, that is a "trust radius" in which you think that whenever you have a point somewhere, there in the neighborhood you assign a bin (it is done with Gaussian in reality, but this does not matter much). This is the important point that compensates for the issues you might encounter putting excessive large hills in places that you have not visited. It is nice to have a look to the correction and compare with the hills in the same range. \verbatim @@ -444,7 +444,7 @@ gnuplot> set cntrp lev incr -140,6,-30 \image html belfast-2-metadpath-free.png "Final free energy obtained by the sum of the negative bias and the correction. Isolines are every 6 kJ/mol." So now we can comment a bit on the free energy surface obtained and note that there is a free energy path that connects the two endpoints and runs very close to zero distance from the path. -This means that our input path is actually resemblant of what is really happening in the system. Additionally you can see that there are many comparable routes different from the straight path. Can you make a sense of it just by looking at the free energy on the Ramachandran plot? +This means that our input path is resembles what is really happening in the system. Additionally you can see that there are many comparable routes different from the straight path. Can you make a sense of it just by looking at the free energy on the Ramachandran plot? */ diff --git a/user-doc/tutorials/belfast-3.txt b/user-doc/tutorials/belfast-3.txt index 15728cb9f..85ad5438f 100644 --- a/user-doc/tutorials/belfast-3.txt +++ b/user-doc/tutorials/belfast-3.txt @@ -4,7 +4,7 @@ \section belfast-3-aims Aims The aim of this tutorial is to consolidate the material that was covered during -\ref belfast-1 and \ref belfast-2 on analysing trajectories using collective variables +\ref belfast-1 and \ref belfast-2 on analyzing trajectories using collective variables and path collective variables. We will then build on this material by showing how you can use the multidimensional scaling algorithm to automate the process of finding collective variables. @@ -24,11 +24,11 @@ The <a href="tutorial-resources/belfast-3.tar.gz" download="belfast-3.tar.gz"> t - trajectory-short.xyz : a (short) trajectory for a 16 residue protein in xyz format. All calculations with plumed driver use this trajectory. - trajectory-short.pdb : the same trajectory in pdb format, this can be loaded with VMD -- template.pdb : a single frame from the trajectory that can be used in conjuction with the \ref MOLINFO command +- template.pdb : a single frame from the trajectory that can be used in conjunction with the \ref MOLINFO command \section belfast-3-instructions Instructions -\subsection belfast-3-vis-traj Visualising the trajectory +\subsection belfast-3-vis-traj Visualizing the trajectory The aim of this tutorial is to understand the data contained in the trajectory called trajectory-short.pdb. This file contains some frames from a simulation of a 16 residue protein. As a start point then lets load this trajectory with vmd and have a look @@ -65,7 +65,7 @@ is this trajectory sampling from? What can we tell from looking at this traject If your answers to the questions at the end of the above paragraph are I don't know that is good. We can tell very little by just looking at a trajectory. In fact the whole point of the tutorials I referenced at the start of this one was to find ways of analyzing trajectories precisely -so that we are not put in this position of staring at trajetories mystified! +so that we are not put in this position of staring at trajectories mystified! \subsection belfast-3-gismo Installing GISMO @@ -78,12 +78,12 @@ to open GISMO can be found under Extensions>Analysis>GISMO. \subsection belfast-3-cvs Finding collective variables -Right so lets come up with some CVs to analyse this trajectory with. As some of you may know we can understand the conformation of proteins -by looking at the Ramachandran angles. For those of you who don't know here is a Wikkepedia article: +Right so lets come up with some CVs to analyze this trajectory with. As some of you may know we can understand the conformation of proteins +by looking at the Ramachandran angles. For those of you who don't know here is a Wikipedia article: http://en.wikipedia.org/wiki/Ramachandran_plot -Our protein has 32 ramachandran angles. We'll come back to that. For the time being pick out a couple that you think might be useful +Our protein has 32 Ramachandran angles. We'll come back to that. For the time being pick out a couple that you think might be useful and construct a plumed input to calculate them and print them to a file. You will need to use the \ref TORSION and \ref PRINT commands in order to do this. Once you have created your plumed input use driver to calculate the torsional angles using the following command: @@ -91,8 +91,8 @@ in order to do this. Once you have created your plumed input use driver to calc plumed driver --plumed plumed.dat --ixyz trajectory-short.xyz \endverbatim -If you have done this correctly you should have an output file containing your torsional angles. We can use vmd+GISMO to visualise -the relationship between the ramachandran angles and the atomic configurations. To do this first load the trajectory in VMD: +If you have done this correctly you should have an output file containing your torsional angles. We can use vmd+GISMO to visualize +the relationship between the Ramachandran angles and the atomic configurations. To do this first load the trajectory in VMD: \verbatim vmd trajectory-short.pdb @@ -107,7 +107,7 @@ What can you conclude from this exercise. Do the CV values of the various frame Do points in different clusters correspond to structures that look the same or different? Are there similar looking structures clustered together or are they always far apart? What can we conclude about the various basins in the free energy landscape that have been explored in this trajectory? How many are there? Would your estimate be the same if you tried the above estimate -with a different pair of ramachandran angles? +with a different pair of Ramachandran angles? \subsection belfast-3-dim-red Dimensionality reduction @@ -115,17 +115,17 @@ What we have done in most of the other exercises here is seek out a function tha dimensional vector, where \f$N\f$ is the number of atoms. This function then outputs a single number - the value of the collective variable - that tells us where in a low dimensional space we should project that configuration. Problems can arise because this collective-variable function is many-to-one. As you have hopefully seen in the previous exercise markedly different configurations of the protein can actually have quite -similar values of a particular ramachandran angle. +similar values of a particular Ramachandran angle. -We are going to spend the rest of this session introducing an alternative approach to this bussiness of finding collective variables. In this alternative +We are going to spend the rest of this session introducing an alternative approach to this business of finding collective variables. In this alternative approach we are going to stop trying to seek out a function that can take any configuration of the atoms (any \f$3N\f$-dimensional vector) and find its low dimensional projection on the collective variable axis. Instead we are going to take a set of configurations of the atoms (a set of \f$3N\f$-dimensional vectors of atom positions) and try to find a sensible set of projections for these configurations. We already touched on this idea earlier when we looked at paths. Our assumption, when we introduced this idea, was that we could find an ordered set of high-dimensional configurations that represented the -transtion pathway the system takes as it crossed a barrier and changed between two particularly interesting configurations. Lets say we have a path defined by +transition pathway the system takes as it crossed a barrier and changed between two particularly interesting configurations. Lets say we have a path defined by four reference configurations - this implies that to travel between the configurations at the start and the end of this path you have to pass through configuration 1, then configuration 2, then configuration 3 and then configuration 4. This ordering means that the numbers 1 through 4 constitute sensible -projections of these high-dimensional configurations. The numbers 1 through 4 all lie on a single cartesian axis - a low-dimensional space. +projections of these high-dimensional configurations. The numbers 1 through 4 all lie on a single Cartesian axis - a low-dimensional space. The problem when it comes to applying this idea to the data that we have in the trajectory-short trajectory is that we have no information on the "order" of these points. We have not been told that this trajectory represents the transition between two interesting points in phase space and thus we cannot apply the @@ -150,28 +150,28 @@ You should then run this calculation using the following command: plumed driver --ixyz trajectory-short.xyz --plumed plumed.dat \endverbatim -This should generate an output file called rmsd-embed. You should now be able to use VMD+GISMO to visualise this output. +This should generate an output file called rmsd-embed. You should now be able to use VMD+GISMO to visualize this output. Do the CV values of the various frames appear in clusters in the plane? Do points in different clusters correspond to structures that look the same or different? Are there similar looking structures clustered together or are they always far apart? What can we conclude about the various basins in the free energy landscape that have been explored in this trajectory? How many are there? Do you think this gives you a fuller picture of the -trajectory than the ones you obtained by considering ramachandran angles? +trajectory than the ones you obtained by considering Ramachandran angles? \section belfast-3-extensions Extensions -As discussed in the previous section this approach to trajectory analysis works by calcalating distances between pairs of atomic configurations. +As discussed in the previous section this approach to trajectory analysis works by calculating distances between pairs of atomic configurations. Projections corresponding to these configurations are then generated in the low dimensional space in a way that tries to preserve these pairwise distances. There are, however, an infinite number of ways of calculating the distance between two high-dimensional configurations. In the previous section we used the RMSD distance but you could equally use the DRMSD distance. You could even try calculating a large number of collective variables for each of the high-dimensional points and seeing how much these all changed. You can use these different types of distances with the \ref CLASSICAL_MDS action that was introduced in the previous section. If you have time less at the end of the session read the manual -for the \ref CLASSICAL_MDS action and see if you can calculate an MDS projection using an alternative defintion of the distances between configurations. -Some suggestions to try in order of increasing difficulty: DRMSD, how much the full set of 32 ramachandran angles change and the change in the +for the \ref CLASSICAL_MDS action and see if you can calculate an MDS projection using an alternative definition of the distances between configurations. +Some suggestions to try in order of increasing difficulty: DRMSD, how much the full set of 32 Ramachandran angles change and the change in the contact map \section belfast-3-further Further Reading -There is a growing community of people using these ideas to analyse trajectory data. Some start points for investigating their work in more detail +There is a growing community of people using these ideas to analyze trajectory data. Some start points for investigating their work in more detail are: - http://epfl-cosmo.github.io/sketchmap/index.html?page=main diff --git a/user-doc/tutorials/belfast-4.txt b/user-doc/tutorials/belfast-4.txt index 3531956e6..4c900ca50 100644 --- a/user-doc/tutorials/belfast-4.txt +++ b/user-doc/tutorials/belfast-4.txt @@ -106,7 +106,7 @@ the conformations corresponding to the transition state. Imagine that you know since the beginning the shape of the free-energy landscape \f$ F(s) \f$ as a function of one CV \f$ s \f$. If you perform a molecular dynamics simulation using a bias potential which is exactly equal to \f$ -F(s) \f$, -the biased free-energy landscape will be flat and barrierless. +the biased free-energy landscape will be flat and barrier less. This potential acts as an "umbrella" that helps you to safely cross the transition state in spite of its high free energy. @@ -145,7 +145,7 @@ where \f[ Z_i=\sum_{q}e^{-\left(U(q)+V_i(q)\right)} \f] -The likelyhood for the observation of a sequence of snapshots \f$q_i(t)\f$ (where \f$i\f$ is +The likelihood for the observation of a sequence of snapshots \f$q_i(t)\f$ (where \f$i\f$ is the index of the trajectory and \f$t\f$ is time) is just the product of the probability of each of the snapshots. We use here the minus-logarithm of the likelihood (so that the product is converted to a sum) that can be written as @@ -158,17 +158,17 @@ the product is converted to a sum) that can be written as +\log Z_i \right) \f] -One can then maximize the likelyhood by setting \f$\frac{\delta \mathcal{L}}{\delta P({\bf s})}=0\f$. +One can then maximize the likelihood by setting \f$\frac{\delta\mathcal{L}}{\delta P({\bf s})}=0\f$. After some boring algebra the following expression can be obtained \f[ 0=\sum_{i}\int dt\left(-\frac{\delta_{{\bf s}_{i}(t),{\bf s}}}{P({\bf s})}+\frac{e^{-\frac{V_{i}({\bf s})}{k_{B}T}}}{Z_{i}}\right) \f] In this equation we aim at finding \f$P(s)\f$. However, also the list of normalization factors \f$Z_i\f$ is unknown, and they should -be found selfconsistently. Thus one can find the solution as +be found self consistently. Thus one can find the solution as \f[ P({\bf s})\propto \frac{N({\bf s})}{\sum_i\int dt\frac{e^{-\frac{V_{i}({\bf s})}{k_{B}T}}}{Z_{i}} } \f] -where \f$Z\f$ is selfconsistently determined as +where \f$Z\f$ is self consistently determined as \f[ Z_i\propto\int ds' P({\bf s}') e^{-\frac{V_i({\bf s}')}{k_BT}} \f] @@ -185,9 +185,9 @@ Z_i\propto\frac{\sum_j \int dt e^{-\beta V_i({\bf s}_j(t))} w_j(t)}{ \sum_j \int dt w_j(t)} \f] -This allows to straighforwardly compute averages related +This allows to straightforwardly compute averages related to other, non-biased degrees of freedom, and it is thus a bit more flexible. -It is sufficient to precompute this factors \f$w\f$ and use them to weight +It is sufficient to pre-compute this factors \f$w\f$ and use them to weight every single frame in the trajectory. \section belfast-4-learning-outcomes Learning Outcomes @@ -316,7 +316,7 @@ This can be done e.g. with this shell script awk '{if($3>1 && $3<2)a++; else b++;}END{print a/(a+b)}' \endverbatim Notice that we here considered only the last 80000 frames in the average. Look at the time series for \f$\psi \f$ -and guess why. Also notice that the script is removing the initial comments. After this trivial preprocessing, +and guess why. Also notice that the script is removing the initial comments. After this trivial pre-processing, the script is just counting how many times the third column (\f$ \psi \f$) lies between 1 and 2 and how many times it doesn't. At the end it prints the number of times the variable is between 1 and 2 divided by the total count. The result should be something around 0.40. Now try to do it on @@ -336,7 +336,7 @@ Repeat this calculation for different values of AT. Does the result depend on AT \subsection belfast-4-fes A free-energy landscape One can also count the probability of an angle to be in a precise -bin. The logarithm of this quantity, in kbT units, is the free-energy +bin. The logarithm of this quantity, in \f$k_B T\f$ units, is the free-energy associated to that bin. There are several ways to compute histograms, either with PLUMED or with external programs. Here I decided to use awk. diff --git a/user-doc/tutorials/belfast-5.txt b/user-doc/tutorials/belfast-5.txt index 2b5b477b6..13f7acbfa 100644 --- a/user-doc/tutorials/belfast-5.txt +++ b/user-doc/tutorials/belfast-5.txt @@ -4,7 +4,7 @@ In plumed you can bring a system in a specific state in a collective variable by means of the \ref MOVINGRESTRAINT directive. This directive is very flexible and allows for a programmed series -of draggings and can be used also to sample multiple events +moving restraints. This technique can be used also to sample multiple events within a single simulation. Here I will explain the concepts of it and show some examples \section belfast-5-resources Resources @@ -29,18 +29,18 @@ H_\lambda(X,t)&=&H(X)+U_\lambda(X,t)\\ \f} This means that if the \f$ k \f$ is tight enough the system will follow closely the center of the moving harmonic spring. But be careful, if the spring constant is too hard your equations of motion will now keep up since they are tuned -to the fastest motion in your system so if you artificially introduce a higher freqeuncy in your system you'll screw up the dynamics. +to the fastest motion in your system so if you artificially introduce a higher frequency in your system you'll screw up the dynamics. The same is true for the pulling speed \f$ v \f$. As a matter of fact I never encountered the case where I had to lower the time step and I could all the time be happy just by making a softer spring constant or a slower steering speed. Generally, integrators of equations of motion like velocity-Verlet are very tolerant. Note that one can also make the spring constant depend on time and this, as we will see later in the examples is particularly useful to prepare your state. -In simulations, it is more convenient to adopt a situation where you specify only the starting point, the final point of cvs and the time in which you want to cover +In simulations, it is more convenient to adopt a situation where you specify only the starting point, the final point of collective variables and the time in which you want to cover the transition. That's why the plumed input is done in such a way. For example, let's go back to the alanine dipeptide example encountered in \ref belfast-2-ala. -Let's say that now we want to steer from \f$ C_7eq \f$ to \f$ C_7ax \f$. +Let's say that now we want to steer from \f$C_7eq \f$ to \f$ C_7ax \f$. If you think, just by dragging along the \f$ \Phi \f$ dihedral angle from a value of -1 to a value 1 should be enough to the state of interest. Additionally, it might be important to you not to stress the system too much, so you might want first to increase the \f$ k \f$ first so to lock the system in \f$ \Phi =-1 \f$, then move it gently to \f$ \Phi =1 \f$ and then release again your spring constant so to end up to an equilibrated and @@ -64,9 +64,9 @@ restraint: ... PRINT STRIDE=10 ARG=* FILE=COLVAR \endverbatim -Please note the syntax of \ref MOVINGRESTRAINT : You need one (or more) argument(s) and a set of steps denote by ATX, STEPX, KAPPAX where X is a incremental starting from 0 -that assign the center and the harness of the spring at step STEPX. What happens in between is a linear interpolation of the AT and KAPPA parameters. If those -are identical in two consecutive steps then nothing is happening to that parameter. So if you put the same KAPPA and AT in two different STEPs then this will give you +Please note the syntax of \ref MOVINGRESTRAINT : You need one (or more) argument(s) and a set of steps denote by AT\f$x\f$, STEP\f$x\f$, KAPPA\f$x\f$ where \f$x\f$ is a incremental starting from 0 +that assign the center and the harness of the spring at step STEP\f$x\f$. What happens in between is a linear interpolation of the AT and KAPPA parameters. If those +are identical in two consecutive steps then nothing is happening to that parameter. So if you put the same KAPPA and AT value in two different STEP keywords then this will give you an umbrella potential placed in the same point between the two time intervals defined by STEP. Note that you need to run a bit more than 6000 steps because after this your system has no more restraints so the actual thermalization period starts here. @@ -100,11 +100,11 @@ Note that the actual force on an atom of the system involved in a CV is instead \f} This is important because in CVs that have a derivative that change significantly with space then you might have regions in which no force is exerted while in others you -might have an enormous force on it. Typically this is the case of sigmoids that are used in coordination numbers in which, in the tails, they are basically flat as a function +might have an enormous force on it. Typically this is the case of sigmoid functions that are used in coordination numbers in which, in the tails, they are basically flat as a function of particle positions. Additionally note that this happens on any force-based enhanced-sampling algorithm so keep it in mind. Very often people miss this aspect and complain either that a CV or a enhanced-sampling method does not work. This is another good reason to use tight spring force so to compensate in the force the lack of derivative from the CV. -The other argument in colvar is restraint.phi_cntr which is the center of the harmonic potential. This is a useful quantity so you may know how close the system is follwing the +The other argument in colvar is restraint.phi_cntr which is the center of the harmonic potential. This is a useful quantity so you may know how close the system is following the center of harmonic potential (more on this below). The last parameter is restraint.phi_work. The work is defined as @@ -146,9 +146,9 @@ Another couple of interesting thing that you can check is \section belfast-5-path Moving on a more complex path -Very often it is useful to use this movingrestraint to make a fancier schedule by using nice properties of \ref MOVINGRESTRAINT. For example you can plan a schedule to drive multiple +Very often it is useful to use this moving restraint to make a fancier schedule by using nice properties of \ref MOVINGRESTRAINT. For example you can plan a schedule to drive multiple CVs at the same time in specific point of the phase space and also to stop for a while in specific using a fixed harmonic potential. This can be handy in case of an umbrella sampling run -where you might want to explore a 1-dimensional landscape by acquiring some statistics in one point and then moving to the next to acquire more statistics. With \ref MOVINGRESTRAINT you can do it in only one file. To give an example of such capabilities, let's say that we want to move from \f$ C7eq \f$ vertically toward \f$ \Phi =-1.5 ; \Psi=-1.3 \f$, stop by for a while (e.g. to acquire a statistics that you might need for an umbrella sampling), then moving toward \f$ \Phi =1.3 ; \Psi=-1.3 \f$ which roughly corresponds to \f$ C7ax \f$. +where you might want to explore a 1-dimensional landscape by acquiring some statistics in one point and then moving to the next to acquire more statistics. With \ref MOVINGRESTRAINT you can do it in only one file. To give an example of such capabilities, let's say that we want to move from \f$C7eq \f$ vertically toward \f$ \Phi =-1.5 ; \Psi=-1.3 \f$, stop by for a while (e.g. to acquire a statistics that you might need for an umbrella sampling), then moving toward \f$ \Phi =1.3 ; \Psi=-1.3 \f$ which roughly corresponds to \f$ C7ax \f$. This can be programmed conveniently with \ref MOVINGRESTRAINT by adopting the following schedule @@ -171,10 +171,10 @@ restraint: ... PRINT STRIDE=10 ARG=* FILE=COLVAR \endverbatim -Note that by adding two arguments for movingrestraint, now I am allowed to put two values (separated by comma, as usual for multiple values in PLUMED) -and correspondingly two KAPPA values. One for each variable. Please note that no space must be used bewtween the arguments! This is a very common fault in writing the inputs. +Note that by adding two arguments for moving restraint, now I am allowed to put two values (separated by comma, as usual for multiple values in PLUMED) +and correspondingly two KAPPA values. One for each variable. Please note that no space must be used between the arguments! This is a very common fault in writing the inputs. -By plotting the instataneous value of the variables and the value of the center of the harmonic potentials we can inspect the pathways that we make the system +By plotting the instantaneous value of the variables and the value of the center of the harmonic potentials we can inspect the pathways that we make the system walk on the Ramachandran plot. (How to do this? Have a look to the header of COLVAR file to plot the right fields) \anchor belfast-5-doublesteer.png @@ -186,14 +186,14 @@ Plot of the double steering schedule using \ref MOVINGRESTRAINT \section belfast-5-work Why work is important? -The work as we have seen is the cumulative change of the hamiltonian in time. So it is connected with the change in energy of your system while you move it around. +The work as we have seen is the cumulative change of the Hamiltonian in time. So it is connected with the change in energy of your system while you move it around. It has also a more important connection with the free energy via the Jarzynski equation which reads \f[ \Delta F=-\beta^{-1}\ln \langle \exp^{-\beta W} \rangle \f] This is important and says that potentially you can calculate the free energy even by driving your system very fast and out of equilibrium between two states of your interest. Unfortunately this in practice not possible since the accurate calculation of the quantity \f$ \langle \exp^{-\beta W} \rangle \f$ has a huge associated error bar since it involves the -average of a noisy quantity (the work) being exponentiated. +average of a noisy quantity (the work) that appears in the exponential. So, before going wild with SMD, I want to make a small exercise on how tricky that is even for the smallest system. Now we run, say 30 SMD run and we calculate the free energy difference by using Jarzynski equality and see how this differs from the average. @@ -222,7 +222,7 @@ What you see is something like in Fig. \image html belfast-5-jarz.png "Plot of the works produced" There are a number of interesting fact here. First you see that different starting points may end with very different work values. Not bad, one would say, since Jarzyinsi equality -is saying that we can make an average and everything will be ok. +is saying that we can make an average and everything will be OK. So now calculate the average work by using the following bash one-liner: \verbatim @@ -276,13 +276,13 @@ PRINT STRIDE=10 ARG=* FILE=COLVAR Note that \ref RMSD should be provided a reference structure in pdb format and can contain part of the system but the second column (the index) must reflect that of the full pdb so that PLUMED knows specifically which atom to drag where. The \ref MOVINGRESTRAINT bias potential here acts on the rmsd, and the other two variables (phi and psi) are untouched. Notice that whereas the force from the restraint -should be applied at every step (thus rmsd is computed at every step) the two torsions +should be applied at every step (thus rmsd is computed at every step) the two torsion angles are computed only every 10 steps. PLUMED automatically detect which variables are used at every step, leading to better performance when complicated and computationally -expensive variables are monitored - this is not the case here, since the two torsions +expensive variables are monitored - this is not the case here, since the two torsion angles are very fast to compute. Note that here the work always increase with time and never gets lower which is somewhat surprising -if you tink that we are moving in another metastable state. One would expect this to bend and give a +if you think that we are moving in another metastable state. One would expect this to bend and give a signal of approaching a minimum like before. Nevertheless consider what you we are doing: we are constraining the system in one specific conformation and this is completely unnatural for a system at 300 kelvin so, even for this small system adopting a specific conformation in which all the heavy atoms are in a precise position is rather unrealistic. This means that this state is an high free energy state. diff --git a/user-doc/tutorials/belfast-6.txt b/user-doc/tutorials/belfast-6.txt index edfdffc83..4ccfa23ef 100644 --- a/user-doc/tutorials/belfast-6.txt +++ b/user-doc/tutorials/belfast-6.txt @@ -11,8 +11,8 @@ detect issues related to a bad choice of collective variables. \section belfast-6-theory Summary of theory In metadynamics, an external history-dependent bias potential is constructed in the space of -a few selected degrees of freedom \f$ \vec{s}({q}) \f$, generally called collective variables (CVs) \cite metad. -This potential is built as a sum of Gaussians deposited along the trajectory in the CVs space: +a few selected degrees of freedom \f$\vec{s}({q}) \f$, generally called collective variables (CVs) \cite metad. +This potential is built as a sum of Gaussian kernels deposited along the trajectory in the CVs space: \f[ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) @@ -22,7 +22,7 @@ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) \f] where \f$ \tau \f$ is the Gaussian deposition stride, -\f$ \sigma_i \f$ the width of the Gaussian for the ith CV, and \f$ W(k \tau) \f$ the +\f$ \sigma_i \f$ the width of the Gaussian for the \f$i\f$th CV, and \f$ W(k \tau) \f$ the height of the Gaussian. The effect of the metadynamics bias potential is to push the system away from local minima into visiting new regions of the phase space. Furthermore, in the long time limit, the bias potential converges to minus the free energy as a function of the CVs: @@ -31,7 +31,7 @@ time limit, the bias potential converges to minus the free energy as a function V(\vec{s},t\rightarrow \infty) = -F(\vec{s}) + C. \f] -In standard metadynamics, Gaussians of constant height are added for the entire course of a +In standard metadynamics, Gaussian kernels of constant height are added for the entire course of a simulation. As a result, the system is eventually pushed to explore high free-energy regions and the estimate of the free energy calculated from the bias potential oscillates around the real value. @@ -55,16 +55,16 @@ where \f$ T \f$ is the temperature of the system. In the long time limit, the CVs thus sample an ensemble at a temperature \f$ T+\Delta T \f$ which is higher than the system temperature \f$ T \f$. The parameter \f$ \Delta T \f$ can be chosen to regulate the extent of free-energy exploration: - \f$ \Delta T = 0\f$ corresponds to standard molecular dynamics, \f$ \Delta T \rightarrow \infty \f$ to standard + \f$ \Delta T = 0\f$ corresponds to standard molecular dynamics, \f$ \Delta T \rightarrow\infty\f$ to standard metadynamics. In well-tempered metadynamics literature and in PLUMED, you will often encounter -the term "biasfactor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) +the term "bias factor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) and the system temperature (\f$ T \f$): \f[ \gamma = \frac{T+\Delta T}{T}. \f] -The biasfactor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed +The bias factor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed efficiently in the time scale of the simulation. Additional information can be found in the several review papers on metadynamics @@ -87,10 +87,10 @@ Once this tutorial is completed students will know how to: The <a href="tutorial-resources/belfast-6.tar.gz" download="belfast-6.tar.gz"> tarball </a> for this project contains the following directories: - TOPO: it contains the gromacs topology and configuration files to simulate alanine dipeptide in vacuum -- Exercise_1: run a metadynamics simulation with 2 CVs, dihedrals phi and psi, and analyze the output +- Exercise_1: run a metadynamics simulation with 2 CVs, dihedral angles phi and psi, and analyze the output - Exercise_2: restart a metadynamics simulation - Exercise_3: calculate free energies from a metadynamics simulation and monitor convergence -- Exercise_4: run a well-tempered metadynamics simulation with 2 CVs, dihedrals phi and psi +- Exercise_4: run a well-tempered metadynamics simulation with 2 CVs, dihedral angles phi and psi - Exercise_5: run a well-tempered metadynamics simulation with 1 CV, dihedral psi \section belfast-6-instructions Instructions @@ -112,7 +112,7 @@ psi: TORSION ATOMS=7,9,15,17 # # Activate metadynamics in phi and psi # depositing a Gaussian every 500 time steps, -# with height equal to 1.2 kJoule/mol, +# with height equal to 1.2 kJ/mol, # and width 0.35 rad for both CVs. # metad: METAD ARG=phi,psi PACE=500 HEIGHT=1.2 SIGMA=0.35,0.35 FILE=HILLS @@ -127,7 +127,7 @@ The syntax for the command \ref METAD is simple. The directive is followed by a keyword ARG followed by the labels of the CVs on which the metadynamics potential will act. The keyword PACE determines the stride of Gaussian deposition in number of time steps, -while the keyword HEIGHT specifies the height of the Gaussian in kJoule/mol. For each CVs, one has +while the keyword HEIGHT specifies the height of the Gaussian in kJ/mol. For each CVs, one has to specified the width of the Gaussian by using the keyword SIGMA. Gaussian will be written to the file indicated by the keyword FILE. @@ -141,7 +141,7 @@ gmx_mpi mdrun -plumed During the metadynamics simulation, PLUMED will create two files, named COLVAR and HILLS. The COLVAR file contains all the information specified by the PRINT command, in this case the value of the CVs every 10 steps of simulation, along with the current value of the metadynamics -bias potential. The HILLS file contains a list of the Gaussians deposited along the simulation. +bias potential. The HILLS file contains a list of the Gaussian kernels deposited along the simulation. If we give a look at the header of this file, we can find relevant information about its content: \verbatim @@ -155,7 +155,7 @@ If we give a look at the header of this file, we can find relevant information a The line starting with FIELDS tells us what is displayed in the various columns of the HILLS file: the time of the simulation, the value of phi and psi, the width of the Gaussian in phi and psi, -the height of the Gaussian, and the biasfactor. +the height of the Gaussian, and the bias factor. This quantity is relevant only for well-tempered metadynamics simulation (see \ref belfast-6-exercise-4) and it is equal to 1 in standard metadynamics simulations. @@ -173,7 +173,7 @@ entire phase space. If we use the PLUMED input file described above, the expense of a metadynamics simulation increases with the length of the simulation as one has to evaluate -the values of a larger and larger number of Gaussians at every step. To avoid this issue you can +the values of a larger and larger number of Gaussian kernels at every step. To avoid this issue you can store the bias on a grid. In order to use grids, we have to add some additional information to the line of the \ref METAD directive, as follows. @@ -243,13 +243,13 @@ PRINT STRIDE=10 ARG=phi,psi,metad.bias FILE=COLVAR \endverbatim (see \ref RESTART, \ref TORSION, \ref METAD, and \ref PRINT). -In this way, PLUMED will read the old Gaussians from the HILLS file and append the new information +In this way, PLUMED will read the old Gaussian kernels from the HILLS file and append the new information to both COLVAR and HILLS files. \subsection belfast-6-exercise-3 Exercise 3. Calculate free-energies and monitor convergence One can estimate the free energy as a function of the metadynamics CVs directly from the metadynamics -bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussians +bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussian kernels deposited during the simulation and stored in the HILLS file. To calculate the two-dimensional free energy as a function of phi and psi, it is sufficient to use the @@ -286,7 +286,7 @@ The result should look like this: To assess the convergence of a metadynamics simulation, one can calculate the estimate of the free energy as a function of simulation time. At convergence, the reconstructed profiles should be similar, apart from a constant offset. -The option --stride should be used to give an estimate of the free energy every N Gaussians deposited, and +The option --stride should be used to give an estimate of the free energy every N Gaussian kernels deposited, and the option --mintozero can be used to align the profiles by setting the global minimum to zero. If we use the following command line: @@ -294,11 +294,11 @@ If we use the following command line: plumed sum_hills --hills HILLS --idw phi --kt 2.5 --stride 500 --mintozero \endverbatim -one free energy is calculated every 500 Gaussians deposited, and the global minimum is set to zero in all profiles. +one free energy is calculated every 500 Gaussian kernels deposited, and the global minimum is set to zero in all profiles. The resulting plot should look like the following: \anchor belfast-6-phifest-fig -\image html belfast-6-phifest.png "Estimates of the free energy as a function of the dihedral phi calculated every 500 Gaussians deposited along a 5ns-long metadynamics simulation using 2 CVs." +\image html belfast-6-phifest.png "Estimates of the free energy as a function of the dihedral phi calculated every 500 Gaussian kernels deposited along a 5ns-long metadynamics simulation using 2 CVs." To assess the convergence of the simulation more quantitatively, we can calculate the free-energy difference between the two local minima in the one-dimensional free energy along phi as a function of simulation time. @@ -323,8 +323,8 @@ This analysis, along with the observation of the diffusive behavior in the CVs s In this exercise, we will run a well-tempered metadynamics simulation on alanine dipeptide in vacuum, using as CVs the two backbone dihedral angles phi and psi. To activate well-tempered metadynamics, we need to add two keywords to the line of \ref METAD, which -specifies the biasfactor and temperature of the simulation. For the first example, -we will try a biasfactor equal to 6. Here how the PLUMED input file (plumed.dat) should look like: +specifies the bias factor and temperature of the simulation. For the first example, +we will try a bias factor equal to 6. Here how the PLUMED input file (plumed.dat) should look like: \verbatim # set up two variables for Phi and Psi dihedral angles @@ -336,7 +336,7 @@ psi: TORSION ATOMS=7,9,15,17 # with height equal to 1.2 kJoule/mol, # and width 0.35 rad for both CVs. # Well-tempered metadynamics is activated, -# and the biasfactor is set to 6.0 +# and the bias factor is set to 6.0 # metad: METAD ARG=phi,psi PACE=500 HEIGHT=1.2 SIGMA=0.35,0.35 FILE=HILLS BIASFACTOR=6.0 TEMP=300.0 @@ -348,7 +348,7 @@ PRINT STRIDE=10 ARG=phi,psi,metad.bias FILE=COLVAR After running the simulation using the instruction described above, we can have a look at the HILLS file. At variance with standard metadynamics, the last two columns of the HILLS file report more -useful information. The last column contains the value of the biasfactor used, while +useful information. The last column contains the value of the bias factor used, while the last but one the height of the Gaussian, which is rescaled during the simulation following the well-tempered recipe. @@ -369,13 +369,13 @@ If we carefully look at the height column, we will notice that in the beginning reported is higher than the initial height specified in the input file, which should be 1.2 kJoule/mol. In fact, this column reports the height of the Gaussian rescaled by the pre-factor that in well-tempered metadynamics relates the bias potential to the free energy. In this way, when -we will use \ref sum_hills, the sum of the Gaussians deposited will directly provide the free-energy, +we will use \ref sum_hills, the sum of the Gaussian kernels deposited will directly provide the free-energy, without further rescaling needed. We can plot the time evolution of the CVs along with the height of the Gaussian. \anchor belfast-6-wtb6-fig -\image html belfast-6-wtb6.png "Time evolution of the CVs and Gaussian height during the first 2.5 ns of a well-tempered metadynamics simulation with biasfactor equal to 6." +\image html belfast-6-wtb6.png "Time evolution of the CVs and Gaussian height during the first 2.5 ns of a well-tempered metadynamics simulation with bias factor equal to 6." The system is initialized in one of the local minimum where it starts accumulating bias. As the simulation progresses and the bias added grows, the Gaussian height is progressively reduced. @@ -384,21 +384,21 @@ explore a new region of the phase space. As soon as this happens, the Gaussian h to the initial value and starts to decrease again. In the long time, the Gaussian height becomes smaller and smaller while the system diffuses in the entire CVs space. -We can now try a different biasfactor and see the effect on the simulation. -If we choose a biasfactor equal to 1.5, we can notice a faster decrease of the Gaussian height +We can now try a different bias factor and see the effect on the simulation. +If we choose a bias factor equal to 1.5, we can notice a faster decrease of the Gaussian height with simulation time, as expected by the well-tempered recipe. We will also conclude from the plot below that -this biasfactor is not large enough to allow for the system to escape from the initial local minimum +this bias factor is not large enough to allow for the system to escape from the initial local minimum in the time scale of this simulation. \anchor belfast-6-wtb15-fig -\image html belfast-6-wtb15.png "Time evolution of the CVs and Gaussian height in a 5 ns long well-tempered metadynamics simulation with biasfactor equal to 1.5." +\image html belfast-6-wtb15.png "Time evolution of the CVs and Gaussian height in a 5 ns long well-tempered metadynamics simulation with bias factor equal to 1.5." Following the procedure described for standard metadynamics in the previous example, we can estimate the free energy as a function of time and monitor the convergence of the simulations using the analyze_FES.sh script. We will do this for the simulation in which -the biasfactor was set to 6.0. In this case we will notice that the oscillations +the bias factor was set to 6.0. In this case we will notice that the oscillations observed in standard metadynamics are here damped, and the bias potential converges more -smoothly to the underlying free-energy landscape, provided that the biasfactor is +smoothly to the underlying free-energy landscape, provided that the bias factor is sufficiently high for the free-energy barriers of the system under study to be crossed. \anchor belfast-6-wtdifft-fig @@ -421,7 +421,7 @@ psi: TORSION ATOMS=7,9,15,17 # with height equal to 1.2 kJoule/mol, # and width 0.35 rad. # Well-tempered metadynamics is activated, -# and the biasfactor is set to 10.0 +# and the bias factor is set to 10.0 # metad: METAD ARG=psi PACE=500 HEIGHT=1.2 SIGMA=0.35 FILE=HILLS BIASFACTOR=10.0 TEMP=300.0 @@ -453,7 +453,7 @@ of freedom phi from one free-energy basin to another. The dynamics of phi is not so we need to equilibrate this degree of freedom, i.e. to observe multiple transitions from the two basins, before declaring convergence of our simulation. Or alternatively we can add phi to the set of CVs. This example demonstrates how to declare convergence of a well-tempered -metadynamics simulation it is necessary but not sufficient to observe: 1) Gaussians with very small height, 2) +metadynamics simulation it is necessary but not sufficient to observe: 1) Gaussian kernels with very small height, 2) a diffusive behavior in the CV space (as in the first 3 ns of this example). What we should do is repeating the simulation multiple times starting from different initial conformations. If in all simulations, we observe a diffusive behavior in the biased CV when the Gaussian height is very small, diff --git a/user-doc/tutorials/belfast-7.txt b/user-doc/tutorials/belfast-7.txt index ef3bf26fa..b1bf081ce 100644 --- a/user-doc/tutorials/belfast-7.txt +++ b/user-doc/tutorials/belfast-7.txt @@ -53,7 +53,7 @@ must be modified in order to account for the presence of a bias potential: \frac{1}{k_B T_j} \left [ V_G^{j}(s(R_j),t) - V_G^{j}(s(R_i),t) \right ], \f] -where \f$ V_G^{i} \f$ and \f$ V_G^{j} \f$ are the bias potentials acting on the i-th and j-th replicas, respectively. +where \f$ V_G^{i} \f$ and \f$ V_G^{j} \f$ are the bias potentials acting on the \f$i\f$th and \f$j\f$th replicas, respectively. PTMetaD is particularly effective because it compensates for some of the weaknesses of each method alone. The effect of neglecting a slow degree of freedom in the choice of the metadynamics CVs is alleviated by PT, @@ -67,7 +67,7 @@ well-tempered metadynamics CVs. The well-tempered metadynamics bias leads to the sampling of a well-defined distribution called Well-Tempered Ensemble (WTE) \cite Bonomi:2009p17935. In this ensemble, the average energy remains close to the canonical value -but its fluctuations are enhanced in a tunable way, thus improving sampling. +but its fluctuations are enhanced in a way that can be tuned, thus improving sampling. In the so-called PTMetaD-WTE scheme \cite ct300297t, each replica diffusion in temperature space is enhanced by the increased energy fluctuations at all temperatures. @@ -89,7 +89,7 @@ The <a href="tutorial-resources/belfast-7.tar.gz" download="belfast-7.tar.gz"> t - Exercise_3: run a PTMetaD simulation - Exercise_4: run a PT, PT-WTE and PTMetaD-WTE simulations -Each directory contains a TOPO subdirectory where topology and configuration files for Gromacs are stored. +Each directory contains a TOPO sub-directory where topology and configuration files for Gromacs are stored. \section belfast-7-instructions Instructions @@ -122,8 +122,8 @@ To submit this simulation with Gromacs, we need the following command line. mpirun -np 2 gmx_mpi mdrun -s TOPO/topol -plumed -multi 2 -replex 100 \endverbatim -This command will execute two MPI processess in parallel, using the topology files -topol0.tpr and topol1.tpr stored in the TOPO subdirectory. These two binary files +This command will execute two MPI processes in parallel, using the topology files +topol0.tpr and topol1.tpr stored in the TOPO sub-directory. These two binary files have been created using the usual Gromacs procedure (see Gromacs manual for further details) and setting the temperature of the two simulations at 300K and 305K in the configuration files. An exchange between the configurations of the two simulations will be attempted every 100 steps. diff --git a/user-doc/tutorials/belfast-8.txt b/user-doc/tutorials/belfast-8.txt index 42ec57130..b713aa289 100644 --- a/user-doc/tutorials/belfast-8.txt +++ b/user-doc/tutorials/belfast-8.txt @@ -4,7 +4,7 @@ \section belfast-8-aim Aims The aim of this tutorial is to introduce the users to the use of Bias-Exchange Metadynamics. We will go through the writing of -the input files for BEMETA for a simple case of three peptide and we will use METAGUI to to analyse them. We will compare +the input files for BEMETA for a simple case of three peptide and we will use METAGUI to to analyze them. We will compare the results of WT-BEMETA and STANDARD-BEMETA with four independent runs on the four Collective Variables. Finally we will use a simplified version of BEMETA that is Multiple Walkers Metadynamics. @@ -13,7 +13,7 @@ use a simplified version of BEMETA that is Multiple Walkers Metadynamics. Once this tutorial is completed students will: - Know how to run a Bias-Exchange simulation using PLUMED and GROMACS -- Know how to analyse the results of BEMETA with the help of METAGUI +- Know how to analyze the results of BEMETA with the help of METAGUI - Know how to run a Multiple Walker simulation \section belfast-8-res Resources @@ -90,9 +90,9 @@ PRINT ARG=cv1,cv2,cv3,cv4 STRIDE=1000 FILE=COLVAR \endverbatim NOTE: -1. in COLVAR we \ref PRINT only the four collective variables, always in the same order in such a way that COLVARs are compatible +1. in COLVAR we \ref PRINT only the four collective variables, always in the same order in such a way that COLVAR files are compatible with METAGUI -2. if you want to print additional information, like the \ref METAD bias it is possibile to use additional \ref PRINT keyword +2. if you want to print additional information, like the \ref METAD bias it is possible to use additional \ref PRINT keyword \verbatim PRINT ARG=cv1,be.bias STRIDE=xxx FILE=BIAS @@ -114,11 +114,11 @@ The same simulation can be run using WELLTEMPERED metadynamics. \subsection belfast-8-bxcon Convergence of the Simulations In the resources for this tutorial you can find the results for a 40ns long Well-Tempered Bias Exchange simulation. First of all we can try to assess the convergence of the simulations by looking at the profiles. In the "convergence" folder there is a script that -calculates the free energy from the HILLS.0 file at incresing simulation lengths (i.e. every more 0.8 ns of simulation). The +calculates the free energy from the HILLS.0 file at increasing simulation lengths (i.e. every more 0.8 ns of simulation). The scripts also generate two measures of the evolution of the profiles in time: 1. time-diff.HILLS.0: where it is stored the average deviation between two successive profiles -2. KL.HILLS.0: where it is stored the average deviation between profiles correctly weigheted for the free energy of the profiles themselves (Symmetrized Kullback-Lieber divergence) +2. KL.HILLS.0: where it is stored the average deviation between profiles correctly weighted for the free energy of the profiles themselves (Symmetrized Kullback-Lieber divergence) From both plots one can deduce that after 8 ns the profiles don't change significantly thus suggesting that averaging over the range 8-40ns should result in a accurate profile (we will test this using metagui). Another test is that of looking at the fluctuations of the profiles in a time window instead of looking @@ -129,15 +129,15 @@ at successive profiles: \subsection belfast-8-metagui Bias-Exchange Analysis with METAGUI In principle Bias-Exchange Metadynamics can give as a results only N 1D free energy profiles. But the information -contained in all the replicas can be used to recover multidensional free energy surfaces in >=N dimensions. A simple +contained in all the replicas can be used to recover multidimensional free energy surfaces in >=N dimensions. A simple way to perform this analysis is to use METAGUI. METAGUI performs the following operations: -1. Clusterizes the trajectories on a multidimensional GRID defined by at least the biased coordinates. +1. Clusters the frames in the trajectories on a multidimensional GRID defined by at least the biased coordinates. 2. By using the 1D free energy profiles and the clustering assigns a free energy to the cluster using a WHAM procedure. 3. Lets the user visualize the clusters. 4. Approximates the kinetics among clusters. -METAGUI (Biarnes et. al) is a plugin for VMD that implements the approch developed by Marinelli et. al 2009. It can be +METAGUI (Biarnes et. al) is a plugin for VMD that implements the approach developed by Marinelli et. al 2009. It can be downloaded from the PLUMED website. In order for the colvar and hills file to be compatible with METAGUI their header must be formatted as following: @@ -151,9 +151,9 @@ COLVAR.#: \endverbatim NOTE: -1. the COLVAR.# files should contain ALL the collective variables employed (all those biased in at least one replica plus those additionaly analysed). They MUST be named cv1 ... cvN. -2. the COLVAR.# files must be synchronised with the trajectories, this means that for each frame in the trajectory at time t there must be a line in each colvar at time t and viceversa. The best option is usually to analyse the trajectories a posteriori using plumed driver. -3. a keyword #! ACTIVE NBIASEDCV BIASEDCV LABEL is needed, where NBIASEDCV is the number of biased cv in that replica (not overall), BIASEDCV is the index of the biased cv in that replica (i.e. 1 for the first replica and so on); LABEL is a letter that identify the replica (usually is simply A for the first, B for the second and so on) this is usufull if two replicas are biasing the same collective variable: +1. the COLVAR.# files should contain ALL the collective variables employed (all those biased in at least one replica plus those additionally analyzed). They MUST be named cv1 ... cv\f$\f$N. +2. the COLVAR.# files must be synchronized with the trajectories, this means that for each frame in the trajectory at time t there must be a line in each colvar at time t and vice versa. The best option is usually to analyze the trajectories a posteriori using plumed driver. +3. a keyword #! ACTIVE NBIASEDCV BIASEDCV LABEL is needed, where NBIASEDCV is the number of biased collective variables in that replica (not overall), BIASEDCV is the index of the biased collective variables in that replica (i.e. 1 for the first replica and so on); LABEL is a letter that identify the replica (usually is simply A for the first, B for the second and so on) this is useful if two replicas are biasing the same collective variable: \verbatim COLVAR.0: @@ -192,7 +192,7 @@ HILLS.#: \endverbatim The above notes hold for the HILLS files as well. In the folder metagui the script check_for_metagui.sh checks if the header of your file is compatible -with METAGUI, but remember that this is not enough! Synchronisation of COLVAR and trajectory files is also needed. HILLS files can be written with a +with METAGUI, but remember that this is not enough! Synchronization of COLVAR and trajectory files is also needed. HILLS files can be written with a different frequency but times must be consistent. NOTE: It is important to copy HILLS files in the metagui folder. @@ -241,27 +241,27 @@ Now let's start with the analysis: 3. In the left section of the interface "load all" the trajectories 4. Find the Microstates -In order to visualise the microstate it is convenient to align all the structures using the VMD RMSD Trajectory tool that can be found +In order to visualize the microstate it is convenient to align all the structures using the VMD RMSD Trajectory tool that can be found in Extensions->Analysis. -One or more microstates can be visualised by selecting them and clicking show. +One or more microstates can be visualized by selecting them and clicking show. You can sort the microstates using the column name tabs, for example by clicking on size the microstates will be ordered from the larger to the smaller. If you look at the largest one it is possible to observe that by using the four selected collective variables the backbone -conformation of the peptide is well defined while the sidechains can populate different rotameric states. +conformation of the peptide is well defined while the side chains can populate different rotameric states. The equilibrium time in the analysis panel should be such that by averaging over the two halves of the remind of the simulation the profiles are the same (i.e the profile averaged between Teq and Teq+(Ttot-Teq)/2 should be the same of that averaged from Teq+(Ttot-Teq)/2 and Ttot). By clicking on COMPUTE FREE ENERGIES, the program will first generate the 1D free energy profiles from the HILLS files and then run the WHAM analysis on the microstates. Once the analysis is done it is possible to visually check the convergence of the 1D profiles one by -one by clicking on the K bottons next to the HILLS.# files. The BLUE and the RED profiles are the two profiles just defined, while the GREEN +one by clicking on the K buttons next to the HILLS.# files. The BLUE and the RED profiles are the two profiles just defined, while the GREEN is the average of the two. Now it is possible for example to sort the microstates as a function of the free energy and save them by dumping the structures for further analysis. \anchor belfast-8-mg1-fig \image html belfast-8-mg1.png "METAGUI Interface after loading metagui.input file" -If you look in the metagui folder you will see a lot of files, some of them can be very usefull: +If you look in the metagui folder you will see a lot of files, some of them can be very useful: metagui/MICROSTATES: is the content of the microstates list table metagui/WHAM_RUN/VG_HILLS.#: are the opposite of the free energies calculated from the hills files @@ -284,7 +284,7 @@ and edit the script to select the two columns of MICROSTATES on which show the i \subsection belfast-8-mw Multiple Walker Metadynamics -Multiple Walker metadynamics is the simplest way to parallelise a MetaD calculation: multiple simulation of the same system +Multiple Walker metadynamics is the simplest way to parallelize a metadynamics calculation: multiple simulation of the same system are run in parallel using metadynamics on the same set of collective variables. The deposited bias is shared among the replicas in such a way that the history dependent potential depends on the whole history. @@ -317,8 +317,8 @@ mpirun -np 4 gmx_mpi mdrun -s topol -plumed plumed -multi 4 >& log & \endverbatim alternatively Multiple Walkers can be run as independent simulations sharing via the file system the biasing potential, -this is usefull because it provides a parallelisation that does not need a parallel code. In this case the walkers read with -a given frequency the gaussians deposited by the others and add them to their own \ref METAD. +this is useful because it provides a parallelization that does not need a parallel code. In this case the walkers read with +a given frequency the Gaussian kernels deposited by the others and add them to their own \ref METAD. \section belfast-8-refer Reference @@ -326,8 +326,8 @@ This tutorial is freely inspired to the work of Biarnes et al. More materials can be found in -1. Marinelli, F., Pietrucci, F., Laio, A. & Piana, S. A kinetic model of trp-cage folding from multiple biased molecular dynamics simulations. PLoS Comput. Biol. 5, e1000452 (2009). -2. Biarnés, X., Pietrucci, F., Marinelli, F. & Laio, A. METAGUI. A VMD interface for analyzing metadynamics and molecular dynamics simulations. Comput. Phys. Commun. 183, 203–211 (2012). +1. Marinelli, F., Pietrucci, F., Laio, A. & Piana, S. A kinetic model of TRP-cage folding from multiple biased molecular dynamics simulations. PLoS Comput. Biol. 5, e1000452 (2009). +2. Biarnes, X., Pietrucci, F., Marinelli, F. & Laio, A. METAGUI. A VMD interface for analyzing metadynamics and molecular dynamics simulations. Comput. Phys. Commun. 183, 203–211 (2012). 3. Baftizadeh, F., Cossio, P., Pietrucci, F. & Laio, A. Protein folding and ligand-enzyme binding from bias-exchange metadynamics simulations. Current Physical Chemistry 2, 79–91 (2012). 4. Granata, D., Camilloni, C., Vendruscolo, M. & Laio, A. Characterization of the free-energy landscapes of proteins by NMR-guided metadynamics. Proc. Natl. Acad. Sci. U.S.A. 110, 6817–6822 (2013). 5. Raiteri, P., Laio, A., Gervasio, F. L., Micheletti, C. & Parrinello, M. Efficient reconstruction of complex free energy landscapes by multiple walkers metadynamics. J. Phys. Chem. B 110, 3533–3539 (2006). diff --git a/user-doc/tutorials/belfast-9a.txt b/user-doc/tutorials/belfast-9a.txt index 7283914dd..b8bd22db5 100644 --- a/user-doc/tutorials/belfast-9a.txt +++ b/user-doc/tutorials/belfast-9a.txt @@ -28,7 +28,7 @@ In the former tutorials it has been often discussed the possibility of measuring some kind of state for a system, i.e. \ref belfast-5. An alternative possibility is to use as a reference a set of experimental data that represent a state and measure the current deviation from the set. In plumed there are currently implemented the following NMR experimental observables: Chemical Shifts (only for proteins) \ref CS2BACKBONE, \ref NOE distances, \ref JCOUPLING, \ref PRE intensities, -and Residual Dipolar couplings/pseudocontact shifts \ref RDC. +and Residual Dipolar couplings/pseudo-contact shifts \ref RDC. In the following we will write the \ref CS2BACKBONE collective variable similar to the one used in Granata et al. (2013) (while the collective variable is still proportional to the square sum of the deviation of the calculated and experimental chemical shifts divided by a typical @@ -71,7 +71,7 @@ same time, instead the restraint is applied over an AVERAGED COLLECTIVE VARIABLE independent simulations of the same system in the same conditions. In this way the is not a single replica that must be in agreement with the experimental data but they should be in agreement on average. It has been shown that this approach is equivalent to solving the problem of finding a modified -version of the force field that will reproduce the provided set of experimental data withouth any additional assumption on the +version of the force field that will reproduce the provided set of experimental data without any additional assumption on the data themselves. The second example included in the resources show how the amber force field can be improved in the case of protein domain GB3 using @@ -93,7 +93,7 @@ ENDPLUMED \endverbatim with respect to the case in which chemical shifts are used to define a standard collective variable, in this case \ref CS2BACKBONE -is a collective variable with multiple components, that are all the backcalculated chemical shifts, plus all the relative experimental +is a collective variable with multiple components, that are all the back calculated chemical shifts, plus all the relative experimental values. The keyword function \ref ENSEMBLE tells plumed to calculate the average of the arguments over the replicas (i.e. 4 replicas) and the function \ref STATS compare the averaged back calculated chemical shifts with the experimental values and calculates the sum of the squared deviation. diff --git a/user-doc/tutorials/belfast-9b.txt b/user-doc/tutorials/belfast-9b.txt index 3d1e45e52..061793669 100644 --- a/user-doc/tutorials/belfast-9b.txt +++ b/user-doc/tutorials/belfast-9b.txt @@ -9,7 +9,7 @@ we will look at the transition from solid to liquid as this is easier to study u The CVs we will introduce can, and have, been used to study the transition from liquid to solid. More information can be found in the further reading section. -<b> You will need to ensure that you have compiled PLUMED with the crystalisation module enabled in order to complete this tutorial. </b> +<b> You will need to ensure that you have compiled PLUMED with the crystallisation module enabled in order to complete this tutorial. </b> To learn how to activate this module consult the information in the manual about the \ref mymodules. \subsection belfast-10-lo Learning Outcomes @@ -26,7 +26,7 @@ know how to calculate such quantities using plumed. The <a href="tutorial-resources/belfast-9b.tar.gz" download="belfast-9b.tar.gz"> tarball </a> for this project contains the following files: - in : An input file for the simplemd code that forms part of plumed -- input.xyz : A configuration file for Lennard-Jones solid with an fcc solid structure +- input.xyz : A configuration file for Lennard-Jones solid with an FCC solid structure \section belfast-10-ins Instructions @@ -36,7 +36,7 @@ For this tutorial we will be using the MD code that is part of plumed - simplemd do the simulations of <a href="http://en.wikipedia.org/wiki/Lennard-Jones_potential"> Lennard-Jones </a> that we require here but not much else. We will thus start this tutorial by doing some simulations with this code. You should have two files from the tarball, the first is called input.xyz and is basically an -xyz file containing the posisitions of the atoms. The second meanwhile is called in and is the input to simplemd. +xyz file containing the positions of the atoms. The second meanwhile is called in and is the input to simplemd. If you open the file the contents should look something like this: \verbatim @@ -58,14 +58,14 @@ One thing that might be a little strange is the units. Simplemd works with Lenn which is why the temperature in the above file is apparently so low. Use simplemd to run 50000 step calculations at 0.2, 0.8 and 1.2 \f$\frac{k_B T}{\epsilon}\f$. (N.B. You will need -an empty file called plumed.dat in order to run these calculations.) Visualise the trajectory output by each of your simulations +an empty file called plumed.dat in order to run these calculations.) Visualize the trajectory output by each of your simulations using VMD. Please be aware that simplemd does not wrap the atoms into the cell box automatically. If you are at the tutorial we have resolved this problem by making it so that if you press w when you load all the atoms they will be wrapped into the box. At what temperatures did the simulations melt? What then is the melting point of the Lennard-Jones potential at this density? \subsection belfast-10-cvs Coordination Numbers -At the end of the previous section you were able to make very sophisticated judgements about whether or not the arrangment of atoms +At the end of the previous section you were able to make very a sophisticated judgement about whether or not the arrangement of atoms in your system was solid-like or liquid-like by simply looking at the configuration. The aim in the rest of this tutorial is to see if we can derive collective variables that are able to make an equally sophisticated judgement. For our first attempt lets use a CV which we have encountered elsewhere, the \ref COORDINATIONNUMBER. Rerun the calculations from the previous section but at variance @@ -77,17 +77,17 @@ cc: COORDINATIONNUMBER SPECIES=1-108 SWITCH={RATIONAL D_0=1.3 R_0=0.2 D_MAX=3.0} PRINT ARG=cc.* FILE=colvar STRIDE=100 \endverbatim -Rerun the simpled simulations described at the end of the previous section. Is the average coordination number good at -differentiaing between solid and liquid configurations? Given your knowledge of physics/chemistry is this result surprising? +Rerun the simplemd simulations described at the end of the previous section. Is the average coordination number good at +differentiating between solid and liquid configurations? Given your knowledge of physics/chemistry is this result surprising? -\subsection belfast-10-steinhardt Steinhard parameter +\subsection belfast-10-steinhardt Steinhardt parameter The solid and liquid phases of a material are both relatively dense so the result at the end of the last section - the fact that the coordination number is not particularly good at differentiating between them - should not be that much of a surprise. As you -will have learnt early on in your scientific education when solids melt the atoms rearrange themselves in a much less ordered fashion. +will have learned early on in your scientific education when solids melt the atoms rearrange themselves in a much less ordered fashion. The bonds between them do not necessarily break it is just that, whereas in a the solid the bonds were all at pretty well defined angles to each other, in a liquid the spread of bond angles is considerably larger. To detect the transition from solid to liquid what we need then -is a coordinate that is able to recognise whether or not the geometry in the coordination spheres around each of the atoms +is a coordinate that is able to recognize whether or not the geometry in the coordination spheres around each of the atoms in the system are the same or different. If these geometries are the same the system is in a solid-like configuration, whereas if they are different the system is liquid-like. The Steinhardt parameters \ref Q3, \ref Q4 and \ref Q6 are coordinates that allow us to examine the coordination spheres of atoms in precisely this way. The way in which these coordinates are able to do this is explained in the @@ -107,15 +107,15 @@ coordination sphere of a particular atom are defined using a continuous switchin \subsection belfast-10-lvsg Local versus Global At the end of the previous section you showed that the average Q6 parameter for a system of atoms is able to tell you whether or not that collection -of atoms is in a solid-like or liquid-like configuration. In this section we will now ask the question - can the Steinhardt parameter always, unambiously +of atoms is in a solid-like or liquid-like configuration. In this section we will now ask the question - can the Steinhardt parameter always, unambiguously tell us whether particular tagged atoms are in a solid-like region of the material or whether they are in a liquid-like region of the material? This is -an important question that might come up if we are looking at nucleation of a solid from the melt. Our question in this context reads - how do we unambigously +an important question that might come up if we are looking at nucleation of a solid from the melt. Our question in this context reads - how do we unambiguously identify those atoms that are in the crystalline nucleus? A similar question would also come up when studying an interface between the solid and liquid phases. In this guise we would be asking about the extent of interface; namely, how far from the interface do we have to go before we can start thinking of the atoms as just another atom in the solid/liquid phase? With these questions in mind our aim is to look at the distribution of values for the Q6 parameters of atoms in our system of Lennard-Jones have. If Q6 is able -to unambigously tell us whether or not an atom is in a solid-like pose or a liquid-like pose then there should be no overlap in the distributions obtained for the +to unambiguously tell us whether or not an atom is in a solid-like pose or a liquid-like pose then there should be no overlap in the distributions obtained for the solid and liquid phases. If there is overlap, however, then we cannot use these coordinates for the applications described in the previous paragraph. To calculate these distributions you will need to run two simulations - one at 0.2 \f$\frac{k_B T}{\epsilon}\f$ and one at 1.2 \f$\frac{k_Bt}{\epsilon}\f$ using now the following PLUMED input file: @@ -134,7 +134,7 @@ Hopefully you saw that the distribution of Q6 parameters for the solid and liqui surprising - as you go from solid to liquid the distribution of the geometries of the coordination spheres widens. The system is now able to arrange the atoms in the coordination spheres in a much wider variety of different poses. Importantly, however, the fact that an atom is in a liquid does not preclude it from having a very-ordered, solid-like coordination environment. As such in the liquid state we will find the occasional atom with a high value -for the Q6 parameter. Consequently, an ordred coordination environment does not always differentiate solid-like atoms from liquid-like atoms. The major +for the Q6 parameter. Consequently, an ordered coordination environment does not always differentiate solid-like atoms from liquid-like atoms. The major difference is in the liquid the ordered atoms will always be isolated. That is to say in the liquid atoms with an ordered coordination will always be surrounded by atoms with a disordered coordination sphere. By contrast in the solid each ordered atom will be surrounded by further ordered atoms. This observation forms the basis of the local Steinhardt parameters and the locally averaged Steinhardt parameters that are explained in this @@ -151,8 +151,8 @@ DUMPGRID GRID=hh FILE=histo STRIDE=50000 Do the distributions of \ref LOCAL_Q6 parameter for the solid and liquid phases overlap? -Lectner and Dellago have designed an alternative to the \ref LOCAL_Q6 parameter that is based on taking the \ref LOCAL_AVERAGE of the Q6 parameter arround a central atom. This -quantity can be calcualted using plumed. If you have time try to use the manual to work out how. +Lectner and Dellago have designed an alternative to the \ref LOCAL_Q6 parameter that is based on taking the \ref LOCAL_AVERAGE of the Q6 parameter around a central atom. This +quantity can be calculated using plumed. If you have time try to use the manual to work out how. \section belfast-10-further Further Reading diff --git a/user-doc/tutorials/cambridge.txt b/user-doc/tutorials/cambridge.txt index 24602afec..e7dc3fd95 100644 --- a/user-doc/tutorials/cambridge.txt +++ b/user-doc/tutorials/cambridge.txt @@ -46,8 +46,8 @@ The <a href="tutorial-resources/cambridge-1.tar.gz" download="cambridge-1.tar.gz \subsection cambridge-exercise-1-0 Summary of theory In metadynamics, an external history-dependent bias potential is constructed in the space of -a few selected degrees of freedom \f$ \vec{s}({q}) \f$, generally called collective variables (CVs) \cite metad. -This potential is built as a sum of Gaussians deposited along the trajectory in the CVs space: +a few selected degrees of freedom \f$\vec{s}({q}) \f$, generally called collective variables (CVs) \cite metad. +This potential is built as a sum of Gaussian kernels deposited along the trajectory in the CVs space: \f[ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) @@ -57,7 +57,7 @@ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) \f] where \f$ \tau \f$ is the Gaussian deposition stride, -\f$ \sigma_i \f$ the width of the Gaussian for the ith CV, and \f$ W(k \tau) \f$ the +\f$ \sigma_i \f$ the width of the Gaussian for the \f$i\f$th CV, and \f$ W(k \tau) \f$ the height of the Gaussian. The effect of the metadynamics bias potential is to push the system away from local minima into visiting new regions of the phase space. Furthermore, in the long time limit, the bias potential converges to minus the free energy as a function of the CVs: @@ -66,7 +66,7 @@ time limit, the bias potential converges to minus the free energy as a function V(\vec{s},t\rightarrow \infty) = -F(\vec{s}) + C. \f] -In standard metadynamics, Gaussians of constant height are added for the entire course of a +In standard metadynamics, Gaussian kernels of constant height are added for the entire course of a simulation. As a result, the system is eventually pushed to explore high free-energy regions and the estimate of the free energy calculated from the bias potential oscillates around the real value. @@ -90,22 +90,22 @@ where \f$ T \f$ is the temperature of the system. In the long time limit, the CVs thus sample an ensemble at a temperature \f$ T+\Delta T \f$ which is higher than the system temperature \f$ T \f$. The parameter \f$ \Delta T \f$ can be chosen to regulate the extent of free-energy exploration: - \f$ \Delta T = 0\f$ corresponds to standard molecular dynamics, \f$ \Delta T \rightarrow \infty \f$ to standard + \f$ \Delta T = 0\f$ corresponds to standard molecular dynamics, \f$ \Delta T \rightarrow\infty\f$ to standard metadynamics. In well-tempered metadynamics literature and in PLUMED, you will often encounter -the term "biasfactor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) +the term "bias factor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) and the system temperature (\f$ T \f$): \f[ \gamma = \frac{T+\Delta T}{T}. \f] -The biasfactor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed +The bias factor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed efficiently in the time scale of the simulation. Additional information can be found in the several review papers on metadynamics \cite gerv-laio09review \cite WCMS:WCMS31 \cite WCMS:WCMS1103. -\subsection cambridge-exercise-1-1 Setup, run, and analyse a well-tempered metadynamics simulation +\subsection cambridge-exercise-1-1 Setup, run, and analyze a well-tempered metadynamics simulation In this exercise, we will run a well-tempered metadynamics simulation on alanine dipeptide in vacuum, using as CVs the two backbone dihedral angles phi and psi. @@ -119,7 +119,7 @@ psi: TORSION ATOMS=7,9,15,17 # # Activate metadynamics in phi and psi # depositing a Gaussian every 500 time steps, -# with height equal to 1.2 kJoule/mol, +# with height equal to 1.2 kJ/mol, # and width 0.35 rad for both CVs. # Well-tempered metadynamics is activated, # and the biasfactor is set to 6.0 @@ -136,7 +136,7 @@ The syntax for the command \ref METAD is simple. The directive is followed by a keyword ARG followed by the labels of the CVs on which the metadynamics potential will act. The keyword PACE determines the stride of Gaussian deposition in number of time steps, -while the keyword HEIGHT specifies the height of the Gaussian in kJoule/mol. +while the keyword HEIGHT specifies the height of the Gaussian in kJ/mol. To run well-tempered metadynamics, we need two additional keywords BIASFACTOR and TEMP, which specify how fast the bias potential is decreasing with time and the temperature of the simulation, respectively. For each CVs, one has to specify the width of the Gaussian by using the keyword SIGMA. Gaussian will be written @@ -152,10 +152,10 @@ gmx_mpi mdrun -plumed During the metadynamics simulation, PLUMED will create two files, named COLVAR and HILLS. The COLVAR file contains all the information specified by the PRINT command, in this case the value of the CVs every 10 steps of simulation, along with the current value of the metadynamics -bias potential. The HILLS file contains a list of the Gaussians deposited along the simulation. +bias potential. The HILLS file contains a list of the Gaussian kernels deposited along the simulation. If we give a look at the header of this file, we can find relevant information about its content. -The last column contains the value of the biasfactor used, while -the last but one the height of the Gaussian, which is rescaled during the simulation following +The last column contains the value of the bias factor used, while +the last but one the height of the Gaussian, which is scaled during the simulation following the well-tempered recipe. \verbatim @@ -172,15 +172,15 @@ the well-tempered recipe. \endverbatim If we carefully look at the height column, we will notice that in the beginning the value -reported is higher than the initial height specified in the input file, which should be 1.2 kJoule/mol. -In fact, this column reports the height of the Gaussian rescaled by the pre-factor that +reported is higher than the initial height specified in the input file, which should be 1.2 kJ/mol. +In fact, this column reports the height of the Gaussian scaled by the pre-factor that in well-tempered metadynamics relates the bias potential to the free energy. In this way, when -we will use \ref sum_hills, the sum of the Gaussians deposited will directly provide the free-energy, +we will use \ref sum_hills, the sum of the Gaussian kernels deposited will directly provide the free-energy, without further rescaling needed. We can plot the time evolution of the CVs along with the height of the Gaussian. -\image html belfast-6-wtb6.png "Time evolution of the CVs and Gaussian height during the first 2.5 ns of a well-tempered metadynamics simulation with biasfactor equal to 6." +\image html belfast-6-wtb6.png "Time evolution of the CVs and Gaussian height during the first 2.5 ns of a well-tempered metadynamics simulation with bias factor equal to 6." The system is initialized in one of the local minimum where it starts accumulating bias. As the simulation progresses and the bias added grows, the Gaussian height is progressively reduced. @@ -192,7 +192,7 @@ becomes smaller and smaller while the system diffuses in the entire CVs space. \subsection cambridge-exercise-1-2 Calculate free-energies and monitor convergence One can estimate the free energy as a function of the metadynamics CVs directly from the metadynamics -bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussians +bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussian kernels deposited during the simulation and stored in the HILLS file. To calculate the two-dimensional free energy as a function of phi and psi, it is sufficient to use the @@ -226,7 +226,7 @@ The result should look like this: To assess the convergence of a metadynamics simulation, one can calculate the estimate of the free energy as a function of simulation time. At convergence, the reconstructed profiles should be similar. -The option --stride should be used to give an estimate of the free energy every N Gaussians deposited, and +The option --stride should be used to give an estimate of the free energy every N Gaussian kernels deposited, and the option --mintozero can be used to align the profiles by setting the global minimum to zero. If we use the following command line: @@ -234,7 +234,7 @@ If we use the following command line: plumed sum_hills --hills HILLS --idw phi --kt 2.5 --stride 500 --mintozero \endverbatim -one free energy is calculated every 500 Gaussians deposited, and the global minimum is set to zero in all profiles. +one free energy is calculated every 500 Gaussian kernels deposited, and the global minimum is set to zero in all profiles. Try to visualize the different estimates of the free energy as a function of time. To assess the convergence of the simulation more quantitatively, we can calculate the free-energy difference between the two @@ -270,7 +270,7 @@ This limitation can be severe when studying complex transformations or reactions more than say three relevant CVs can be identified. Alternative solutions employ metadynamics together with other enhanced sampling techniques (i.e. Parallel -Tempering or more generally Hamiltionan Replica Exchange). In particular in Bias-Exchange metadynamics the +Tempering or more generally Hamiltonian Replica Exchange). In particular in Bias-Exchange metadynamics the problem of sampling a multi-dimensional free-energy surface is recast in that of sampling many one-dimensional free-energies. In this approach a relatively large number N of CVs is chosen to describe the possible transformations of the system (e.g., to study the conformations of a peptide one may consider all the dihedral angles between amino acids). @@ -294,10 +294,10 @@ and \f$ V_{G}^{a(b)}\left(x,t\right) \f$ is the metadynamics potential acting on Each trajectory evolves through the high dimensional free energy landscape in the space of the CVs sequentially biased by different metadynamics potentials acting on one CV at each time. The results of the simulation are N one-dimensional projections of the free energy. -\subsection cambridge-exercise-2-1 Setup, run, and analyse a well-tempered bias-exchange metadynamics simulation +\subsection cambridge-exercise-2-1 Setup, run, and analyze a well-tempered bias-exchange metadynamics simulation -In the following example, a bias-exchange simulation is performed on Alanine di-peptide. A typical input file for BE-WTMetaD is the following: +In the following example, a bias-exchange simulation is performed on Alanine di-peptide. A typical input file for bias exchange well tempered metadynamics is the following: - a common input file in which all the collective variables are defined: @@ -371,8 +371,8 @@ variables with respect to a free energy surface obtained by sampling in two dime The main features are that all the replicas can sample all the relevant free energy minima even if their collective variable is not meant to sample them, only the replicas with a collective variable meant to pass a barrier can sample that transition, so high energy regions are sampled -only locally. This can seem a weakness in the case of alanine dipeptide but is the real strength of BE-MetaD, indeed by not knowing anything -of a system it is possible to choose as many collective variables as possible and even if most of them are not particularly usefull a posteriori, +only locally. This can seem a weakness in the case of alanine dipeptide but is the real strength of bias exchange metadynamics, indeed by not knowing anything +of a system it is possible to choose as many collective variables as possible and even if most of them are not particularly useful a posteriori, they all gain from the good ones and nothing is lost. \subsection cambridge-exercise-2-2 Calculate free-energies and monitor convergence @@ -398,7 +398,7 @@ The result should look like this (compared with the one obtained before): As above we can assess the convergence of a metadynamics simulation by calculating the estimate of the free energy as a function of simulation time. At convergence, the reconstructed profiles should be similar. -The option --stride should be used to give an estimate of the free energy every N Gaussians deposited, and +The option --stride should be used to give an estimate of the free energy every N Gaussian kernels deposited, and the option --mintozero can be used to align the profiles by setting the global minimum to zero. If we use the following command line: @@ -527,13 +527,13 @@ RESTRAINT ARG=ed1.d1 KAPPA=100000 AT=0.656214 Step 2: Setup the Bias-Exchange simulations We will run Bias-Exchange simulations using four CVs, two of them have already been chosen (see plumed-common.dat) while the others should be selected by you. -Additionally Metadynamics parameters \ref METAD for the two selected cvs should be added in the plumed.dat.0 and plumed.dat.1 files. In particular while the -PACE, HEIGHT and BIASFACTOR can be kept the same of those defined for the preselected CVs, the SIGMA should be half of the fluctuations of the +Additionally Metadynamics parameters \ref METAD for the two selected collective variables should be added in the plumed.dat.0 and plumed.dat.1 files. In particular while the +PACE, HEIGHT and BIASFACTOR can be kept the same of those defined for the selected CVs, the SIGMA should be half of the fluctuations of the chosen CV in an unbiased run (>1 ns). Step 3: Run -The simulation should be run until convergence of the four monodimensional free energies, this will tipically take more than 200 ns per replica. +The simulation should be run until convergence of the four one dimensional free energies, this will typically take more than 200 ns per replica. Step 4: Analysis diff --git a/user-doc/tutorials/cineca.txt b/user-doc/tutorials/cineca.txt index 976c96c85..1811558ab 100644 --- a/user-doc/tutorials/cineca.txt +++ b/user-doc/tutorials/cineca.txt @@ -1,5 +1,5 @@ /** -\page cineca Cineca tutorial +\page cineca CINECA tutorial \authors Max Bonomi, stealing most of the material from other tutorials. Alejandro Gil Ley is acknowledged for beta-testing this tutorial. @@ -35,7 +35,7 @@ We can start by downloading the tarball and uncompressing it: \endverbatim \warning Exercises should be run inside the newly-created cineca directory, -by creating new subdirectories, one per exercise. +by creating new sub-directories, one per exercise. \section cineca-toymodel Alanine dipeptide: our toy model @@ -93,7 +93,7 @@ GROMACS can be run (interactively) using the following command: \verbatim > gmx_mpi mdrun -s ../SETUP/topolA.tpr -nsteps 10000 -x traj.xtc \endverbatim -The nsteps flags can be used to change the number of timesteps and topolA.tpr is the name of the tpr file. +The nsteps flags can be used to change the number of time steps and topolA.tpr is the name of the tpr file. While running, GROMACS will produce a md.log file, with log information, and a traj.xtc file, with a binary trajectory. The trajectory can be visualized with VMD using: \verbatim @@ -182,7 +182,7 @@ be printed in the output file you should give more information to the driver, e. (see \ref driver) In this case we inform the driver that the `traj.xtc` file was produced in a run with a timestep of 0.002 ps and -saving a snapshop every 1000 timesteps. +saving a snapshot every 1000 time steps. You might want to analyze a different collective variable, such as the gyration radius. The gyration radius tells how extended is a molecule in space. @@ -202,7 +202,7 @@ PRINT ARG=phi,psi,gyr FILE=analyze \endverbatim (see \ref TORSION, \ref GYRATION, \ref GROUP, and \ref PRINT) -Now try to compute the time serie of the gyration radius. +Now try to compute the time series of the gyration radius. \section cineca-biasing Biasing collective variables @@ -222,8 +222,8 @@ all the biasing methods available in PLUMED can be found at the \ref Bias page. \hidden{Summary of theory} \subsubsection cineca-biasing-me-theory Summary of theory In metadynamics, an external history-dependent bias potential is constructed in the space of -a few selected degrees of freedom \f$ \vec{s}({q}) \f$, generally called collective variables (CVs) \cite metad. -This potential is built as a sum of Gaussians deposited along the trajectory in the CVs space: +a few selected degrees of freedom \f$ \vec{s}({q})\f$, generally called collective variables (CVs) \cite metad. +This potential is built as a sum of Gaussian kernels deposited along the trajectory in the CVs space: \f[ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) @@ -233,7 +233,7 @@ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) \f] where \f$ \tau \f$ is the Gaussian deposition stride, -\f$ \sigma_i \f$ the width of the Gaussian for the ith CV, and \f$ W(k \tau) \f$ the +\f$ \sigma_i \f$ the width of the Gaussian for the \f$i\f$th CV, and \f$ W(k \tau) \f$ the height of the Gaussian. The effect of the metadynamics bias potential is to push the system away from local minima into visiting new regions of the phase space. Furthermore, in the long time limit, the bias potential converges to minus the free energy as a function of the CVs: @@ -242,7 +242,7 @@ time limit, the bias potential converges to minus the free energy as a function V(\vec{s},t\rightarrow \infty) = -F(\vec{s}) + C. \f] -In standard metadynamics, Gaussians of constant height are added for the entire course of a +In standard metadynamics, Gaussian kernels of constant height are added for the entire course of a simulation. As a result, the system is eventually pushed to explore high free-energy regions and the estimate of the free energy calculated from the bias potential oscillates around the real value. @@ -266,16 +266,16 @@ where \f$ T \f$ is the temperature of the system. In the long time limit, the CVs thus sample an ensemble at a temperature \f$ T+\Delta T \f$ which is higher than the system temperature \f$ T \f$. The parameter \f$ \Delta T \f$ can be chosen to regulate the extent of free-energy exploration: - \f$ \Delta T = 0\f$ corresponds to standard MD, \f$ \Delta T \rightarrow \infty \f$ to standard + \f$ \Delta T = 0\f$ corresponds to standard MD, \f$ \Delta T \rightarrow\infty\f$ to standard metadynamics. In well-tempered metadynamics literature and in PLUMED, you will often encounter -the term "biasfactor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) +the term "bias factor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) and the system temperature (\f$ T \f$): \f[ \gamma = \frac{T+\Delta T}{T}. \f] -The biasfactor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed +The bias factor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed efficiently in the time scale of the simulation. Additional information can be found in the several review papers on metadynamics @@ -303,7 +303,7 @@ psi: TORSION ATOMS=7,9,15,17 # # Activate well-tempered metadynamics in phi depositing # a Gaussian every 500 time steps, with initial height equal -# to 1.2 kJoule/mol, biasfactor equal to 10.0, and width to 0.35 rad +# to 1.2 kJ/mol, bias factor equal to 10.0, and width to 0.35 rad METAD ... LABEL=metad @@ -329,7 +329,7 @@ The syntax for the command \ref METAD is simple. The directive is followed by a keyword ARG followed by the labels of the CVs on which the metadynamics potential will act. The keyword PACE determines the stride of Gaussian deposition in number of time steps, -while the keyword HEIGHT specifies the height of the Gaussian in kJoule/mol. For each CVs, one has +while the keyword HEIGHT specifies the height of the Gaussian in kJ/mol. For each CVs, one has to specified the width of the Gaussian by using the keyword SIGMA. Gaussian will be written to the file indicated by the keyword FILE. @@ -362,7 +362,7 @@ by the metadynamics bias potential to visit the other local minimum. As the simu the bias potential fills the underlying free-energy landscape, and the system is able to diffuse in the entire phase space. -The HILLS file contains a list of the Gaussians deposited along the simulation. +The HILLS file contains a list of the Gaussian kernels deposited along the simulation. If we give a look at the header of this file, we can find relevant information about its content: \verbatim @@ -376,7 +376,7 @@ If we give a look at the header of this file, we can find relevant information a The line starting with FIELDS tells us what is displayed in the various columns of the HILLS file: the time of the simulation, the value of phi and psi, the width of the Gaussian in phi and psi, -the height of the Gaussian, and the biasfactor. +the height of the Gaussian, and the bias factor. We can use the HILLS file to visualize the decrease of the Gaussian height during the simulation, according to the well-tempered recipe: @@ -384,14 +384,14 @@ according to the well-tempered recipe: \image html munster-metad-phihills.png "Time evolution of the Gaussian height." If we look carefully at the scale of the y-axis, we will notice that in the beginning the value -of the Gaussian height is higher than the initial height specified in the input file, which should be 1.2 kJoule/mol. -In fact, this column reports the height of the Gaussian rescaled by the pre-factor that +of the Gaussian height is higher than the initial height specified in the input file, which should be 1.2 kJ/mol. +In fact, this column reports the height of the Gaussian scaled by the pre-factor that in well-tempered metadynamics relates the bias potential to the free energy. In this way, when -we will use \ref sum_hills, the sum of the Gaussians deposited will directly provide the free-energy, +we will use \ref sum_hills, the sum of the Gaussian kernels deposited will directly provide the free-energy, without further rescaling needed (see below). One can estimate the free energy as a function of the metadynamics CVs directly from the metadynamics -bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussians +bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussian kernels deposited during the simulation and stored in the HILLS file. To calculate the free energy as a function of phi, it is sufficient to use the following command line: @@ -418,7 +418,7 @@ The result should look like this: To assess the convergence of a metadynamics simulation, one can calculate the estimate of the free energy as a function of simulation time. At convergence, the reconstructed profiles should be similar. -The option --stride should be used to give an estimate of the free energy every N Gaussians deposited, and +The option --stride should be used to give an estimate of the free energy every N Gaussian kernels deposited, and the option --mintozero can be used to align the profiles by setting the global minimum to zero. If we use the following command line: @@ -426,11 +426,11 @@ If we use the following command line: plumed sum_hills --hills HILLS --stride 100 --mintozero \endverbatim -one free energy is calculated every 100 Gaussians deposited, and the global minimum is set to zero in all profiles. +one free energy is calculated every 100 Gaussian kernels deposited, and the global minimum is set to zero in all profiles. The resulting plot should look like the following: \anchor cineca-metad-phifest-fig -\image html munster-metad-phifest.png "Estimates of the free energy as a function of the dihedral phi calculated every 100 Gaussians deposited." +\image html munster-metad-phifest.png "Estimates of the free energy as a function of the dihedral phi calculated every 100 Gaussian kernels deposited." To assess the convergence of the simulation more quantitatively, we can calculate the free-energy difference between the two local minima of the free energy along phi as a function of simulation time. @@ -485,7 +485,7 @@ psi: TORSION ATOMS=7,9,15,17 # # Activate well-tempered metadynamics in psi depositing # a Gaussian every 500 time steps, with initial height equal -# to 1.2 kJoule/mol, biasfactor equal to 10.0, and width to 0.35 rad +# to 1.2 kJ/mol, bias factor equal to 10.0, and width to 0.35 rad METAD ... LABEL=metad @@ -661,7 +661,7 @@ PRINT ARG=phi,psi,res.bias Notice that here we are printing a quantity named `res.bias`. We do this because \ref RESTRAINT does not define a single value (that here would be theoretically named `res`) but a structure with several components. All biasing methods (including \ref METAD) do so, as well as many collective variables (see e.g. \ref DISTANCE used with COMPONENTS keyword). -Printing the bias allows one to know how much a given snapshop was penalized. +Printing the bias allows one to know how much a given snapshot was penalized. Also notice that PLUMED understands numbers in the form `{number}pi`. This is convenient when using torsions, since they are expressed in radians. diff --git a/user-doc/tutorials/hrex.txt b/user-doc/tutorials/hrex.txt index 71f28719d..0ccdbd17c 100644 --- a/user-doc/tutorials/hrex.txt +++ b/user-doc/tutorials/hrex.txt @@ -62,7 +62,7 @@ plumed partial_tempering $scale < processed.top > topol$i.top Warnings: - It's not very robust and there might be force-field dependent issues! -- It certainly does not work with charmm cmap +- It certainly does not work with CHARMM cmap Suggested tests: 1. Compare partial_tempering with scale=1.0 to non-scaled force field. E.g. @@ -137,7 +137,7 @@ is not neutral. There should be no problem in this. When used with pbc, GROMACS will add a compensating background. Suggested check: -- Try with several identical force fields (hardcode the same lambda for all replicas in the +- Try with several identical force fields (hard code the same lambda for all replicas in the script above) and different seed/starting point. Acceptance should be 1.0 Notice that when you run with GPUs acceptance could be different from 1.0. diff --git a/user-doc/tutorials/julich-devel.txt b/user-doc/tutorials/julich-devel.txt index 0be75ae92..ada8d540c 100644 --- a/user-doc/tutorials/julich-devel.txt +++ b/user-doc/tutorials/julich-devel.txt @@ -5,7 +5,7 @@ The aim of this tutorial is to introduce users to the tutorials in the developer manual of PLUMED. We will learn a little bit about the structure of the code and how this structure -can be visualised using the tree diagrams in the developer manual. You will then learn how +can be visualized using the tree diagrams in the developer manual. You will then learn how to implement a new collective variable in a way that uses as much of the functionality that is already within the code and that thus avoids the duplication of code. Finally, you will learn how to write regression tests and how these can be used for continuous integration. @@ -65,7 +65,7 @@ manual, which can be found <a href="../../developer-doc/html/index.html"> here < you can get to the front page of the developer manual by clicking the USER/DEVELOPER logo in the top right hand corner of any page of the user manual. -One of PLUMED's nicest features for the developer is that all the code and documentation for any new PLUMED command +One of nicest features of PLUMED for the developer is that all the code and documentation for any new PLUMED command all appears together in a single file. When you come to implement a new feature it is thus relatively unlikely that you will have to modify any of the files you downloaded. Adding a new feature is simply a matter of adding one further cpp file containing the new method and the documentation for this method. We are able to achieve this by exploiting @@ -74,7 +74,7 @@ calculate biases thus inherit from a single base class called Action. You can r <a href="../../developer-doc/html/class_p_l_m_d_1_1_action.html"> here </a>. Notice also that this page also shows how all the various classes within the code inherit from this single base class. It is perhaps worth spending a little while browsing through the various branches of this tree and understanding how the classes -at each level become increasingly specialised and thus fit for particular purposes. +at each level become increasingly specialized and thus fit for particular purposes. Lets say you want to implement a new collective variable in PLUMED. One way to start this task would be to write a new class that inherits from the <a href="../../developer-doc/html/class_p_l_m_d_1_1_colvar.html">PLMD::Colvar</a> base class. @@ -84,7 +84,7 @@ and follow various links on the various subject pages you will eventually get to for implementing a new collective variable. If you look through the manual you can find similar pages that provide you with instructions for implementing new analysis methods, functions and so on. Our suggestion when you implement something new would thus be to find some similar functionality in the code, look at how it is implemented and to look -in the developer manual at the descriptions of the classes that have been inheritted. This process is what we will +in the developer manual at the descriptions of the classes that have been inherited. This process is what we will try and take you through in this short tutorial. \section julich-devel-instruct Instructions @@ -117,7 +117,7 @@ you should see the output that was discussed above. In setting up your input files you may find it useful to watch our <a href="http://www.youtube.com/watch?v=PxJP16qNCYs"> introductory </a> video and <a href="http://www.youtube.com/watch?v=iDvZmbWE5ps"> this video </a> on some of the -features that are available in the code. Obviously, the user manual will be indespensible as well. +features that are available in the code. Obviously, the user manual will be indispensable as well. \subsection new-cv Implementing a new collective variable @@ -154,7 +154,7 @@ std::numeric_limits<double>::max() Write an input file that computes \f$s\f$ and outputs its value to a file called colvar and that computes both the analytical and numerical derivatives of \f$s\f$ and outputs this information to a file called deriv. When outputting numbers to colvar and deriv -you should output numbers to four and three decimal places repsectively. You are going to run this calculation +you should output numbers to four and three decimal places respectively. You are going to run this calculation in the rt-second-shell directory that you downloaded with the tarball for this exercise so you should create this input file within that directory. When you are content with your input run the calculation by typing make. If you have done everything correctly you should see the output that was discussed above. @@ -176,8 +176,8 @@ tested and faster than anything that you will write so spend your time learning - <B> Test your code </B> Notice that I set up directories that you could use to test your code for this tutorial. Testing should always be part of your development workflow and coming up with ways to test your code is hard, which is why we often don't do it enough. Also don't develop stuff in a vacuum get in touch with people who are using the code. You need to test your code on systems that people actually want to simulate and not -just on dummy problems. In addition, you should learn to use profiling tools such as valgrind and instruments on the mac so that you can find -memory leaks and bottlenecks in your code. When you start doing this properly you will realise that you cannot possibly properly maintain all the code +just on dummy problems. In addition, you should learn to use profiling tools such as Valgrind and instruments on the mac so that you can find +memory leaks and bottlenecks in your code. When you start doing this properly you will realize that you cannot possibly properly maintain all the code that you have written and you will see clearly that you that you need to write less code. - <B> Write documentation for your code </B> Undocumented software is useless. You need to explain how to use your code and to fill your manual with diff --git a/user-doc/tutorials/lugano-1.txt b/user-doc/tutorials/lugano-1.txt index d12007baf..693cf9196 100644 --- a/user-doc/tutorials/lugano-1.txt +++ b/user-doc/tutorials/lugano-1.txt @@ -4,7 +4,7 @@ \section lugano-1-aims Aims The aim of this tutorial is to introduce you to the PLUMED syntax. We will go through the writing of input files to calculate -and print simple collective variables. We will then discuss how we can use PLUMED to analyse a trajectory by calculating ensemble +and print simple collective variables. We will then discuss how we can use PLUMED to analyze a trajectory by calculating ensemble averages, histograms and free energy surfaces. \section lugano-1-lo Learning Outcomes @@ -20,16 +20,16 @@ Once this tutorial is completed students will: The <a href="tutorial-resources/lugano-1.tar.gz" download="lugano-1.tar.gz"> tarball </a> for this project contains the following files: -- trajectory-short.xyz : a (short) trajectory for a 16 residue protein in xyz format. All the calculations with plumed driver that will be performed during this tuturial will use this trajectory. -- template.pdb : a single frame from the trajectory that can be used in conjuction with the \ref MOLINFO command +- trajectory-short.xyz : a (short) trajectory for a 16 residue protein in xyz format. All the calculations with plumed driver that will be performed during this tutorial will use this trajectory. +- template.pdb : a single frame from the trajectory that can be used in conjunction with the \ref MOLINFO command - in : An input file for the simplemd code that forms part of PLUMED -- input.xyz : A configuration file for Lennard-Jones solid with an fcc solid structure +- input.xyz : A configuration file for Lennard-Jones solid with an FCC solid structure \section lugano-1-instructions Instructions PLUMED2 is a library that can be incorporated into many MD codes by adding a relatively simple and well documented interface. -Once it is incorporated you can use PLUMED2 to perform a variety of different analyses on the fly and to bias -the sampling in the molecular dynamics engine. PLUMED2 can also, however, be used as a standalone code for analysing trajectories. +Once it is incorporated you can use PLUMED2 to perform a variety of different analyzes on the fly and to bias +the sampling in the molecular dynamics engine. PLUMED2 can also, however, be used as a standalone code for analyzing trajectories. If you are using the code in this way you can, once PLUMED2 is compiled, run the plumed executable by issuing the command: \verbatim @@ -44,7 +44,7 @@ plumed --help What is output when this command is run is a list of the tasks you can use PLUMED2 to perform. There are commands that allow you to patch an MD code, commands that allow you to run molecular dynamics and commands that allow you to build the manual. In this -tutorial we will mostly be using PLUMED2 to analyse trajectories, however. As such most of the calculations we will perform will be performed +tutorial we will mostly be using PLUMED2 to analyze trajectories, however. As such most of the calculations we will perform will be performed using the driver tool. Let's look at the options we can issue to plumed driver by issuing the following command: \verbatim @@ -73,7 +73,7 @@ to \ref PRINT these CVs and a termination line. Comments are denoted with a # and the termination of the input for PLUMED is marked with the keyword ENDPLUMED. Any words that follow the ENDPLUMED line are ignored by PLUMED. You can also introduce blank lines as these will not be interpreted by PLUMED. -The following input can be used analyse the \ref DISTANCE between the two terminal carbons of a 16 residues peptide. The \ref PRINT +The following input can be used analyze the \ref DISTANCE between the two terminal carbons of a 16 residues peptide. The \ref PRINT command after the \ref DISTANCE command ensures that the results (i.e. the distances calculated) are printed into the file named COLVAR. \verbatim @@ -88,7 +88,7 @@ ENDPLUMED here I can write what I want it won't be read. \endverbatim -Let's use this simple input file to analyse the trajectory included in the RESOURCES. To do so copy the input above into a file called +Let's use this simple input file to analyze the trajectory included in the RESOURCES. To do so copy the input above into a file called plumed.dat and then issue the command below: \verbatim @@ -107,7 +107,7 @@ find that the first two lines read: \endverbatim The first line of the COLVAR files tells you what values are in each of the columns. The remaining lines then tell you the values these quantities took -in the various trajectory frames. We can plot this data using gnuplot (or our favourite graphing program) by issuing the following commands: +in the various trajectory frames. We can plot this data using gnuplot (or our favorite graphing program) by issuing the following commands: \verbatim gnuplot @@ -192,7 +192,7 @@ ENDPLUMED Make a PLUMED input containing the above input and execute it on the trajectory that you downloaded at the start of the exercise by making use of the commands in section \ref lugano-1-introinput -Before we turn to analysing what is output from this calculation there are a few things to note about this input file. Firstly, I should describe what this file +Before we turn to analyzing what is output from this calculation there are a few things to note about this input file. Firstly, I should describe what this file instructs PLUMED to do. It tells PLUMED to: 1. calculate the position of the Virtual Atom 'first' as the \ref CENTER of atoms from 1 to 6; @@ -211,7 +211,7 @@ Notice also that ranges of atoms can be defined with a stride which can also be 3. ATOMS=from,to:by (i.e.: 251-256:2) 4. ATOMS=to,from:-by (i.e.: 256-251:-2) -Lets now return to the business of analysing what was output by the calculation. Lets begin by looking at the contents of the COLVAR file that was output. +Lets now return to the business of analyzing what was output by the calculation. Lets begin by looking at the contents of the COLVAR file that was output. When you do so you should find that the first few lines of this file read: \verbatim @@ -222,7 +222,7 @@ When you do so you should find that the first few lines of this file read: Notice that at variance with the file that was output in the previous section we now have three columns of numbers in the output file. Given the \ref PRINT command that we used in the input to this calculation though this new behavior makes a lot of sense. -Lets now plot contents of the COLVAR file so we can compare the behaviour of the distance between the two terminal carbons and the distance between the +Lets now plot contents of the COLVAR file so we can compare the behavior of the distance between the two terminal carbons and the distance between the centers of the mass of the two terminal residues in this trajectory (these two distances are what the above input is calculating). To plot this data issue the following commands: @@ -238,7 +238,7 @@ we are right. To look at the trajectory issue the following commands: vmd template.pdb trajectory-short.xyz \endverbatim -Lets summarise what we have learnt from these sections thus far. We have seen that: +Lets summarize what we have learned from these sections thus far. We have seen that: - PLUMED provides a simple syntax that can be used to calculate the \ref DISTANCE between any pair of atoms in our system. - PLUMED also allows us to calculate the positions of virtual atom (e.g. \ref CENTER) and that we can calculate the \ref DISTANCE between these quantities. @@ -246,13 +246,13 @@ Lets summarise what we have learnt from these sections thus far. We have seen t Now, obviously, PLUMED can do much more than calculate the distances between pairs of atoms as we will start to see that in the following sections. -\subsection lugano-1-torsions Calculating torsions +\subsection lugano-1-torsions Calculating torsion angles In the previous sections we have seen how we can use PLUMED to calculate distances and how by plotting these distances we can begin to simplify the high dimensional data contained in a trajectory. Obviously, calculating a \ref DISTANCE is not always the best way to simplify the information contained -in a trajectory and we often find we have to work with other more-complex quantities. PLUMED thus started as a libraray that was used to gather all the various -implementations people had written for different collective variables (CVs) that people had used to "analyse" trajectories over the years (analyse is in -inverted commas here because, as you will see if you continue to use PLUMED, we use CVs to do much more than simply analyse trajectories). +in a trajectory and we often find we have to work with other more-complex quantities. PLUMED thus started as a library that was used to gather all the various +implementations people had written for different collective variables (CVs) that people had used to "analyze" trajectories over the years (analyze is in +inverted commas here because, as you will see if you continue to use PLUMED, we use CVs to do much more than simply analyze trajectories). Now we will not have time to go over all the quantities that can be calculated in this tutorial. Once you understand the basic principles, however, you should be able to use the manual to work out how to calculate other quantities of interest. With this in mind then lets learn how to calculate a \ref TORSION. As with \ref DISTANCE the atoms that we specify in our \ref TORSION command can be real or virtual. In the example below two @@ -268,13 +268,13 @@ PRINT ARG=cvtor STRIDE=1 FILE=COLVAR ENDPLUMED \endverbatim -Copy this input to a PLUMED input file and use it to analyse the trajectory you downloaded at the start of this exercise by using the commands -descibred in section \ref lugano-1-introinput then plot the CV output using gnuplot. +Copy this input to a PLUMED input file and use it to analyze the trajectory you downloaded at the start of this exercise by using the commands +described in section \ref lugano-1-introinput then plot the CV output using gnuplot. As you can hopefully see calculating \ref TORSION values and other CVs is no more difficult than calculating \ref DISTANCE values. In fact it is -easier as generally when you calculate the torsions of a protein you often wish to calculate particular, named torsions (i.e. the \f$\phi\f$ and \f$\psi\f$ +easier as generally when you calculate the torsion angles of a protein you often wish to calculate particular, named torsion angles (i.e. the \f$\phi\f$ and \f$\psi\f$ angles). The \ref MOLINFO command makes it particularly easy to do this. For instance suppose that you want to calculate and print the \f$\phi\f$ angle -in the 6th residue of the protein and the \f$\psi\f$ angle in the 8th residue of the protein. You can do so using the following input: +in the sixth residue of the protein and the \f$\psi\f$ angle in the eighth residue of the protein. You can do so using the following input: \verbatim MOLINFO STRUCTURE=template.pdb @@ -283,13 +283,13 @@ psi8: TORSION ATOMS=@psi-8 PRINT ARG=phi6,psi8 FILE=colvar \endverbatim -Copy this input to a PLUMED input file and use it to analyse the trajectory you downloaded at the start of this exercise by using the commands -descibred in section \ref lugano-1-introinput then plot the CV output using gnuplot. Notice that you will need the template.pdb file you downloaded +Copy this input to a PLUMED input file and use it to analyze the trajectory you downloaded at the start of this exercise by using the commands +described in section \ref lugano-1-introinput then plot the CV output using gnuplot. Notice that you will need the template.pdb file you downloaded at the start of this exercise in order for this to run. \subsection lugano-1-gyration An exercise with the radius of gyration -Lets now see if you can use everything you have learnt to setup a PLUMED input file of your own. What I would like you to do is to write +Lets now see if you can use everything you have learned to setup a PLUMED input file of your own. What I would like you to do is to write a PLUMED input file that measures the Radius of Gyration \ref GYRATION for the configurations in each of the frames in the trajectory that you downloaded at the start of this exercise. This radius of gyration will be calculated using the positions of all the atoms in that trajectory. @@ -305,9 +305,9 @@ Now 'ca' is not a virtual atom but a simple list of atoms. \subsection lugano-1-multicol Coordination numbers -In the previous sections we have learnt how PLUMED can be used to calculate simple functions of atomic positions such as the +In the previous sections we have learned how PLUMED can be used to calculate simple functions of atomic positions such as the \ref DISTANCE between pairs of atoms. As discussed here (https://www.youtube.com/watch?v=iDvZmbWE5ps) many of the more complicated -collective variables that we use to analyse simulations can be calculated by computing these simple quantities multiple times for different +collective variables that we use to analyze simulations can be calculated by computing these simple quantities multiple times for different sets of atoms. That is to say we can calculate many more complicated collective variables using: \f[ @@ -319,7 +319,7 @@ a scalar. The sum then runs over the different sets of atoms from which the qua so lets make it more concrete by considering an example. Suppose that we want to calculate the coordination number of atom \f$k\f$. What we need to do is: -1. We need to calculate the distance between atom \f$k\f$ and every atom in the system that is not atom \f$k\f$. This will be the set of sets of atoms that we have to perform the sum above on. Furthermore, the function \f$f\f$ in the above will be pythagoras theorem, which is the function we use to calculate the distance between a pair of atoms. +1. We need to calculate the distance between atom \f$k\f$ and every atom in the system that is not atom \f$k\f$. This will be the set of sets of atoms that we have to perform the sum above on. Furthermore, the function \f$f\f$ in the above will be Pythagoras theorem, which is the function we use to calculate the distance between a pair of atoms. 2. We need to transform the distances calculated by a switching function (\f$f\f$ in the above) that is equal to one if the distance is less than or equal to the typical length of a chemical bond and zero otherwise. Lets thus use PLUMED to calculate the coordination number of atom 9. We can do this as follows: @@ -329,7 +329,7 @@ d1: DISTANCES GROUPA=9 GROUPB=1-8,10-256 LESS_THAN={RATIONAL D_0=0.16 R_0=0.01 D PRINT ARG=d1.* FILE=colvar \endverbatim -Copy this input file to a PLUMED input file. Before using it to analyse the trajectory that you downloaded at the start of the exercise using the commands +Copy this input file to a PLUMED input file. Before using it to analyze the trajectory that you downloaded at the start of the exercise using the commands described in section \ref lugano-1-introinput try to guess what value this coordination number will take. Hint: what element is atom 9? Now see if you can adjust the above input to calculate the coordination number of atom 5. What is the coordination number of this atom and why does it take @@ -351,10 +351,10 @@ PRINT ARG=dd.mean,dd.min,dd.max,dd.moment-2 STRIDE=1 FILE=COLVAR ENDPLUMED \endverbatim -Try to copy this input now and to use it to analyse the trajectory you downloaded at the start of the exercise using the commands +Try to copy this input now and to use it to analyze the trajectory you downloaded at the start of the exercise using the commands described in section \ref lugano-1-introinput. -Multicolvar is not just for \ref DISTANCES though. The infrastracture of multicolvar has been used to develop many PLUMED2 collective variables. +Multicolvar is not just for \ref DISTANCES though. The infrastructure of multicolvar has been used to develop many PLUMED2 collective variables. One interesting example is the set of Secondary Structure CVs (\ref ANTIBETARMSD, \ref PARABETARMSD and \ref ALPHARMSD). You can use the input below to calculate the degree of anti-beta secondary structure in each of the trajectory frames by copying this input to a PLUMED input file and by exploiting the commands to run driver that were described in section \ref lugano-1-introinput. @@ -373,7 +373,7 @@ We can do a large number of other things with multicolvar. If you are intereste \subsection lugano-1-averagesintro Understanding the need for ensemble averages -In the previous sections we have learnt how we can use PLUMED to calculate collective variables from simulation trajectories and have seen how, +In the previous sections we have learned how we can use PLUMED to calculate collective variables from simulation trajectories and have seen how, by plotting how these collective variables change as a function of time, we can get a handle on what happens during the trajectory. Generally this level of analysis is not sufficient for us to publish a paper about the simulations we have run, however. In this section we are going to run some molecular dynamics simulations in order to understand why. @@ -408,7 +408,7 @@ positions of all other atoms were identical. In spite of these similarities, ho are very different. This is important as it tells us that the time series of CV values that we get from a single MD simulation is (in and of itself) not particularly valuable as we would have got a completely different time series of values if we had run the calculation with only a slightly different input structure. We should, therefore, <b> think of the trajectory output from a molecular dynamics simulations as a time series of random variables </b> -and analyse it accordingly. +and analyze it accordingly. The justification for thinking of a trajectory as a time series of random variables is based on statistical mechanics which tells us that a system with a constant number of atoms and a constant volume evolving at a fixed temperature has a probability: @@ -446,7 +446,7 @@ were added together. This works because of results known as the law of large nu - http://gtribello.github.io/mathNET/LAW_OF_LARGE_NUMBERS.html - http://gtribello.github.io/mathNET/CENTRAL_LIMIT_THEOREM.html -Alternatively, we can justify this way of analysing our trajectory by thinking of the set of sampled frames as random variables generated from Markov chain. +Alternatively, we can justify this way of analyzing our trajectory by thinking of the set of sampled frames as random variables generated from Markov chain. These mathematical objects are discussed here: - http://gtribello.github.io/mathNET/MARKOV_PROPERTY.html @@ -454,7 +454,7 @@ These mathematical objects are discussed here: - http://gtribello.github.io/mathNET/TRANSIENT_RECURRENT_STATES.html - http://gtribello.github.io/mathNET/STATIONARY_DIST_MARKOV.html -Regardlessly, however, and as we will see in the following section we can estimate ensemble averages of collective variables using: +Regardless, however, and as we will see in the following section we can estimate ensemble averages of collective variables using: \f[ \langle A \rangle \approx \frac{1}{T} \sum_{t} A[q(t)] @@ -462,7 +462,7 @@ Regardlessly, however, and as we will see in the following section we can estima where the sum runs over the set of \f$T\f$ microstates, \f$q(t)\f$, in our trajectory and where \f$A[q(t)]\f$ is the function that is used to calculate the value of the collective variable from the positions of the atoms in the microstate. When we do so we find that the values of the ensemble averages from different -trajectories are reasonably consisitent and certainly much more consisitent than the set of instantaneous CV values. +trajectories are reasonably consistent and certainly much more consistent than the set of instantaneous CV values. \subsection lugano-1-averages Calculating ensemble averages using PLUMED @@ -529,8 +529,8 @@ trajectories. \verbatim d1: DISTANCE ATOMS=2,3 -hh: HISTOGRAM ARG=d1 BANDWIDTH=0.05 GRID_MIN=0 GRID_MAX=4.0 GRID_BIN=200 STRIDE=10 -DUMPGRID GRID=hh FILE=histo STRIDE=25000 +histogram: HISTOGRAM ARG=d1 BANDWIDTH=0.05 GRID_MIN=0 GRID_MAX=4.0 GRID_BIN=200 STRIDE=10 +DUMPGRID GRID=histogram FILE=histo STRIDE=25000 \endverbatim You can plot the histogram output from this simulation using: @@ -549,19 +549,19 @@ that tells us how often data should be added to to the accumulated averages. Fu with its own STRIDE keyword (\ref PRINT for \ref AVERAGE and \ref DUMPGRID for \ref HISTOGRAM) that tells us how frequently the accumulated averages should be output to human readable files. -A substantial difference between these two input files is the object the label hh refers to. In all previous examples the label of an action has referred to a -scalar quantity that was calculated by that action. In the example above, however, the label hh refers to the function accumulated on a grid that is evaluated +A substantial difference between these two input files is the object the label histogram refers to. In all previous examples the label of an action has referred to a +scalar quantity that was calculated by that action. In the example above, however, the label histogram refers to the function accumulated on a grid that is evaluated by the \ref HISTOGRAM command. This is why we cannot print this quantity using \ref PRINT - it is not a simple scalar valued function anymore. As you become more familiar with PLUMED you will find that these labels can refer to a range of different types of mathematical object. -Lets now see if we can bring together everything we have learnt in this tutorial in order to analyse the protein trajectory that was downloaded at the start of +Lets now see if we can bring together everything we have learned in this tutorial in order to analyze the protein trajectory that was downloaded at the start of the exercise. \subsection lugano-1-analysis A histogram for the protein trajectory We are going to calculate the \ref HISTOGRAM from our protein trajectory as a function of two different collective variables: \ref ANTIBETARMSD and the average distance between the ca atoms of our protein backbone. The input that allows us to -claculate perform this analysis is shown below: +calculate perform this analysis is shown below: \verbatim # Read in protein structure template @@ -575,7 +575,7 @@ PRINT ARG=abeta.lessthan,dd.mean,dd.min,dd.max,dd.moment-2 STRIDE=1 FILE=COLVAR # Accumulate histogram of collective variables hh: HISTOGRAM ARG=abeta.lessthan,dd.mean KERNEL=DISCRETE GRID_MIN=0,0.8 GRID_MAX=4,1.2 GRID_BIN=40,40 # Output histogram - N.B. when we are running with driver we can not provide a STRIDE. -# The histogram will then only be output once all the trajectory frames have been read in and analysed +# The histogram will then only be output once all the trajectory frames have been read in and analyzed DUMPGRID GRID=hh FILE=histo \endverbatim @@ -588,7 +588,7 @@ set pm3d map sp 'histo' w pm3d \endverbatim -If you do so though, you will probably find that the structure in the histogram, \f$H(s)\f$, is a bit difficult to visualise because the probability changes +If you do so though, you will probably find that the structure in the histogram, \f$H(s)\f$, is a bit difficult to visualize because the probability changes from point to point by multiple orders of magnitude. This is why we often convert the histogram, \f$H(s)\f$, to a free energy surface, \f$F(s)\f$, using: @@ -611,7 +611,7 @@ PRINT ARG=abeta.lessthan,dd.mean,dd.min,dd.max,dd.moment-2 STRIDE=1 FILE=COLVAR hh: HISTOGRAM ARG=abeta.lessthan,dd.mean KERNEL=DISCRETE GRID_MIN=0,0.8 GRID_MAX=4,1.2 GRID_BIN=40,40 fes: CONVERT_TO_FES GRID=hh TEMP=300 # Output free energy - N.B. when we are running with driver we can not provide a STRIDE. -# The histogram will then only be output once all the trajectory frames have been read in and analysed +# The histogram will then only be output once all the trajectory frames have been read in and analyzed DUMPGRID GRID=fes FILE=fes.dat \endverbatim @@ -623,7 +623,7 @@ If you have worked through all of this tutorial make sure that you have understo in section \ref lugano-1-lo means and that you can use PLUMED to perform all these tasks. In terms of further work you should investigate issues related to the convergence of estimates of ensemble averages such as block averaging. You might like to investigate how long your simulations have to be in order to obtain reliable estimates of the ensemble average for a collective variable and reliable estimates for the free energy as a function of a collective variable. -Alternatively, you might like to explore other collective variables that could be used to analyse the protein trajectory that you have worked on in this tutorial. +Alternatively, you might like to explore other collective variables that could be used to analyze the protein trajectory that you have worked on in this tutorial. */ diff --git a/user-doc/tutorials/lugano-2.txt b/user-doc/tutorials/lugano-2.txt index d96e6d4fe..7c492082e 100644 --- a/user-doc/tutorials/lugano-2.txt +++ b/user-doc/tutorials/lugano-2.txt @@ -15,15 +15,15 @@ cases - and no I have absolutely no idea as to how to create a collective variab These answers are interesting as they cut to the very heart of what is interesting about biomolecular systems. In fact this difficulty is one of the reasons why such systems are so widely studied. If you think for a moment about solid state systems any transition usually involves a substantial change in symmetry. Low energy configurations are usually high symmetry while higher energy configurations have a low symmetry. This makes it easy to design collective variables -to study solid state transitions - you simply measure the degree of symmetry in the system (see \ref belfast-10). In biomolecular systems by constrast the +to study solid state transitions - you simply measure the degree of symmetry in the system (see \ref belfast-10). In biomolecular systems by contrast the symmetry does not change substantially during a folding transition. The unfolded state has a low symmetry but the folded state also has a low symmetry, which is part of the reason that it is so difficult to find the folded state from the amino acid sequence alone. With all this in mind the purpose of this tutorial is to learn about how we can design collective variables that can be used to study transitions between different states of these low-symmetry systems. In particular, we are going to learn how we can design collective variables that describe how far we have progressed along some pathway between two configurations with relatively low symmetry. We will in most of this -tutorial study how these coordinates work in a two-dimensional space as this will allow us to visualise what we are doing. Hopefully, however, -you will be able to use what you learn from this tutorial to generalise these ideas so that you can use \ref PATH and \ref PCAVARS in +tutorial study how these coordinates work in a two-dimensional space as this will allow us to visualize what we are doing. Hopefully, however, +you will be able to use what you learn from this tutorial to generalize these ideas so that you can use \ref PATH and \ref PCAVARS in higher-dimensional spaces. \section lugano-2-lo Learning Outcomes @@ -40,11 +40,11 @@ Once this tutorial is completed students will: The <a href="tutorial-resources/lugano-2.tar.gz" download="lugano-2.tar.gz"> tarball </a> for this project contains the following files: -- transformation.pdb : a trajectory that shows the transition between the C7ax and C7eq conformers of alanine dipeptide. -- pca-reference.pdb : a file that gives the start and end points of the vector that connects the C7ax and C7eq conformers. This file contains the positions of the atoms in these two structures. -- PCA-isocommittor : a directory containing the files required to run isocommitor simulations that monitor the values of \f$\phi\f$, \f$\psi\f$ and the PCA coordinate. -- PATH-isocommittor : a directory containing the files required to run isocommitor simulations that monitor the values of the \ref PATH collective variable \f$S(X)\f$. -- 2CV-isocommittor : a directory containing the files required to run isocommitor simulations that monitor the values of the \ref PATH collective variables \f$S(X)\f$ and \f$Z(X)\f$ +- transformation.pdb : a trajectory that shows the transition between the \f$C_7ax\f$ and \f$C_7eq\f$ conformers of alanine dipeptide. +- pca-reference.pdb : a file that gives the start and end points of the vector that connects the \f$C_7ax\f$ and \f$C_7eq\f$ conformers. This file contains the positions of the atoms in these two structures. +- PCA-isocommittor : a directory containing the files required to run isocommittor simulations that monitor the values of \f$\phi\f$, \f$\psi\f$ and the PCA coordinate. +- PATH-isocommittor : a directory containing the files required to run isocommittor simulations that monitor the values of the \ref PATH collective variable \f$S(X)\f$. +- 2CV-isocommittor : a directory containing the files required to run isocommittor simulations that monitor the values of the \ref PATH collective variables \f$S(X)\f$ and \f$Z(X)\f$ \section lugano-2-instructions Instructions @@ -54,9 +54,9 @@ between the two conformers of this molecule shown below: \anchor lugano-2-transition-fig \image html belfast-2-transition.png "Two metastable states of alanine dipeptide are characterized by their Ramachandran dihedral angles." -Alanine dipeptide is a rather well-studied biomolecule (in fact it is an overstudied molecule!). It is well known that you can -understand the interconversion of the two conformers shown above by looking at the free energy surface as a function of the \f$\phi\f$ and \f$\psi\f$ -ramachandran angles as shown below: +Alanine dipeptide is a rather well-studied biomolecule (in fact it is an over studied molecule!). It is well known that you can +understand the inter-conversion of the two conformers shown above by looking at the free energy surface as a function of the \f$\phi\f$ and \f$\psi\f$ +Ramachandran angles as shown below: \anchor lugano-2-rama-fig \image html belfast-2-rama.png "The Free energy landscape of alanine dipeptide in Ramachandran angles in the CHARMM27 force field." @@ -70,13 +70,13 @@ Consider the free energy surface shown in figure \ref lugano-2-rama-fig. It is \f$\psi\f$ (\f$y\f$-axis) angle of the molecule can be used to distinguish between the two configurations shown in figure \ref lugano-2-transition-fig. Having said that, however, given the shape of landscape and the associated thermal fluctuations we would expect to see in the values of these angles during a typical simulation, it seems likely that \f$\phi\f$ will do a better job -at distinguishing between the two configurations. \f$\psi\f$ would most likely be a bad coordinate as when the molecule is in the C7eq +at distinguishing between the two configurations. \f$\psi\f$ would most likely be a bad coordinate as when the molecule is in the \f$C_7eq\f$ configuration the \f$\psi\f$ angle can fluctuate to any value with only a very small energetic cost. If we only had information on how the \f$\psi\f$ -angle changed during a simulation we would thus struggle to distinguish a transition to the C7ax configuration from a thermal fluctuations. -Unsurprinsgly then it has been shown that metadynamics simulations that use just the \f$\phi\f$ angle as a collective variable can effectively -drive the system from the C7ax configuration to the C7eq configuration. We will not repeat these calculations here but will instead use driver to determine -how the values of \f$\phi\f$ and \f$\psi\f$ change as we move between along the transition path that connects the C7ax configuration to the -C7eq state. +angle changed during a simulation we would thus struggle to distinguish a transition to the \f$C_7ax\f$ configuration from a thermal fluctuations. +It has been shown that metadynamics simulations that use just the \f$\phi\f$ angle as a collective variable can effectively +drive the system from the \f$C_7ax\f$ configuration to the \f$C_7eq\f$ configuration. We will not repeat these calculations here but will instead use driver to determine +how the values of \f$\phi\f$ and \f$\psi\f$ change as we move between along the transition path that connects the \f$C_7ax\f$ configuration to the +\f$C_7eq\f$ state. \verbatim t1: TORSION ATOMS=2,4,6,9 @@ -84,7 +84,7 @@ t2: TORSION ATOMS=4,6,9,11 PRINT ARG=t1,t2 FILE=colvar \endverbatim -Lets run this now on trajectory that describes the transition from the C7ax configuration to the C7eq configuration. To run this calculation copy the input +Lets run this now on trajectory that describes the transition from the \f$C_7ax\f$ configuration to the \f$C_7eq\f$ configuration. To run this calculation copy the input file above into a file called plumed.dat and run the command below: \verbatim @@ -92,7 +92,7 @@ plumed driver --mf_pdb transformation.pdb \endverbatim Try plotting each of the two torsional angles in this file against time in order to get an idea of how good a job each one of these coordinates at distinguishing between -the various configurations along the pathway connecting the C7ax and C7eq configurations. What you will see is that in both cases the CV does not increase/decrease +the various configurations along the pathway connecting the \f$C_7ax\f$ and \f$C_7eq\f$ configurations. What you will see is that in both cases the CV does not increase/decrease monotonically as the transition progresses. We can perhaps come up with a better coordinate that incorporates changes in both \f$\phi\f$ and \f$\psi\f$ by using the coordinate illustrated in the figure below. @@ -113,9 +113,9 @@ Try calculating the values of the above collective variables for each of the con command that was given earlier. Notice that what we are using here are some well known results on the dot product of two vectors here. Essentially if the values of the -ramachandran angles in the C7eq configuration are \f$(\phi_1,\psi_1)\f$ and the ramachandran angles in the C7ax configuration are +Ramachandran angles in the \f$C_7eq\f$ configuration are \f$(\phi_1,\psi_1)\f$ and the Ramachandran angles in the \f$C_7ax\f$ configuration are \f$(\phi_2,\psi_2)\f$. If our instantaneous configuration is \f$(\phi_3,\psi_3)\f$ we can thus calculate the following projection -on the vector connecting the C7eq state to the C7ax state: +on the vector connecting the \f$C_7eq\f$ state to the \f$C_7ax\f$ state: \f[ s = (\phi_2 - \phi_1).(\phi_3 - \phi_1) + (\psi_2 - \psi_1).(\psi_3 - \psi_1) @@ -130,7 +130,7 @@ connecting the point \f$(\phi_1,\psi_1)\f$ to \f$(\phi_3,\psi_3)\f$. If we call \f] where \f$| \mathbf{v}_1 |\f$ and \f$| \mathbf{v}_2 |\f$ are the magnitudes of our two vectors and where \f$\alpha\f$ is the angle between -the two of them. Elementary trignometry thus tells us that if \f$\mathbf{v}_1\f$ is a unit vector (i.e. if it has magnitude 1) the dot product +the two of them. Elementary trigonometry thus tells us that if \f$\mathbf{v}_1\f$ is a unit vector (i.e. if it has magnitude 1) the dot product is thus equal to the projection of the vector \f$\mathbf{v}_2\f$ on \f$\mathbf{v}_2\f$ as shown in figure \ref pca-figure. This is an useful idea. In fact it is the basis of the \ref PCAVARS collective variable that is implemented in PLUMED so we can (almost) calculate @@ -139,11 +139,11 @@ the projection on this vector by using the input shown below: \verbatim t1: TORSION ATOMS=2,4,6,9 t2: TORSION ATOMS=4,6,9,11 -pca: PCAVARS REFERENCE=angle-pca-reference.pdb TYPE=EUCLIDEAN -PRINT ARG=t1,t2,pca.* FILE=colvar +p: PCAVARS REFERENCE=angle-pca-reference.pdb TYPE=EUCLIDEAN +PRINT ARG=t1,t2,p.* FILE=colvar \endverbatim -We cannot, however, do this in practise (we also shouldn't really use the previous input either) as the \ref TORSION angles that we use +We cannot, however, do this in practice (we also shouldn't really use the previous input either) as the \ref TORSION angles that we use to define our vectors here are periodic. In this next section we will thus look at how we can avoid this problem of periodicity by working in a higher dimensional space. @@ -151,10 +151,10 @@ in a higher dimensional space. In the previous section I showed how we can use the projection of a displacement on a vector as a collective variable. I demonstrated this in a two dimensional space as this makes it easy to visualize the vectors involved. We are not forced to work with two dimensional vectors, -however. We can instead find the vector that connects the C7eq and C7ax states in some higher dimensional space and project our current +however. We can instead find the vector that connects the \f$C_7eq\f$ and \f$C_7ax\f$ states in some higher dimensional space and project our current coordinate on that particular vector. In fact we can even define this vector in the space of the coordinates of the atoms. In other words, -if the 3\f$N\f$ coordinate of atomic positions is \f$\mathbf{x}^{(1)}\f$ for the C7eq configuration and \f$\mathbf{x}^{(2)}\f$ for the C7ax -configuration and if the instanenous configuration of the atoms is \f$\mathbf{x}^{(3)}\f$ we can use the following as a CV: +if the 3\f$N\f$ coordinate of atomic positions is \f$\mathbf{x}^{(1)}\f$ for the \f$C_7eq\f$ configuration and \f$\mathbf{x}^{(2)}\f$ for the \f$C_7ax\f$ +configuration and if the instantaneous configuration of the atoms is \f$\mathbf{x}^{(3)}\f$ we can use the following as a CV: \f[ s = \sum_{i=1}^{3N} (x^{(2)}_i - x^{(1)}_i ) (x^{(3)}_i - x^{(1)}_i ) \f] @@ -164,11 +164,11 @@ where the sum here runs over the \f$3N\f$-dimensional vector that defines the po \verbatim t1: TORSION ATOMS=2,4,6,9 t2: TORSION ATOMS=4,6,9,11 -pca: PCAVARS REFERENCE=pca-reference.pdb TYPE=OPTIMAL -PRINT ARG=t1,t2,pca.* FILE=colvar +p: PCAVARS REFERENCE=pca-reference.pdb TYPE=OPTIMAL +PRINT ARG=t1,t2,p.* FILE=colvar \endverbatim -Use this input to analyse the set of configurations that are in the transformation.pdb file. +Use this input to analyze the set of configurations that are in the transformation.pdb file. Let's now look further at the caveat that I alluded to before we ran the calculations. I stated that we are only calculating: \f[ @@ -176,24 +176,24 @@ s = \sum_{i=1}^{3N} (x^{(2)}_i - x^{(1)}_i ) (x^{(3)}_i - x^{(1)}_i ) \f] in a manner of speaking. The point is that we would not want to calculate exactly this quantity because the vectors of displacements that are calculated in this way includes both rotational and translational motion. This is a problem as the majority of the change -in moving from the C7ax configuration shown in figure \ref lugano-2-transition-fig to the C7eq configuration shown in figure +in moving from the \f$C_7ax\f$ configuration shown in figure \ref lugano-2-transition-fig to the \f$C_7eq\f$ configuration shown in figure \ref lugano-2-transition-fig comes from the translation of all the atoms. To put this another way if I had, in figure \ref lugano-2-transition-fig, -shown two images of the C7ax configuration side by side the displacement in the positions of the atoms in those two structures would be +shown two images of the \f$C_7ax\f$ configuration side by side the displacement in the positions of the atoms in those two structures would be similar to the displacement of the atoms in \ref lugano-2-transition-fig as as the majority of the displacement in the vector of atomic positions comes about because I have translated all the atoms in the molecule rightwards by a fixed amount. I can, however, remove these translational displacements from consideration when calculating these vectors. In addition, -I can also remove any displacements due rotation in the frame of reference of the molecule. If you are interested in how this is done in practise you can +I can also remove any displacements due rotation in the frame of reference of the molecule. If you are interested in how this is done in practice you can read about it on the manual page about the \ref RMSD collective variable. For what concerns us here, however, all we need to know is that when we use the OPTIMAL metric we are calculating a vector which tells us how far the atoms have been displaced in moving from structure A to structure B in a way that excludes any displacements due to translation of the center of mass of the molecule or any displacements that occur due to rotation -of the cartesian frame. +of the Cartesian frame. -\subsection lugano-2-iso The isocommitor surface +\subsection lugano-2-iso The isocommittor surface In the previous sections I have been rather loose when talking about better and worse collective variables in that I have not been clear in my distinction between what makes a collective variable good and what makes a collective variable bad. In this section I thus want to discuss one method that we can use -to judge the quality of a collective variable. This method involves calculating the so called isocommitor. The essential notion behind this techinique is -that there will be a saddle point between the two states of interest (the C7ax and C7eq configurations in our alanine dipeptide example). If the free energy +to judge the quality of a collective variable. This method involves calculating the so called isocommittor. The essential notion behind this technique is +that there will be a saddle point between the two states of interest (the \f$C_7ax\f$ and \f$C_7eq\f$ configurations in our alanine dipeptide example). If the free energy is plotted as a function of a good collective variable the location of this dividing surface - the saddle point - will appear as a maximum. Lets suppose that we now start a simulation with the system balanced precariously on this maximum. The system will, very-rapidly, fall off the maximum and move @@ -203,18 +203,18 @@ words if the position of maximum in the low-dimensional free energy surface tell not represent the location of the transition state well then there will be an imbalance between the number of trajectories that move rightwards and the number that move leftwards. -We can think of this business of the isocomittor one further way, however. If in the vicinity of the transition state between the two basins the collective variable +We can think of this business of the isocommittor one further way, however. If in the vicinity of the transition state between the two basins the collective variable is perpendicular to the surface separating these two states half of the trajectories that start from this configuration will move to the left in CV space while the other half will move to the right. If the dividing surface and the CV are not perpendicular in the vicinity of the transition state, however, there will be an imbalance between the number of trajectories that move rightwards and the number of trajectories that move leftwards. We can thus determine the goodness of a collective variable by shooting trajectories from what we believe is the transition state and examining the number of trajectories that move into the left and right basins. -Lets make all this a bit more concrete by looking at how we might calculate isocomittors by using some of the collective variables we have introduced in this +Lets make all this a bit more concrete by looking at how we might calculate the isocommittor by using some of the collective variables we have introduced in this exercise. You will need to go into the directory in the tar ball that you downloaded that is called PCA-isocommittor. In this directory you will find a number of files that will serve as input to gromacs 5 and PLUMED. You thus need to ensure that you have an installed version of gromacs 5 patched with PLUMED on your computer in order to perform this exercise. In addition to these input files you will find a bash script called script.sh, which we are going to use in order -to set of a large number of molecular dynamics simualtions. If you open script.sh you will find the following lines near the top: +to set of a large number of molecular dynamics simulations. If you open script.sh you will find the following lines near the top: \verbatim GROMACS_BIN=/Users/gareth/MD_code/gromacs-5.1.1/build/bin @@ -229,9 +229,9 @@ modification though you can run the calculation by issuing the following command ./script.sh \endverbatim -This command submits 50 molecular dynamics simulations that all start from a configuration that lies between the C7eq and C7ax configurations of alanine -dipeptide. In addition, this command also generates some scripts that allow us to visualise how the \f$\phi\f$, \f$\psi\f$ and PCA coordinates that we -introduced in the previous sections change duing each of these 50 simulations. If you load gnuplot and issue the command: +This command submits 50 molecular dynamics simulations that all start from a configuration that lies between the \f$C_7eq\f$ and \f$C_7ax\f$ configurations of alanine +dipeptide. In addition, this command also generates some scripts that allow us to visualize how the \f$\phi\f$, \f$\psi\f$ and PCA coordinates that we +introduced in the previous sections change during each of these 50 simulations. If you load gnuplot and issue the command: \verbatim load "script_psi.gplt" @@ -249,10 +249,10 @@ will show how how \f$\phi\f$ changes during the course of the 50 simulations and load "script_pca.gplt" \endverbatim -gives you the information on the pca coordinates. If you look at the \f$\psi\f$ and pca data first you can see clearly that it is very difficult to distinguish -the configurations that moved to the C7eq basin from the trajectories that moved to the C7ax basin. By contrast if you look at the data on the \f$\phi\f$ angles -you can indeed use these plots to distinguish the trajectories that moved to C7eq from the trajectories that moved to C7ax. It is abundantly clear, however, -that the number of trajectories that moved to C7eq is not equal to the number of trajectories that moved to C7ax. This CV, therefore, is not capturing +gives you the information on the PCA coordinates. If you look at the \f$\psi\f$ and PCA data first you can see clearly that it is very difficult to distinguish +the configurations that moved to the \f$C_7eq\f$ basin from the trajectories that moved to the \f$C_7ax\f$ basin. By contrast if you look at the data on the \f$\phi\f$ angles +you can indeed use these plots to distinguish the trajectories that moved to \f$C_7eq\f$ from the trajectories that moved to \f$C_7ax\f$. It is abundantly clear, however, +that the number of trajectories that moved to \f$C_7eq\f$ is not equal to the number of trajectories that moved to \f$C_7ax\f$. This CV, therefore, is not capturing the transition state ensemble. One thing you will have seen from these examples is that the \ref PCAVARS coordinate that were introduced in the previous @@ -263,14 +263,14 @@ it is perhaps clear why. \anchor ala-tstate \image html "lugano-2-trans-state.png" "The PCAVARS coordinate and the transition state are highlighted on this figure. As you can see the coordinate does not pass through the transition state" -You can see the location of the saddle point between these two states in this surface and it is very clear that the vector connecting the C7eq state to the -C7ax state does not pass through this point. In fact it would be extremely fortuitous if a vector connecting an initial state and a final state also passed +You can see the location of the saddle point between these two states in this surface and it is very clear that the vector connecting the \f$C_7eq\f$ state to the +\f$C_7ax\f$ state does not pass through this point. In fact it would be extremely fortuitous if a vector connecting an initial state and a final state also passed through the intermediate transition state between them. We can, after all, define the equation of straight line (a vector) if we are given only two points on it. In the next section we are thus going to see how we can resolve this problem by introducing a non-linear (or curvilinear) coordinate. \subsection lugano-2-pathcvs Path collective variables -Consider the black path that connects the C7ax and C7eq states in the free energy shown below: +Consider the black path that connects the \f$C_7ax\f$ and \f$C_7eq\f$ states in the free energy shown below: \anchor lugano-2-good-bad-path-fig \image html belfast-2-good-bad-path.png "Examples of good and bad paths: the black path follows the minimum free energy path connecting the two metastable states, while the red path connects the two states directly via a linear path that passes through high energy" @@ -284,9 +284,9 @@ path: PATH REFERENCE=path-reference.pdb TYPE=OPTIMAL LAMBDA=15100. PRINT ARG=* STRIDE=2 FILE=colvar FMT=%12.8f \endverbatim -Lets thus use this input and run an isocomittor analsysis using the location of the maximum in this coordinate as the start point for all our trajectories. +Lets thus use this input and run an isocommittor analysis using the location of the maximum in this coordinate as the start point for all our trajectories. Everything you need to do this analysis is in the PATH-isocommittor directory. Once again you will find that there is a script.sh bash script inside this -directory, which, as in the previous section, you can use to run a large number of molecular dynamics simulation. Furhthermore, similarly to the last +directory, which, as in the previous section, you can use to run a large number of molecular dynamics simulation. Furthermore, similarly to the last section you will need to begin this exercise by modifying the location of path to gromacs within this script. Once you have made this modification submit your molecular dynamics jobs by issuing the command: @@ -301,7 +301,7 @@ load "script_path.gplt" \endverbatim Unlike what we saw for the \ref PCAVARS variables in the previous section we find that it is easy to use these \ref PATH variables to distinguish those -configurations that moved to C7ax from those that moved to C7eq. Having said that, however, we still have a large imbalance between the number of +configurations that moved to \f$C_7ax\f$ from those that moved to \f$C_7eq\f$. Having said that, however, we still have a large imbalance between the number of trajectories that move rightwards and the number that move leftwards. We are thus still a long way from unambiguously identifying the location of the transition state ensemble for this system. @@ -317,17 +317,17 @@ Suppose that you were giving your friend instructions as to how to get to your h If you think about these instructions in the abstract what you have is a set of way markers (the item I have put in bold) in a particular order. The list of way markers is important as is the order they appear in in the instructions so we incorporate both these items in the \ref PATH coordinates -that we have just used to study the transition between the transition between C7eq and C7ax. In these coordinates the way markers are a set of -interesting points in a high dimensional space. In other words, these are like the configurations of C7eq and C7ax that we used in the previous sections -when talking about \ref PCAVARS. Each of these configurations lies along the path connecting C7eq and C7ax and, as in the directions example above, one +that we have just used to study the transition between the transition between \f$C_7eq\f$ and \f$C_7ax\f$. In these coordinates the way markers are a set of +interesting points in a high dimensional space. In other words, these are like the configurations of \f$C_7eq\f$ and \f$C_7ax\f$ that we used in the previous sections +when talking about \ref PCAVARS. Each of these configurations lies along the path connecting \f$C_7eq\f$ and \f$C_7ax\f$ and, as in the directions example above, one must pass them in a particular order in order to pass between these two conformations. The final CV that we have used above is thus: \f[ S(X)=\frac{\sum_{i=1}^{N} i\ \exp^{-\lambda \vert X-X_i \vert }}{ \sum_{i=1}^{N} \exp^{-\lambda \vert X-X_i \vert } } \f] -In this expression \f$\vert X-X_i \vert\f$ is the distance between the instaneous coordinate of the system, \f$X\f$, in the high-dimensional space and -\f$X_i\f$ is the coordinate of the \f$i\f$th waymark in the path. The largest exponential in the sum that appears in the numerator and the denominator +In this expression \f$\vert X-X_i \vert\f$ is the distance between the instantaneous coordinate of the system, \f$X\f$, in the high-dimensional space and +\f$X_i\f$ is the coordinate of the \f$i\f$th way mark in the path. The largest exponential in the sum that appears in the numerator and the denominator will thus be the one that corresponds to the point that is closest to where the system currently lies. In other words, \f$S(X)\f$, measures the position on a (curvilinear) path that connects two states of interest as shown in red in the figure below: @@ -337,9 +337,9 @@ on a (curvilinear) path that connects two states of interest as shown in red in \subsection lugano-2-pathz The Z(X) collective variable You may reasonably ask what the purpose these \ref PATH collective variables variables serve given that in this case they seem to do no better than -\f$\phi\f$ when it comes to the tests we have performed on calculating the isocomittor. To answer this question we are going to run one final set of -isocommittor simulations. The input for these calculations are in the directory called 2CV-isocommitor. Once again you will find that there is a script.sh bash script inside this -directory, which, as in the previous section, you can use to run a large number of molecular dynamics simulation. Furhthermore, similarly to the last +\f$\phi\f$ when it comes to the tests we have performed on calculating the isocommittor. To answer this question we are going to run one final set of +isocommittor simulations. The input for these calculations are in the directory called 2CV-isocommittor. Once again you will find that there is a script.sh bash script inside this +directory, which, as in the previous section, you can use to run a large number of molecular dynamics simulation. Furthermore, similarly to the last section you will need to begin this exercise by modifying the location of path to gromacs within this script. Once you have made this modification submit your molecular dynamics jobs by issuing the command: @@ -369,9 +369,9 @@ Z(X)=-\frac{1}{\lambda}\log (\sum_{i=1}^{N} \ \exp^{-\lambda \vert X-X_i \vert } \f] What this quantity measures is shown in green in the figure \ref lugano-2-ab-sz-fig. Essentially it measures the distance between the instantaneous configuration -the system finds itself in and the path that is marked out using the waymarkers. If you plot the data using script_path.gplt what you thus see is that the +the system finds itself in and the path that is marked out using the way markers. If you plot the data using script_path.gplt what you thus see is that the system never moves very far from the path that is defined using the \ref PATH command. In short the system follows this path from the transition state back to -either the C7eq or C7ax configuration. +either the \f$C_7eq\f$ or \f$C_7ax\f$ configuration. We can calculate a quantity similar to \f$Z(X)\f$ for the \ref PCAVARS collective variables. Furthermore, when we plot the data we have generated using this exercise using script_pca.gplt the value this quantity takes is shown plotted against the instantaneous value of the \ref PCAVARS collective variable. If you @@ -381,32 +381,32 @@ coordinate. This exercise illustrates the strength of these \ref PATH collective variables. We can use a \ref PATH to monitor how a large number of coordinates change during a chemical transition. Furthermore, we can use \f$Z(X)\f$ to measure how much real trajectories deviate from our \ref PATH and thus have a quantitative measure of how well our \ref PATH represents the true transition mechanism. Compare this with using a single CV such as \f$\phi\f$. When use use a single CV we -map the high-dimensional data from the traejctory into a lower dimensional space. We thus lose some information on what occurs during the transition. +map the high-dimensional data from the trajectory into a lower dimensional space. We thus lose some information on what occurs during the transition. To be clear we also loose information on the what occurs during the transition when we use \ref PATH as any mapping into a lower dimensional space deletes information. In the \ref PATH case, however, we can use the value of \f$Z(X)\f$ to measure how much data has been lost in mapping the trajectory onto \f$S(X)\f$. These two coordinates, \f$S(X)\f$ and \f$Z(X)\f$, are very flexible. They are thus been used widely in the literature on modelling conformational changes of biomolecules. -A part of this flexibility comes because one can use any set of waymarkers to define the \ref PATH. Another flexibility comes, however, when you -recognise that you can also change the way in which the distance, \f$ \vert X- X_i \vert \f$, is calculated in the two formulas above. For example -this distance can be calculated using the \ref RMSD distance or it can be calculated by mesauring the sum of the squares of a set of displacements +A part of this flexibility comes because one can use any set of way markers to define the \ref PATH. Another flexibility comes, however, when you +recognize that you can also change the way in which the distance, \f$ \vert X- X_i \vert \f$, is calculated in the two formulas above. For example +this distance can be calculated using the \ref RMSD distance or it can be calculated by measuring the sum of the squares of a set of displacements in collective variable values (see \ref TARGET). Changing the manner in which the distance between path way points is calculated thus provides a way to control the level of detail that is incorporated in the description of the reaction \ref PATH. -\subsection lugano-2-pathcv-find Optimising path collective variables +\subsection lugano-2-pathcv-find Optimizing path collective variables Hopefully the previous sections have allowed you to understand how \ref PATH collective variables work and the sorts of problems they might be used to solve. If you have one of these problems to solve the next reasonable question to ask is: how to collect the set of reference frames that serve as the -waymarkers on your \ref PATH. Unfortunately, there is no single answer to this question. Different researchers have used different methods -including using packages that morph one protein structure into another, using information from prior molecular dynamics or ehnanced sampling calculations +way markers on your \ref PATH. Unfortunately, there is no single answer to this question. Different researchers have used different methods +including using packages that morph one protein structure into another, using information from prior molecular dynamics or enhanced sampling calculations and even using \ref PATH collective variables that change adaptively as the simulation progresses. Ultimately, you will need to find the best method for solving your particular problem. Having said that, however, there is some general guidance on setting up \ref PATH collective variable and it is -this that we will focus on in this section. The first thing that you will need to double check is the set of spacings between all the frames in your +this that we will focus on in this section. The first thing that you will need to double check is the spacing between the frames in your \ref PATH. Lets suppose that your \ref PATH has \f$N\f$ of these way markers upon it you will need to calculate is the \f$N \times N\f$ matrix of distances between way markers. That is to say you will have to calculate the distance \f$\vert X_j - X_i \vert\f$ between each pair of frames. The values of the distance in this matrix for a good \ref PATH are shown in the figure below: \anchor lugano-2-good-matrix-fig -\image html belfast-2-good-matrix.png "A good distance matrix for path variables has the gullwing shape shape shown here." +\image html belfast-2-good-matrix.png "A good distance matrix for path variables has the gull wing shape shape shown here." For contrast the values of the distances in this matrix for a bad \ref PATH are shown in the figure below: @@ -416,7 +416,7 @@ For contrast the values of the distances in this matrix for a bad \ref PATH are If the distance matrix looks like the second of the two figures shown above this is indicates that the frames in the \ref PATH that have been chosen are not particularly effective. Lets suppose that we have a \ref PATH with four way markers upon it. In order for the \f$S(x)\f$ CV that was defined earlier to work well frame number 3 must be further from frame number 1 than frame number 2. Similarly frame number 4 must be still further -from frame number 1 than frame number 3. This is what the gullwing shape in \ref lugano-2-good-matrix-fig is telling us. The order of the frames in the +from frame number 1 than frame number 3. This is what the gull wing shape in \ref lugano-2-good-matrix-fig is telling us. The order of the frames in the rows and columns of the matrix is the same as the order that they are run through in the sums in the equation for \f$S(X)\f$. The shape of the surface in this figure shows that the distance between frames \f$i\f$ and \f$j\f$ increases monotonically as the magnitude of the difference between \f$i\f$ and \f$j\f$ is increased, which is what is required. @@ -424,9 +424,9 @@ in this figure shows that the distance between frames \f$i\f$ and \f$j\f$ increa A second important requirement of a good \ref PATH is shown in the figure below: \anchor lugano-2-good-vs-bad-fig -\image html belfast-2-good-vs-bad.png "Comparison between the distances between neighbouring frames on the PATH. A good PATH will have a set of frames that are all approximately equally spaced along it." +\image html belfast-2-good-vs-bad.png "Comparison between the distances between neighboring frames on the PATH. A good PATH will have a set of frames that are all approximately equally spaced along it." -A good \ref PATH has an approximately equal spacing between the neighbouring frames along it. In other words, the distance between frame 1 and frame 2 +A good \ref PATH has an approximately equal spacing between the neighboring frames along it. In other words, the distance between frame 1 and frame 2 is approximately equal to the distance between frame 2 and frame 3 as shown above. When this condition is satisfied a good criterion for selecting a suitable \f$\lambda\f$ parameter to use is: @@ -438,11 +438,11 @@ suitable \f$\lambda\f$ parameter to use is: If you have worked through all of this tutorial make sure that you have understood it by ensuring that you understand what the list of learning outcomes in section \ref lugano-2-lo means and that you can use PLUMED to perform all these tasks. You might then want to read the original paper on the \ref PATH -collective variable method as well as a few other articles in which these coordinates have been used to analyse simulations and to accelerate sampling. +collective variable method as well as a few other articles in which these coordinates have been used to analyze simulations and to accelerate sampling. - Davide Branduardi and Francesco Luigi Gervasio and Michele Parrinello <a href="http://aip.scitation.org/doi/10.1063/1.2432340"> From A to B in free energy space </a> J. Chem. Phys., 126, 054103 (2007) -If you are interested in learning more about isocommitor surfaces and the transition state ensemble you should read up on the transition path sampling method. +If you are interested in learning more about isocommittor surfaces and the transition state ensemble you should read up on the transition path sampling method. */ diff --git a/user-doc/tutorials/moving.txt b/user-doc/tutorials/moving.txt index c322368aa..b4e8d105e 100644 --- a/user-doc/tutorials/moving.txt +++ b/user-doc/tutorials/moving.txt @@ -2,7 +2,7 @@ \page moving Moving from PLUMED 1 to PLUMED 2 Syntax in PLUMED 2 has been completely redesigned based on our experience -using PLUMED 1, hopefully makeing it clearer, more flexible, and less error prone. +using PLUMED 1, hopefully making it clearer, more flexible, and less error prone. The main difference is that whereas in PLUMED 1 lines could be inserted in any order, in PLUMED 2 the order of the lines matters. This is due to a major change in @@ -158,7 +158,7 @@ We decided to use as standard units in PLUMED kj/mol, nm, and ps. Perhaps this is because most of the developers are using GROMACS, which also adopts these units. However, we still allow -the personalization of units by means of the \ref UNITS keyword. +users to change units by means of the \ref UNITS keyword. Since this keyword is included directly in the PLUMED input file, even when using personalized units the input files remains perfectly portable across different MD engines. @@ -166,7 +166,7 @@ portable across different MD engines. \section moving-Directives Directives What follows is a list of all the documented directives of PLUMED 1 together with their -plumed 2 equivalents. Be aware that the input syntaxes for these directives are not totally +plumed 2 equivalents. Be aware that the input syntax for these directives are not totally equivalent. You should read the documentation for the PLUMED 2 Action. <TABLE> diff --git a/user-doc/tutorials/munster.txt b/user-doc/tutorials/munster.txt index 3595a825d..a5fc91061 100644 --- a/user-doc/tutorials/munster.txt +++ b/user-doc/tutorials/munster.txt @@ -25,7 +25,7 @@ more realistic but slower, and a setup in gas phase, which is much faster. Simulations are made using GROMACS 4.6.7, which is here assumed to be already patched with PLUMED and properly installed. However, these examples could be easily converted to other MD software. -All the gromacs input files and analsys scripts are provided in this \tarball{munster}. +All the gromacs input files and analysis scripts are provided in this \tarball{munster}. Users are expected to write PLUMED input files based on the instructions below. @@ -90,7 +90,7 @@ Gromacs md can be run using on the command line: \verbatim > gmx_mpi mdrun -s topolA.tpr -nsteps 10000 \endverbatim -The nsteps flags can be used to change the number of timesteps and topolA.tpr is the name of the tpr file. +The nsteps flags can be used to change the number of time steps and topolA.tpr is the name of the tpr file. While running, gromacs will produce an md.log file, with log information, and a traj.xtc file, with a binary trajectory. The trajectory can be visualized with VMD using a command such as \verbatim @@ -182,7 +182,7 @@ be printed in the `analysis` file you should give more information to the driver (see \ref driver) In this case we inform the driver that the `traj.xtc` file was produced in a run with a timestep of 0.002 ps and -saving a snapshop every 1000 timesteps. +saving a snap shot every 1000 time steps. You might want to analyze a different collective variable, such as the gyration radius. The gyration radius tells how extended is the molecules in space. @@ -211,10 +211,10 @@ Now try to compute the time series of the gyration radius. In case you are running the simulation in water, you might see that at some point this variable shows some crazy jump. -The reason is that the trajectory containes coordinates where +The reason is that the trajectory contains coordinates where molecules are broken across periodic-boundary conditions This happens with GROMACS and some other MD code. These codes typically have tools to process -trajecories and restore whole molecules. This trick is ok for the a-posteriori analysis +trajectories and restore whole molecules. This trick is OK for the a-posteriori analysis we are trying now, but cannot used when one needs to compute a collective variable on-the-fly or, as we will see later, one wants to add a bias to that collective variable. For this reason, we implemented a workaround in PLUMED, @@ -271,7 +271,7 @@ to manipulate MD trajectories. \hidden{Other analysis tools} \subsection munster-monitor-an Other analysis tools -PLUMED also allows you to make some analyis on the collective variables +PLUMED also allows you to make some analysis on the collective variables you are calculating. For example, you can compute a histogram with an input like this one \plumedfile phi: TORSION ATOMS=5,7,9,15 @@ -297,7 +297,7 @@ PLUMED can do much more than a histogram, more information on analysis can be fo Notice that the plumed driver can also be used directly from VMD taking advantage of the PLUMED collective variable tool developed by Toni Giorgino (http://multiscalelab.org/utilities/PlumedGUI). -Just open a recent version of VMD and go to Extensions/Analysis/Collective Variable Analsys (PLUMED). +Just open a recent version of VMD and go to Extensions/Analysis/Collective Variable Analysis (PLUMED). This graphical interface can also be used to quickly build PLUMED input files based on template lines. \endhidden @@ -320,8 +320,8 @@ all the biasing methods available in PLUMED can be found at the \ref Bias page. \hidden{Summary of theory} \subsubsection munster-biasing-me-theory Summary of theory In metadynamics, an external history-dependent bias potential is constructed in the space of -a few selected degrees of freedom \f$ \vec{s}({q}) \f$, generally called collective variables (CVs) \cite metad. -This potential is built as a sum of Gaussians deposited along the trajectory in the CVs space: +a few selected degrees of freedom \f$\vec{s}({q}) \f$, generally called collective variables (CVs) \cite metad. +This potential is built as a sum of Gaussian kernels deposited along the trajectory in the CVs space: \f[ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) @@ -331,7 +331,7 @@ V(\vec{s},t) = \sum_{ k \tau < t} W(k \tau) \f] where \f$ \tau \f$ is the Gaussian deposition stride, -\f$ \sigma_i \f$ the width of the Gaussian for the ith CV, and \f$ W(k \tau) \f$ the +\f$ \sigma_i \f$ the width of the Gaussian for the \f$i\f$th CV, and \f$ W(k \tau) \f$ the height of the Gaussian. The effect of the metadynamics bias potential is to push the system away from local minima into visiting new regions of the phase space. Furthermore, in the long time limit, the bias potential converges to minus the free energy as a function of the CVs: @@ -340,7 +340,7 @@ time limit, the bias potential converges to minus the free energy as a function V(\vec{s},t\rightarrow \infty) = -F(\vec{s}) + C. \f] -In standard metadynamics, Gaussians of constant height are added for the entire course of a +In standard metadynamics, Gaussian kernels of constant height are added for the entire course of a simulation. As a result, the system is eventually pushed to explore high free-energy regions and the estimate of the free energy calculated from the bias potential oscillates around the real value. @@ -364,16 +364,16 @@ where \f$ T \f$ is the temperature of the system. In the long time limit, the CVs thus sample an ensemble at a temperature \f$ T+\Delta T \f$ which is higher than the system temperature \f$ T \f$. The parameter \f$ \Delta T \f$ can be chosen to regulate the extent of free-energy exploration: - \f$ \Delta T = 0\f$ corresponds to standard molecular dynamics, \f$ \Delta T \rightarrow \infty \f$ to standard + \f$ \Delta T = 0\f$ corresponds to standard molecular dynamics, \f$ \Delta T \rightarrow\infty\f$ to standard metadynamics. In well-tempered metadynamics literature and in PLUMED, you will often encounter -the term "biasfactor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) +the term "bias factor" which is the ratio between the temperature of the CVs (\f$ T+\Delta T \f$) and the system temperature (\f$ T \f$): \f[ \gamma = \frac{T+\Delta T}{T}. \f] -The biasfactor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed +The bias factor should thus be carefully chosen in order for the relevant free-energy barriers to be crossed efficiently in the time scale of the simulation. Additional information can be found in the several review papers on metadynamics @@ -415,13 +415,13 @@ Results from a 200ps (100000 steps) trajectory in vacuum are shown in Figure \ref munster-ala-traj-metad. \anchor munster-ala-traj-metad -\image html munster-ala-traj-metad.png "(phi,psi) scatter plot obtained with metadynamics and two different values of the biasfactor. Simulation performed in vacuum." +\image html munster-ala-traj-metad.png "(phi,psi) scatter plot obtained with metadynamics and two different values of the bias factor. Simulation performed in vacuum." As you can see, exploration is greatly enhanced. -Notice that the explored ensemble can be tuned using the biasfactor \f$\gamma\f$. +Notice that the explored ensemble can be tuned using the bias factor \f$\gamma\f$. Larger \f$\gamma\f$ implies that the system will explore states with higher free energy. As a rule of thumb, if you expect a barrier of the order of \f$\Delta G^*\f$, -a reasonable choice for the biasfactor is \f$\gamma\approx\frac{\Delta G}{2k_BT}\f$. +a reasonable choice for the bias factor is \f$\gamma\approx\frac{\Delta G}{2k_BT}\f$. Finally, notice that \ref METAD potential depends on the previously visited trajectories. As such, when you restart a previous simulation, it should read the previously deposited @@ -439,7 +439,7 @@ psi: TORSION ATOMS=7,9,15,17 # # Activate well-tempered metadynamics in phi depositing # a Gaussian every 500 time steps, with initial height equal -# to 1.2 kJoule/mol, biasfactor equal to 10.0, and width to 0.35 rad +# to 1.2 kJ/mol, bias factor equal to 10.0, and width to 0.35 rad METAD ... LABEL=metad @@ -464,7 +464,7 @@ The syntax for the command \ref METAD is simple. The directive is followed by a keyword ARG followed by the labels of the CVs on which the metadynamics potential will act. The keyword PACE determines the stride of Gaussian deposition in number of time steps, -while the keyword HEIGHT specifies the height of the Gaussian in kJoule/mol. For each CVs, one has +while the keyword HEIGHT specifies the height of the Gaussian in kJ/mol. For each CVs, one has to specified the width of the Gaussian by using the keyword SIGMA. Gaussian will be written to the file indicated by the keyword FILE. @@ -497,7 +497,7 @@ by the metadynamics bias potential to visit the other local minimum. As the simu the bias potential fills the underlying free-energy landscape, and the system is able to diffuse in the entire phase space. -The HILLS file contains a list of the Gaussians deposited along the simulation. +The HILLS file contains a list of the Gaussian kernels deposited along the simulation. If we give a look at the header of this file, we can find relevant information about its content: \verbatim @@ -511,7 +511,7 @@ If we give a look at the header of this file, we can find relevant information a The line starting with FIELDS tells us what is displayed in the various columns of the HILLS file: the time of the simulation, the value of phi and psi, the width of the Gaussian in phi and psi, -the height of the Gaussian, and the biasfactor. +the height of the Gaussian, and the bias factor. We can use the HILLS file to visualize the decrease of the Gaussian height during the simulation, according to the well-tempered recipe: @@ -519,14 +519,14 @@ according to the well-tempered recipe: \image html munster-metad-phihills.png "Time evolution of the Gaussian height." If we look carefully at the scale of the y-axis, we will notice that in the beginning the value -of the Gaussian height is higher than the initial height specified in the input file, which should be 1.2 kJoule/mol. -In fact, this column reports the height of the Gaussian rescaled by the pre-factor that +of the Gaussian height is higher than the initial height specified in the input file, which should be 1.2 kJ/mol. +In fact, this column reports the height of the Gaussian scaled by the pre-factor that in well-tempered metadynamics relates the bias potential to the free energy. In this way, when -we will use \ref sum_hills, the sum of the Gaussians deposited will directly provide the free-energy, +we will use \ref sum_hills, the sum of the Gaussian kernels deposited will directly provide the free-energy, without further rescaling needed (see below). One can estimate the free energy as a function of the metadynamics CVs directly from the metadynamics -bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussians +bias potential. In order to do so, the utility \ref sum_hills should be used to sum the Gaussian kernels deposited during the simulation and stored in the HILLS file. To calculate the free energy as a function of phi, it is sufficient to use the following command line: @@ -553,7 +553,7 @@ The result should look like this: To assess the convergence of a metadynamics simulation, one can calculate the estimate of the free energy as a function of simulation time. At convergence, the reconstructed profiles should be similar. -The option --stride should be used to give an estimate of the free energy every N Gaussians deposited, and +The option --stride should be used to give an estimate of the free energy every N Gaussian kernels deposited, and the option --mintozero can be used to align the profiles by setting the global minimum to zero. If we use the following command line: @@ -561,11 +561,11 @@ If we use the following command line: > plumed sum_hills --hills HILLS --stride 100 --mintozero \endverbatim -one free energy is calculated every 100 Gaussians deposited, and the global minimum is set to zero in all profiles. +one free energy is calculated every 100 Gaussian kernels deposited, and the global minimum is set to zero in all profiles. The resulting plot should look like the following: \anchor munster-metad-phifest-fig -\image html munster-metad-phifest.png "Estimates of the free energy as a function of the dihedral phi calculated every 100 Gaussians deposited." +\image html munster-metad-phifest.png "Estimates of the free energy as a function of the dihedral phi calculated every 100 Gaussian kernels deposited." To assess the convergence of the simulation more quantitatively, we can calculate the free-energy difference between the two local minima of the free energy along phi as a function of simulation time. @@ -620,7 +620,7 @@ psi: TORSION ATOMS=7,9,15,17 # # Activate well-tempered metadynamics in psi depositing # a Gaussian every 500 time steps, with initial height equal -# to 1.2 kJoule/mol, biasfactor equal to 10.0, and width to 0.35 rad +# to 1.2 kJ/mol, bias factor equal to 10.0, and width to 0.35 rad METAD ... LABEL=metad @@ -797,7 +797,7 @@ PRINT ARG=phi,psi,res.bias Notice that here we are printing a quantity named `res.bias`. We do this because \ref RESTRAINT does not define a single value (that here would be theoretically named `res`) but a structure with several components. All biasing methods (including \ref METAD) do so, as well as many collective variables (see e.g. \ref DISTANCE used with COMPONENTS keyword). -Printing the bias allows one to know how much a given snapshop was penalized. +Printing the bias allows one to know how much a given snap shot was penalized. Also notice that PLUMED understands numbers in the for `{number}pi`. This is convenient when using torsions, since they are expressed in radians. diff --git a/user-doc/tutorials/others/isdb-1.txt b/user-doc/tutorials/others/isdb-1.txt index 64f951afc..9b947b422 100644 --- a/user-doc/tutorials/others/isdb-1.txt +++ b/user-doc/tutorials/others/isdb-1.txt @@ -3,7 +3,7 @@ \section isdb-1-aims Aims -The aim of this tutorial is to introduce the users to the ISDB module and in particular to Metadynamics Metainfenrence \cite Bonomi:2016ip \cite Bonomi:2016ge ensemble determination. +The aim of this tutorial is to introduce the users to the ISDB module and in particular to Metadynamics Metainference \cite Bonomi:2016ip \cite Bonomi:2016ge ensemble determination. We will reproduce the setup of the simulation for a simple system \cite Lohr:2017gc . For a general overview of the problem of ensembles determination please read \cite Bonomi:2017dn . \section isdb-1-objectives Objectives @@ -19,7 +19,7 @@ The \tarball{isdb-1} for this project contains the following files: - system: a folder with reference files for gromacs (not needed) - reference-impl: a folder to perform a simple implicit solvent simulation - reference-impl-pbmetad: a folder to perform a pbmetad implicit solvent simulation -- m_and_m: a folder to perform a metadynamic metainference simulation +- m_and_m: a folder to perform a metadynamics metainference simulation This tutorial has been tested on a pre-release version of version 2.4. @@ -27,16 +27,16 @@ This tutorial has been tested on a pre-release version of version 2.4. Molecular dynamics simulations are the ideal tool to determine at atomistic resolution the behavior of complex molecules. This great resolution power comes at the cost of approximations that affects the agreement with actual experimental observables. At the same time experimental data alone are generally speaking not enough to determine -a structural ensemble due the inverse nature of the problem, that is to go from few observables to many atoms in many different configurations. Furthemore, experimental data +a structural ensemble due the inverse nature of the problem, that is to go from few observables to many atoms in many different configurations. Furthermore, experimental data are affected by errors of multiple nature, from noise, systematic errors and errors in their atomistic interpretation. Most important experimental data are the result of -the averaging over the ensemble of structure so it is not trivial to deconvolve this signal. One possibility is that of employng MD simulations together with experimental data +the averaging over the ensemble of structure so it is not trivial to deconvolve this signal. One possibility is that of employing MD simulations together with experimental data to generate simulations already corrected for the data themselves. With \ref METAINFERENCE this is done on-the-fly by adding an additional energy to the system that takes into account the agreement with the experimental data considering the multiple sources of errors. \section isdb-1-reference Run a reference simulation The system we use is the EGAAWAASS peptide used in ref. \cite Lohr:2017gc . -First of all we will run a simulation in implicit solvent using the EEF1-SB CHARMM36 force field. EEF1-SB includes a correction to the standard backbone torsion potentianl of CHARMM36, an electrostatic interaction with a distance dependent dielectric constant and a simple gaussian form for the solvation energy. The first two terms are implemented in the force field and using table potentials while the latter is implemented as a collective variable in PLUMED, \ref EEFSOLV . +First of all we will run a simulation in implicit solvent using the EEF1-SB CHARMM36 force field. EEF1-SB includes a correction to the standard backbone torsion potential of CHARMM36, an electrostatic interaction with a distance dependent dielectric constant and a simple gaussian form for the solvation energy. The first two terms are implemented in the force field and using table potentials while the latter is implemented as a collective variable in PLUMED, \ref EEFSOLV . \plumedfile # this is optional and tell to VIM that this is a PLUMED file @@ -52,13 +52,13 @@ bias: BIASVALUE ARG=solv \endplumedfile -This can be run using gromacs (unfortunately recent versions of gromacs do not support verlet groups with table potentials, so performances are currently suboptimal on the gromacs side) +This can be run using gromacs (unfortunately recent versions of gromacs do not support Verlet groups with table potentials, so performances are currently sub-optimal on the gromacs side) \verbatim gmx_mpi mdrun -s run.tpr -table table.xvg -tablep table.xvg -plumed plumed-eef1.dat -v \endverbatim -In order to have a converged sampling for this reference ensemble calculation it is usefull to setup a Metadynamics calculation. In particular we will use \ref PBMETAD because it is then a natural choice for Metadynamics Metainference later. The following input file is meant to be appended to the former. +In order to have a converged sampling for this reference ensemble calculation it is useful to setup a Metadynamics calculation. In particular we will use \ref PBMETAD because it is then a natural choice for Metadynamics Metainference later. The following input file is meant to be appended to the former. \plumedfile # CVs, Psi9, Phi1 are not defined @@ -110,7 +110,7 @@ PRINT FILE=COLVAR ARG=phi2,phi3,phi4,phi5,phi6,phi7,phi8,phi9,psi1,psi2,psi3,psi PRINT FILE=ENERGY ARG=bias.bias,pb.bias STRIDE=200 \endplumedfile -In this case we are running a multiple-replica simulation where the sampling is used to parallelise the Metadynamics time-dependent potential through the use of multiple walkers. +In this case we are running a multiple-replica simulation where the sampling is used to parallelize the Metadynamics time-dependent potential through the use of multiple walkers. \verbatim mpiexec -np 14 gmx_mpi mdrun -s topolnew -multi 14 -plumed plumed-eef1-pbmetad.dat -table table.xvg -tablep table.xvg >& log.out & @@ -118,11 +118,11 @@ mpiexec -np 14 gmx_mpi mdrun -s topolnew -multi 14 -plumed plumed-eef1-pbmetad.d \section isdb-1-m_m Metadynamics Metainference -The former simulations should provide a converged (check for this) ensemble for the peptide. As shown in \cite Lohr:2017gc the agreement with the multiple avaible NMR experimental data is not perfect. In order to generate an ensemble compatible with most of the available experimetnal data it is possible to include them in the simulation using \ref METAINFERENCE . To do so the forward models for the data sets should be defined in the input file. In this case we have backbone chemical shifts, \ref CS2BACKBONE ; residual dipolar couplings for two bonds, \ref RDC ; and J-couplings for multiple atoms, \ref JCOUPLING. -Once the forward models are defined for the data sets, the calculated data together with the corresponding experimental values can be used to calculate the metainference score. The metainference score is additive so it can be splitted into multiple \ref METAINFERENCE entries. In this case we are using two metainference entries for the two sets of RDCs because these are compared with the experimental data modulo a constant that should be unique each data set. Then we use one metainference for all the jcouplings and another one for the chemical shifts. In this latter case we use a different noise model, i.e. NOISE=MOUTLIERS because the forward model for chemical shifts can result in systematic errors for some of them. +The former simulations should provide a converged (check for this) ensemble for the peptide. As shown in \cite Lohr:2017gc the agreement with the multiple available NMR experimental data is not perfect. In order to generate an ensemble compatible with most of the available experimental data it is possible to include them in the simulation using \ref METAINFERENCE . To do so the forward models for the data sets should be defined in the input file. In this case we have backbone chemical shifts, \ref CS2BACKBONE ; residual dipolar couplings for two bonds, \ref RDC ; and J-couplings for multiple atoms, \ref JCOUPLING. +Once the forward models are defined for the data sets, the calculated data together with the corresponding experimental values can be used to calculate the metainference score. The metainference score is additive so it can be split into multiple \ref METAINFERENCE entries. In this case we are using two metainference entries for the two sets of RDCs because these are compared with the experimental data modulo a constant that should be unique each data set. Then we use one metainference for all the jcouplings and another one for the chemical shifts. In this latter case we use a different noise model, i.e. NOISE=MOUTLIERS because the forward model for chemical shifts can result in systematic errors for some of them. -The following input file is meant to be appended to the formers. +The following input file is meant to be appended to the former files. \plumedfile # EXPERIMENTAL DATA SECTION diff --git a/user-doc/tutorials/others/ves-lugano2017-01-metad.txt b/user-doc/tutorials/others/ves-lugano2017-01-metad.txt index bab75009d..dc86ab8aa 100644 --- a/user-doc/tutorials/others/ves-lugano2017-01-metad.txt +++ b/user-doc/tutorials/others/ves-lugano2017-01-metad.txt @@ -25,7 +25,7 @@ the following folders: \subsection ves-lugano2017-metad-subsection-1 The system -We consider the association/dissociation of NaCl in acqueous solution. +We consider the association/dissociation of NaCl in aqueous solution. The dissociation barrier is expected to be around 2.5 \f$k_\mathrm{B} T\f$. One interesting aspect of the ion dissociation problem is that collective solvent motions play an important role in the transition. @@ -110,7 +110,7 @@ It is unable to cross the \f$ \sim 5 k_\mathrm{B} T\f$ barrier located at a dist vmd out.dcd -psf nacl.psf \endverbatim The trajectory has been saved in unwrapped format in order to avoid bonds -stretching from one side to the box to the other due to periodic bounday +stretching from one side to the box to the other due to periodic boundary conditions. In VMD we can wrap the atoms without breaking the bonds and show the box using the commands: \verbatim @@ -192,27 +192,27 @@ the trajectory and the effective FES. In principle the cost of a metadynamics simulation should increase as the simulation progresses due to the need of adding an increasing number of -gaussians to calculate the bias. However, since a grid is used to build up the +Gaussian kernels to calculate the bias. However, since a grid is used to build up the bias this effect is not observed. You can check what happens if you do not use the GRID_* keywords. Remember that the bins in the grid should be small enough to -describe the changes in the bias created by the gaussians. Normally a bin of +describe the changes in the bias created by the Gaussian kernels. Normally a bin of size \f$ \sigma / 5 \f$ (with \f$ \sigma \f$ the gaussian width) is small enough. \subsection ves-lugano2017-metad-subsection-4 Assess convergence -One way to identify if a WT-MetaD simulation has converged is observing the +One way to identify if a well tempered metadynamics simulation has converged is observing the estimated free energy surface at different times. The FES is estimated by using the relation (again up to an arbitrary constant): \f[ F(\mathbf{s}) = - \left ( \frac{\gamma}{\gamma-1} \right ) V(\mathbf{s},t) \f] -and the bias potential is calculated as a sum of gaussians. +and the bias potential is calculated as a sum of Gaussian kernels. This can be done with Plumed 2 using for instance: \verbatim plumed sum_hills --hills ../HILLS --min 0.1 --max 0.8 --bin 300 --stride 100 \endverbatim Most of the flags are self explanatory. The flag --stride 100 will result in -the FES been written every 100 gaussians, i.e. 100 ps. +the FES been written every 100 Gaussian kernels, i.e. 100 ps. Inside the folder FES_calculation you will find a script run.sh that executes the sum_hills command and a gnuplot script plot.gpi that can be used typing:. \verbatim gnuplot plot.gpi @@ -220,7 +220,7 @@ gnuplot plot.gpi After roughly 3 ns the free energy surface does not change significantly with time except for an immaterial constant \f$ c(t) \f$ that grows in time. This is in line with well-tempered metadynamics -asymptotic behaviour: +asymptotic behavior: \f[ V(\mathbf{s},t)= - \left ( 1- \frac{1}{\gamma} \right ) F(\mathbf{s}) + c(t). @@ -302,7 +302,7 @@ instance if some features of the FES could not be captured by the kernels, the reweighting procedure will show them. This will become clearer in the VES tutorial. The scripts to perform this calculation are found in the folder ReweightDistance. -In metadynamics quasistationary limit the weight assigned to a given +In metadynamics quasi-stationary limit the weight assigned to a given configuration is \cite Tiwary_jp504920s : \f[ w(\mathbf{R}) \propto e^{\beta ( V(\mathbf{s},t) - c(t) )}. @@ -313,7 +313,7 @@ becomes clear. \f$ c(t) \f$ keeps growing even after long times, reflecting the approximately rigid shift of the bias with time. Normally the first part of the trajectory is not used for reweighting since -during this period the simulation has not reached the quasistationary limit. +during this period the simulation has not reached the quasi-stationary limit. In this case we can discard the first 2 or 3 ns of simulation. To disregard the first 3 ns of simulation we can use sed to delete the first 15000 lines from the COLVAR file: \verbatim diff --git a/user-doc/tutorials/others/ves-lugano2017-02-ves1.txt b/user-doc/tutorials/others/ves-lugano2017-02-ves1.txt index 0d88e7bde..8046d2884 100644 --- a/user-doc/tutorials/others/ves-lugano2017-02-ves1.txt +++ b/user-doc/tutorials/others/ves-lugano2017-02-ves1.txt @@ -53,13 +53,12 @@ in some basis set: \f[ V(\mathbf{s}) = \sum\limits_{i} \alpha_i \, f_i(\mathbf{s}), \f] -where \f$ f_i(\mathbf{s}) \f$ are the basis functions and the \f$ -\boldsymbol\alpha \f$ are the coefficients in the expansion. -We then need to find the \f$ \boldsymbol\alpha \f$ that minimize \f$ \Omega +where \f$ f_i(\mathbf{s}) \f$ are the basis functions and the \f$\boldsymbol\alpha \f$ are the coefficients in the expansion. +We then need to find the \f$\boldsymbol\alpha \f$ that minimize \f$ \Omega [V] \f$. In principle one could use any optimization algorithm. In practice the algorithm that has become the default choice for VES is the so-called averaged stochastic gradient descent algorithm \cite Bach-NIPS-2013. -In this algorithm the \f$ \boldsymbol\alpha \f$ are evolved iteratively +In this algorithm the \f$\boldsymbol\alpha \f$ are evolved iteratively according to: \f[ \boldsymbol\alpha^{(n+1)} = \boldsymbol\alpha^{(n)}-\mu @@ -69,10 +68,9 @@ H(\bar{\boldsymbol\alpha}^{(n)})[\boldsymbol\alpha^{(n)}-\bar{\boldsymbol\alpha} \right] \f] where \f$\mu\f$ is the step size, -\f$ \bar{\boldsymbol\alpha}^{(n)} \f$ is the running average of \f$ -\boldsymbol\alpha^{(n)} \f$ at iteration \f$ n \f$, and -\f$ \nabla\Omega(\bar{\boldsymbol\alpha}^{(n)}) \f$ and -\f$ H(\bar{\boldsymbol\alpha}^{(n)}) \f$ +\f$\bar{\boldsymbol\alpha}^{(n)} \f$ is the running average of \f$\boldsymbol\alpha^{(n)} \f$ at iteration \f$ n \f$, and +\f$\nabla\Omega(\bar{\boldsymbol\alpha}^{(n)}) \f$ and +\f$H(\bar{\boldsymbol\alpha}^{(n)}) \f$ are the gradient and Hessian of \f$ \Omega[V] \f$ evaluated at the running average at iteration \f$ n \f$, respectively. The behavior of the coefficients will become clear in the examples below. @@ -175,8 +173,7 @@ Now that we have set the scene, we can run our simulation using the run.sh script in the Example1 folder. The simulation will produce several files: - COLVAR: Just as in the metadynamics example. -- coeffs.data : Values of the coefficients \f$ \boldsymbol\alpha \f$ and \f$ - \bar{\boldsymbol\alpha} \f$ . +- coeffs.data : Values of the coefficients \f$\boldsymbol\alpha \f$ and \f$\bar{\boldsymbol\alpha} \f$ . - bias.<bias-name>.iter-<iteration-number> : Bias potential as a function of \f$ \mathbf{s} \f$ at iteration <iteration-number>. - fes.<bias-name>.iter-<iteration-number> : FES at iteration <iteration-number>. @@ -184,8 +181,7 @@ script in the Example1 folder. The simulation will produce several files: You can first observe how the system moves in the CV space in a fashion similar to metadynamics. -Then we can see the evolution of \f$ \boldsymbol\alpha \f$ and \f$ - \bar{\boldsymbol\alpha} \f$ . The first lines of the file coeffs.data are: +Then we can see the evolution of \f$\boldsymbol\alpha \f$ and \f$\bar{\boldsymbol\alpha} \f$ . The first lines of the file coeffs.data are: \verbatim #! FIELDS idx_d1 b1.coeffs b1.aux_coeffs index @@ -233,8 +229,8 @@ Then we can see the evolution of \f$ \boldsymbol\alpha \f$ and \f$ \endverbatim The first column are the coefficient indices, the second are the -\f$ \bar{\boldsymbol\alpha} \f$, and the third are the -\f$ \boldsymbol\alpha \f$. Each block in the file corresponds to a different +\f$\bar{\boldsymbol\alpha} \f$, and the third are the +\f$\boldsymbol\alpha \f$. Each block in the file corresponds to a different iteration, in this case iteration 0 and 10. We can plot the evolution of the coefficients using the gnuplot script plotCoeffs.gpi . The output should be similar to the next figure. @@ -242,11 +238,11 @@ the next figure. \anchor ves-school-2017-ves1-coeffs1 \image html ves-lugano2017-ves1_coeffs1.png "Evolution of the instantaneous and averaged coefficients with simulation time" -The \f$ \boldsymbol\alpha \f$ change fast and oscillate around some mean -value. The \f$ \bar{\boldsymbol\alpha} \f$ evolve smoothly until they stabilize +The \f$\boldsymbol\alpha \f$ change fast and oscillate around some mean +value. The \f$\bar{\boldsymbol\alpha} \f$ evolve smoothly until they stabilize around some equilibrium value. It is important to remember that the bias is a -function of \f$ \bar{\boldsymbol\alpha} \f$ and since these evolve smoothly, so -will the bias. Once that the \f$ \bar{\boldsymbol\alpha} \f$ have stabilized, the +function of \f$\bar{\boldsymbol\alpha} \f$ and since these evolve smoothly, so +will the bias. Once that the \f$\bar{\boldsymbol\alpha} \f$ have stabilized, the simulation can be considered converged. It is also interesting to observe how the estimation of the FES evolves in time. For this we @@ -426,7 +422,7 @@ cp ../coeffs.data . \endverbatim In order to restart, the RESTART keyword must be added at the beginning of -Plumed's input named plumed.restart.dat: +the input file for PLUMED named plumed.restart.dat: \verbatim RESTART diff --git a/user-doc/tutorials/others/ves-lugano2017-03-ves2.txt b/user-doc/tutorials/others/ves-lugano2017-03-ves2.txt index 476a63124..74fd232fa 100644 --- a/user-doc/tutorials/others/ves-lugano2017-03-ves2.txt +++ b/user-doc/tutorials/others/ves-lugano2017-03-ves2.txt @@ -40,7 +40,7 @@ metadynamics. The advantages of this distribution are that the features of the FES (metastable states) are preserved and that the system is not forced to sample regions of high free energy as it would if we had chosen the uniform target distribution. This is specially important when biasing 2 CVs and there are large regions of very -high free energy and therefore they represent unphysical configurations. +high free energy and therefore they represent un-physical configurations. There is a caveat though, \f$ p(\mathbf{s}) \f$ depends on \f$ F(\mathbf{s})\f$ that is the function that we are trying to calculate. @@ -132,7 +132,7 @@ disregard the first 1 ns of simulation: sed '2,5000d' ../COLVAR > COLVAR \endverbatim To calculate the biased and unbiased distribution of CVs we use the following -Plumed's input: +PLUMED input: \plumedfile distance: READ FILE=COLVAR IGNORE_TIME VALUES=d1 ves: READ FILE=COLVAR IGNORE_TIME VALUES=b1.bias @@ -204,7 +204,7 @@ Equivalently one may say that fluctuations are enhanced. \subsection ves-lugano2017-ves2-subsection-4 Example 2: Constructing a 2 dimensional bias In this example we will construct a 2 dimensional bias on the distance Na-Cl -and the coodrination number of Na with respect to O. +and the coordination number of Na with respect to O. The files to run this example are included in the Example2 folder. Two dimensional biases in VES can be written: \f[ diff --git a/user-doc/tutorials/others/ves-lugano2017-04-kinetics.txt b/user-doc/tutorials/others/ves-lugano2017-04-kinetics.txt index be2e51cda..ae60a3578 100644 --- a/user-doc/tutorials/others/ves-lugano2017-04-kinetics.txt +++ b/user-doc/tutorials/others/ves-lugano2017-04-kinetics.txt @@ -2,7 +2,7 @@ \page ves-lugano2017-kinetics MARVEL-VES tutorial (Lugano Feb 2017): Kinetics \section ves-lugano2017-kinetics-aims Aims -The aim of this tutorial is to introduce the use of VES for obtaining kinetic information of activated processes. We will learn how to set up a biased simulation using a fixed flooding potential acting on a relevant slow degree of freedom. We can then rescale the accelerated MD time in a post-processing procedure. +The aim of this tutorial is to introduce the use of VES for obtaining kinetic information of activated processes. We will learn how to set up a biased simulation using a fixed flooding potential acting on a relevant slow degree of freedom. We can then scale the accelerated MD time in a post-processing procedure. \section ves-lugano2017-kinetics-lo Learning Outcomes @@ -39,7 +39,7 @@ The <a href="tutorial-resources/ves-lugano2017-kinetics.tar.gz" download="ves-lu \subsection ves-lugano2017-kinetics-exercise-1A Exercise 1A. Preliminary investigation using VES -As an example of an activated process in materials science, we will work with the Stone-Wales transformation in a carbon nanotube. The <a href="tutorial-resources/ves-lugano2017-kinetics.tar.gz" download="ves-lugano2017-kinetics.tar.gz"> tarball </a> for this project contains the inputs neccessary to run a simulation in LAMMPS for a 480 atom carbon nanotube. We use the AIREBO (Adaptive Intermolecular Reactive Empirical Bond Order) force field parameters which can approximately describe C-C bond breakage and formation at reasonable computational cost. For CVs we can use the coordination number in PLUMED to measure the number of covalent bonds among different groups of atoms. The transformation involves breaking two C-C bonds and forming two alternative C-C bonds.A definition of these CVs as well as the relevant C-C bonds are depicted in Figure \ref stone-wales. We prepare a PLUMED input file as follows. +As an example of an activated process in materials science, we will work with the Stone-Wales transformation in a carbon nanotube. The <a href="tutorial-resources/ves-lugano2017-kinetics.tar.gz" download="ves-lugano2017-kinetics.tar.gz"> tarball </a> for this project contains the inputs necessary to run a simulation in LAMMPS for a 480 atom carbon nanotube. We use the AIREBO (Adaptive Intermolecular Reactive Empirical Bond Order) force field parameters which can approximately describe C-C bond breakage and formation at reasonable computational cost. For CVs we can use the coordination number in PLUMED to measure the number of covalent bonds among different groups of atoms. The transformation involves breaking two C-C bonds and forming two alternative C-C bonds.A definition of these CVs as well as the relevant C-C bonds are depicted in Figure \ref stone-wales. We prepare a PLUMED input file as follows. \plumedfile # set distance units to angstrom, time to ps, and energy to eV @@ -206,7 +206,7 @@ velocity all create 1700. 495920 Choose a different 6 digit random number and repeat Exercise 2. How does the escape time compare to what you obtained before? Repeat the procedure several times with different velocity seeds to get a distribution of first passage times. Make sure you launch each simulation from a separate directory and keep all COLVAR files as you will need them in the next section where we will analyze the transition times. \subsection ves-lugano2017-kinetics-exercise-3 Exercise 3. Post processing to obtain unbiased estimate for the transition rate -In the previous section you generated several trajectories with different first passage times. However, these times need to be re-weighted to correct for the bias potential. We can rescale the time according to the hyperdynamics formula +In the previous section you generated several trajectories with different first passage times. However, these times need to be re-weighted to correct for the bias potential. We can scale the time according to the hyperdynamics formula \f[ t^{*} = \Delta t_{MD} \sum_i^{n} e^{\beta V(s)} -- GitLab