diff --git a/src/isdb/CS2Backbone.cpp b/src/isdb/CS2Backbone.cpp index 0c4fc0db63c384183edb9153d4e0c71c7ec955ea..b26c54ba309ef801caafa7398678fc1e665f6d4b 100644 --- a/src/isdb/CS2Backbone.cpp +++ b/src/isdb/CS2Backbone.cpp @@ -50,36 +50,38 @@ namespace isdb { /* Calculates the backbone chemical shifts for a protein. -The functional form is that of CamShift \cite Kohlhoff:2009us. The chemical shifts -of the selected nuclei/residues are saved as components. Reference experimental values -can also be stored as components. The two sets of components can then be used to calculate -either a scoring function as in \cite Robustelli:2010dn \cite Granata:2013dk, using -the keyword CAMSHIFT or to calculate ensemble averaged chemical shift as in \cite Camilloni:2012je -\cite Camilloni:2013hs (see \ref ENSEMBLE, \ref STATS and \ref RESTRAINT). Finally they can -also be used as input for \ref METAINFERENCE, \cite Bonomi:2016ip . In the current implementation there is -no need to pass the data to \ref METAINFERENCE because \ref CS2BACKBONE can internally enable Metainference -using the keywork DOSCORE. - -CamShift calculation is relatively heavy because it often uses a large number of atoms, in order -to make it faster it is currently parallelised with \ref Openmp. - -As a general rule, when using \ref CS2BACKBONE or other experimental restraints it is better to +The functional form is that of CamShift \cite Kohlhoff:2009us. The chemical shift +of the selected nuclei can be saved as components. Alternatively one can calculate either +the CAMSHIFT score (usefull as a collective variable \cite Granata:2013dkor as a scoring +function \cite Robustelli:2010dn) or a \ref METAINFERENCE score (using DOSCORE). +For these two latter cases experimental chemical shifts must be provided. + +CS2BACKBONE calculation can be relatively heavy because it often uses a large number of atoms, it can +be parallelised using MPI and \ref Openmp. + +As a general rule, when using \ref CS2BACKBONE or other experimental restraints it may be better to increase the accuracy of the constraint algorithm due to the increased strain on the bonded structure. In the case of GROMACS it is safer to use lincs-iter=2 and lincs-order=6. -In general the system for which chemical shifts are calculated must be completly included in +In general the system for which chemical shifts are calculated must be completely included in ATOMS and a TEMPLATE pdb file for the same atoms should be provided as well in the folder DATADIR. -The atoms are made automatically whole unless NOPBC is used, in particular if the system is made of +The system is made automatically whole unless NOPBC is used, in particular if the system is made by multiple chains it is usually better to use NOPBC and make the molecule whole \ref WHOLEMOLECULES -selecting an appropriate order. +selecting an appropriate order of the atoms. The pdb file is needed to the generate a simple topology of the protein. +For histidines in protonation states different from D the HIE/HSE HIP/HSP name should be used. +GLH and ASH can be used for the alternative protonation of GLU and ASP. Non-standard amino acids and other +molecules are not yet supported, but in principle they can be named UNK. If multiple chains are present +the chain identifier must be in the standard PDB format, together with the TER keyword at the end of each chain. +Termini groups like ACE or NME should be removed from the TEMPLATE pdb because they are not recognized by +CS2BACKBONE. In addition to a pdb file one needs to provide a list of chemical shifts to be calculated using one file per nucleus type (CAshifts.dat, CBshifts.dat, Cshifts.dat, Hshifts.dat, HAshifts.dat, Nshifts.dat), -all the six files should always be present. A chemical shift for a nucleus is calculated if a value -greater than 0 is provided. For practical purposes the value can correspond to the experimental value. -Residues numbers should go from 1 to N irrespectively of the numbers used in the pdb file. The first and -last residue of each chain should be preceeded by a # character. Termini groups like ACE or NME should -be removed from the PDB. +add only the files for the nuclei you need, but each file should include all protein residues. +A chemical shift for a nucleus is calculated if a value greater than 0 is provided. +For practical purposes the value can correspond to the experimental value. +Residues numbers should match that used in the pdb file, but must be positive, so double check the pdb. +The first and last residue of each chain should be preceeded by a # character. \verbatim CAshifts.dat: @@ -89,31 +91,26 @@ CAshifts.dat: . . #last 0.0 -#last+1 (first) of second chain +#first of second chain . #last of second chain \endverbatim The default behaviour is to store the values for the active nuclei in components (ca_#, cb_#, co_#, ha_#, hn_#, nh_# and expca_#, expcb_#, expco_#, expha_#, exphn_#, exp_nh#) with NOEXP it is possible -to only store the backcalculated values. - -A pdb file is needed to the generate a simple topology of the protein. For histidines in protonation -states different from D the HIE/HSE HIP/HSP name should be used. GLH and ASH can be used for the alternative -protonation of GLU and ASP. Non-standard amino acids and other molecules are not yet supported, but in principle -they can be named UNK. If multiple chains are present the chain identifier must be in the standard PDB format, -together with the TER keyword at the end of each chain. +to only store the backcalculated values, where # includes a chain and residue number. -One more standard file is also needed in the folder DATADIR: camshift.db. This file includes all the CamShift parameters -and can be found in regtest/isdb/rt-cs2backbone/data/ . +One more standard file is also needed in the folder DATADIR: camshift.db. This file includes all the parameters needed to +calculate the chemical shifts and can be found in regtest/isdb/rt-cs2backbone/data/ . All the above files must be in a single folder that must be specified with the keyword DATADIR. -Additional material and examples can be also found in the tutorial \ref belfast-9 +Additional material and examples can be also found in the tutorial \ref belfast-9 as well as in the cs2backbone regtests +in the isdb folder. \par Examples -In this first example the chemical shifts are used to calculate a scoring function to be used +In this first example the chemical shifts are used to calculate a collective variable to be used in NMR driven Metadynamics \cite Granata:2013dk : \plumedfile