# HHybrid Monte Carlo method for conformational sampling of proteins

 Sana 04.01.2018 Hajmi 505 b.

• ## Huge gap: sequence data and 3D structure data

• EMBL/GENBANK, DNA (nucleotide) sequences 15 million sequence, 15,000 million base pairs
• SWISSPROT, protein sequences 120,000 entries
• PDB, 3D protein structures 20,000 entries
• ## Bridging the gap through prediction

• Aim of structural genomics:
• “Structurally characterize most of the protein sequences by an efficient combination of experiment and prediction,” Baker and Sali (2001)
• Thermodynamics hypothesis: Native state is at the global free energy minimum Anfinsen (1973)

• ## Short time kinetics

• strong correctness possible
• e.g., transport properties, diffusion coefficients

• ## Sampling

• Compute equilibrium averages by visiting all (most) of “important” conformations
• Examples:
• Equilibrium distribution of solvent molecules in vacancies
• Free energies
• Characteristic conformations (misfolded and folded states)

• ## We can sample from a distribution with density p(x) by simulating a Markov chain with the following transitions:

• From the current state, x, a candidate state x’ is drawn from a proposal distribution S(x,x’). The proposed state is accepted with prob. min[1,(p(x’) S(x’,x)) / (p(x) S(x,x’))]
• If the proposal distribution is symmetric, S(x’,x)) = S(x,x’)), then the acceptance prob. only depends on p(x’) / p(x)

• ## Invalid proposals:

• x’ = 1 / x (Jacobian not 1)
• x’ = x + 5 (not reversible)

• ## 3. Compute change in total energy

• H = H(q’,p’) - H(q,p)

• ## Is method sampling from desired distribution?

• Does it preserve detailed balance?
• Use simple model systems that can be solved analytically. Compare to analytical results or well known solution methods. Examples, Lennard-Jones liquid, butane
• Is it ergodic?
• Impossible to prove for realistic problems. Instead, show self-averaging of properties

• ## Is system equilibrated?

• Average values of set of properties fluctuate around mean value
• Convergence to steady state from
• Different initial conditions
• Different pseudo random number generators
• ## Are statistical errors small?

• Run should be about 10 times longer than slowest relaxation in system
• Estimate statistical errors by independent block averaging
• Compute properties
• Vary system sizes

• ## Shadow Hamiltonian: SH2p = HM + O(t 2p)

• Arbitrary accuracy
• Easy to compute
• Stable energy graph

• ## Replace total energy H with shadow energy

• SH2m = SH2m (q’,p’) – SH2m (q,p)

• ## Cost per conformation is total simulation time divided by number of new conformations discovered (2mlt, dt = 0.5 fs)

• HMC 122 s/conformation
• SHMC 16 s/conformation
• HMC discovered 270 conformations in 33000 seconds
• SHMC discovered 2340 conformations in 38000 seconds

• ## System size

• Parallel Multigrid O(N) electrostatics
• ## Applications

• Free energy estimation for drug design
• Folding and metastable conformations
• Average estimation

• ## Dr. Edward Maginn’s “Monte Carlo Primer”

Do'stlaringiz bilan baham:

Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2017
ma'muriyatiga murojaat qiling