Artificial Intelligence Resolves Kinetic Pathways of Magnesium Binding to RNA

Magnesium is an indispensable cofactor in countless vital processes. In order to understand its functional role, the characterization of the binding pathways to biomolecules such as RNA is crucial. Despite the importance, a molecular description is still lacking since the transition from the water-mediated outer-sphere to the direct inner-sphere coordination is on the millisecond time scale and therefore out of reach for conventional simulation techniques. To fill this gap, we use transition path sampling to resolve the binding pathways and to elucidate the role of the solvent in the binding process. The results reveal that the molecular void provoked by the leaving phosphate oxygen of the RNA is immediately filled by an entering water molecule. In addition, water molecules from the first and second hydration shell couple to the concerted exchange. To capture the intimate solute–solvent coupling, we perform a committor analysis as the basis for a machine learning algorithm that derives the optimal deep learning model from thousands of scanned architectures using hyperparameter tuning. The results reveal that the properly optimized deep network architecture recognizes the important solvent structures, extracts the relevant information, and predicts the commitment probability with high accuracy. Our results provide detailed insights into the solute–solvent coupling which is ubiquitous for kosmotropic ions and governs a large variety of biochemical reactions in aqueous solutions.


Convergence of the free energy profiles
The one-dimensional free energy profile as a function of the distance between Mg 2+ and the phosphate oxygen O1P was calculated from umbrella sampling. To ensure convergence, each window was simulated for 100 ns and the results were divided into 8 blocks ( Figure S1A). Each block yields identical results and agrees with the 100 ns simulation. Further insight into the quality of the umbrella sampling method is provided by the evenly spaced and overlapping histograms ( Figure S1B).

Free energy landscape
The two-dimensional free energy landscape as a function of the distance between Mg 2+ and the phosphate oxygen O1P is calculated from umbrella sampling. Without further restraints, the system has three stable states ( Figure S2). State A corresponds to the inner-sphere conformation in which Mg 2+ is coordinated by the O1P atom and five additional water molecules. State B corresponds to the outer-sphere conformation in which the O1P atom is replaced by the selected water molecule W ex for which the biasing umbrella potential is applied. State C corresponds to conformations in which O1P is replaced by any other of the N −6 water molecules (where N is the number of water molecules in the simulations box and 6 is the coordination number). In this two-dimensional projection, transitions between state A and state C involve additional water molecules that are hidden in the selected two-dimensional projection. Here, an N − 6 dimensional representation would be required to capture all relevant coordinates. However, since all outer-sphere states are identical since the water molecules are indistinguishable, it is sufficient to show only the exchanges between state A and state B in the two-dimensional projection. This is achieved by applying an additional biasing potential on the coordination number n 1 . Hereby, the coordination number is calculated from the switching function s(r) = 1 − (r/r 0 ) 24 1 − (r/r 0 ) 288 and n 1 = N s(r) with the ion-water distances r and the cutoff r 0 = 0.3 nm. With the hydration bias, all water molecules other than the five closest water molecules and the selected exchanging water are forced to remain outside the first hydration shell by setting S1  Figure S2: Two-dimensional free energy landscape F (r I , s 6 ) of the Mg 2+ -O1P distance r I and the hydration parameter s 6 . s 6 is the sum of the distances to the five closest water molecules (shown in red and white) and the selected water molecule W ex for which the biasing umbrella potential is applied (shown in blue). The green water molecule indicates one of the N − 6 unlabeled water molecules. The energy contour spacing is 5 k B T.
n 1 = 0 using a force constant of k h = 1000 kJ/mol.

Dependence of the performance on the deep learning model architecture
The quality of the deep learning model strongly depends on the choice of the underlying model architecture. To illustrate the dependence of the accuracy on the deep learning model architecture, 12 architectures out of several thousands were used to predict the transition state (0.45 < p pred A < 0.55) and to calculate the resulting committor distribution p(p true A ). The model architectures selected for this analysis were the following: Optimal model: 5 hidden layers with 256 neurons, ReLU activation and an initial learning rate of 1.9 * 10 −4 . A dropout layer with 30% dropout is placed directly before the output layer. Model 1: 5 hidden layers with 256 neurons, ReLU activation and an initial learning rate of 2.8 * 10 −3 . A dropout layer with 30% dropout is placed after the second hidden dense layer. Model 2: 3 hidden layers with 192 neurons, ReLU activation and an initial learning rate of 1.6 * 10 −3 . A dropout layer with 20% dropout is placed after the input layer. Model 3: 4 hidden layers with 192 neurons, ReLU activation and an initial learning rate of 3.8 * 10 −3 . A dropout layer with 20% dropout is placed after the input layer. Model 4: 3 hidden layers with 192 neurons, ReLU activation and an initial learning rate of 5.9 * 10 −3 . A dropout layer with 10% dropout is placed after the input layer. Model 5: 4 hidden layers with 224 neurons, ReLU activation and an initial learning rate of 1.3 * 10 −4 . A dropout layer with 30% dropout is placed after the input layer. Model 6: 5 hidden layers with 256 neurons, ReLU activation and an initial learning rate of 1.3 * 10 −3 without dropout layer. Model 7: 5 hidden layers with 192 neurons, ReLU activation and an initial learning rate of 8.0 * 10 −4 . A dropout layer with 20% dropout is placed after the input layer. Model 8: 4 hidden layers with 224 neurons, ReLU activation and an initial learning rate of 4.3 * 10 −4 . A dropout layer with 20% dropout is placed after the second hidden dense layer. Model 9: 5 hidden layers with 256 neurons, SELU activation and an initial learning rate of 3.6 * 10 −3 without dropout layer. Model 10: 6 hidden layers with 256 neurons, ReLU activation and an initial learning rate of 4.0 * 10 −3 without dropout layer. Model 11: 3 hidden layers with 224 neurons, ReLU activation and an initial learning rate of 2.0 * 10 −3 without dropout layer. Model 12: 2 hidden layers with 192 neurons, ReLU activation and an initial learning rate of 5.3 * 10 −3 . A dropout layer with 30% dropout is placed after the first hidden dense layer.

S2
In order to determine the error one makes upon discarding most of the features, we performed additional calculations in which we used only the two most important features. Figure S3A, B shows the direct comparison of the prediction using all 83 features or for using only the two most relevant features. The results show, that with only two features the prediction is not very accurate ( Figure S3B with RMS error of 23%). Further insight into the quality of the predictions with only two features is obtained from the distribution of the transition states ( Figure S3C). With the two most important features the distribution is very broad and the quality of the prediction is compatible to the one by the human expert. The influence of the splitting was analyzed by using the following amounts of data for training, test and validation: 90-5-5, 80-10-10 and 60-20-20%. We selected the optimal model by running a Keras Tuner random search in the space of model parameters (100 different model architectures) and training each model for 50 epochs. The results are shown in Figure S4. The RMS error of the prediction is 10.0% (90-5-5), 10.1% (80-10-10) and 10.6% (60-20-20). Larger amounts of data for training yield slightly better predictions as expected. Overall, the results are robust and the details of the splitting does not have a significant influence.

Commitment probability from constrained simulations
To test the influence of the second hydration shell on the commitment probability, we performed additional constrained simulations. For 600 shooting points, the positions of Mg 2+ , RNA, five closest water molecules and exchanging water were kept constant. All other atoms were equilibrated for 1 ns in the NVT ensemble. Subsequently, 100 trajectories were initiated with velocities drawn from a Maxwell-Boltzmann velocity distribution and run forward and backward for 2 ps (see main manuscript for further details). The commitment probability p A was calculated from the fraction of trajectories that reached state A. Figure S5 shows a comparison of the commitment probability for 600 conformations drawn directly from transition path sampling and for the same conformations after the constrained equilibration described above. Clearly, the values of p A are different for the unconstrained and the constrained conformations. We conclude that the positions of Mg 2+ , RNA, five closest water molecules and exchanging water are insufficient to predict p A . Consequently, the exchange reaction is influenced by water molecules outside the first hydration shell. A quantitative description should therefore include the water molecules from the second hydration shell or even beyond.

Features
Table S1 lists all features used for deep learning. The features include all distances r i between Mg 2+ and the 20 closest water molecules, distances between Mg 2+ and different RNA atoms and the Cl − ion, all angles a i formed between O1P, Mg 2+ and the 20 closest water molecules, Steinhardt-Nelson order parameters, tetrahedral order parameters and numbers of hydrogen bonds for different groups of atoms. Hereby, the tetrahedral order parameter calculates the degree to which the water molecules around O1P or W ex have a tetrahedral order. It is defined as (2) where r ij is the magnitude of the vector connecting atom i and atom j and x ij , y ij and z ij are its three components. σ(r ij ) is a switching function with a cutoff of r 0 = 0.35 nm. The Steinhardt-Nelson order parameters measure to which degree the hydration shell is ordered. The 3rd Steinhardt-Nelson order parameter around atom i is calculated from where Y 3m is the 3rd order spherical harmonics and σ is a switching function. σ is one for the atoms of the hydration shell under consideration and zero otherwise. The 4th Steinhardt-Nelson order parameter is calculated from where Y 4m is the fourth order spherical harmonics. The 6th Steinhardt-Nelson order parameter is calculated from where Y 6m is the sixth order spherical harmonics. All distances, angles, Steinhardt-Nelson order parameters and tetrahedral order parameter were calculated with PLUMED. The number of hydrogen bonds were calculated with GROMACS. distance between Mg 2+ and the phosphate oxygen O1P r 1 − r 5 distances of the 5 closest water molecules r ex distance between Mg 2+ and the exchanging water r 6 − r 20 distances of the 6-20 closest water molecules a 1 − a 20 angles between the oxygen atoms of the 20 closest waters, Mg 2+ and O1P ∆r Sum of distances for 5 closest waters + exchanging water λ atan2(s 6 − 1.18, r I − 0.18) q i0 i-th Steinhardt-Nelson order parameter (SNOP) for oxygen atoms of 5 closest waters (i=3, 4, 6) q i1 i-th SNOP for oxygen of 5 closest waters, O1P and oxygen atoms of W ex (i=3, 4, 6) q i2 i-th SNOP for oxygen atoms of second hydration shell (i=3, 4, 6) q i3 i-th SNOP for oxygen atoms of second hydration shell, O1P and oxygen of W ex q i i-th SNOP for all oxygen atoms in first and second hydration shell and O1P q W i0 i-th SNOP for 5 closest waters including hydrogens (i=3, 4, 6) q W i1 i-th SNOP for 5 closest waters, O1P and W ex (i=3, 4, 6) q W i2 i-th SNOP for water in second hydration shell (i=3, 4, 6) q W i3 i-th SNOP for water in second hydration shell, O1P and W ex q W h-bonds between O1P and water from the second hydration shell h 1 O2P h-bonds between O2P and water from the first hydration shell h 2 O2P h-bonds between O2P and water from the second hydration shell h h-bonds between first and second hydration shell s 2 s 2 = 12 i=1 exp(−σ((r i − r 0 ) 2 ) + (a i − a 0 ) 2 ) with σ = 50, r 0 = 0.35 nm, a 0 = 120 •