Abstract
Smoothened (SMO) is a membrane protein of the Class F subfamily of G-Protein Coupled Receptors (GPCRs) and maintains homeostasis of cellular differentiation. SMO undergoes conformational change during activation, transmitting the signal across the membrane, making it amenable to bind to its intracellular signaling partner. Receptor activation has been studied at length for Class A receptors, but the mechanism of Class F receptor activation remain unknown. SMO bound to agonists and antagonists at sites in the Transmembrane Domain (TMD) and the Cysteine Rich Domain (CRD) has been characterized; giving a static view of the various conformations SMO adopts. While these crystal structures of the inactive and active SMO outline the residue-level transitions, a kinetic view of the overall activation process remains unexplored for Class F receptors. We describe SMO’s activation process in atomistic detail by performing 300 µs of molecular dynamics simulations and combining it with Markov State Model theory. A molecular switch, conserved across Class F and analogous to the activation-mediating D-R-Y motif in Class A receptors, is observed to break during activation. We also show that this transition occurs in a stage-wise movement of the transmembrane helices - TM6 first, followed by TM5. To see how modulators affect SMO activity, we simulated agonist and antagonist-bound SMO. We observed that agonist-bound SMO has an expanded hydrophobic tunnel in SMO’s core TMD, while antagonist-bound SMO shrinks this tunnel; further supporting the hypothesis that cholesterol travels through a tunnel inside Smoothened to activate it. In summary, this study establishes the distinct activation mechanism of Class F GPCRs and shows that SMO’s activation process rearranges the core transmembrane domain to open a hydrophobic conduit for cholesterol transport.
Introduction
G-Protein Coupled Receptors (GPCRs) act as molecular telephones and transmit signals across the cellular membrane by associating with G-proteins1, 2 or β-arrestins.3 The process of signal transduction generally involves GPCRs binding to effectors (agonists) which aid the shift in conformational equilibrium, facilitating the receptors to transition to an ’active’ state. Activation allows the receptor to associate with intracellular binding partners, terminating the process of signal transduction.4 GPCR activation is an area of active research - with studies establishing conserved structural motifs like the E/DRY, NPxxY4–8 in Class A and PxxG, HETx9 in Class B receptors acting as molecular switches that stabilize the inactive state. Unlike Class A and B GPCRs, activation of Class F receptors (SMO, FZD1−10) is still poorly understood. A primary reason for this elusiveness is that these receptors share none of the structural motifs seen in Class A/B, and have less than 10% sequence similarity to Class A receptors. Since GPCRs are involved in mediating virtually every physiological response - they are crucial drug targets, as 34% of all FDA approved drugs target one of these proteins.10
Smoothened (SMO) is a transmembrane protein from the Class F family of GPCRs. Class F consists of proteins that are involved in maintaining tissue homeostasis and regenerative responses in adults, and are crucial in embryonic development, as they regulate cellular diff tiation by binding to sterol and WnT ligands.11–14 SMO is expressed in tissues throughout the body, particularly in cerebellar and pituitary tissue,15 and is a member of the Hedgehog (Hh) signaling pathway. When the endogenous inhibitor of SMO, a membrane protein Patched (PTCH), is inhibited by Sonic Hedgehog (ShH) binding, SMO translocates to the ciliary membrane, and undergoes conformational transitions (activation) to bind to its intracellular signaling partner Gi.16, 17 How PTCH inhibits SMO is still unclear. However, multiple studies have described PTCH’s inhibition on SMO as acting through reducing SMO’s accessibility to membrane cholesterol.18 A recent study described the effect of PTCH on the cholesterol accessibility of the upper leafl suggesting that PTCH inhibits SMO by either transporting cholesterol to the inner leafl , or to an extracellular acceptor.19 Hh signaling is critical to embryonic development, and any changes in signaling can lead to severe birth defects.20 Cyclopamine, a naturally occuring alkaloid in corn lily, has been identified as a teratogen (agents responsible for birth defects in infants),21 and was responsible for birth defects in lambs in Idaho in the 1950s.22 It was identified later that cyclopamine’s mechanism of action involved inhibiting HH signalling by binding to SMO.23–25 On the other hand, overstimulation of HH signaling via SMO has been linked to the pathogenesis of pediatric medulloblastoma and basal cell carcinoma. 26, 27 Vismodegib28 and Sonidegib29 are two FDA approved drugs that target SMO, but are prone to chemoresistance.30 Therefore, understanding activation mechanisms of Class F GPCRs is hence critical to design novel therapeutics.
Structures of SMO bound to agonists and antagonists outline the effects of allosteric and orthosteric modulators binding on SMO activity. These structures show the existence of two primary binding sites in SMO - in the CRD (Cysteine Rich Domain), which binds agonists cholesterol,31 cyclopamine 32 and in the TMD, which binds both antagonists LY2940680,33 SANT1 and AntaXV,34 cyclopamine, 35 TC114,36 Vismodegib31 and agonists SAG1.5,34 SAG,37 SAG21k,38 24,25 epoxycholesterol,39 and cholesterol.37 Mutagenesis studies have outlined the presence of an intracellular W7.55-R6.32 cation-π lock40 in Class F that is broken on activation (Fig. 1A), with mutations that disrupt this lock lead to increased agonist potency and pathway selection (Superscripts refer to the Ballesteros-Weinstein numbering system used to denote GPCR TM residues41). On the extracellular end, for SAG-bound SMO, the D-R-E network is broken in active SMO34 (Fig. 1A). The intracellular end of active SMO shows rearrangements in TM6 (outward, Fig. 1B), TM3 (outward, Fig. 1C) and TM5 (inward, Fig. 1D). These studies paint a static picture of how SMO activity can be attributed to structural rearrangements; however, a dynamic understanding of the process of SMO activation remains to be understood. Hence to provide a dynamic overview of activation, we simulated ∼ 250 µs Apo-SMO (no ligand bound) to understand SMO’s activation process in atomistic detail. Moreover, it has been shown that PTCH modulates SMO activity by controlling its access to membrane cholesterol18, 42 which then travels through a hydrophobic tunnel inside SMO to access the primary ligand binding site in CRD, showing an expanded tunnel in active SMO (Fig. 1A). Hence, we simulated agonist bound (SAG-SMO) and antagonist bound (SANT1-SMO) to explore the effects of bound modulators on SMO activity, and the mechanisms of action for these molecules. Using a highly parallel adaptive sampling based approach and constructing a Markov state-model (MSM),43, 44 we probe submillisecond dynamics of SMO, and show that SMO activation involves a intra-cellular structural motif that is conserved across Class F receptors. MSMs have been used to model membrane protein behavior at varied timescales, to probe activity of membrane transporters,45–47 as well as to study conformational dynamics of signaling proteins.8, 48–55 In particular, Markov State Models have been employed to investigate conformational dynamics of GPCRs, such as β2-AR,8, 52, 56 Dopamine D3 receptor,48 µ-opioid receptor,54, 55 Chemokine receptor CCR2,49 and Cannabinoid Receptors 1,2.50, 51 Using MSMs, we outline the involvement of multiple CRD-TMD salt-bridges that are rearranged during SMO activation, establishing a role for the CRD in SMO activation. We show that the hydrophobic tunnel inside SMO expands in the presence of an agonist, and is occluded by the antagonist. These observations are amenable to experimental observations that bolster the cholesterol transport-like activity of SMO. We then use a mutual-information based approach to outline the allosteric mechanisms through which the agonist SAG operates, i.e. by changing the allosteric pathways in SMO to more active-like SMO. These observations provide a detailed and atomistic in-depth view of SMO activation, and may aid in design of antagonists for cancer therapy.

Major structural changes during SMO Activation. (A) Comparison of the broken D-R-E network and the W-R ionic lock, and the expanded tunnel, in inactive (green) vs active (red) SMO (B)-(D) Comparison of inactive and active SMO, indicating the outward movement of the TM6 and TM3 and inward movement of TM5 in active SMO.
Results and Discussion
SMO activation involves a conserved molecular switch
To probe the transitions SMO undergoes during activation, SMO was simulated in a ligand-free form (Apo-SMO) from two starting points - inactive and active crystal structures. Simulations were performed using a parallel approach - by clustering the existing data based on selected features (feature selection explained in Methods) and seeding the next round of simulations by randomly selecting starting points from the least populated clusters - a technique known as Adaptive Sampling.57 (Fig. S1, Table S1). The high dimensionality of the data was reduced by transforming it using time-Independent Component Analysis (tICA).58, 59 tICA uses a linear combination of the supplied features to identify the slowest collective degress of freedom in the data by computing the time-lagged autocorrelation. The fi two tICA components account for the two slowest processes associated with activation(Fig. S2). The active and inactive structures were separated majorly in the fi tICA component (tIC 1), indicating that activation was the slowest process observed in simulations. Hence, features that were highly correlated with tIC 1 (Fig. S3) were considered pivotal to activation. The tICA-transformed data was clustered - dividing the data into kinetically distinct microstates. A Markov-State Model (MSM) was constructed on the clustered data to compute the transition rates between microstates, and to reweigh the data, eliminating the bias introduced by adaptive sampling.
At the intracellular end, we observe that W3393.50 shows a very dramatic reorientation on receptor activation, moving outwards from the center of the TM bundle, to accommodate the bound Gi. W3393.50 is conserved across all Class F receptors (Fig. S4). Upon further analysis, we observed that this rearrangement extends to include M4496.30 and G4536.34 (outward movement) , G4225.61 (translation) (Fig. 2A), as well as W5357.55 (inward rotation) - residues that are all conserved across the entire Class F family (Fig. S4).
M4496.30’s outward movement is a proxy for the outward movement of TM6 a process associated with canonical GPCR activation.;37, 39 However, instead of kinking outwards as observed in Class A/B receptors, TM6 in SMO undergoes translation, to accommodate Gi. This can be attributed to the absence of P6.43, a residue conserved across FZDs. P6.43 is replaced by F4626.43 in SMO - thereby increasing its rigidity and resistance to developing kinks.60 Recently published structure of active FZD761 shows this kink at P6.43. TM5 on the other hand, shows slight inward translation. To capture these outlined movements, we projected the entire Apo-SMO data onto W3393.50 – M4496.30 (TM3-TM6 distance) v/s W3393.50 – G4225.39 (TM3-TM5 distance) and computed the free energy associated with each state(Fig. 2B, Fig. S5). The free energy plot shows that this TM3-5-6 rearrangement follows a stage-wise process - with TM6 moving outwards fi by ∼ 4 Å , (State 1 in Fig. 2B) followed by the rest of the TM3 outward movement after a slight outward rearrangement in TM5 (State 2 in Fig. 2B). The overall free energy barrier for this rearrangement is ∼ 2.5 kcal/mol. The outward movement of TM6 is analogous to class A receptor activation (Fig. S6 A,B). A conserved molecular switch mediating SMO’s activation on the intracellular end is similar to the breakage of molecular switch E/DRY in Class A GPCRs- with W3393.50 being the residue analogous to R3.50 (Fig. S6 C,D). Hence, we posit that this conserved molecular motif (W-G-M) is integral to Class F receptor activation, and provides a basis for activation across the entire Class F receptors, while also showing the uniqueness of activation of Class F receptors.

Molecular metrics integral to SMO Activation. (A) Rearrangement of the WGM motif, a conserved molecular switch across class F GPCRs, undergoes rearrangement on SMO activation. (B) Relative free energies from MSM-weighted simulation data plotted on the TM3-TM6 distance vs TM3-TM5 distance measured at residues W3393.50, M4496.30 and G4225.61. (C) Breaking of the D-R-E network on the extracellular end of the TMD. (D) SImilar to (B), but for TM3-TM6 distance vs the D-E distance. (E) The ionic lock breaks by the sidechain rotation of W5357.55. (F) Same as (B) but for TM3-TM6 distance vs χ2 dihedral measured at W5357.55.
The crystal structure of SMO bound to the synthetic agonist SAG1.5 gives clues about the activation-specific residue-level rearrangements that occur on the extracellular end of SMO. D4736.54 has been established as a residue critical to SMO activity, as it forms a part of SMO’s core TMD ligand binding cavity, and is shown to interact with agonists SAG1.5, SAG, oxysterols and antagonists GDC-0449, AntaXV.34, 37, 39, 62, 63 Specifically, a network of salt bridges formed by the residues D4736.54, E5187.38 and R4005.39 is broken in SAG1.5-bound SMO (Fig. 2C).34 Hence, we also projected the data on the D4736.54 – E5187.38 distance v/s intracellular TM3-6 movement (Fig. 2D, Fig. S5). We observe that the TM6-TM3 outward movement (2 in Fig. 2D) is preceded by the breakage of the hydrogen bond between D473-E518 (1 in Fig. 2D).
To outline the role of the cation-π lock W5357.55-R4516.32 in activation, we projected this ionic lock contact v/s the TM3-6 outward movement (Fig. 2F, Fig. S5). Projecting the Apo-SMO data along the sidechain dihedral angle χ2 of W5357.55, clearly showed the distinct inactive and active states. This shows that the mechanism of cation-pi lock breaking involves the sidechain rotation of W5357.55. Additionally, we observe that the cation-pi lock breaks around the same TM3-TM6 distance as the outward movement of TM3. Thus, the WGM motif and the ionic lock at the intracellular end, and the D-R-E network at extracellular end are critical residue networks involved in SMO activation. These residues form a network of allosterically coupled residues, proving crucial for signal transduction across the membrane.
Residues at the CRD-TMD interface involve salt-bridge rearrangements in SMO activation
SMO, in addition to a heptahelical TM domain, possesses an extracellular domain called the Cysteine Rich Domain (CRD). The CRD consists of residues that are highly polar; in comparison to the TMD, which is mostly hydrophobic (Fig. S7). This domain is critical for SMO activation, as SMOΔCRD mutants show a higher constitutive activity - suggesting that the CRD represses SMO’s basal activity.64 The CRD also includes the primary sterol binding site in SMO31 - and it has been posited that PTCH inhibits SMO by reducing cholesterol access to this site.17 Structures of active xenopus laevis SMO (xSMO) show a dramatic reorientation of the CRD on xSMO activation-suggesting that the CRD has a very dynamic range of motion.63 However, this reorientation is not observed in human SMO (hSMO).31, 36 Thus to establish a role of the CRD in activation of hSMO, we sought residue pairs in SMO CRD-TMD interface that showed the highest variance along tIC1, the slowest process that captured SMO activation.
Fig. 3(A-F) show the residue pairs that have the highest change in contact frequency during activation - starting with the R4856.66 – D209CRD , salt-bridge, which breaks during activation (Fig. 3A) due to the outward movement of TM6. This indicates that the R485-D209 salt-bridge is involved in stabilizing the inward conformation of TM6 in the inactive state. This loss of the R485-D209 salt-bridge is however compensated by the formation of the nearby R161–D486 salt bridge, which is predominantly seen in the active conformation (Fig. 3E). Furthermore, the inactive state shows a salt-bridge E208CRD -K395ECL2 which breaks on activation, compensated by the formation of the nearby D201CRD –R296ECL1 (Fig. 3B, E). Additionally, activation strongly favors the formation of R159CRD –D209CRD (Fig. 3C) and D382ECL2–K204CRD (Fig. 3G) salt bridges.

Overall activation of SMO involves residues at CRD-TMD junction. (A)-(F) Snapshots and probability density plots outlining the salt-bridge rearrangements at the CRD-TMD interface during SMO activation.
The path along tIC1 from the inactive state to active state involves 3 intermediate states(I1−3) (Fig. 4A), characterized by free energy barriers of atleast 1 kcal mol− among them. Using Transition Path Theory on the constructed MSM, we calculated the fl of transitions between these states, to establish timescales for activation of SMO. (Fig. 4B). The simulations show that the entire process of activation from inactive to active has a MFPT (mean fi passage time) of 72µs (Fig. 4B), while the reverse process is ∼ 3X faster, with MFPT ∼24 µs.

(A)Relative free energies from MSM-weighted simulation data of Apo-SMO plotted along tIC1 and tIC2, the 2 slowest components, with the intermediate states I1−3 as shown. (B) Overall transition pathway of SMO activation process.
We observe that residue pair rearrangements that are associated with activation at the CRD-TMD junctions are salt-bridges, mostly between residues with one residue in CRD and the other one in TMD. Almost none of these polar residues are conserved (Fig. S4, S8), indicating that these residues contribute to a unique activation process for SMO at the CRD-TMD interface. Additionally, we observe that the entire CRD motion can be accounted for by a slight outward rotational motion of the CRD (Fig. S9), thereby causing TM6 to move outwards and triggering activation on the intracellular end. Since the CRD has a cholesterol binding site, it is possible that cholesterol binding to the CRD triggers this outward rotation, inducing the signal that causes TM6 to move out. This potentially outlines a mechanism for the activation of SMO by cholesterol, its endogenous agonist.
SMO’s Activation is linked to opening of a hydrophobic tunnel
Endogenously, on PTCH’s inhibition by ShH, SMO is activated. SMO’s activation is mediated endogenously by cholesterol, suggesting that PTCH’s inhibition facilitates SMO’s activation by cholesterol. This suggests that cholesterol from the membrane travels to the extracellular sterol binding site. How this transfer of cholesterol occurs to the SMO CRD is still unknown; however, SMO does indeed present itself with a unique topology - the presence of a tunnel inside the protein. This tunnel has been hypothesized37–39, 63 to facilitate the transport of cholesterol from the membrane to the binding site in the CRD31 ; making this tunnel a prime target for inhibitors. As noted by Qi et al., SMO antagonists (SANT1, AntaXV, LY2940680) bind deeper into a tunnel inside SMO, whereas SMO agonists (SAG) bind outside this tunnel. Adding a 4-aminomethyl moiety to the tail-end of SAG converts it to an antagonist, suggesting that this added moiety can hinder the tunnel.65 Mutations that introduced a bulky residue into the tunnel (V329F,V333F,V408F,I412F,T470Q), blocked SMO activity,31, 38 suggesting that the tunnel conformation was linked to how small molecule and mutations modulated SMO activity.37 This suggests that SMO antagonists like SANT1 act as steric antagonists by blocking the sterol tunnel inside SMO, while agonists like SAG allosterically activate SMO by breaking the D-R-E network, setting off receptor activation on the intracellular end. The mechanism and dynamics of the modulators acting on SMO’s activation is still unclear; hence we simulated SMO bound to antagonist SANT1 (SANT1-SMO) and agonist SAG (SAG-SMO) to probe the effect of bound agonist and antagonist on SMO’s activation.
SMO’s tunnel is characterized by markedly hydrophobic residues (Fig. S10), pointing further towards the idea that a hydrophobic molecule may be transported through it. This tunnel runs through the core of the receptor, spans the entire TM domain, starting at the conserved residues W3393.50, spans ∼ seven helical turns, and ends at the extracellular network of residues E5187.38, D4736.54 and R4005.39. These three residues form the base of the space between the CRD and TMD. Moving outwards along the path defi along the tunnel directly leads to the binding site, with TM6, ECL2 and ECL1 forming the bridge between these sites (Fig. S11).
In SANT1-SMO simulations, the tunnel remains almost completely blocked (Fig. 5A, B), indicating that the mechanism by which SANT1 modulates SMO activity is by binding deeply into the SMO tunnel core, precluding the potential transport of cholesterol. SANT1’s piperazine moiety directly interacts with H4706.51 and sidechain of M5257.45 - forming hydrogen bond interactions. The pyrrolic head of the ligand remains buried deep inside, with minimal lateral movement across the tunnel (Fig. S12). However, in Apo-SMO simulations, the tunnel remains relatively open(Fig. 5C, D). In SAG-SMO simulations however, the tunnel radius has a sudden kink outward (z ∼ -20 Å), suggesting that there is a relative expansion of the tunnel induced by SAG (Fig. 5 E, F). Since this expansion occurs at z ∼-20 Å, it suggests the opening is in the upper leafl (Fig. S13A). Recent studies suggest that active PTCH precludes SMO’s accessibility to cholesterol in the upper leafl 19 To further probe into the exact position of this tunnel opening, we observed that a cluster of openings occured at x ∼ 16 Å and y ∼ 22 Å; corresponding to the space between TM2 and TM3. (Fig. S13B). This is in agreement with a recent study that used coarse-grained simulations to observe a cholesterol binding site at the TM2-TM3 interface in the upper leafl 67 Thus, SAG acts as an agonist by allosterically expanding the tunnel at the cholesterol interaction site - giving further evidence for the cholesterol-transport like activity of SMO. Thus we conclude that SANT1 functions as a steric antagonist by blocking the tunnel, whereas SAG functions by allosterically expanding the tunnel, thereby establishing design rules for SMO agonists and antagonists.

Tunnel radius plots for SMO. (A) Free energy plot of the tunnel diameter along the z-coordinate for SANT1-bound SMO. (C) same as (A), but for Apo-SMO. (E) same as (A), but for SAG-bound SMO. SAG-bound SMO clearly shows the expansion of the tunnel as compared to Apo-SMO and SANT1-SMO. (B), (D), (F) - representative fi for SANT-1 SMO, Apo-SMO and SAG-SMO. Tunnel radii were calculated using the HOLE program66 transduction. Allosteric pathways contain a series of conformationally-coupled residues that link dynamically active and spatially distant residues. In Class A GPCRs, allosteric pathways are responsible for communicating conformational changes from the extracellular end to the intracellular end, completing the process of signal transduction.68–70 Since SMO’s activation process involves allosteric communication between the extracellular ligand binding site (D-R-E network) and the G-protein coupling site (WGM motif), we sought to analyze the allosteric pathways that connect the two sites. We computed the dynamic pairwise mutual information of Inactive-Apo-SMO, Active-Apo-SMO, SANT1-SMO and SAG-SMO on a residue-level basis, and construct a graphical network of residues that are allosterically linked. Based on this network, we present the allosteric pathway between the intra- and extracellular ends of TMD.

Allosteric pathways between E5187.38 and W3393.50. (A) - Pathway in Apo-Inactive-SMO. Since the tunnel radius is decreased, TM6 outward movement is restricted, and therefore the entire allosteric communications occurs via TM6. (B) In SANT1-SMO, due to slight outward movement of TM6, the pathways switches from TM7 to TM6 to TM3. (C,D) SAG-SMO and Apo-Active SMO show the same allosteric pathway, which spans TM7-TM6-TM5-TM3.
SAG alters the allosteric pathways in SMO during the process of SMO activation
To further investigate the mechanism by which SAG allosterically modulates SMO’s activity resulting in the expansion of the tunnel, we computed the allosteric pathways that connected the intra- and extracellular ends of SMO, responsible for transmembrane signal
In our simulations, we observe that the allosteric pathway between the intra and extra-cellular ends in Apo-Inactive SMO completely passes through TM6, encompassing residues T4666.47, F4606.41 and G4566.37 (Fig. 5A). This establishes an integral role for TM6 in mediating the signals across the transmembrane domain in inactive-SMO. SANT1-SMO on the other hand, unexpectedly shows a distinct pathway, fi going down intra-helically to A5247.24, crossing over to TM6 via A4596.40 and fi to L3353.46. This however can be explained by the observation that the SANT1 causes a slight outward movement of TM6, to accommodate itself in the deep core TMD ligand binding cavity (Fig. S14). This outward movement of TM6 drifts T4664.67 away from E5187.55; causing the network to rearrange itself, moving over to TM6 further downstream. On the other hand, SAG and Apo-Active SMO show the exact same networks, further indicating that SAG alters the allosteric networks in SMO to resemble Apo-SMO. These networks involve C4696.50, the most conserved residue in TM6, down L4646.45, and a fl over to TM5 as we move intracellularly, due to the outward intracellular movement of TM6, via L4125.51 and F4185.57. Thus, we can establish a basis for the mechanisms through which SAG and SANT1 effectively modulate SMO activity, and establish an integral role for TMs 7,6,5,3 in signal transduction.
Conclusions
Our study reveals the activation mechanism for SMO, a Class F GPCR, in atomistic detail via molecular dynamics simulations. We characterized the residue level transitions that SMO undergoes during activation. We simulated SMO in Apo, SAG and SANT1 bound states to probe the activation mechanism of SMO, and computed the free energy landscape of the process. Our MSM weighted free energy landscapes show a barrier of max free energy barrier of ∼ 3 kcal mol−1 while transitioning from an inactive to active state, involving three intermediate states.
Class A and Class B receptors have been the subject of major interest involving GPCR activation.4, 9 Receptor activation studies on Class F majorly focused on the start and end states of the receptor, without giving an overview of the dynamics of the process. Using computational methods, we show that SMO activation involves the rearrangement of a intracellular structural motif - the W-G-M motif, conserved across the entire Class F family. This lays the basis for a common activation mechanism for all Class F receptors on the intracellular end. Additionally, this motif involves W3.50, which is the residue equivalent to R3.50 in class A receptors, establishing the integral role of TM3 in GPCR activation. On the extracellular end of TMD, we see that the D-R-E network of residues is pivotal to activation, as it engages the agonist and sets off the activation process at the intracellular end. We also show evidence of allosteric coupling between these two sites, showing that the rearrangement of the D-R-E network is necessary to ensure intracellular rearrangement of the WGM motif.
We also establish a role for the CRD in SMO activation, establishing and breaking salt- bridges while transitioning to an active state, contacts that have not been discussed previously. This gives novelty to the methodology established, inferring that MD simulations can be used to discover contacts crucial to activation, previously unknown. We show that the agonist SAG expands an intra-TMD tunnel inside SMO, further supporting the hypothesis that SMO transports a cholesterol molecule through its hydrophobic tunnel to activate SMO.18, 19, 37, 39, 63 We also show that SAG acts as an allosteric modulator, by modifying SMO’s allosteric pathways to be similar to Apo-SMO. On the other hand, SANT1 acts as a steric antagonist, by occluding the hydrophobic tunnel inside SMO, hence lowering the radius. Therefore, we establish the mechanisms of action of antagonists and agonists in modulating SMO activity. How cholesterol, the endogenous agonist of SMO, modulates SMO activity in the presence of agonists, still needs to be explored. However, we propose that the overall mechanistic fi from this study can be used to design novel SMO antagonists, for chemotherapy.
Methods
Molecular Dynamics Simulations
Simulation setup
SMO crystal structures in the bound inactive conformation (inactive-SMO) (PDB ID: 5L7D31) and active conformation (active-SMO) (PDB ID: 6XBL37) were used as starting points for the SMO-Apo simulations. For apo systems, the bound ligand and the stabilizing antibodies were removed. The missing residues in the proteins were modeled using MODELLER71 (Table S2). The inactivating mutation V329F in the inac-SMO was corrected back to wild-type. For SAG-SMO, the bound SAG was retained in the SMO-SAG complex.37 For SANT1-SMO, owing to the lack of the CRD in the SANT1-SMO complex (PDB ID: 4N4W34); this crystal structure was instead aligned to inac-SMO 5L7D (to maintain the same binding pose for SANT1), and the 5L7D-SANT1 starting point was used for simulations. The terminal residues were capped using neutral terminal caps Acetyl (ACE) for N-terminus and MethylAmide (NME) for the C-terminus. The proteins were embedded in a membrane bilayer using CHARMM-GUI.72, 73 The atomic interactions were characterized using the CHARMM36 force fi 74, 75 The force fi parameters for synthetic ligands SAG and SANT1 were generated using ParamChem,76 an automated version of CGenFF.77, 78 Input fi were generated using the web-based input generator CHARMM-GUI.79 The composition of the membrane bilayer was based on lipid composition of the mice brain cerebellum80 - (75% POPC, 21% Cholesterol, 4% Sphingomyelin), to mimic physiological cerebellar membrane composition. The system was solvated using TIP3P water81 and 150mM NaCl, to mimic physiological conditions. Overall the system sizes for inac-SMO, act-SMO, SAG-SMO and SANT1-SMO were 106,415, 105,971, 105,100 and 105,582 atoms with box sizes 86x86x153 Å3, 86x86x152 Å3 86x86x152 Å3 and 85x85x153 Å3 respectively. The mass of non-protein hydrogens was repartitioned to 3.024 Da,82 to enable simulations with a long timestep (4fs). Parmed, a part of the AmberTools19 package, was used for this purpose.83
Pre-Production MD
The systems were minimized for 1000 steps, using the steepest descent method, followed by minimization using the SHAKE algorithm84 for 14000 steps. Systems were then heated from 0-310 K using NVT conditions for 5ns, constraining the backbone using a force constant of 10 kcal mol−1 Å−2. Systems were then equilibrated using the NPT conditions for 5ns, at 310 K and 1 bar, using similar backbone restraints. This was followed by an equilibration of 40 ns, without constraints. Apo-SMO and SANT-SMO simulations were performed using the AMBER1883, 85–88 biomolecular simulation package. SAG-SMO simulations were performed using NAMD 2.14.89, 90
Production MD
Post equilibration, the GPU-accelerated pmemd.cuda package from AMBER1883, 88 was used for production MD. Integrator timestep was set to 4fs. Periodic boundary conditions were used, and the temperature was maintained using the Langevin Thermostat. 91 Particle Mesh Ewald92 (PME) method was used for computing long-range electrostatic interactions. SHAKE84 algorithm was used to restrain the Hydrogen bonds. Cutoff for non-bonded interactions was set to 10 Å. Frames were saved every 25000 steps, giving a frame rate of 100 ps between each frame. Simulations were performed using the Blue Waters supercomputer(NVIDIA Tesla K20X GPUs) or our in-house computing cluster(NVIDIA GeForce GTX 980 GPUs).
Adaptive Sampling, Feature Selection and Clustering
Simulating biological systems using traditional long-MD simulations to observe sub-millisecond dynamics is unfeasible, hence we sorted to using a parallel approach to accelerate conformational sampling, called Adaptive Sampling. The simulation data after every round of simulations was clustered (feature selection explained below), and the least populated clusters were used to seed simulations for the next round. Overall, for Apo-SMO, 7 rounds of simualtions were performed, collecting 30-50µs per round. For SAG/SANT1-SMO, the data was collected in a similar fashion, for 3 rounds each, around 10-20 µs per round. The bias introduced in the system due to selectively starting simulations from least populated clusters was eliminated by constructing a Markov-state model, that estimated the reverse transition probabilities from each microstate.
The progress of the transition from inactive to active was monitored by calculating features, each of which was selected based on maximum magnitude of Δ RRCS (RRCS - Residue Residue Contact Score). RRCS is a order-parameter identifying technique that uses a fl linear-flat scoring scheme to assign a score to contact between every residue-pair in the system.6 Contacts that had |ΔRRCS| < 3.5 (58 such distances total) (Table S3) were used. K-means clustering was used to cluster the simulation data, based on these calculated features. Clustering was performed using the pyEMMA python library.93
Markov State Model construction
The high dimensionality of the data was fi st reduced using tICA. The tICA lagtime was optimized by observing the plateauing of the implied timescales (-2/lnλ, λ being the largest eigenvalue of the fi tICA eigenvector) vs the lag time, and was set to 30 ns for the 3 systems (Apo-SMO, SAG-SMO, SANT1-SMO). The tICA reduced-dimension data was then clustered using k-means clustering. The optimum number of clusters and no. of tICA components to be used was optimized by maximizing the VAMP2 score (sum of the squares of the highest eigenvalues of the transition matrix) for a particular the number of clusters, and the convergence of the implied timescales vs the MSM lag time. (Fig. S16,S17,S18). Accordingly, the number of clusters was set to 200 (Apo-SMO) and 100 (SANT1-SMO and SAG-SMO). The MSM lag time was set to 30 ns for the three systems (Apo-SMO, SAG-SMO, SANT1-SMO). The Chapman-Kolmogorov test, which tests the validity of the MSM on 5 macrostates, was performed using pyEMMA (Fig. S19,S20,S21).
Trajectory Analysis and Visualization
cpptraj94 was used for trajectory processing. VMD95, 96 and open-source PyMOL97 were used to visualize and render images. MDTraj98 was used for computing all order parameters. All plots were made using matplotlib99 and seaborn100 python libraries. Numpy101 was used for numerical computations. The salt-bridge based contacts were discovered by extracting probabilty-weighted 10000 frames from clusters the in Inactive, I1−3 and Active states, using cluster probabilities from the MSM. They were analyzed for unique contacts using GetContacts.102 Tunnel radii for for analysis of effect of SAG and SANT1 were calculated using HOLE.66
Mutual Information Calculations
Mutual Information for describing the allosteric pathways was computed using mdentropy,103 using the DihedralMutualInformation function. Analysis was performed on 10000 frames each extracted from Apo-SMO, SANT1-SMO and SAG-SMO data. The frames were chosen based on the predicted MSM probabilities, to represent the entire ensemble. A graph was constructed from the computed Mutual Information, and residues with C-α distances < 10 Å were considered to be connected by an edge. The weight of each edge was assigned as MI = MImax - MIab, with the MImax as the maximum mutual information computed among two residues in a protein, and MIab was the mutual information computed between residue pair ab. Edges with MI < MIavg were not considered. Allosteric pathways were computed by calculating the shortest paths between 2 nodes, (in our case E518 and W339) using Dijkstra’s algorithm.104 NetworkX,105 a python library was used for graph-construction, visualization and computing shortest paths.
Data Availability
Stripped trajectories and corresponding parameter fi have been uploaded to Box. Scripts used for MSM construction and trajectory analysis have been uploaded to github.
Supporting information
Acknowledgements
The authors thank The Blue Waters Petascale Computing Facility and National Center for Supercomputing Applications, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993) and the state of Illinois. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Super-computing Applications. DS acknowledges support from the NSF Early CAREER award MCB-1845606 and seed grant from Cancer Center at Illinois for their support. PDB thanks Matthew Chan, Austin Weigle and Jiming Chen of the Shukla Group at University of Illinois for the valuable insights throughout the course of this study.
References
- (1)Activation of heterotrimeric G proteins by SmoothenedProc. Natl. Acad. Sci. U.S.A 103:12607–12612Google Scholar
- (2)G protein Gαi functions immediately downstream of Smoothened in Hedgehog signallingNature 456:967–970Google Scholar
- (3)Activity-Dependent Internalization of Smoothened Mediated by ß-Arrestin 2 and GRK2Science 306:2257–2260Google Scholar
- (4)The Molecular Basis of G Protein–Coupled Receptor ActivationAnnu. Rev. Biochem 87:897–919Google Scholar
- (5)GPCR Dynamics: Structures in MotionChem. Rev 117:139–155Google Scholar
- (6)Common activation mechanism of class A GPCRseLife 8Google Scholar
- (7)The Dynamic Process of β2-Adrenergic Receptor ActivationCell 152:532–542Google Scholar
- (8)Cloud-based simulations on Google Exacycle reveal ligand modulation of GPCR activation pathwaysNat. Chem 6:15–21Google Scholar
- (9)A combined activation mechanism for the glucagon receptorProc. Natl. Acad. Sci. U.S.A 117:15414–15422Google Scholar
- (10)G Protein-Coupled Receptors as Targets for Approved Drugs: How Many Targets and How Many Drugs?Mol. Pharmacol 93:251–258Google Scholar
- (11)The WNT signaling pathway in development and diseaseAnnu. Rev. Cell Dev. Biol 20:781–810Google Scholar
- (12)Sonic hedgehog mediates the polarizing activity of the ZPACell 75:1401–1416Google Scholar
- (13)The mechanisms of Hedgehog signalling and its roles in development and diseaseNat. Rev. Mol. Cell Biol 14:416–429Google Scholar
- (14)Hedgehog signallingDevelopment 143:367–372Google Scholar
- (15)GTEx Portalhttps://www.gtexportal.org/home/gene/SMO
- (16)Dual Roles for Patched in Sequestering and Transducing Hedge-hogCell 87:553–563Google Scholar
- (17)Biochemical mechanisms of vertebrate hedgehog signalingDevelopment 146Google Scholar
- (18)Cholesterol accessibility at the ciliary membrane controls hedgehog signalingeLife 8Google Scholar
- (19)Patched 1 reduces the accessibility of cholesterol in the outer leafl of membraneseLife 10Google Scholar
- (20)Hedgehog signaling and congenital malformationsClin. Genet 67:193–208Google Scholar
- (21)Toxic and teratogenic alkaloids of western range plantsJ. Agric. Food Chem 17:473–482Google Scholar
- (22)Cyclopamine and Hedgehog Signaling: Chemistry, Biology, Medical PerspectivesAngew. Chem 49:3418–3427Google Scholar
- (23)Effects of oncogenic mutations in Smoothened and Patched can be reversed by cyclopamineNature 406:1005–1009Google Scholar
- (24)Inhibition of Hedgehog signaling by direct binding of cyclopamine to SmoothenedGenes Dev 16:2743–2748Google Scholar
- (25)Structure and function of the Smoothened extracellular domain in vertebrate Hedgehog signalingeLife 2Google Scholar
- (26)Oxysterols stimulate Sonic hedgehog signal transduction and proliferation of medulloblastoma cellsProc. Natl. Acad. Sci. U.S.A 103:8408–8413Google Scholar
- (27)Misactivation of Hedgehog signaling causes inherited and sporadic cancersJ. Clin. Investig 129:465–475Google Scholar
- (28)US Food and Drug Administration approval: vismodegib for recurrent, locally advanced, or metastatic basal cell carcinomaClin. Cancer Res 19:2289–2293Google Scholar
- (29)Sonidegib: mechanism of action, pharmacology, and clinical utility for advanced basal cell carcinomasOncoTargets Ther 10:1645–1653Google Scholar
- (30)Emergence of chemoresistance in a metastatic basal cell carcinoma patient after complete response to hedgehog pathway inhibitor vismodegib (GDC-0449)Australas. J. Dermatol 55:218–221Google Scholar
- (31)Structural basis of Smoothened regulation by its extracellular domainsNature 535:517–522Google Scholar
- (32)Cellular Cholesterol Directly Activates Smoothened in Hedgehog SignalingCell 166:1176–1187Google Scholar
- (33)Structure of the human smoothened receptor bound to an antitumour agentNature 497:338–343Google Scholar
- (34)Structural basis for Smoothened receptor modulation and chemoresistance to anticancer drugsNat. Commun 5:4355Google Scholar
- (35)Lipidic cubic phase injector facilitates membrane protein serial femtosecond crystallographyNat. Commun 5:3309Google Scholar
- (36)Crystal structure of a multi-domain human smoothened receptor in complex with a super stabilizing ligandNat. Commun 8:15383Google Scholar
- (37)Sterols in an intramolecular channel of Smoothened mediate Hedgehog signalingNat. Chem. Biol 16:1368–1375Google Scholar
- (38)Smoothened stimulation by membrane sterols drives Hedgehog pathway activityNature 571:284–288Google Scholar
- (39)Cryo-EM structure of oxysterol-bound human Smoothened coupled to a heterotrimeric GiNature 571:279–283Google Scholar
- (40)A conserved molecular switch in Class F receptors regulates receptor activation and pathway selectionNat. Commun 10Google Scholar
- (41)Methods in NeurosciencesElsevier pp. 366–428Google Scholar
- (42)Cholesterol access in cellular membranes controls Hedgehog signalingNat. Chem. Biol 16:1303–1313Google Scholar
- (43)Markov State Models: From an Art to a ScienceJ. Am. Chem. Soc 140:2386–2396Google Scholar
- (44)Markov State Models Provide Insights into Dynamic Modulation of Protein FunctionAcc. Chem. Res 48:414–422Google Scholar
- (45)The substrate import mechanism of the human serotonin transporterBiophys. J 121:715–730Google Scholar
- (46)Molecular Basis of the Glucose Transport Mechanism in PlantsACS Cent. Sci 5:1085–1096Google Scholar
- (47)Free Energy Landscape of the Complete Transport Cycle in a Key Bacterial TransporterACS Cent. Sci 4:1146–1154Google Scholar
- (48)Dopamine D3 receptor antagonist reveals a cryptic pocket in aminergic GPCRsSci. Rep 8Google Scholar
- (49)Structural basis for ligand modulation of the CCR2 conformational landscapeProc. Natl. Acad. Sci. U.S.A 116:8131–8136Google Scholar
- (50)Distinct Binding Mechanisms for Allosteric Sodium Ion in Cannabinoid ReceptorsACS Chem. Neurosci 13:379–389Google Scholar
- (51)Mechanistic origin of partial agonism of tetrahydrocannabinol for cannabinoid receptorsJ. Biol. Chem 298:101764Google Scholar
- (52)Allosteric Effect of Nanobody Binding on Ligand-Specific Active States of the β2 Adrenergic ReceptorJ. Chem. Inf. Model 61:6024–6037Google Scholar
- (53)Universality of the Sodium Ion Binding Mechanism in Class A G-Protein-Coupled ReceptorsAngew. Chem 57:3048–3053Google Scholar
- (54)Dynamic and Kinetic Elements of µ-Opioid Receptor Functional SelectivitySci. Rep 7Google Scholar
- (55)Atomic-Level Characterization of the Methadone-Stabilized Active Conformation of µ-Opioid ReceptorMol. Pharmacol 98:475–486Google Scholar
- (56)Elucidating ligand-modulated conformational landscape of gpcrs using cloud-computing approachesMethods Enzymol 557:551–572Google Scholar
- (57)Enhanced Modeling via Network Theory: Adaptive Sampling of Markov State ModelsJ. Chem. Theory Comput 6:787–794Google Scholar
- (58)Identification of slow molecular order parameters for Markov model constructionJ. Chem. Phys 139:015102Google Scholar
- (59)Improvements in Markov State Model Construction Reveal Many Non-Native Interactions in the Folding of NTL9J. Chem. Theory Comput 9:2000–2009Google Scholar
- (60)Residue 6.43 defi receptor function in class F GPCRsNat. Commun 12:3919Google Scholar
- (61)Cryo-EM structure of constitutively active human Frizzled 7 in complex with heterotrimeric GsCell. Res 31:1311–1314Google Scholar
- (62)Small Molecule Inhibition of GDC-0449 Refractory Smoothened Mutants and Downstream Mechanisms of Drug ResistanceCancer Res 71:435–444Google Scholar
- (63)Structural Basis of Smoothened Activation in Hedgehog SignalingCell 174:312–324Google Scholar
- (64)Hedgehog Pathway Modulation by Multiple Lipid Binding Sites on the Smoothened Effector of Signal ResponseDev. Cell 26:346–357Google Scholar
- (65)Converse Conformational Control of Smoothened Activity by Structurally Related Small MoleculesJ. Biol. Chem 284:20876–20884Google Scholar
- (66)The pore dimensions of gramicidin ABiophys. J 65:2455–2460Google Scholar
- (67)Cholesterol Interaction Sites on the Transmembrane Domain of the Hedgehog Signal Transducer and Class F G Protein-Coupled Receptor SmoothenedStructure 27:549–559Google Scholar
- (68)Dynamic Role of the G Protein in Stabilizing the Active State of the Adenosine A2A ReceptorStructure 27:703–712Google Scholar
- (69)Differences in Allosteric Communication Pipelines in the Inactive and Active States of a GPCRBiophys. J 107:422–434Google Scholar
- (70)Thermostabilization of the β1-adrenergic receptor correlates with increased entropy of the inactive stateJ. Phys. Chem. B 117:7283–7291Google Scholar
- (71)Comparative Protein Structure Modeling Using ModellerCurr. Protoc. Bioinform :15Google Scholar
- (72)CHARMM-GUI: A web-based graphical user interface for CHARMMJ. Comput. Chem 29:1859–1865Google Scholar
- (73)CHARMM-GUI Membrane Builder for Complex Biological Membrane Simulations with Glycolipids and LipoglycansJ. Chem. Theory Comput 15:775–786Google Scholar
- (74)Update of the CHARMM All-Atom Additive Force Field for Lipids: Validation on Six Lipid TypesJ. Phys. Chem. B 114:7830–7843Google Scholar
- (75)Optimization of the additive CHARMM all-atom protein force fi targeting improved sampling of the backbone φ, ψ and side-chain χ1 and χ2 dihedral anglesJ. Chem. Theory Comput 8:3257–3273Google Scholar
- (76)CGenFF interface at paramchemhttps://cgenff.umaryland.edu/
- (77)CHARMM general force fi A force fi for drug-like molecules compatible with the CHARMM all-atom additive biological force fi ldsJ. Comput. Chem Google Scholar
- (78)Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic ChargesJ. Chem. Inf. Model :3155–3168Google Scholar
- (79)CHARMM-GUI Input Generator for NAMD, GROMACS, AMBER, OpenMM, and CHARMM/OpenMM Simulations Using the CHARMM36 Additive Force FieldJ. Chem. Theory Comput 12:405–413Google Scholar
- (80)Lipid content of brain, brain membrane lipid domains, and neurons from acid sphingomyelinase defi t miceJ. Neurochem 107:329–338Google Scholar
- (81)Comparison of simple potential functions for simulating liquid waterJ. Chem. Phys 79:926–935Google Scholar
- (82)Long-Time-Step Molec ular Dynamics through Hydrogen Mass RepartitioningJ. Chem. Theory Comput 11:1864–1874Google Scholar
- (83)AMBERGoogle Scholar
- (84)Rattle: A “velocity” version of the shake algorithm for molecular dynamics calculationsJ. Comput. Phys 52:24–34Google Scholar
- (85)An overview of the Amber biomolecular simulation packageWiley Interdiscip. Rev. Comput. Mol. Sci 3:198–210Google Scholar
- (86)The Amber biomolecular simulation programsJ. Comput. Chem 26:1668–1688Google Scholar
- (87)Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 1Generalized Born. J. Chem. Theory Comput 8:1542–1555Google Scholar
- (88)Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 2. Explicit Solvent Particle Mesh EwaldJ. Chem. Theory Comput 9:3878–3888Google Scholar
- (89)Scalable molecular dynamics with NAMDJ. Comput. Chem 26:1781–1802Google Scholar
- (90)Scalable molecular dynamics on CPU and GPU architectures with NAMDJ. Chem. Phys 153:044130Google Scholar
- (91)Langevin thermostat for rigid body dynamicsJ. Chem. Phys 130:234101Google Scholar
- (92)Particle mesh Ewald: An N log (N) method for Ewald sums in large systemsJ. Chem. Phys 98:10089–10092Google Scholar
- (93)PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov ModelsJ. Chem. Theory Comput 11:5525–5542Google Scholar
- (94)PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory DataJ. Chem. Theory Comput 9:3084–3095Google Scholar
- (95)VMD: Visual molecular dynamicsJ. Mol. Graph 14:33–38Google Scholar
- (96)An Efficient Library for Parallel Ray Tracing and Animation. M.Sc. thesis, Computer Science DepartmentUniversity of Missouri-Rolla Google Scholar
- (97)The PyMOL Molecular Graphics Systemhttps://pymol.org/2/
- (98)MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics TrajectoriesBiophys. J 109:1528–1532Google Scholar
- (99)Matplotlib: A 2D graphics environmentComput. Sci. Eng 9:90–95Google Scholar
- (100)seaborn: statistical data visualizationJ. Open Source Softw 6:3021Google Scholar
- (101)Array programming with NumPyNature 585:357–362Google Scholar
- (102)GetContacts, Interaction analysis for molecular structures and ensembleshttps://getcontacts.github.io/
- (103)mdentropy: v0.2https://doi.org/10.5281/zenodo.18859
- (104)A note on two problems in connexion with graphsNumer. Math 1:269–271Google Scholar
- (105)Exploring Network Structure, Dynamics, and Function using NetworkXIn: Proceedings of the 7th Python in Science Conference. Pasadena, CA USA pp. 11–15Google Scholar
Article and author information
Author information
Version history
Copyright
© 2022, Bansal et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.