Our research interests are in the general area of biomolecular interactions. For the last several years our group has been working in parallel in the areas of protein folding and protein-ligand recognition, though our recent work suggests that these two areas are more convergent than strictly parallel. On the one hand, we find evidence of a significant role for protein folding in molecular recognition, and on the other hand we find that protein folding can be viewed, both conceptually and experimentally, as a special case of molecular recognition in which the partners are normally intramolecular. A unifying theme of our work in these two areas is a focus on understanding the molecular origins of affinity, specificity, and cooperativity in the two kinds of recognition processes, and relating these features to the requirements of the biological systems they serve (for a recent review, see Szwajkajzer & Carey, 1997). We have worked on a broad range of systems, usually choosing those that display intriguing features in their folding or ligand-recognition reactions, in the hope that detailed comparisons may elucidate general principles important for folding and recognition.

Molecular strategies for specificity in protein-ligand interactions. We chose several bacterial regulons for our studies because these systems display a delicate balance of affinity and specificity. Regulons are regulatory units consisting of multiple DNA sites with imperfect sequence identity, recognized by a single master regulatory protein, which therefore must tolerate a range of DNA sequences while maintaining the ability to reject slightly more distantly related ones (Lavoie & Carey, 1994). In the case of the tryptophan repressor, TrpR, through a combination of structural (Lawson & Carey, 1993) and biochemical (Carey et al., 1991; Jin et al., 1993; Yang et al., 1996; Jin et al., 1999) approaches we identified multiple mechanisms, including folding coupled to ligand binding, that TrpR apparently uses to modulate affinity and specificity of DNA binding. Another unexpected finding was that the molecular origin of the extreme positive cooperativity observed for TrpR binding to very short DNA targets is rooted largely, though artificially, in constraints on stoichiometry (Carey et al., 1991), and we exploited this property to cocrystallize a tandem, cooperative complex of two TrpR dimers on one 17 bp target (Carey et al., 1993). We are now asking what mechanisms for control of affinity and specificity are used by the arginine repressor, which, in addition to its role as master regulator of one of the largest bacterial biosynthetic regulons, inexplicably also functions as a required factor in plasmid recombination, and thus apparently represents a novel class of proteins with the dual functions of gene regulation and gene organization; the specificity of such a class of proteins is of obvious interest.

Our studies on ArgR exemplify the range of molecular and biophysical techniques we typically employ. The E. coli protein is a hexamer of 100 kDa whose crystals diffract poorly and which is too large for direct NMR analysis by current methods. Using proteolytic methods that we refined in our work on protein folding, we identified structural and functional domains of the repressor according to both in vitro and in vivo criteria (Grandori et al., 1995), and showed that a minimal unit for DNA binding is a monomer of only 8 kDa that is soluble to 5 mM in water at neutral pH. We recently solved the NMR structure of this fragment using both unlabeled and 15N-labeled protein, and have combined it with an extensive body of biochemical data (Tian et al., 1992) on the intact hexamer to reconstruct a model for the complex with DNA (Sunnerhagen et al., 1997). The model suggests a large number of experimentally testable hypotheses about the regulatory and organizational functions of this protein, which are being tested in the current phase of our work. A major area of continuing investigation is determination of affinity, specificity, storchrometry, and cooperativity of DNA binding by both wildtype and superrepressor mutants of ArgR to various DNA constructs (D. Szwajkajzer). The appraoches being used include quantitative gel-mobility shift and footprinting titrations, two methods in which we have a great deal of expertise (Carey, 1991; Yang and Carey, 1995).

Understanding protein structural hierarchies, stabilities, and folding. We demonstrated that proteolytic fragments of TrpR too small to exhibit stable secondary or tertiary structures in isolation could acquire such structures upon reconstitution with each other (Tasayco & Carey, 1992). The reassembly reaction regenerated a native-like structure in an obligately ordered series of steps which we speculated might reflect the order of steps in the folding pathway of intact TrpR, an hypothesis that is still under experimental evaluation. To test the more general hypothesis that reassembly might mimic folding, and thus might represent an alternative approach to studying folding pathways in complex proteins, we carried out similar experiments on cytochrome c, the folding pathway of which was already partly elucidated in structural terms. We reasoned that since the first structured intermediate on the cyt c folding pathway could form when the rest of the chain was unfolded, then it should also form when the rest of the chain was removed by proteolysis. We demonstrated that a two-fragment complex representing the first intermediate can form with surprisingly high affinity despite the absence of the central half of the protein chain (Wu et al., 1993), indicating that noncovalently bound fragments can recognize each other to form at least part of the fold of the protein. The similarity between this complex and the corresponding part of the intact protein structures is still under investigation. The results of our reconstitution studies have several implications about protein organization (Wu et al., 1994), one of the most significant of which is the suggestion that, if chain connectivity is not required at all points to form the fold, then chain topology (the linear order of secondary structure elements in the chain) cannot be the determinative feature of the fold, and should be de-emphasized as a basis for protein classification.

Recent work on cyt c has included the demonstration that two short peptide fragments can associate with heme noncovalently to form a three-component complex with substantial secondary structure (Kang & Carey, 1999). This finding opens the way for NMR analysis to determine the structure of the noncovalent complex in work in progress with J. Carson ’99, and I. Pelczer. We also recently completed the semisynthesis of two cytochromes c in which the heme group is replaced by biphenyl or by phenanthrene, each covalently attached via chroether links to Cys 14 and 17 just as the native heme is normally attached (Kang & Carey, submitted). Surprisingly, both heme proxy groups lead to formation of substantial helix content relative to apocyt c, despite their small size relative to the porphyin group. As well, helix content and stability are similar in the two derivatives despite large differences in rigid planarity between them. Taken together, the results highlight the dominant role of heme hydrophobicity in organizing helical structure in cytochrome c.

One general strategy in our work on protein structural organization exploits nonspecific proteolysis to approach several aspects of protein structure/function relationships by coupling that simple, old technology to modern cloning (and, in some cases, semisynthesis) and high-resolution structural methods. Because proteases with very different sequence specificities often have kinetically preferred cleavage sites close together in any given substrate protein, preferential cleavages are apparently controlled by accessibility of the peptide chain. Therefore proteolytic dissection is a general method for probing the structure and dynamics of the native state, provided careful attention is paid to the kinetics of cleavage (Wu et al., 1994). Our approach to proteolysis has the virtue of being quite general and systematic (Carey, Meths. Enzymol., in press), and we are presently applying it to define structural and functional domains of a group of cancer-related proteins (S. Osterdahl-Brijker with M. Sunnerhagen) and to further dissect the structural subdomains of cytochrome c (A. Kesarwala).

In our work on protein structural hierarchies we also do some sequence homology analysis to try to identify distant relatives of proteins or their subdomains. One such analysis led us to propose that the protein WrbA of E. coli is the founding member of a new class of possibly ubiquitous, novel flavodoxin-like proteins that use the flavodoxin domain itself to form multimers (Grandori & Carey, 1994). We then isolated WrbA and demonstrated that it binds FMN and participates in a dimer-tetramer equilibrium at micromolar concentrations (Grandori et al., 1998). In future work we intend to probe the structural requirements for WrbA multimerization both by proteolytic dissection coupled with analysis of the multimeric state, and by direct structural determination. Another intriguing question we intend to pursue with WrbA is the molecular basis for control of FMN affinity and redox potential, since this protein binds its redox cofactor much more weakly than typical flavodoxins, though still specifically.

Another link between our experimental work and more theoretical work is a recent combined analysis of subdomains in TrpR and its proteolytic fragments (Wallqvist et al., 1999). The use of proteolysis to test predicted stable substructures mave have promise for elucidating structures of unknown proteins that are available in amounts adequate only for analytical protein chemistry.