What we are working on

 
Schematic overview of chromatin organization with respect to Cajal bodies and splicing.
 

We have established experimental systems in budding yeast, zebrafish embryos, and mammalian tissue culture cells to explore transcription and splicing regulation in a variety of biological contexts and with a diversity of tools, from imaging to genome-wide approaches. Our observations have provided novel insights into transcription and splicing mechanisms as well as principles of cellular organization that facilitate efficient gene expression.

 

Our main research interests include:

 

Coordination of splicing and transcription

Schematic describing terminal exon pausing in co-transcriptional splicing.

All protein-coding genes are transcribed by RNA polymerase II (Pol II); the resulting pre-mRNA transcripts are spliced by a distinct macromolecular machine, the spliceosome, to produce mRNA. These two reactions, transcription and splicing, occur independently of one another in vitro. We have used “splicing factor ChIP”, which we developed, to show that the spliceosome assembles while the nascent transcript is attached to chromatin by Pol II. Thus, transcription and chromatin have the potential to influence splicing outcome in vivo. Current projects investigate the roles of regulatory factors and chromatin modifications in determining splicing efficiency and which of the diverse number of alternative transcripts are expressed by cells.

The elusive question in the field has been whether transcription and splicing are directly coupled. Using a genome-wide approach in budding yeast, we have recently discovered that Pol II pauses within terminal exons to yield highly efficient co-transcriptional splicing. Until now, Pol II pausing has only been thought to regularly occur during transcription initiation and termination. The phenomenon of terminal exon pausing indicates that specific mechanisms have evolved to directly couple transcription and splicing. We plan to determine the molecular mechanism of terminal exon pausing and how co-transcriptional splicing fundamentally contributes to gene expression.

Why do genes have introns in the first place? We recently discovered that the presence of introns in genes enhances transcriptional output and fidelity. Because this effect was splicing-dependent, this phenomenon represents another kind of coordination between splicing and transcription. In that study, we showed that the activating histone modifications H3K4me3 and H3K9ac map specifically to first exon-intron boundaries. This was surprising, because these marks help recruit general transcription factors (GTFs) to promoters. In genes with long first exons, promoter-proximal levels of H3K4me3 and H3K9ac were greatly reduced; consequently, GTFs and RNA polymerase II were low at transcription start sites (TSSs) and exhibited a second, promoter-distal peak from which transcription also initiates. In contrast, short first exons lead to increased H3K4me3 and H3K9ac at promoters, higher expression levels, accuracy in TSS usage, and a lower frequency of antisense transcription. Therefore, first exon length is predictive for gene activity. Thus, gene architecture and splicing determines transcription quantity and quality as well as chromatin signatures. These observations indicate that gene architectures have evolved to take advantage of a distance-dependent enhancer-like activity present at the end of first exons. In addition, these observations raise the intriguing possibility that the transcription and splicing history of genes may be preserved in the longer term (e.g. across cell cycles) through these chromatin signatures deposited near promoters.

Schematic of H3K4me3 controlling transcription along first exons.
 

Cajal Bodies and the macromolecular assembly of RNPs

Fluorescent microscopy image showing green Cajal bodies in early zebrafish embryo.

Cajal bodies (CBs) were identified more than 100 years ago by Ramon y Cajal in vertebrate neurons. The function of these 0.5-1 mm spherical structures, which like other cellular subcompartments (PML bodies, P bodies, P granules, stress granules, nucleoli) lack membranes, has been mysterious. Do these bodies have functions per se? Or are they just sticky places where molecules collect? Using live-cell imaging, we have shown that assembly of the macromolecular splicing complexes – the spliceosomal snRNPs – occurs in CBs. Mathematical modeling predicted that snRNP assembly is ~10-fold more efficient when CBs are present; this suggested that CBs increase the efficiency of gene expression by facilitating splicing.

We established the zebrafish embryo as a model to test CB function. Combining high resolution imaging in live embryos, targeted knockdown, sophisticated biochemistry, and molecular biology techniques, we identified an essential function of CBs. Loss of CBs resulted in splicing defects and embryonic lethality, due to an inability to assemble sufficient snRNPs. Thus, CBs promote efficient macromolecular assembly of snRNPs. This work reveals a novel element in cellular logistics, in which CBs and likely other such compartments facilitate macromolecular assembly by concentrating interacting components without the diffusional barrier of membranes. We wonder whether the CB provides a “catalytic surface” for macromolecular assembly, perhaps by aligning interaction partners in favorable orientations. We are taking in vivo and in vitro approaches to understand the structure and molecular function of CBs in snRNP assembly.

 

mRNP Formation, Composition, and Function

Fluorescent microscopy image showing SR proteins, nuclei, and cytoplasm.


Genomes encode many hundreds of RNA binding proteins that have roles in transcription, splicing, subcellular localization, stability and translation. Yet we do not have a comprehensive handle on how they work. Each mRNA is bound by numerous RNA binding proteins during its lifetime. How do nascent and mature mRNPs assemble? What is their composition? What are the specific functions of mRNP components in gene expression? These questions currently represent a black box in our knowledge of gene expression.

My lab studies a family of essential RNA binding proteins, the SR proteins, as representatives of this class of regulators. We established physiological expression of tagged versions of each SR protein on bacterial artificial chromosomes (BACs) stably integrated into multipotent murine cell lines. The uniform tag on each protein facilitates biochemical purification of SR protein-specific mRNPs, from which protein and RNA components are analyzed. We identified the mRNA cargoes of SR proteins in cycling and neural cells and found that individual SR proteins associate with a discrete set of mRNAs that changes upon neural differentiation.


Many target mRNAs required the cognate SR protein for their expression. Identification of mRNP components in cycling and neural cells by mass spectrometry is in progress. Targeted depletion of individual SR proteins leads to discrete, largely non-overlapping changes in alternative splicing. Our vision is that the SR proteins provide an opportunity to systematically determine the role of RNA-binding proteins in each step of gene expression, because we can compare and contrast family members that are structurally highly related. We are currently generating large genome-wide datasets to provide insight into the function of SR proteins at all phases of gene expression.