|
|
||||||||
|
First published online April 22, 2005; 10.1105/tpc.105.032185 © 2005 American Society of Plant Biologists
Antiquity of MicroRNAs and Their Targets in Land Plants
|
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
22-nucleotide regulatory RNAs that derive from stem-loop regions of endogenous precursor transcripts (Ambros, 2004
miRNAs are important controllers of development in flowering plants (Dugas and Bartel, 2004)
. The majority of known miRNA targets in Arabidopsis thaliana code for proteins with a known or suspected role in developmental control (Rhoades et al., 2002
; Jones-Rhoades and Bartel, 2004
). Dysfunction of individual miRNAs and/or their ability to properly regulate their targets has been shown to cause floral and leaf-patterning defects (miR159 and miR319; Palatnik et al., 2003
; Achard et al., 2004
; Millar and Gubler, 2005
), floral development and timing defects (miR172; Aukerman and Sakai, 2003
; Chen, 2004
), loss of organ polarity and altered vascular development (miR165/166; McConnell et al., 2001
; Emery et al., 2003
; Juarez et al., 2004
; Mallory et al., 2004b
; McHale and Koning, 2004
; Zhong and Ye, 2004
; Kim et al., 2005
), defective organ separations and aberrant numbers of floral organs (miR164; Laufs et al., 2004
; Mallory et al., 2004a
; Baker et al., 2005
), aberrant phyllotaxis, reduced fertility, and abortion of the shoot apical meristem (miR168; Vaucheret et al., 2004
), and cotyledon and rosette leaf shape and symmetry defects, reduced fertility, and misexpression of early auxin response genes (miR160; Mallory et al., 2005
). A null mutation in the Dicer-Like 1 (DCL1) locus, which codes for an endonuclease critical for miRNA accumulation (Park et al., 2002
; Reinhart et al., 2002
), causes embryonic lethality (Schauer et al., 2002
), further implicating plant miRNAs in the elaboration of the multicellular plant body plan. Given these clear roles in plant development, it has been proposed that precise regulation of miRNA activity during various stages of growth and in specific cell types is of central importance for normal plant development (Rhoades et al., 2002
; Bartel, 2004
).
Many Arabidopsis miRNAs are conserved among flowering plants. For most miRNAs cloned from Arabidopsis, exact or nearly exact matches can be found in the rice (Oryza sativa) genome, which if transcribed would be in a context predicted to fold into stem-loops characteristic of miRNA primary transcripts (Reinhart et al., 2002
; Bonnet et al., 2004
; Wang et al., 2004
). Similarly, rice homologs of many Arabidopsis miRNA targets have conserved miRNA complementary sites, implying that these miRNAtarget interactions have been functioning at least since the last common ancestor of monocots and eudicots (Rhoades et al., 2002
; Jones-Rhoades and Bartel, 2004
; Sunkar and Zhu, 2004
). Additional evidence for conservation of plant miRNAs has come from EST sequence data from diverse flowering plants and occasional nonflowering plants, in which sequences containing miRNA hairpins as well as sequences homologous to the known or predicted Arabidopsis targets retaining miRNA complementary sites have been observed (Palatnik et al., 2003
; Jones-Rhoades and Bartel, 2004
; Sunkar and Zhu, 2004
). Direct observations have shown that miRNAs in the miR165/166 family are expressed and functional in wheat (Triticum aestivum; Tang et al., 2003
; Mallory et al., 2004b
) and maize (Zea mays; Juarez et al., 2004
) and are guiding cleavage of homologous target mRNAs in basal plants such as the lycopod Selaginella kraussiana (Floyd and Bowman, 2004
), implying that the miR165/166 regulatory circuit has remained intact since the last common ancestor of vascular plants.
Several different approaches enabling multiplexed detection of miRNAs using microarray technologies have been reported (Krichevsky et al., 2003
; Babak et al., 2004
; Liu et al., 2004
; Miska et al., 2004
; Nelson et al., 2004
; Sun et al., 2004
; Thomson et al., 2004
; Liang et al., 2005
), including one from our laboratory (Baskerville and Bartel, 2005
). In our approach, probes consist of Tm-normalized DNA oligonucleotides antisense to the given small RNA sequence. Sample preparation begins by selecting small RNAs with the characteristic features of miRNAs, followed by reverse transcription and PCR amplification with a fluorescently labeled primer. Single-stranded Cy3-labeled biological samples are then hybridized to the array along with a synthetic reference library containing a constant amount of Cy5-labeled single-stranded DNA sample, which allows internal normalization of the experiments. This technology has proven to be semiquantitative, sensitive, and highly reproducible in experiments with vertebrate miRNAs (Baskerville and Bartel, 2005
).
In this study, a microarray suitable for the detection of Caenorhabditis elegans, Drosophila melanogaster, and Arabidopsis miRNAs is described. Using this platform, the overall miRNA expression profile within the major organs of Arabidopsis was determined, providing a useful baseline for understanding the developmental dynamics of plant miRNA expression. Comparison with existing mRNA expression data revealed a significant negative correlation between the levels of miRNAs and those of their target messages. The array was also used as a phylogenetic profiling tool to probe RNA samples derived from specimens representative of major clades of land plants. We detected members of 11 miRNA families in a gymnosperm, eight in a fern, three in a lycopod, and two in a moss, indicating that many plant miRNA families have been long conserved during land plant evolution. Using a strategy for identification and validation of miRNA-regulated transcripts in the absence of any genomic information, we identified targets for several of these conserved miRNAs in organisms as divergent as Arabidopsis and moss. The newly identified targets of miR160, miR167, miR170/171, and miR172 in nonflowering plants were all homologous to the known Arabidopsis targets, demonstrating that multiple miRNAtarget interactions have remained unchanged over very long periods of plant evolution.
| RESULTS |
|---|
|
|
|---|
Many plant miRNAs are members of closely related families that differ by only a few nucleotides in sequence. We arrayed separate spots for closely related family members if there were one or more nucleotide differences in the center portion of the sequence (more than four nucleotides from both the 5' and 3' termini). To test the discrimination between these closely related sequences, one-half of the synthetic reference library was labeled with Cy5 and the other with Cy3 and hybridized in duplicate to the array. The selection of Cy5- and Cy3-labeled oligonucleotides was such that nearly all members of closely related families were tested against each other. This experiment showed that 13 out of the 63 plant spots were cross-hybridizing (see Supplemental Table 2 online). Twelve of these cases could be sorted into seven families of small RNAs within which a closely related probe could be found that most likely accounts for the cross-hybridization. Thus, signals from spots within these seven families reflect the combination of the closely related family members. This experiment also revealed that for five other closely related plant small RNAs, discrimination between species differing by one or two nucleotides was achieved (see Supplemental Table 2 online). Because these experiments tested equimolar concentrations of all samples, cross-hybridization might still be a problem if a slightly mismatched miRNA was present in a biological sample at much higher concentrations than was the perfectly matched RNA. Nonetheless, because all cases of observed cross-hybridization among plant-specific probes, save that of miR158, could be accounted for by closely related probes, we conclude that the array is specifically reporting the abundance of the intended miRNA families.
Previously, it has been demonstrated that the cloning frequency of C. elegans miRNAs from a mixed stage total RNA sample correlates with absolute molecular abundance, as determined by quantitative RNA gel blots (Lim et al., 2003
). When C. elegans miRNAs were analyzed by hybridization to the array, a positive linear correlation was observed between cloning frequency and array value, indicating that array value is generally indicative of RNA abundance (Figure 1A). However, it is important to note that there was significant variation in this relationship between array value and RNA abundance and that, therefore, two miRNAs with a similar array value may in fact vary substantially in their absolute abundance within a given sample. Of course, the primary purpose of expression analysis is not to compare the abundance of different miRNAs within a sample, but to compare the relative abundance of individual miRNAs across several samples. To investigate the array's utility for this purpose, four small RNA hybridizations were performed from two samples of C. elegans total RNA [wild type, mixed stage worms, and glp-4(bn2) worms; see Supplemental Table 3 online]. Figures 1B and 1C show that the technical variation was very low; linear regressions gave r2 values of 0.949 and 0.969 [for the wild type and glp-4(bn2), respectively]. Therefore, we concluded that an individual array value was highly reproducible and that large variations in the array value for a given miRNA across an experimental panel were indicative of different steady state levels of that species within the different samples.
|
A graphical representation of array values organized by hierarchical clustering of both genes and experiments is shown in Figure 2A. For clarity, values derived from spots designed for predicted rice homologs as well as any spot that was not above the detection threshold in both replicates in at least one organ were omitted from display. Consistent with the high reproducibility of these biological replicates, clustering of the experiments placed all biological replicates as most closely related to each other, with the exception of the long- and short-day seedling replicates, which were intercalated.
|
The organ expression map showed that non-miRNA small RNAs can be developmentally regulated. For instance, the DCL1-independent and RNA-dependent RNA polymerase 2dependent species siRNA02, which originates from an inverted repeat on chromosome V (Xie et al., 2004
), was detected by the array only in siliques and in two of four inflorescence samples. It is also worthwhile to note that many miRNAs were found to exhibit relatively uniform accumulation across the panel of tissues assayed. This does not necessarily imply that in these cases precise tissue- or cell typespecific miRNA activities are not important; indeed, such high-resolution miRNA accumulation patterns would be lost when assaying RNA from entire organs. Higher resolution methods to determine spatio-temporal accumulation patterns of miRNAs, such as in situ blot analysis (Chen, 2004
; Juarez et al., 2004
; Kidner and Martienssen, 2004
) or sensor transgenes (Brennecke et al., 2003
; Parizotto et al., 2004
), will be necessary to discover the precise locations of many of these plant miRNAs.
miRNA Expression Is Generally Anticorrelated with That of Targeted mRNAs
Plant miRNAs generally direct endonucleolytic cleavage of mRNAs (Llave et al., 2002
; Tang et al., 2003
; Schwab et al., 2005
), consistent with the suggestion that plant miRNAs enable rapid clearance of target mRNAs at specific points during plant development (Rhoades et al., 2002
; Bartel, 2004
). This hypothesis predicts a negative correlation between the expression of a miRNA and its target mRNAs within a given tissue or organ. We tested this hypothesis by comparing the expression levels of the differentially expressed miRNAs shown in Figure 2B with the expression levels of their known and predicted targets (Figure 3). To do this, we made use of portions of the AtGenExpress expression atlas of wild-type Arabidopsis development (Schmid et al., 2005
), available from The Arabidopsis Information Resource (www.arabidopsis.org). These expression data consist of triplicates of Affymetrix ATH1 array hybridizations using RNA derived from various organs and growth stages of Columbia-0 (Col-0) wild-type plants. For each of the seven organs sampled in Figure 2, the median relative expression level of each differentially expressed miRNA was plotted versus the median relative expression level of the corresponding targets (Figure 3A). As controls, we compared the expression levels of randomly selected WRKY and MADS box transcription factors (two large families of plant-specific transcription factors, neither of which have any Arabidopsis members known to be regulated by miRNAs) to miRNA levels (Figure 3C). As expected, there was no apparent correlation between the expression of these control genes and that of the differentially expressed miRNAs.
|
For each set of miRNA versus target, paralogous nontarget, and control mRNA expression data, a correlation coefficient was calculated (Figure 3D). The expression of the majority of miRNA targets was negatively correlated with expression of their corresponding miRNAs. The paralogous nontarget set was also somewhat negatively correlated with miRNA expression, whereas the control set had a large range of correlation coefficients, as expected from a random selection of genes. The median correlation coefficient of the targets was significantly lower than those of both the paralogous nontargets and the controls (P = 0.0038 and P < 0.0001, respectively; Mann-Whitney U-test), demonstrating that in plants, expression of miRNAs and that of their targets are generally negatively correlated.
Many Plant miRNA Families Are Ancient
Cloning and computational analyses of Arabidopsis small RNAs suggest that many plant miRNAs and their predicted targets are conserved between monocots and eudicots, which are thought to have diverged >125 million years ago (Reinhart et al., 2002
; Bonnet et al., 2004
; Jones-Rhoades and Bartel, 2004
; Sunkar and Zhu, 2004
; Wang et al., 2004
; Adai et al., 2005
). For the miR165/166 family, functional conservation between eudicots and monocots has been experimentally demonstrated (McConnell et al., 2001
; Emery et al., 2003
; Tang et al., 2003
; Juarez et al., 2004
; Mallory et al., 2004b
; McHale and Koning, 2004
; Zhong and Ye, 2004
), and conservation of target mRNA cleavage at the canonical site has been shown to occur in the lycopod S. kraussiana and at an offset potential target site in the moss Physcomitrella patens (Floyd and Bowman, 2004
). To directly assay multiple miRNA families for conservation between distantly related land plants, the plant miRNA array was used to analyze samples derived from the eudicot Nicotiana benthamiana, the monocots rice and wheat (T. aestivum), the magnoliid Liriodendron tulipifera, the gymnosperm Pinus resinosa (pine), the fern Ceratopteris thalictroides, the lycopod Selaginella uncinata, and the moss Polytrichum juniperinum. miRNAs, but not endogenous small interfering RNAs (siRNAs), trans-acting siRNAs, or any of the nine families of unclassified Arabidopsis small RNAs with probes present on the array were detected outside of Arabidopsis (see Supplemental Table 1 online; data not shown). miR161, miR163, and ASRP1729 (data not shown) were not detected outside of Arabidopsis, consistent with the hypothesis that these genes emerged recently (Allen et al., 2004
). Out of the 23 families of Arabidopsis miRNAs analyzed, we detected expression of 21 in Arabidopsis (composite of all experiments), 19 in Arabidopsis rosette leaves, 13 in N. benthamiana leaves, 12 in wheat germ lysate, 13 in rice seedlings, 13 in magnoliid leaves, 11 in pine needles, eight in fern leaves and stems, three in lycopod leaves and stems, and two in moss leafy gametophytes (Figure 4A; see Supplemental Table 5 online). We noted that miR158 was detected in N. benthamiana, T. aestivum, and L. tulipifera (data not shown), but we suspect these detections were false positives because this probe was the one that uniquely gave unexplained cross-hybridization in our control experiments (see Supplemental Tables 2 and 5 online), miR158 homologs are not computationally evident in either the rice (Jones-Rhoades and Bartel, 2004
) or poplar genome (M. Jones-Rhoades, personal communication), and RNA gel blot analysis could not detect miR158 in these RNA samples (data not shown). RNA gel blots were performed for the two miRNAs that were detected in the moss sample, miR160 and miR390, using DNA probes antisense to the Arabidopsis RNA sequences (Figure 4B). Both miRNAs were detected in the same samples as those indicated by the array analyses. For the samples in which the array analysis did not detect miR390, blot analysis showed very slight (N. benthamiana and pine [P. resinosa]) or no (lycopod [S. uncinata]) accumulation of the miRNA, demonstrating that, with the exception of the abnormally performing probe for miR158, the method used to determine the lower limit of detection in the array analyses was sufficiently stringent to prevent false positives, with sensitivity at least comparable to that of RNA gel blots.
|
miRNA Targets in Nonflowering Plants Are Homologous to Those in Arabidopsis
The extraordinary conservation of the plant miRNAs shown in Figure 4 raised the question of whether their target mRNAs, and by inference their biological functions, have also been conserved. All known plant miRNA:mRNA interactions are characterized by extensive base pairing, and disruption of this pairing has been shown to render the regulatory circuit dysfunctional in numerous independent studies (reviewed in Dugas and Bartel, 2004
). In the absence of genomic sequence data, this functionally critical complementarity was exploited to probe the transcriptomes of pine (P. resinosa), fern (C. thalictroides), lycopod (S. uncinata), and moss (P. juniperinum) for mRNAs containing a putative miRNA binding site (Figure 5A). Degenerate oligonucleotides corresponding to the expected sequences of functional miRNA target sites were used as gene-specific primers, along with an oligo(dT) adapter primer, to amplify the 3' regions of potential miRNA targets. The sequences from the 3' regions of these candidate targets were then used to design gene-specific primers for subsequent 5' rapid amplification of cDNA ends (5'-RACE) experiments. A plurality of 5'-RACE amplicons that terminate at the nucleotide that pairs to the tenth nucleotide of the miRNA is strong evidence that the mRNA in question is a miRNA target (Llave et al., 2002
). Initial amplifications were attempted with oligonucleotides representing miRNA complementary sites for most of the miRNAs detected in pine, fern, lycopod, and moss, resulting in 29 candidate targets (see Supplemental Table 6 online). Subsequent 5'-RACE experiments using libraries enriched in uncapped messages as templates yielded single PCR products of the predicted size for six candidates (Figure 5B). For two other candidates, multiple bands were recovered, and in each case one of them corresponded to the predicted size for a cleavage product (Figure 5B, lanes 4 and 6). For the other 21 candidate sequences tested, either no 5'-RACE products could be obtained (n = 12) or no evidence for cleavage was observed (n = 9; see Supplemental Table 6 online). We suspect that most of these other 21 candidate targets do not contain miRNA complementary sites and instead represent artifacts obtained from using short, degenerate oligonucleotides during the initial PCR. Single bands from 5'-RACE reactions were gel-excised before cloning and sequencing, as were lanes with multiple bands (gel slices containing all visible species were excised in these two cases).
|
To test directly whether the fern miR171 and miR172 are offset relative to their Arabidopsis homologs, we performed PCR from a fern small RNA library using oligonucleotides designed for miR171 and miR172 detection and 5' end definition (Lim et al., 2003
). In both cases, the experimentally determined 5' ends were indeed offset relative to the Arabidopsis homologs: fern miR171 was shifted three nucleotides to the 3' relative to Arabidopsis, whereas fern miR172 was shifted two nucleotides to the 5' relative to Arabidopsis (Figure 5C). Using these offset miRNAs as guides, the fern-171-1 and fern-172-1 target cleavage sites, as mapped in Figure 5C, were precisely at the nucleotide expected for miRNA-directed cleavage. Interestingly, register-shifted miRNAs have been cloned at low frequencies from Arabidopsis as well, including a shifted miR171 that matches the miR171 version that appears to predominate in the fern C. thalictroides (miR171.2/ASRP444; Gustafson et al., 2005
). An apparent offset target site in the moss (P. patens) homolog of a miR165/166 HD-ZIP target has also been suggested (Floyd and Bowman, 2004
). Together, these examples indicate that deeply conserved plant miRNAs can diverge from each other by shifts in register. If these offsets are too large relative to the Arabidopsis or rice sequences for which probes were designed, the miRNA will not be detected by the array. The putative offset target site in the P. patens miR165/166 target proposed by Floyd and Bowman (2004)
was shifted by 10 nucleotides, which could explain why we did not detect a miR165/166 ortholog in our moss (P. juniperinum) sample.
To assign putative functions to the newly discovered miRNA targets, deduced protein sequences were used to query the Arabidopsis protein database. In all cases, the best hit in the database was found to be either a confidently predicted or confirmed target of that miRNA in Arabidopsis (Table 1). For instance, fern-160-1, moss-160-1, and moss-160-2 are all most similar to the Arabidopsis gene Auxin Response Factor 16 (ARF16), a target of Arabidopsis miR160 (Mallory et al., 2005
). Pine-172-1 and pine-172-2 are most similar to two Pinus genes annotated as coding for Apetala 2 (AP2)-like proteins (Shigyo and Ito, 2004
), whereas the full-length fern-172-1 is most similar to Arabidopsis AP2; in Arabidopsis, miR172 is known to target AP2 and related mRNAs (Aukerman and Sakai, 2003
; Chen, 2004
). Pine-167-1 is most similar to the Arabidopsis ARF6 gene, which is a predicted target of Arabidopsis miR167 (Rhoades et al., 2002
). Fern-171-1 is most homologous to the Arabidopsis Scarecrow-Like 6-III (SCL6-III) gene, which is a confirmed target of miR170/171 in Arabidopsis (Llave et al., 2002
). In summary, our direct detection of miRNAs and empirical target discovery demonstrate that plant miRNAtarget interactions are frequently conserved between mosses, ferns, gymnosperms, and flowering plants, implying that these regulatory circuits have long been critical components of land-plant development.
|
The moss small RNA population had some characteristics reminiscent of those previously observed for Arabidopsis small RNA populations (Tang et al., 2003
): As seen in Arabidopsis, a strong peak was observed at 21 nucleotides in length, and uridine was the most frequent 5' residue of these 21mers (Figure 6). However, the strong peak of 24 nucleotide species possessing a 5' adenine residue that is seen in Arabidopsis small RNA populations was not apparent in moss.
|
| DISCUSSION |
|---|
|
|
|---|
Expression Profile of Arabidopsis miRNAs and Their Targets
The global expression profile shown in Figure 2 demonstrates the organ specificity of some miRNAs, while highlighting a fairly uniform level expression for many others. In instances where there are previous RNA gel blot data, the global expression profile corresponded well: For instance, miR156/157 expression is very strong in seedlings (Reinhart et al., 2002
), miR398 is strongly expressed in leaves (Jones-Rhoades and Bartel, 2004)
, and miR171 is strongly expressed within inflorescences (Llave et al., 2002
).
Accumulation of miRNA target mRNAs was frequently negatively correlated with that of the corresponding miRNAs (Figure 3D). There are several probable sources of noise in the comparison of the two expression profiling data sets: The experiments used different RNA samples from specimens grown under slightly different conditions and in both cases represented only a crude dissection of the organism where many cell types were combined in single samples. Additionally, some plant miRNA targets have very stable 3' cleavage fragments that may raise the mRNA expression value as detected by the ATH1 microarray. Because of these sources of noise, it is possible that the true extent of the negative correlations is higher. Expression profiling in human cells has revealed a similar phenomenon: Probable target genes (defined by partial complementarity to the miRNA and by their repression upon ectopic expression of the miRNA) have their lowest expression values within tissues where the corresponding miRNA is maximally expressed (Lim et al., 2005
).
The observation that miRNA targets rarely accumulate to high levels in the organs in which the corresponding miRNAs are most highly expressed (Figure 3A) is consistent with the hypothesis that plant miRNAs often act to clear target messages at certain stages of development. However, this observation and the more general anticorrelation between miRNAs and their targets is likely to have more complex causes, with other contributions to mRNA target expression in addition to miRNA-mediated clearance. In animals, the magnitude of the miRNA-induced downregulation of targets appears too small to fully explain the anticorrelation between miRNA and target expression, suggesting that the miRNAs are reinforcing regulation also occurring at the transcriptional and other levels (Lim et al., 2005
). The same could be true in plants. Indeed, the paralogs of miRNA targets, which themselves have not been confidently predicted as targets, also rarely accumulate to high levels in the organs in which the corresponding miRNAs are most highly expressed (Figure 3B). In some instances, this could indicate functional miRNA complementary sites with more mispaired residues than have been allowed when reliably predicting plant miRNA targets. For instance, one of the paralogs of the miR172 targets, AINTEGUMENTA (At4g37750), has a site with six mismatches to miR172, none of which occur in the region complementary to the 5' of the miRNAa region that has been shown to be critical for plant miRNA function (Mallory et al., 2004b
; Parizotto et al., 2004
; Schwab et al., 2005
). This pairing to miR172, together with its AP2-like domain, which is present in all of the more confidently predicted miR172 targets, suggests that AINTEGUMENTA may also be a direct target of miR172. Nonetheless, in the absence of any evidence to the contrary, it is possible that the similar expression profiles of miRNA targets and their closely related paralogs is not due to the direct action of miRNAs on highly mispaired complementary sites, but is instead due to alternative, non-miRNAmediated control that enables the expression of the paralogs of the miRNA targets to mirror that of the targets. Such non-miRNAmediated control processes could also be influencing expression of the miRNA targets. This being said, the observation that most miRNA targets are transcription factors (Rhoades et al., 2002
; Jones-Rhoades and Bartel, 2004)
brings up the possibility that in some cases expression of nontarget paralogs might be transcriptionally regulated by miRNA targets, such that miRNA-mediated repression of targets indirectly leads to repression of paralogous nontargets as well.
The Antiquity of Plant miRNAs: Evolutionary and Developmental Implications
The microarray platform allowed the direct detection of deeply conserved miRNAs from a gymnosperm, a fern, a lycopod, and a moss. The fact that these basal land plants with radically different lifestyles and morphologies share miRNAs in common with flowering plants indicates that these miRNAs have long been under selection pressure. Sequencing of a limited amount of moss small RNAs showed that there is a large and diverse population of 21-nucleotide species that predominantly possess a 5' uridine residue. Arabidopsis miRNAs and trans-acting siRNAs are most often 21 nucleotides in length (Reinhart et al., 2002
; Vazquez et al., 2004
), and miRNAs have a strong bias toward uridine as the 5' residue (Reinhart et al., 2002
); thus, this initial sampling of small RNAs in moss suggests a wealth of small silencing RNAs in lower plants. The anticipated completion of the P. patens genome will soon enable the examination of potential miRNA families that have emerged specifically in the bryophyte lineage or have been lost in flowering plants.
Most of the 11 miRNA families detected in pine and all eight families detected in fern have targets in Arabidopsis that are developmentally implicated, either as DNA binding transcription factors or as a core component of the miRNA machinery itself (miR168; AGO1). The deeply conserved miR390, cloned from Arabidopsis by the Carrington group as ASRP754 (http://asrp.cgrb.oregonstate.edu/; Gustafson et al., 2005
), from rice by Sunkar et al. (2005)
, and computationally predicted by several groups (Bonnet et al., 2004
; Wang et al., 2004
; Adai et al., 2005
), does not have any confirmed targets in Arabidopsis. However, Sunkar et al. (2005)
have recently demonstrated that rice miR390 targets an mRNA encoding a Leu-rich repeat containing receptor-like kinase (RLK). Pairing guidelines used to predict the targets of plant miRNAs (Jones-Rhoades and Bartel, 2004
) suggest that miR390 could potentially regulate several Arabidopsis RLK mRNAs (At1g34110, At1g55610, At1g56130, At1g73070, At3g24660, At3g43740, At4g08850, At5g07180, At5g14210, At5g44700, At5g49660, and At5g62230), which are homologous to the confirmed miR390 target in rice. However, despite extensive attempts, we have been unable to detect 3' cleavage fragments indicative of miR390-mediated cleavage for any of these possible RLK targets in Arabidopsis. There are 3 to 3.5 mismatches (counting G:U wobbles as 0.5 mismatches) between miR390 and each of these Arabidopsis RLK mRNAs, which is just at the cutoff for confident target prediction (Jones-Rhoades and Bartel, 2004
). The Arabidopsis miRNAs whose targets are not obviously involved in developmental control (e.g., miR161, miR163, miR397, and miR398) were not detected outside of flowering plants. Because the predicted or confirmed Arabidopsis targets of all of the miRNA families detected in nonflowering plants have at least potential developmental connections, we propose that the deeply conserved miRNAs are primarily involved in ancient circuits of gene regulation whose outputs have been affecting the morphology of plants throughout their diversification.
The discovery of miRNA targets from basal plants (Figure 5) demonstrates unequivocally that several miRNAtarget interactions have been constant throughout plant evolution. mir160, miR167, miR170/171, and miR172 all direct cleavage of targets in nonflowering plants whose closest known homologs are the very same targets that they are known or thought to cleave in Arabidopsis, as do miRNAs in the miR165/166 family (Floyd and Bowman, 2004
).
Technical limitations of our target discovery strategy might have prevented identification of additional targets: Target discovery depends first upon the presence of full-length target mRNAs in the sample and is probably helped by having target sites close to the 3' end of the transcript and by targets with short 3' untranslated regions, all of which combine to make amplification of the 3' portion of the message more robust (Figure 5A, step 2). Target validation depends on there also being a reasonably sized pool of somewhat stable 3' cleavage fragments present in vivo. It is probable that one or more of these factors prevented our detection of the miR165/166 targets previously reported by Floyd and Bowman (2004)
in gymnosperms and lycopods, who analyzed preselected mRNAs for evidence of miRNA-mediated cleavage. It is likely that careful construction of representative cDNA libraries coupled with more extensive PCR optimization could enable the discovery of additional targets from plants that have not yet been sequenced.
If, as these data suggest, most ancient miRNAs in plants have always been regulating the same targets, it follows that the downstream molecular effects of deeply conserved miRNA circuits may also be conserved, although perhaps with differing morphological outcomes. Such highly conserved, molecularly compact developmental modules would seem to be excellent substrates for the natural selection of plant form. Although the molecular identities of the miRNAs and their targets have remained constant, it is easy to envision that small changes in the temporal, spatial, or environmental regulation of these modules over time could have had large phenotypic effects on plant morphology. It is interesting to consider the extent to which the deeply conserved miRNAtarget modules may have been recruited for nonhomologous functions in different plant lineages. Understanding in molecular detail both miRNA regulation of these conserved targets and how the targets themselves cause their downstream effects in diverse model systems should significantly enhance the understanding of the molecular roots of plant morphology.
| METHODS |
|---|
|
|
|---|
55°C (mean = 54.79°C, SD = 1.33°C). Plant small silencing RNAs can be classified as either miRNAs or siRNAs by the structure of their parent genes and their genetic requirements for biogenesis and function (Reinhart et al., 2002
RNA Sources and Extractions
C. elegans RNA was obtained from wild-type, mixed stage worms and from glp-4(bn2) worms cultured under standard conditions. Arabidopsis total RNA samples from inflorescences (stages 1 to 12), siliques (>4 d after fertilization), stems, cauline leaves, and rosette leaves were harvested from wild-type Col-0 50- to 60-d-old, long-day (16 h light/8 h dark) grown plants at 18°C. Arabidopsis root RNA samples were derived from Col-0 roots harvested from 14-d-old plants grown in constant light in liquid culture (1x MS salts + vitamins, 1% sucrose, and 5 mM Mes-KOH, pH 5.7), shaking at 60 rpm in constant light at 22°C. Short-day and constant light seedling RNA samples were taken from Col-0 10-d-old seedlings grown in soil under an 8-h-light/16-h-dark regime or under constant light at 18°C, respectively. All biological replicate samples were derived from two separate crops grown at different times under the same conditions. Nicotiana benthamiana RNA was obtained from leaves of 21- to 28-d-old plants grown under long-day conditions (16 h light/8 h dark) at 26°C. O. sativa cv indica (rice) RNA was derived from 7-d-old seedlings grown under long-day conditions on plates containing 1x MS salts + vitamins, 1% sucrose, 10 mM Mes-KOH, pH 5.7, and 0.8% bacto-agar. Triticum aestivum (wheat) total RNA was derived from wheat germ lysate prepared as described (Tang et al., 2003
). Liriodendron tulipifera (tulip treea Magnoliid) total RNA was harvested from mature leaves of a specimen located on Cambridge Street, Cambridge, MA, in July. Pinus resinosa (red pinea Gymnosperm) total RNA was derived from mature needles of a specimen located in John F. Kennedy Park, Cambridge, MA, in July. Ceratopteris thalictroides (water spritea fern) total RNA was derived from the leaves and stems of a specimen purchased from Doctors Foster and Smith (Rhinelander, WI). Selaginella uncinata (a lycopod) total RNA was derived from the leaves and stems of a specimen purchased from Plant Delights Nursery (Raleigh, NC). Polytrichum juniperinum (a moss) total RNA was derived from leafy gametophytes collected in Nickerson State Park, Brewster, MA, in October. Total RNA from all Arabidopsis, N. benthamiana, O. sativa, and T. aestivum samples was harvested as described by Mallory et al. (2001)
. Total RNA from all other specimens was prepared using a method for pine tree RNA isolation (Chang et al., 1993
).
Array Hybridizations
Small RNAs were fractionated, sequentially ligated to 3' and 5' adapters, and reverse transcribed as described (Lau et al., 2001
). First-stage PCR used oligonucleotides 17.92 and 17.93D (Lau et al., 2001
) and proceeded until amplifications were in linear stage (as determined by visualization of products from successive cycles; typically 17 to 19 cycles). A 1/100 dilution of this reaction was used as template in a labeling PCR using oligonucleotides 5' Cy3-labeled 17.93D and a reverse oligo containing a 20-nucleotide 5' poly(A) tract followed by an internal 18-carbon spacer and the 17.92D sequence (17.92_c18_A20) for 10 cycles to create an asymmetric PCR product (Baskerville and Bartel, 2005
). A Cy5-labeled reference library was generated by 10 cycles of PCR using 5' Cy5-17.93D and 17.92_c18_A20 using a 45 nM pool containing equal amounts of all 225 reference oligonucleotides as template. Labeled PCR products were fractionated through a 6% denaturing polyacrylamide gel, enabling excision of the shorter Cy3- or Cy5-labeled strand. Samples were adjusted to 5 µM in water. For each hybridization, 2 µL of 5 µM Cy3-labeled sample and 2 µL of 5 µM Cy5-labeled reference was added to 20 µL of hybridization buffer (3.5x SSC, 1% [m/v] BSA, 0.1% [m/v] SDS, 93 µg/mL salmon testes DNA, 187 µg/mL Escherichia coli tRNA, and 37 µg/mL polyadenine) for a final concentration of 0.417 µM each. After heating for 4 min at 85°C, samples were applied to arrays that had been prehybridized for 45 min in 3.5x SSC, 1% (m/v) BSA, 0.1% (m/v) SDS, rinsed with deionized water, and dried. Arrays were incubated at 57° for 16 h, then washed for 5 min at 50° in 2x SSC, 0.1% SDS, followed by 10 min at room temperature in 0.1x SSC, 0.1% SDS, and 3 x 1 min at room temperature in 0.1x SSC. Arrays were then dried and scanned using the GenePix 4000B (Axon Instruments, Union City, CA) at 10 µm per pixel, line average two, and constant photomultiplier tube gains for both 635 nm and 532 nm.
miRNA Array Data Analysis
Raw data was extracted from scanned array images using GenePix Pro 5.1 (Axon Instruments). Spots with an unacceptably low signal in the reference channel (defined as less than or equal to the median background at 635 nm plus 4 standard deviations) were eliminated from analysis, as well as rare spots whose median intensities at either 532 or 635 nm were saturated. Median local background was then subtracted from median spot intensities to arrive at background-corrected median intensities in both channels for all spots. Typical global normalizations for standard two-channel arrays operate on the assumption that, on average, the total intensities of both channels should be equal (Causton et al., 2003
); this assumption is clearly false for experiments comparing a constant, synthetic sample against varying biological samples. Instead, a limited global normalization was performed based upon the noncognate spot intensities: The summed median intensities in both the 635 channel (Cy5) and 532 channel (Cy3) were derived from all noncognate spots (i.e., the D. melanogaster and C. elegans spots for plant experiments) from each hybridization. The ratio of total noncognate Cy3/total noncognate Cy5 (a) was calculated for each hybridization, and a mean ratio (
) derived from all arrays to be compared. For each array n, a normalization factor bn was derived by dividing an/
; the final value for each spot was the ratio of background corrected median Cy3/background corrected median Cy5 divided by bn. Because of our desire to always use the same amplified, synthetic Cy5-labeled reference set for every experiment, we did not perform dye-swap experiments. Thus, although we cannot rule out small dye-specific effects, they are expected to be minimal because the dyes were introduced via end-labeled oligonucleotides rather than by direct incorporation and because our normalization procedure incorporates the nonspecific background Cy3/Cy5 ratio in its calculations. After normalization, the four replicate spots for each small RNA were averaged together. The determination of a lower limit of detection (thresholding) was also guided by the presence of the noncognate probes: Values for all noncognate spots in a given analysis were compiled into a histogram. The value at which greater than or equal to 99% of all noncognate values were lower was called the lower limit of detection in these analyses. Finally, RNAs that were called detected and whose total Cy3 + total Cy5 intensities were in below the 25th percentile of all spots in the analysis were manually reexamined and eliminated from consideration if warranted. To find small RNAs that are differentially expressed in at least one of the organs studied, the values derived from the four replicate spots on each array were first condensed to the mean. Single-factor analysis of variance was performed on the 29 small RNAs that were expressed at detectable levels in at least half of the organs tested in all biological replicates, and those with P-values < 0.01 were listed as being differentially expressed. Using the Bonferroni-Holm stepdown correction to adjust P-values for multiple comparisons, we find that eight of these (F3-3_B01-5, miR157, miR172, miR156, miR396, miR398, miR160, and miR163) have corrected P-values < 0.05, whereas six [miR167, miR169, miR394, miR158, siR480(+), and miR171] have corrected P-values between 0.05 and 0.118. Hierarchical clustering of log2 transformed values was performed with Cluster (M. Eisen, Stanford University, Stanford, CA) and visualized using Java Treeview (M. Eisen). Files containing the normalized, detected, and log2-transformed data used in the C. elegans analysis, Arabidopsis organ map, and the phylogenetic survey are available in Supplemental Tables 3 to 5 online, respectively.
mRNA Array Data Analysis
Raw data from the following triplicate experiments were downloaded from the AtGenExpress expression atlas of wild-type Arabidopsis development (The Arabidopsis Information Resource, http://www.arabidopsis.org/, accession number ME00319): ATGE_7 (green parts of seedlings, 7 d, 23°C,continuous light, soil grown), ATGE_13 (rosette leaf 4,1 cm long, 17 d, 23°C, continuous light, soil grown,), ATGE_26 (cauline leaves, 21+ d, continuous light, 23°C, soil grown), ATGE_27 (stem, 2nd internode, 21+ d, continuous light, 23°C, soil grown), ATGE_29 (shoot apex, inflorescences, 21 d, continuous light, 23°C, soil grown), ATGE_78 (siliques, with seeds, stage 5; 8 weeks, continuous light, 23°C, soil grown), and ATGE_93 (roots, 15 d, long days [16/8], 22°C, 1x MS agar with 1% sucrose). These data correspond to our miRNA array data for long-day seedlings, rosette leaves, cauline leaves, stems, inflorescences, siliques, and roots, respectively. Raw expression values from each hybridization were normalized by dividing each value by the median value of the chip and multiplying the result by 100. The resulting expression values for the miRNA targets of the differentially expressed miRNAs shown in Figure 2B, control WRKY and MADS box genes, and paralogous nontargets were retrieved. These values were normalized on a per-gene basis, such that each value was divided by the median value of that gene across all tissues examined. Expression values from calls flagged "absent" or "marginal" were excluded from the median calculation and subsequently substituted for the lowest observed "present" value for that gene in the experiments examined. The median, log2-transformed values for each gene of interest were then calculated and plotted against the medians of similarly normalized relative expression values of the cognate miRNAs. Linear correlation coefficients between relative miRNA and relative target, paralogous nontarget, and control mRNAs were calculated as by Baskerville and Bartel (2005)
. The targets were as defined by Jones-Rhoades and Bartel (2004)
, with the exception of the targets of the trans-acting siRNA designated siR480(+), for which we examined the expression levels of At5g18040 and At4g29770, which were validated by Vazquez et al. (2004)
; note that for three of the differentially expressed small RNAs (miR163, miR158, and the unclassified small RNA F3-3_B01-5) shown in Figure 2B, there are no conserved predicted targets. The control genes were 20 randomly selected WRKY and 15 randomly selected MADS box transcription factors that were flagged "present" in the majority of the tissues analyzed. Two different random WRKY genes and either one or two different random MADS box genes were randomly paired with the values for each of the 10 miRNAs with known targets. Gene pairings and normalized median relative expression levels for this experiment are located in Supplemental Table 8 online.
RNA Gel Blots
Approximately 25 µg of total (lanes 1 to 5 of Figure 4B) or poly(A)-depleted (lanes 6 to 9 of Figure 4B) RNA was fractionated through a 15% polyacrylamide-urea gel along with 33P end-labeled RNA standards and transferred and fixed to a nylon filter as described (Lau et al., 2001
). 32P end-labeled DNA oligonucleotides antisense to the Arabidopsis miR390 (5'-GGCGCTATCCCTCCTGAGCTT-3'), Arabidopsis miR160 (5'-TGGCATACAGGGAGCCAGGCA-3'), and the U6 small nuclear RNA (snRNA) (5'-TTGCGTGTCATCCTTGCGCAGG-3') were prepared using T4 polynucleotide kinase. Hybridizations were performed in PerfectHyb-Plus (Sigma-Aldrich, St. Louis, MO) supplemented at 100 µg/mL with denatured salmon testes DNA, at 20°C below the Tm of the probe, where Tm is defined by 4(n G+C) + 2(n A+T). Blots were washed at 50°C 2 x 10 min in 2x SSC, 0.2% SDS followed by 2 x 5 min in 0.5x SSC, 0.2% SDS, and imaged using a phosphor imager. Blots were stripped in between hybridizations by washing for 30 min, at room temperature, with 200 mL of initially boiling 0.1% SDS, and exposed for at least 16 h to verify complete removal of probe before rehybridization.
Empirical Discovery of miRNA Targets and miRNA 5' Definition
The 3'-RACE oligonucleotides were designed for queried miRNA targets (see Supplemental Table 6 online) to be antisense to a consensus of known Arabidopsis targets and EST homologs of Ara