Loss of Polycomb proteins CLF and LHP1 leads to excessive RNA degradation in Arabidopsis

of Polycomb-group

A c c e p t e d M a n u s c r i p t 4 ( Derkacheva et al., 2013;Hecker et al., 2015), suggesting that it acts as a bridge between PRC2 and PRC1. In addition, clf and lhp1 mutants have similar developmental phenotypes and significantly overlapping transcriptome, supporting a common function (Veluchamy et al., 2016;Wang et al., 2016;Zhou et al., 2017b). Taken together, these observations mean that PRC1 and PRC2 cooperate, to a certain extent, to down-regulate the expression of genes, and to keep their expression silenced.
However, several studies have demonstrated or suggested more diverse functions for PcG proteins than the strict transcriptional repression of genes. First, recent reports have illustrated PcG functions that are not restricted to genes with low expression. In mammalian cells for instance, a non-canonical PRC1 targets preferentially active loci, associated with metabolic functions (van den Boom et al., 2016). Likewise, it has been demonstrated in Drosophila that PRC1 could regulate the expression of a subset of active genes, with a large part of this regulation being certainly direct (Pherson et al., 2017).
In Arabidopsis, a subset of H2AUb-marked genes corresponds to transcriptionally active genes (Zhou et al., 2017a). Interestingly these genes are also associated with metabolic functions, but the impact of PcG proteins regulation on them has not been analyzed. Another study in Arabidopsis described that NRT2.1, which is a highly expressed gene coding for a major and essential root nitrate transporter, is constantly marked by H3K27me3 even under transcriptionally permissive conditions (Bellegarde et al., 2018). Loss of CLF function leads to a reduction of H3K27me3 levels at the NRT2.1 locus, increasing its mRNA levels without altering its expression pattern (Bellegarde et al., 2018). On another hand, non-transcriptional functions have been also described for PcG proteins. Indeed, different reports have demonstrated that PcG proteins can be directly implicated in cellular processes independently of the regulation of gene expression, including the regulation of cell cycle progression, or the regulation of alternative splicing (Brien et al., 2015;Gonzalez et al., 2015;Lecona et al., 2013;Mohd-Sarip et al., 2012). Therefore, growing evidence show that PcG proteins may have emerging functions that are not strictly associated with repression of gene expression. However, even though the presence of PcG proteins and their associated chromatin marks on transcriptionally active loci have been clearly demonstrated, the implication in their regulation is still elusive and deserves A c c e p t e d M a n u s c r i p t 5 further investigation. Here, we used NRT2.1 as a model gene to further explore the role of PcG proteins in the regulation of transcriptionally active genes in Arabidopsis. We showed that in addition to being controlled by PRC2, NRT2.1 is also regulated by PRC1 members, including LHP1. Concurrent mutations in CLF and LHP1 lead to an unexpected down-regulation of NRT2.1 expression, associated with the accumulation of sRNAs resulting from mRNA degradation. We demonstrated that overaccumulation of sRNAs in a double clf lhp1 mutant concerns a large set of transcriptionally active genes, that are consequently deregulated. We finally observe that mRNA degradation globally increases in clf lhp1 mutant, likely corresponding to a misregulation of actors of RNA degradation machinery.

Expression analysis
Root samples were frozen in liquid nitrogen and tissues were disrupted for 1 min at 30 s−1 in a Retsch mixer mill MM301 homogenizer (Retsch, Haan, Germany). Total RNA was extracted using TRI REAGENT (MRC). Subsequently 500 ng of total RNA were treated with DNase (DNase I; SIGMA-ALDRICH, St. Louis, MO, USA) following the manufacturer's instructions. Reverse transcription was achieved with M-MLV reverse transcriptase (RNase H minus, Point Mutant, Promega) using 1 µL of an anchored oligo(dT) 20 primer (50 µM) for mRNA, or specific complementary primers (50 µM) with stem loop secondary structures for miRNAs (Varkonyi-Gasic et al., 2007). Accumulation of transcripts A c c e p t e d M a n u s c r i p t 6 was measured by qRT-PCR (LightCycler 480, Roche Diagnostics) using the SYBR R Premix Ex Taq TM (TaKaRa), according to the manufacturer's instructions with 1 µL of cDNA in a total reaction volume of 10 µL. Gene expression was normalized using UBQ10 or ACT2 as internal standards and using the 2 -Ct method (Livak and Schmittgen 2001). Sequences of primers used in qPCR for gene expression and stem-loop analysis are listed in Supplemental dataset.

Root nitrate influx
NO 3 − influx was assayed using 0.2 mM 15 NO 3 . Roots were then dried at 70°C for 48 h, and samples were analyzed for total N and atom% 15 N using a continuous flow isotope ratio mass spectrometer coupled with a C/N elemental analyzer (model Euroflash; Eurovector, Pavia, Italy) as previously described (Jacquot et al., 2020).

Transcriptome profiling
Genome-wide expression analyses were performed on using Arabidopsis Affymetrix Gene 1.

Small RNA Northern blot (electrophoresis, labelling, hybridization, quantification)
Northern blot of small RNAs was performed as described in (Blevins, 2017) with minor modifications.
Total RNA from each sample were loaded on denaturing polyacrylamide-urea gels and separated using 4 hours of migration. Small RNAs were transferred overnight on nylon membrane (Whatman® 0,45µm), and cross-linked with EDC (Thermo Scientific™ Pierce™). Membranes were pre-hybridized 4 hours with PerfectHyb Plus Hybridization Buffer (Sigma), and hybridized overnight with radioactive probes. Probes were generated from denatured PCR products labelled with 32 P-CTP by Klenow enzyme ( Fig. 3 and Supplemental Fig. 3), or from single-strand oligonucleotides labelled with 32 P-ATP by T4PNK (Fig. 5). List of primers and oligonucleotides used for probe design is described in A c c e p t e d M a n u s c r i p t 8 Supplemental dataset. Hybridized membranes were washed with 2X SSC 0,1% SDS solution and exposed to CL-XPosure Film (ThermoFisher). Intensity of signal was quantified using ImageJ.

Run-on
Root samples were frozen in liquid nitrogen and nuclei were isolated using NIB described above.
Nuclei were resuspended in 75 µL of 1.3X transcription buffer (65 mM Tris HCl pH 8, 6.5 mM MgCl2, using the SYBR R Premix Ex Taq TM (TaKaRa). Gene expression was normalized using UBQ10 as an internal standard.

Cell fractionation
Root samples were frozen in liquid nitrogen and nuclei were isolated using NIB as described above.
Supernatant were recovered and taken as representative of the cytoplasmic fraction. Nuclei were further purified using a 30% Percoll cushion. RNAs were extracted from both fraction using LS Trizol (Invitrogen) and resuspended in 50% formamide.

sRNAs sequencing
Small RNA sequencing was performed from total RNA by the company Fasteris. Total RNA from each sample was size selected for 18 to 50-nucleotide sRNAs using denaturing polyacrylamide-urea gels.
Sequencing was done on a NextSeq instrument using 50-base single-end mode.
A c c e p t e d M a n u s c r i p t 9

Bioinformatics analysis
Transcriptomics analyses and identification of DEGs. Normalization of the microarray raw intensities was performed using the Robust Multiarray Averaging (RMA) method from the oligo R package of BioConductor. DEGs were identified using the limma R package. A gene was identified as a DEG in a selected genotype when the adjusted P-value was <0.05. Analysis of RNA-seq data. Initial paired-end RNA-Seq reads were pre-processed using fastp. High quality reads from WT and clf-29 lhp1-4 were aligned to the reference genome (A. thaliana TAIR10 release) using STAR. To analyze mis-splicing of mRNA, STAR output bam files were converted to bed format and we selected the reads mapping on higher than randomly expected, and its value of significance, was tested and calculated using GeneSect (http://virtualplant.bio.nyu.edu). Genesect uses a bootstrapping-based method, consisting of selecting randomly 1,000 gene lists of Arabidopsis AGI having the same size as the observed lists, and counting the number of times that the intersection size of the random lists is equal or higher than the intersection observed for the two tested gene lists, reaching to a P-value (Krouk et al., 2010). Gene ontologies. Gene ontology enrichment were interrogated using the ClusterProfiler R package, using the biological process function. The enrichment tests correspond to Fisher's exact tests, determining if the proportion of a specific ontology in a list of interest is statistically higher than expected in all Arabidopsis genes. P-values from those tests are corrected for multiple testing, and the threshold for over-representation tests was set to 0.05. In the graphical results, each significant ontology is thus characterized by an adjusted P-value, and a "Count", representing the number of genes associated to this ontology in the list of interest. Small RNA sequencing analysis.
A c c e p t e d M a n u s c r i p t 10 Small RNA-seq was analyzed using the Shortstack package, after filtering the fastq files from 21-, 23, 24 and 33 nt reads. In order to analyze sRNAs associated with degradation of mRNA, cluster of sRNAs obtained by Shortstack were first filtered to only consider for the rest of the analysis those mapping to coding genes. We then applied (i) a coverage filter, with at least 64 reads associated to a coding gene using all the libraries, (ii) a strand specific filter, keeping only clusters of sRNAs for which sRNA strand corresponds to mRNA strand, and (iii) a filter for a minimum complexity at 0.7 provided by Shortstack. Normalization of sRNA-seq read density using miRNAs expression. In order to normalize the density of sRNA reads between WT and clf-29 lhp1-4, we used absolute values of several miRNAs expression measured by stem-loop quantitative PCR. This analysis reports that these miRNAs are not significantly deregulated in clf-29 lhp1-4 (Supplemental Fig. 3). Then we relate these values to the number of reads observed for these miRNAs in the sRNA-seq for the 2 libraries of WT and clf-29 lhp1-4. Compiled together, this leads to a normalization factor of 8.44 that we applied to calculate normalized read density between WT and clf-29 lhp1-4. Metagene profiling. Metagene profiling was performed using computeMatrix function of deepTools, using the transcript location of each gene and bigwig files with normalized density of sRNAs for each genotype as inputs. Ten groups of genes were made according to their level of expression in WT, with 0-10 corresponding to the first 10% (less expressed) and 90-100 corresponding to the last 10% (most expressed). Results were plotted with plotProfile function of deepTools. Calculation of RNA degradation score and significance on PcG target genes. sRNA-seq reads corresponding to degradation products were selected via the 3 filters previously described on Shortstack output (coverage, strand, and complexity), and quantified for each gene. RNA-Seq counts in the same conditions were used to divide the number of degradation products of each gene, providing a degradation rate, or degradation score, relative to gene expression. This normalization is designed to compensate the fact that degradation increases with gene expression. At this step, an expression filter was set to an average of 10 counts per condition to remove genes with extremely low expression in the RNA-Seq data. The median of degradation scores among a list of genes of interest (i.e. PcG target genes) was used as the observed value in a bootstrap A c c e p t e d M a n u s c r i p t 11 hypothesis test. The null distribution of this test was simulated as the median degradation scores of 1000 lists of genes randomly sampled genome-wide, the size of the lists being equal to the size of the list of interest. This null distribution served to assess whether the observed median degradation score in marked genes was significantly higher than expected in random lists of genes. The P-values, defined as the number of random lists with a median degradation score above the observed degradation score divided by 1000, were 0 for all markings and both genotypes. The derived Zscores, defined as the number of standard deviations between the observed score and the mean of the null distribution, had values between 6 and 11, depending on the marking and genotype.
Bootstrapping results were plotted with ggplot2 in R.

Results
In order to get new insights into the role of PcG proteins in the regulation of actively transcribed genes, we first measured the expression of NRT2.1 under highly transcription-permissive conditions (low nitrate) in mutants for PcG-related factors, including LHP1 and BMI1 proteins. As we already observed, NRT2.1 expression was slightly higher in clf-29 mutant than in WT (Fig. 1A) (Bellegarde et al., 2018). Similarly, we observed that the expression of NRT2.1 was increased in lhp1-4 mutant in comparison to WT, and also to a lesser extent in bmi1 mutants, presumably due to strong redundancy between BMI1 proteins (Merini et al., 2017) (Fig. 1A). We concluded that NRT2.1 might be targeted by both PRC1 and PRC2. This confirmed what might be suggested by epigenome-wide analyses, where NRT2.1 is found to be marked by both H3K27me3 and H2AUb, and is described as an LHP1 target (Veluchamy et al., 2016;Zhou et al., 2017a). Several reports showed that mutants for CLF or LHP1 display similar phenotypes, and have thus proposed a functional redundancy between these two PcG-proteins in the regulation of genes with low expression (Derkacheva et al., 2013;Veluchamy et al., 2016;Wang et al., 2016;Zhou et al., 2017b). Therefore, we wondered whether CLF and LHP1 could also have a certain degree of redundancy in the regulation of NRT2.1 and other highly A c c e p t e d M a n u s c r i p t 12 expressed genes. Surprisingly, the effect of concurrent loss of function for CLF and LHP1 has not been described yet in the literature. To do so, we generated a double clf-29 lhp1-4 mutant. This double mutant yields very few to almost no seeds, and was thus maintained at the heterozygous state for clf-29 mutation. However, clf-29 lhp1-4 mutant does not display severe PcG phenotype like callus-like structures observed in clf swn or bmia/b/c mutants (Bratzel et al., 2010;Chanvivattana et al., 2004) (Supplemental Fig. 1A, B). In opposition, even though clf-29 lhp1-4 mutant shows strong growth defects, it displays a relatively correct organogenesis including a root system, leaves and the development of flowers and siliques. Concerning the root system, the length of the primary root of clf-29 lhp1-4 was not significantly different of the WT, but the lateral root density was significantly reduced in clf-29 lhp1-4 in comparison to WT (Supplemental Fig. 1C). The root anatomy of clf-29 lhp1-4 mutant appears properly organized, as revealed by propidium iodide staining and confocal microscopy (Supplemental Fig. 1D). Therefore, we measured NRT2.1 expression level in the roots of reporter line, that we introduced in the clf-29, lhp1-4 and clf-29 lhp1-4 mutants (Fig. 1B), strongly suggesting again the redundancy between CLF and LHP1 in the regulation of NRT2.1 expression. In agreement with these observations, the root nitrate influx, depending mostly on the function of NRT2.1, was strongly diminished in clf-29 lhp1-4 mutants (Fig. 1C).
To explore the extent of the combined effects of the clf-29 and lhp1-4 mutations on the regulation of transcriptionally active genes, we generated and compared the root transcriptome profiles of WT and the double mutant clf-29 lhp1-4 under low nitrate condition. In agreement with NRT2.1 expression in clf-29 lhp1-4, we found that other highly expressed genes involved in N nutrition and usually coexpressed with NRT2.1, like NRT1.1, NIA1 and NIR1 also display a significantly reduced expression in clf-29 lhp1-4 ( Fig. 2A). Genome-wide, concurrent loss of CLF and LHP1 functions leads to the differential expression of almost 3500 genes (adjusted P-value < 0.05), in agreement with the major  Table 1). To understand the effect of CLF and LHP1 concurrent mutations on transcriptionally active genes at a larger scale, we tested if deregulation of genes in clf-29 lhp1-4 was linked to their level of expression in WT. We clearly observed that genes showing high levels of expression in WT, like NRT2.1, tend to be downregulated in clf-29 lhp1-4, in opposition to genes showing low levels of expression which tend to be up-regulated (Fig. 2B). In order to question whether down-regulation of highly expressed genes in clf-29 lhp1-4 correlates with the presence of H3K27me3 and LHP1 at their loci, we isolated among the highly expressed genes those significantly down-regulated in clf-29 lhp1-4, and marked by both H3K27me3 and LHP1. To do so, we used ChIP-seq data from two studies that previously described H3K27me3 and LHP1 target genes (Roudier et al., 2011;Veluchamy et al., 2016). Even if the nitrate conditions are not identical between those studies and ours, it has been demonstrated that H3K27me3 profile is globally very robust in Arabidopsis, and does not change according to nitrate provision (Bellegarde et al., 2018;Roudier et al., 2011). This leads to a relatively low number of 70 genes, as most transcriptionally active genes are under-represented in H3K27me3 and LHP1 markedgenes (Fig. 2C). However, this number was significantly higher than randomly expected, demonstrating that genes highly expressed and marked by H3K27me3 and LHP1 in WT tend to be down-regulated in clf-29 lhp1-4 (Supplemental Fig. 2). Finally, we analyzed the function of these 70 NRT2.1-like genes by looking at gene ontology enrichments. Although we interrogated a relatively low number of genes, we significantly identified several functional categories that were over-represented.
In particular, we observed that "response to nitrate", "inorganic ion transmembrane transport" or "inorganic anion transport" were among the most significantly over-represented functional categories (Fig. 2D). This led us to the conclusion that PcG proteins might target a group of transcriptionally active genes associated to metabolism and nutrition, including nitrate transport, and that the concurrent absence of regulation exerted by CLF and LHP1 lead to their down-regulation.
Excessive However, it revealed in clf-29 lhp1-4 a massive amount of degradation products (Fig. 3C). The signature of NRT2.1 degradation was very intense in clf-29 lhp1-4, and interestingly traces of such degradation, although to a much lesser extent, also appear in the WT line. We also observed that NRT2.1 degradation was likely established in clf-29 and lhp1-4 simple mutants, although the amount of degradation was clearly lower than in clf-29 lhp1-4 (Supplemental Fig. 3). Interestingly, the moderate levels of NRT2.1 RNA degradation observed in clf-29 and lhp1-4 single mutants compared transcript level was not due to a diminution in transcription (Fig. 3D). Interestingly, we also observed that the increase in NRT2.1 transcription rate in clf-29 lhp1-4 was not in a higher range than the increase in NRT2.1 transcript level in clf-29 or lhp1-4 single mutants (Fig. 1A), suggesting that the concurrent mutations of CLF and LHP1 do not lead to a burst of NRT2.1 expression. Altogether, these observations suggest that CLF and LHP1 might not only control the level of NRT2.1 expression, but also the integrity or the quality of NRT2.1 transcript, as a large part of NRT2.1 mRNAs are subjected to degradation in clf-29 lhp1-4 mutant, leading to a reduced transcript level compared to WT.
Aberrant or defective transcripts are extremely difficult to isolate, as they are instantly degraded to avoid deleterious effect. Therefore, to assess the genome-wide implication of PcG proteins in the regulation of transcript integrity of transcriptionally active genes in Arabidopsis, we focused on the final products of mRNA degradation. We thus analyzed the profile of sRNAs in WT and clf-29 lhp1-4 by performing sequencing of sRNAs (sRNA-seq). First, we observed at the global level that the size distribution of sRNA populations was strikingly different between WT and clf-29 lhp1-4. Indeed, typical sRNA abundant populations in WT, constituted by 21-, 24-or 33-nt sRNAs, were apparently absent in clf-29 lhp1-4 (Fig. 4A). In opposition, clf-29 lhp1-4 mutant line shows a massive increase in non-canonical sRNAs ranging continuously from the lowest to the highest sizes of the sRNA-seq (Fig.   4A). In clf-29 lhp1-4, this population of sRNAs, likely associated with mRNA degradation, represented more than 75% of the sRNA population, in opposition to less than 45% in the WT (Fig 4B). Therefore, the apparent down-representation of 21-, 24-and 33-nt sRNAs in clf-29 lhp1-4 could likely be due to A c c e p t e d M a n u s c r i p t 16 an increase in sRNAs of all other sizes that would affect the whole sequencing representation. To confirm this hypothesis, we performed stem-loop qRT-PCR with primers specific to four representative miRNAs in the genome. The result indicates that there is absolutely no decrease in the abundance of these miRNAs in clf-29 lhp1-4 (Supplemental Fig. 4). Therefore, to perform a correct quantitative comparison of sRNA-seq results between clf-29 lhp1-4 and WT, we used those four miRNAs to generate a normalization factor based on the effective quantification by stem-loop qRT-PCR and on the number of reads obtained in the sRNA-seq data (described in the Methods section). A first targeted analysis of sRNA-seq revealed that genes with an expression profile similar to NRT2.1 in clf-29 lhp1-4 also display a strong accumulation of sRNAs at their locus in this genotype (Fig. 4C).
Moreover, nearly all of these non-canonical sRNAs were oriented in the sense strand of the transcripts, strongly reinforcing the fact that they originate from degraded mRNAs (Supplemental Fig.   5). To properly extract sRNA reads corresponding to RNA degradation, we discarded reads that had the size of canonical sRNAs (21-, 23-, 24-and 33-nt) and used filters for complexity (i.e a broad distribution over a gene coding sequence), read sense (corresponding to the coding strand of the transcript), and minimal coverage of sRNA clusters. Then, we inferred the abundance of sRNA reads corresponding to RNA degradation in WT and clf-29 lhp1-4 at the genome-wide level. We observed an important increase of sRNA abundance over genes in clf-29 lhp1-4, with genes globally accumulating as much as eight times more sRNAs in clf-29 lhp1-4 than in WT (Fig. 4D). This confirmed at the genome-wide level what we observed on NRT2.1 and other co-expressed genes. Then we wanted to compare between WT and clf-29 lhp1-4 the extent of degradation products according to the level of expression in WT. To do so, we performed a metagene analysis representing the density of sRNA reads associated with RNA degradation for subpopulations of genes ranked by their level of expression in WT. In agreement with our previous observations, we could see an important increase in the degradation profile of transcripts in the double mutant, in comparison to the WT. In the WT, only the 10% most expressed genes display a discernable level of RNA degradation. In opposition, we observed for clf-29 lhp1-4 intense profile of RNA degradation, that were especially strong for the A c c e p t e d M a n u s c r i p t 17 most expressed genes (Fig. 4E). The highest gene expression level is, the stronger the effect of clf-29 lhp1-4 mutations is and leads to sRNA accumulation (Fig. 4E). Then we could infer a list of more than 4000 genes that show an increase (at least 2-fold) in their amount of transcript degradation-related reads in clf-29 lhp1-4. Next, we wanted to test whether these genes are, like NRT2.1, overall highly expressed and misregulated in clf-29 lhp1-4. We therefore compared these genes with the list of deregulated genes in clf-29 lhp1-4 and with the 10% most expressed genes in WT. This analysis showed a significant overlap between highly transcribed genes in WT and those degraded and deregulated in clf-29 lhp1-4 (Fig. 4F). These data confirm that the loss of both CLF and LHP1 gives rise to massive transcript degradation on genes that are highly expressed, and that this degradation globally affects the expression level of these genes.
Degradation of transcripts is an important mechanism in eukaryotes that allows suppressing mRNAs with aberrant features that may ultimately compromise cellular functions. Given the excessive amount of transcript degradation related to transcriptionally active genes in clf-29 lhp1-4, we analyzed whether transcripts integrity might be affected following the loss of both CLF and LHP1 function. To do so, we analyzed RNA-seq data performed on WT and clf-29 lhp1-4 lines, and looked at several features associated to premature degradation of mRNAs, including longer transcript, splicing fidelity, presence of small insertions, deletions or unexpected polymorphisms. This analysis shows, for the whole genome as for genes targeted by H3K27me3 or LHP1, that mRNAs from clf-29 lhp1-4 did not show higher signatures of transcript defects compared to those of WT (Supplemental Table 2). This suggests that transcriptional fidelity was not significantly affected in clf-29 lhp1-4. In another hand, we asked whether the excessive production of sRNAs observed in clf-29 lhp1-4 is linked to the chromatin environment mediated by PcG complexes. To do so, we compared the sRNAs abundance on PcG-targeted genes and PcG-non targeted genes in WT and clf-29 lhp1-4. We calculated for each gene of the genome a score of degradation, corresponding to the amount of sRNAs at each locus, normalized by the level of expression. The score of RNA degradation was compared between the list of H3K27me3-or/and LHP1-targeted genes and lists of randomly selected A c c e p t e d M a n u s c r i p t 18 genes. In WT, we observed strikingly that genes marked by H3K27me3, LHP1, or both, have a significantly higher score of RNA degradation than the rest of the genome (Fig. 5A). This, surprisingly, shows that PcG-targeted genes are generally prone to mRNAs degradation in comparison to the rest of the genome. In another hand, two important observations emerged from the corresponding analysis in clf-29 lhp1-4. First, the score of degradation was much higher than in the WT, confirming that degradation of mRNAs globally increases in clf-29 lhp1-4. However, we observed that the score of degradation increased for PcG-targeted genes but also for the rest of the genome, to a very similar extent (Fig. 5A). This strongly suggests that the increase of mRNAs degradation in clf-29 lhp1-4 is not solely affecting PcG-targeted genes but the whole genome, and therefore means that accumulation of sRNAs is not directly mediated by the loss of PcG-related chromatin environment, but rather indirectly by another mechanism. Interestingly, when looking more carefully at the expression of genes annotated with the function "exoribonuclease activity and mRNA catabolic process", we found that some of these genes, such as XRN4, XRN3 or XRN2, could be slightly induced in clf-29 lhp1-4 ( Fig.   5B). This suggests that the machinery of mRNAs degradation could be altered and boosted in clf-29 lhp1-4. However, this would also be an indirect explanation, as XRN2, XRN3 and XRN4 are not targeted by PcG chromatin marks (Roudier et al., 2011). To test the origin of sRNAs associated to excessive RNA degradation in clf-29 lhp1-4, we performed sRNA Northern blots targeting NRT2.1 transcript on nuclear and cytoplasmic compartments in WT and clf-29 lhp1-4. This analysis showed that NRT2.1-associated sRNAs accumulate in clf-29 lhp1-4 in both nuclear and cytoplasmic compartments, with a stronger accumulation in the cytoplasm (Fig. 5C). These results suggest that sRNAs associated with increased RNA degradation in clf-29 lhp1-4 certainly originate from the cytoplasm and the nuclei. Therefore, we concluded that excessive mRNA degradation caused by the loss of both CLF and LHP1 are presumably due to a misregulation of RNA degradation machinery, affecting the expression of genes with major physiological functions.
A c c e p t e d M a n u s c r i p t 19

Discussion
In this work, we explored the role of Arabidopsis PcG group proteins in the regulation of transcriptionally active genes, in relation with mRNA degradation. Several other recent reports have illustrated that PcG functions that are not restricted to genes with low expression. In Arabidopsis, a subset of H2AUb-marked genes corresponds to transcriptionally active genes (Zhou et al. 2017a).
These genes are associated with metabolic functions, but the impact of PcG proteins regulation on them has not been analyzed yet. PcG proteins have usually been described for their roles in keeping the expression level of developmental genes to a minimum, creating a so called facultative heterochromatin. However, it has already been shown that PcG proteins do not only target genes with low expression. In our data, the list of genes that are highly expressed and targeted by PcG proteins is not enriched in development related loci but in genes involved in metabolism. Indeed, active PcG targets are highly enriched in genes involved in the transport of diverse nutrients such as nitrate, amino acids, sulfate, phosphate, potassium and water. The main model gene used in this study, NRT2.1, is also involved in metabolism. Loss of expression of the gene encoding the main nitrate transporter in clf-29 lhp1-4 reduces dramatically the nitrate influx capacity of the root system.
Transcriptionally active PcG proteins targets have already been characterized for their enrichment in metabolic function (van den Boom et al., 2016;Zhou et al., 2017a). This suggests that PcG proteins are not only important to maintain organ identity by repressing homeotic genes but also to ensure the integrity of active transcripts involved in major metabolic functions. Our previous study also described that NRT2.1 is constantly marked by H3K27me3 even under transcriptionally permissive conditions (Bellegarde et al., 2018). Loss of CLF function leads to a reduction of H3K27me3 levels at the NRT2.1 locus, increasing its mRNA levels without altering its expression pattern (Bellegarde et al., 2018). Here, using NRT2.1 as a model for active genes, we showed that concurrent loss of LHP1 and A c c e p t e d M a n u s c r i p t 20 CLF, which are members of PRC1 and PRC2 respectively, leads surprisingly to down-regulation of NRT2.1 expression. Defects in NRT2.1 expression could be due to morphological changes, that are known to be frequent in Polycomb mutants. However, several observations are not in favor of this possibility. Indeed, we observed that the root development and anatomy does not seem particularly disrupted in clf-29 lhp1-4. In addition, we previously demonstrated that the regulation mediated by CLF does not modify the tissue-specific expression of NRT2.1 (Bellegarde et al., 2018). Finally, and most notably, we observed a striking association between the down-regulation of NRT2.1 in clf-29 lhp1-4 and the accumulation of sRNAs originating from the NRT2.1 transcript. Additional experiments will be helpful to characterize the cellular and morphological defects in clf-29 lhp1-4, but our results strongly suggest that NRT2.1 down-regulation is due to excessive RNA degradation that occurs in clf-29 lhp1-4. At the genome-wide level, loss of CLF and LHP1 function also leads to the down-regulation of numerous metabolism-related active genes. The analysis of sRNAs profile at NRT2.1 and genomewide revealed that down-regulation of expression of active genes following loss of LHP1 and CLF is associated to the accumulation of sRNAs, resulting from post-transcriptional degradation of mRNAs.
This might suggest a link between PcG-mediated chromatin environment and RQC mechanisms.
Indeed, a recent study in yeast showed that the LSM2-8 XRN2 associated complex, localized in the nucleus, is involved in the repression of gene expression applied by PcG proteins (Mattout et al. 2020). They could show that the LSM2-8 complex and XRN2 are specifically degrading transcripts coming from genes subjected to PcG repression to enhance and ensure their transcriptional silencing.
In our study, we observed, whatever the genotype, that PcG-targeted genes are more prone to mRNA degradation than the rest of the genome. However, by differentiating PcG-targeted genes and PcGnon targeted genes, we showed that the mechanism for degradation of mRNAs in PcG mutants is likely indirect, as genes that are not marked by H3K27me3 and LHP1 are also subjected to a similar enhancement of mRNA degradation in clf-29 lhp1-4. This suggests that the global enhancement of mRNA degradation in clf-29 lhp1-4 might be due to a deregulation of the expression of genes important for RQC mechanisms. In agreement, we observed that key genes involved in mRNA

Conflict of Interest
The authors have no conflicts to declare   between clf-29 lhp1-4 and WT for differentially expressed genes (Adjusted P-value < 0.05). C. Venn diagram representing the overlap between transcriptionally active genes (corresponding to the 10% most expressed) marked by H3K27me3 and LHP1, and transcriptionally active genes co-regulated with NRT2.1 in clf-29 lhp1-4. Significant overlap is indicated by asterisks (*** P < 0.001, nonparametric randomization test). D. Gene ontology analysis performed on the set of NRT2.1-like genes in clf-29 lhp1-4. The enrichment tests correspond to Fisher's exact tests. P-values are corrected for multiple testing, and the threshold for over-representation tests was set to 0.05. Gene counts represent the number of genes associated to a given ontology in the list of interest.