Chapter 10-2. Epigenomics Sequencing
Recommended Reading: 【Biology】 Chapter 10. Genome Projects and Sequencing Technologies
1. Type 1. Identifying gene function
2. Type 2. Identifying transcription regulation
3. Type 3. Identifying post-translational regulation
4. Type 4. Programmable cell function
1. Type 1. Identifying gene function
⑴ Perturb-seq
① 1st. Treat Cas9-expressing cells with various types of gRNA libraries.
○ By using CRISPR-Cas9, various perturbations occur in each cell depending on the type of gRNA.
○ At least 2.5 million cells should remain after filtering.
○ There should be at least 100 cells per perturbation on average.
○ Each cell should have approximately 10,000 UMIs
② 2nd. Use sequencing technologies that detect both gRNA and mRNA (e.g., scRNA-seq, MERFISH)
③ 3rd. Grouping cells based on gRNA expression: Cells are naturally grouped according to perturbation conditions.
④ 4th. Identify gene function.
○ Premise: Genes with similar functions are likely to exhibit similar expression patterns.
○ Discoverable Result 1. Changes in gene expression due to perturbations.
○ Discoverable Result 2. Differences in perturbation effects due to genetic variants.
Figure 1. General schematic of Perturb-seq including validation experiments
⑵ In vivo Perturb-seq
① 1st. Deliver AAV (adeno-associated virus) containing Cre to mice to induce Cas9 expression.
② 2nd. Deliver lentivirus containing sgRNA.
③ 3rd. In vivo, the CRISPR-Cas9 system induces various perturbations in each Cas9-expressing cell depending on the gRNA.
④ 4th. Perform scRNA-seq and MERFISH after approximately 10 days.
Figure 2. In vivo Perturb-seq process
2. Type 2. Identifying transcription regulation
⑴ BS-seq (bisulfite sequencing)
① Treat with bisulfite to determine methylation patterns.
② Enables epigenomic profiling.
③ Application 1. snmC-seq
○ Plate-base sequencing.
○ Measures sum of 5-methyl- and 5-hydroxymethyl-cytosines.
○ 1-2 million reads per cell.
⑵ ChIP-seq (chromatin immunoprecipitation sequencing)
① Definition: Enables analysis of DNA regions bound to specific proteins, such as transcription factors.
② Combines DNA sequencing with ChIP (chromatin immunoprecipitation) to identify binding sites of DNA-associated proteins.
③ 1st. Treat with a cross-linking agent (e.g., formaldehyde) to fix proteins (e.g., transcription factors) and DNA.
④ 2nd. Lyse the cells, leaving only the DNA/protein complexes.
⑤ 3rd. Use sonication to break DNA into small fragments (200-350 bp); cross-linked DNA remains intact.
⑥ 4th. Add antibodies attached to magnetic beads; applying a magnetic field allows isolation of specific proteins and their linked DNA.
⑦ 5th. Reverse cross-linking: Apply heat to the precipitate to separate DNA from proteins.
⑧ 6th. Perform sequencing to determine the DNA sequence. Proceeds with PCR amplification, fragment size selection, adaptor addition.
⑨ 7th. Compare the sequencing results to the genome reference to identify binding sites of transcription factors, etc.
Figure 3. ChIP-seq process
⑩ Application 1. ChIP-chip: HTS version of ChIP-chip is ChIP-seq.
⑪ Application 2. ChIP-PET
⑫ Application 3. ChIA-PET: ChIP-seq identifies only binding locations of specific proteins, while ChIA-PET investigates interactions between bound DNA regions.
⑬ Application 4. RIP-seq
⑭ Application 5. CLIP-seq
⑮ Application 6. meDIP-seq: DNA methylation or hydroxymethylation.
⑯ Application 7. ChIP-exo: Uses lambda exonuclease digestion. Difficult experimentally.
⑰ Application 8. CUT&RUN (Cleavage under Targets and Release using Nuclease)
○ Uses Micrococcal Nuclease (MNase).
⑱ Application 9. CUT&Tag (Cleavage under Targets and Tagmentation)
○ Overview
○ Tn5 transposase is an enzyme that cuts open regions of DNA, generating various fragments such as single nucleosomes, dimers, trimers, and more.
○ The Tn5 transposase catalyzes DNA insertion by creating sticky ends.
○ It can directly merge adaptors, allowing immediate sequencing.
○ Procedure
○ Step 1. Bind nuclei to beads to anchor them. Magnetic capture leads to high retention of cells.
○ Step 2. Add primary antibody: Quality of data is highly dependent on quality (specificity & sensitivity) of antibody.
○ Step 3. Add secondary antibody.
○ Step 4. Conjugated Tn5-transposase binds to complex and cuts surrounding areas of open DNA.
○ Step 5. PCR amplification, fragment size selection, sequencing
○ Types
○ MuLTI-Tag: Minimizes crossover in multiplexing via direct barcode conjugation.
○ multi-CUT&Tag: Uses barcoded Tn5/pA-antibody complexes. Identifies colocalization of marks.
○ spatial-CUT&Tag: Spatially visualizes histone modifications and chromatin states on tissue sections.
CUT&Tag | CUT&RUN | ChIP-Seq | |
---|---|---|---|
Native condition? | Yes | Yes | No |
Sample input | Nuclei | Cells or nuclei | Sheared chromatin |
Cell number | 100,000 cells | 500,000 cells | 1-10 million cells |
Chromatin fragmentation | Tn5-based tagmentation | MNase digestion | Sonication |
Ideal target | Histone PTM | Histone PTM, Chromatin-associated protein, Remodeler | Histone PTM, Chromatin-associated protein |
Secondary antibody | Yes | No | No |
Library preparation | No (direct-to-PCR) | Yes | Yes |
Integrated library | Possible; uses tagmentation | Impossible | Impossible |
Sequencing depth | 5-8 million reads | 3-5 million reads | 20-50 million reads |
Workflow length | < 2 days | < 3 days | ~ 1 week |
Automation compatibility | High | High | Low |
Signal-to-noise | High | High | Low |
Table 1. CUT&Tag vs. CUT&RUN vs. ChIP-Seq (ref, ref, ref, ref)
⑶ Hi-C seq (high throughput chromatin conformation capture sequencing)
① Definition: Investigates sequences that are naturally close together on chromosomes.
② Examines DNA distances to reveal the 3D folding structure of chromosomes within the nucleus.
Figure 4. Hi-C seq process
③ Application 1. ChIA-PET
○ Difference: Hi-C investigates all naturally occurring DNA-DNA interactions, while ChIA-PET focuses on DNA-DNA interactions mediated by specific proteins.
⑷ DNA ticker tape (prime editing)
① 1st. Initially, only the first site is activated.
② 2nd. After the first event, the second site becomes activated.
③ 3rd. Sequentially records molecular events over time.
⑸ ENGRAM (enhancer-driven genomic recording of transcriptional activity in multiplex)
① Uses perRNA linked to synthetic enhancers.
② Records the sequence and intensity of signaling.
③ Reference: Chen et al., bioRxiv (2021)
⑹ ATAC-seq (Assay for transposase-accessible chromatin with sequencing)
① Overview
○ Definition: A sequencing technique that identifies euchromatin regions.
○ Pseudo-expression: Euchromatin regions can be inferred as areas of gene expression.
○ Tn5 transposase is an enzyme that cuts open regions of DNA, generating various fragments such as single nucleosomes, dimers, trimers, and more.
○ The Tn5 transposase catalyzes DNA insertion by creating sticky ends.
○ It can directly merge adaptors, allowing immediate sequencing.
○ The ~10.5 bp periodicity observed in ATAC-seq is related to the fact that one full turn of the DNA helix requires 10 bp.
○ The fourth alpha-helix in Tn5 acts as the major “recognition helix” and makes several base pair-specific contacts in the major groove of the DNA from positions 7 to 13. (ref)
○ Tn5 transposase bends the helical axis, leading to an increase in the roll and tilt angles, and in deviations of the major and minor groove widths and depths, compared with average B-DNA. (ref)
○ This may explain the 10.5 bp periodicity, which is a little different from 10 bp periodicity in average B-DNA.
Figure 5. ATAC-seq fragment size distribution
② Type 1. bulk ATAC-seq
○ Step 1. Nuclei isolation
○ Step 2. Tn5 transposase treatment: This cuts less condensed open regions in the chromosome and inserts DNA sequence tags.
○ Step 3. Amplification & sequencing
○ Step 4. Data analysis
③ Type 2. scATAC-seq
○ Defines cell type-specific CREs to identify regulatory TFs and cell types associated with diseases and traits.
○ Used in interpreting GWAS variants.
○ Comparison of scRNA-seq and scATAC-seq
○ scRNA-seq: xij ∈ ℤ≥0
○ scATAC-seq: xij ∈ {0, 1}, j ≫ i
④ Type 3. spatial ATAC-seq
⑤ Type 4. FAIRE-seq: Formaldehyde to crosslink chromatin, and phenol-chloroform to extract sheared DNA.
⑥ Type 5. DNaseI-seq: DNase I endonuclease to digest chromatin. Can identify the types of transcription factors interacting with chromatin.
Figure 6. Example results of DNAseI-seq
⑦ Type 6. MNase-seq: Endo-exonuclease to processively digest DNA until obstruction.
Figure 7. Comparison of ATAC-seq with Various Sequencing Techniques
⑺ NOMe-seq (nucleosome occupancy and methylome sequencing)
① Defines nucleosome-depleted regions (NDR) where TFs bind.
② Identifies transcription factors interacting with chromatin.
⑻ MBD-seq
⑼ Bru-seq & BruChase-seq
① Newly transcribed nascent RNA is labeled with Bru (bromouridine) and then sequenced.
② Used for studying RNA synthesis, RNA stability, and splicing.
⑽ TT-seq
⑾ MNIST-seq: RNA methylation on chromatin regulation
⑿ GRO-seq: global run-on sequencing
3. Type 3. Identifying post-translational regulation
⑴ Ribo-seq
① Sequencing of ribosome-protected RNA, indicating active translation.
② scRibo-seq
○ Measures ribosomal occupancy per single codon.
○ 1st. FACS and lysis.
○ 2nd. Nuclease footprinting: MNase nuclease → inactivation → release of footprints.
○ 3rd. Create small-RNA library: End repair → 3’ ligation → 5’ ligation → cDNA synthesis → indexing PCR.
⑵ STAMP-RBP
① Uses scRNA-seq to identify RNA-binding proteins (RBPs).
② 1st. Attach APOBEC to RBPs.
③ 2nd. Enable C-U editing at the site where APOBEC and mRNA bind, replacing cytosine (C) with uracil (U).
④ 3rd. RNA-seq
⑤ 4th. Use SAILOR to identify C-U editing sites.
⑥ Can also identify isoform-specific binding profiles using long-read sequencing.₩
4. Type 4. Programmable cell function
⑴ RADARS
① 1st. When the target transcript is present, it forms a dsRNA structure, allowing ADAR to induce A-to-C editing.
② 2nd. This editing induces cellular behaviors such as GFP expression and caspase activation.
⑵ LADL (light-activated dynamic looping)
① An example of photo-activatable gene expression.
Input: 2022.01.10 00:03
Modified: 2023.01.28 23:12