Next-Generation Sequencing (NGS)

 

A. Illumina Short-Read NGS

 

A.1   Whole Genome Sequencing (WGS)

We offer two options for short-read WGS, sonication-based fragmentation and Tn5 transposase-based tagmentation. The physical shear method using sonication (Covaris M220 Focused-Ultrasonicator) requires a relatively higher amount of starting DNA material, a library made from which has a relatively more even coverage on the whole genome. In contrast, the enzymatic fragmentation method using Tn5 transposase (Illumina Nextera DNA Prep or Nextera XT DNA Library Prep) requires a low amount of genomic DNA (down to 1 ng) and has shown some biases in genome coverage due to the enzyme sequence preference. However, such tagmentation-associated biases may not be as important in some specific applications, such as small genome re-sequencing.

A.2   Whole Exome Sequencing (WES)

Unlike WGS, only the exons of genome-wide protein-coding genes are sequenced in WES, thereby reducing the amount of sequence needed per sample. WES uses biotinylated oligonucleotide probes to capture the protein-coding exons from the genomic DNA library. WES is compatible with formalin-fixed paraffin-embedded (FFPE) samples. Different WES probe sets may be used, along with their BED file for the following WES bioinformatics analysis. Our preferred WES option uses Illumina DNA Prep with Enrichment, which uses Nextera Rapid Capture Exomes or Expanded Exomes, and has a rapid workflow and comprehensive exome content (214,405 exons, 37 Mb target size; Expanded with UTRs and miRNA: 201,121 exons, 62 Mb). Other WES probe sets may also be requested.

A.3   Whole Transcriptome Sequencing (WTS) or RNA Sequencing (RNA-Seq)

Functional mRNA accounts for < 5% of the total RNA in the cell, compared to ribosomal RNA (rRNA) approximately 80% of the total RNA. Thus, for transcriptome or genome-wide gene expression studies, enrichment of mRNA from RNA samples is preferable (please see below). We offer Illumina TruSeq Stranded Total RNA Library Prep for WTS/RNA-Seq, with precision measurement of strand orientation and uniform coverage of transcripts. Other RNA-Seq library preparations may also be requested.

A.3.1   Poly-A pull-down

Poly-adenylated transcripts, mainly mRNAs, are pulled down via hybridization to poly-T oligos bound to magnetic beads, which is the most commonly used method for mRNA enrichment. Poly-A enrichment requires high-quality total RNA samples (RIN > 7 is recommended) to recover the full-length mRNA. It is not suitable for FFPE or otherwise degraded samples.

A.3.2   rRNA depletion

Depletion of ribosomal RNA (rRNA) is applied when mRNA transcripts do not carry poly-A (e.g., bacterial mRNA); when long non-coding RNA (lncRNA) needs to be characterized as well as mRNA; or when RNA quality is low with partial degradation (low RIN), such as FFPE samples. Illumina Ribo-Zero is a subtractive hybridization method, which uses oligonucleotide probes and magnetic beads to capture and remove the rRNA (and globin mRNA) from the sample. Additional RNase H digestion method, enzymatically degrades rRNA targeted by complementary oligonucleotides, may be considered.

A.3.3   RNA immunoprecipitation (RIP)

By using protein A/G magnetic beads, RIP uses a specific antibody raised against the protein of interest to pull down the RNA-binding protein (RBP) and its target-RNA complexes. RIP is useful in functional studies of complex interactions between RBP and coding/non-coding RNAs.

A.4   Small RNA (microRNA) sequencing

Due to the small size, small RNA, including microRNA, is lost during most RNA-Seq library preparation procedures. For small RNA (microRNA), we offer Illumina TruSeq Small RNA Library Prep for the NGS library construction. Other small RNA sequencing library preparations may also be requested. The small RNA library prep starts from either enriched small RNA or directly from total RNA. Notably, the fraction of small RNA in the total RNA sample varies in tissues, organisms, and/or RNA extraction kits/methods. If starting from the total RNA, a higher concentration is preferred.

A.5   Targeted Sequencing, 16S and/or ITS Metagenomic Sequencing

Targeted sequencing is used when deep sequencing over specific genomic target regions is desired, e.g., 16S rRNA gene for metagenomic sequencing of bacterial communities and internal transcribed spacer (ITS) for fungal communities. Two approaches are commonly used for targeted sequencing: probe-based capture or amplicon sequencing. We can help select a suitable target gene panel commercially available or design a suitable custom solution to interrogate your genes/regions of interest. Please contact the BIG Core for more details.

A.6   ChIP-Seq, 3C/HiC-Seq, ATAC-Seq, and Others

Chromatin immunoprecipitation (ChIP) combined with NGS is used to investigate the complex interaction between the protein of interest and genome-wide DNA in the cell. Chromatin conformation capture (3C) techniques, including 3C (one vs. one), 4C (one vs. all), 5C (many vs. many), and HiC (all vs. all), combined with NGS is a set of methods for genome-wide analysis of chromatin spatial organization in the cell. Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-Seq) is used to assess genome-wide chromatin accessibility, an alternative to Micrococcal Nuclease digestion with deep sequencing (MNase-Seq), Formaldehyde-Assisted Isolation of Regulatory Elements for deep sequencing (FAIRE-Seq), and DNase I hypersensitive sites sequencing (DNase-Seq). Due to the high sensitivity and complexity involved in sampling, we do not perform ChIP, 3C/HiC, nuclease digestion assays in the BIG Core. However, we do prepare the Illumina NGS libraries for DNA obtained from these protocols. Notably, no guarantee is offered with this library service, other than we’ll do our best! Please contact the BIG Core for more detail of options.

 

B. PacBio Long-Read Sequencing

 

B.1   Whole Genome Sequencing (WGS)

B.1.1   Continuous Long Read (CLR) WGS:

CLR WGS utilizes long insert library preparations, up to 30 kb, from intact, HMW gDNA. The sequencing data generated from these libraries, without PCR amplification, can be evaluated for the presence of a wide range of structural variants using a reference-based alignment approach, de novo assembly of highly contiguous reference genomes, as well as DNA methylation in modification and motif.

B.1.2   Circular Consensus Sequence (CCS) WGS:

CCS WGS, i.e., HiFi WGS, utilizes slightly shorter library preparations, ranging from 10 – 18 kb, that is tightly size selected. During the sequencing, each molecule is sequenced repeatedly, and these multiple repeats are collapsed to generate a single highly accurate (>99.9%) CCS. The sequencing data can be used to characterize single nucleotide polymorphisms (SNPs) and structural variants when compared to a reference genome.

B.1.3   Low Input WGS:

WGS libraries can be prepared with as low as 500 ng gDNA when necessary to analyze very precious samples. The sequencing data from these low input WGS are processed using the HiFi WGS workflow described above and may be used for de novo genome assembly and variant analysis when compared to a reference genome.

B.1.4   Ultra Low Input WGS:

The Ultra Low Input WGS uses a whole-genome amplification (WGA) method to generate enough material for library preparations, in circumstances where the amount of gDNA is extremely limited (> 5 ng HMW gDNA for genomes up to 500 Mb). The sequencing data are also processed using the HiFi WGS workflow described above and may be used for de novo genome assembly and variant analysis when compared to a reference genome.

B.2   Isoform Sequencing (Iso-Seq):

Iso-Seq libraries are constructed from full-length cDNA generated by oligo dT priming of total RNA, so as to capture all poly-adenylated transcripts. With the results of highly accurate HiFi reads per isoform, the sequencing data are typically used to generate novel reference transcriptomes, examine alternative splicing patterns, resolve fusion transcripts, or characterize alternate promoter, exon, and UTR usage under different experimental conditions.

B.3   Amplicon Sequencing:

Amplicons up to 12 kb length, from cDNA, DNA, or bisulfite-treated DNA, can generate high-quality sequence data in the Sequel IIe System using the HiFi Seq with CCS analysis. Usually, two-step PCR is employed: the 1st PCR with sequence-specific oligos tagged on the 5′-ends with universal tags and the 2nd PCR to add sample-specific 16 bp barcode indices to both ends of each amplicon using the universal tags. Due to the specificity of primers used and the complexity of samples with multiplexing, we encourage investigators to carry out these PCRs by themselves, as well as the equimolar pooling of the amplicons after fluorometry quantification. For full-length 16S metagenomics sequencing, the suggested protocol is described in detail here: Full-Length-16S-Pacbio-Protocol, which can be used as a good reference for other target amplicon sequencing. For more details of indexing and primer designs, please contact the BIG Core.