Reconstructing transcript models from RNA-seq data and establishing these as independent transcriptional units can be a challenging task, especially for low abundant long non-coding RNAs (lncRNAs).
The Zipper plot is a visualization and analysis method that enables users to interrogate putative transcription start sites (TSSs) in relation to various features that are indicative for transcriptional activity. These features are obtained from publicly available datasets including CAGE-sequencing (CAGE-seq), ChIP-sequencing (ChIP-seq) for histone marks and DNase-sequencing (DNase-seq) across a large collection of tissue and cell types. The existence of peaks in the vicinity of a TSS increases the likelihood of the transcript being an independent unit.
The Zipper plot application requires three tab-separated fields as input (chromosome, genomic coordinate (hg19) of the TSS and strand). Optionally, if the user has labels for the genomic features being studied, they can be included as an extra fourth column (e.g.: chr12 3884608 + lnc-PRMT8-4:1). Finally, a report including a detailed summary table, a Zipper plot and several statistics to assess the significance of various TSS-peak associations will be generated and directly emailed to you.
Avila Cobos, F. et al. Zipper plot: visualizing transcriptional activity of genomic regions. BMC Bioinformatics 18, 231 (2017).
Our manuscript is available at BioMed Central