-
Notifications
You must be signed in to change notification settings - Fork 5
Home
Feel free to suggest questions or report troubleshooting and feedback, please post to the github issues or send emails to the authors: lvruitu@gmail.com.
If you're interested in analyzing KAS-seq data from rare species using KAS-Analyzer, simply provide the reference genome sequence and gene annotation to the authors. You can send these materials to lvruitu@gmail.com, or post the download link in the KAS-Analyzer GitHub issue or discussion section. The authors will promptly update KAS-Analyzer with the new species data.
2. Is it necessary to install or activate the KAS-Analyzer conda environment every time I use KAS-Analyzer?
KAS-Analyzer incorporates numerous top-performing tools from the NGS field, such as bwa, bowtie2, epic2, deeptools, bedtools, samtools, macs2, and homer. If you prefer not to activate the KAS-Analyzer conda environment each time you use KAS-Analyzer, ensure that all required tools are accessible in your system. You can verify this by running 'KAS-Analyzer install -check' in the terminal. Nonetheless, it is recommended to activate the KAS-Analyzer conda environment when generating plots or performing differential KAS-seq analysis, as these tasks typically require loading specific R packages.
If KAS-seq or spKAS-seq data are working well can be determined using by many ways: 1) Peaks number, >=50,000. or Fraction of reads in peaks (FRiP) >=25%. 2) Typical 'valley' pattern of KAS-seq signals on genetic regions can be seen in the heatmap or metagene profiles, sharp peaks arounds TSS, broad-basal enrichment on gene body and broad-strong enrichment on terminator regions. 3) Visualize KAS-seq data on genome browser (UCSC, igv or WashU Epigenome Browser) and many significant KAS-seq peaks can be seen especially on genetic regions.
We recommend you can run a saturation analysis using 'KAS-Analyzer saturation'. Basically, 40M deduplicated mapped reads should be enough for a regular KAS-seq assay. You can calculate how many raw sequencings reads you need from sequencing facility by the mapping ratio (~95%) and duplication ratios (20%-40%) for most KAS-seq data in our lab.
KAS-Analyzer provide a subcommand (KAS-Analyzer complexity) to calculate the library complexity metric of (sp)KAS-seq data, including PCR Bottlenecking coefficient and Non-Redundant Fraction (NRF).
We recommend that you can normalize the KAS-seq data based on deduplicated mapped reads or FPKM if spike-in normalization is not available. Normalization factors can be calculated to make sure the number of deduplicated reads between KAS-seq data at the same level, then generate the normalized KAS-seq density file (bedGraph) based on normalization factors using 'KAS-Analyzer normalize'.
Single-stranded transcribing enhancers are those enhancers with KAS-seq peaks, which are 'single-stranded' active enhancers with higher activity and enrich unique TFs motifs. We found that most single-stranded transcribing enhancers are with ATAC-seq peaks and Pol2 binding, some of them are also with stable eRNA transcription.
As we all know that Pol II pauses in the proximity of the promoter on a large fraction of transcribed genes, and transcription initiation and elongation of transcripts are under distinct control. Pausing index is defined as the ratio of pause peak density to gene body density. Previously, people usually used nascent RNA density calculated from GRO-seq to calculate pausing index, in KAS-Analyzer, we use ssDNA density calculated from KAS-seq data.
Transcription termination is the process where a nascent RNA is released from its complex with RNA polymerase and the DNA template. Termination index is defined as the ratio of terminator peak density to gene body density, which can be interpreted as the degree of difficulty that Pol2 leaves its DNA template. Termination index can only be calculated using KAS-seq data and KAS-Analyzer.
We recommend at least 3 replicates for differential KAS-seq analysis per condition. We mainly use DESeq2 to perform differential RNA Pols activity analysis for two-conditions KAS-seq experiments, and ImpulseDE2 to perform differential RNA Pols activity analysis for two-conditions for time-course KAS-seq experiments. Higher replicate numbers can marginally minimize false positives.
Please post your question to the github discussion and ask your question there.
KAS-pipe2 is still under active development, and we have not made official releases yet.