Skip to content

Detection of products of RNA degradation by collateral activity of Type VI CRISPR-Cas system from L. shahii

Notifications You must be signed in to change notification settings

matveykolesnik/LshCas13a_RNA_cleavage

Repository files navigation

LshCas13a_RNA_cleavage

Analysis of RNA-Seq data of E. coli cells expressing targeting/nontargeting Type VI CRISPR-Cas system

Each directory contains data and scripts for the particular experiment:

  • LshCas13a_C3000 - RNA-Seq of total RNA extracted from E. coli C3000 cells carrying activated/nonactivated LshCas13a enzyme;
  • LshCas13a_d10LVM - RNA-Seq of total RNA extracted from E. coli $\Delta$ 10LVM cells carrying activated/nonactivated LshCas13a enzyme;
  • LshCas13a_in_vitro_total_RNA - RNA-Seq of total RNA extracted from E. coli C3000 cells after in vivo incubation with activated/nonactivated LshCas13a enzyme;
  • LshCas13a_in_vitro_tRNAs - RNA-Seq of total tRNA sample after in vivo incubation with activated/nonactivated LshCas13a enzyme;

Each directory contains the following subdirectories:

  • Data - directory containing the raw reads data;
  • Annotations - directory containing GFF tables with genomic features;
  • Alignments - directory containing alignments produced with read_mapping.sh script;
  • Reference_sequences - directory containing FASTA files of sequences used for reads mapping;
  • Scripts - directory containing scripts for data processing;
  • Results - directory containing the results of data processing.

The "Results" directory contains the following subdirectories:

  • Tables
    • Ends_counts - contains files with coordinates of 5' ends of fragments;
    • Fragment_coords - contains files with coordinates of fragments (SeqID - Fragment_start - Fragment_end - Strand)
    • Merged_ends_counts - contains tables with 5' ends counts derived from samples designeted for comparison
    • Read_pairs_TABs - contains tables with coordinates of read pairs.
  • WIG_files - contains wig-files with 5' ends coverage.

The "Scripts" directory contains a set of scripts for the data processing. There is a "basic" set of scripts which is common for all experiments:

  • raw_data_processing.sh - performs reads quality assessment, removes adapters and discards low-quality reads.
    • Requirements:
      • fastqc, trimmomatic
  • read_mapping.sh - maps paired-end reads to the reference sequences. Since the SAM alignments file are quite large, the output data is compressed using gzip.
    • Requirements:
      • bowtie2
  • return_fragment_coords_table.py - receives alignment files (in gzipped SAM format) and generates all tables deposed in "Result-Tables" directory (except "Merged_ends_counts") and produces WIG files with 5' ends coverage;
    • Requirements:
      • python3 with gzip and pandas modules
  • merge_ends_count_tables.py - combines 5' ends counts tables from different tables into one table
    • Requirements:
      • python3 with pandas, gzip, re and functools modules
  • TCS_calling.R - performs statistical test producing table with the position, logFC and p-value values.
    • Requirements:
      • R with dplyr, data.table, tidyr and edgeR modules

About

Detection of products of RNA degradation by collateral activity of Type VI CRISPR-Cas system from L. shahii

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages