Description
Intron predictions from short-read RNA-Seq experiments
Methods
- The reads from 432 single-ended and paired-end Poly(A)_ short-read
RNA-Seq experiments were obtained
from ENCODE. The selection
emphasized brain, testis, liver, or the H1 cell line to optimize transcript
diversity, as well as reads for at least 100bp. The experiments included are lists below.
- Reads were aligned to the T2T CHM13 assembly using STAR 2.7.5. with
parameters derived from the ENCODE pipeline. The maximum number of
allowed multi-mapping reads was increases to 2048 to improve sensitivity
over specificity. The intention is to develop post-mapping filters to obtain
more specific mappings.
--outFilterMultimapNmax 2048 |
--alignSJoverhangMin 8 |
--alignSJDBoverhangMin 1 |
--outFilterMismatchNmax 999 |
--outFilterMismatchNoverReadLmax 0.04 |
--alignIntronMin 20 |
--alignIntronMax 1000000 |
--alignMatesGapMax 1000000 |
--outSAMunmapped Within |
--outFilterType BySJout |
--outSAMstrandField intronMotif (single ended) |
--sjdbScore 1 |
-
Called introns using intronProspector with
--min-confidence-score=1.0
.
Display Conventions and Configuration
Each intron calls is represent by a two-block record, with the blocks
representing the longest observed exon overlap. The name field contains the splicing junction motif,
with known splice junctions capitalized. Items are colored based on their
introns:
U2-type intron | green |
U12-type intron | blue |
unknown intron type | red |
Data access
Release history
- 2020-09-01: initial t2tChm13_20200727 release
Contacts
Credits
- Mark Diekhans
- Ann Mc Cartney