Description

This track represents short-read copy number estimates. Short-read sequence data were processed into 36 bp non-overlapping fragments (assemblies were fragmented into 36 bp overlapping segments) and mapped to a masked reference using mrsFAST with a maximum of two substitution mismatches not allowing for indels (masking was determined by TRF and RepeatMasker). Mapping counts were then aggregated into windows with 500 bp of non repeat masked sequence and these counts were corrected for GC bias per sample using windows known to be single copy in the genome. Finally, copy number was determined from corrected mapping counts using linear regression on read-depth versus known copy number control regions. Sequences are colored from cold to hot (0 - 120+) and exact copy can be found by clicking on the region of interest.

NOTE: While copy number is calculated in overlapping windows of 500bp of unique sequence with a 100bp slide, the windows in the track only show the 100bp resulting from the slide to reduce file size and redundancy.

Copy Number Key

Copy numberColor
0
1
2
3
4
5
6
7
8
9
10
20
30
40
50
60
70
80
90
100
110
120

Credits

Please feel free to contact William Harvey or Mitchell Vollger with any questions and/or concerns regarding this track.

References

Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li PW, Eichler EE. Recent segmental duplications in the human genome. Science 2002

Hach, F., Hormozdiari, F., Alkan, C. et al. mrsFAST: a cache-oblivious algorithm for short-read mapping. Nat Methods. 2010

Sudmant PH, Kitzman JO, Antonacci F, Alkan C, Malig M, Tsalenko A, et al. Diversity of human copy number. Science. 2010

Sudmant PH, Mallick S, Nelson BJ, Hormozdiari F, Krumm N, Huddleston J, et al. Global diversity, population stratification, and selection of human copy-number variation. Science. 2015