1.3 Representation of genomic data structures

In Chapter 4 we explore the limits of the tidy data semantic by extending plyranges to analyse coverage estimated on RNA-seq data by developing a new software tool called superintronic. We show that the long-form tidy representation is an effective way of combining the experimental design and reference annotations into a single genomic data structure for exploration. We use superintronic to develop a framework for discovering interesting regions of coverage and apply our approach to integrating intron signal from RNA-seq data. This chapter is based on my software and analysis contributions to the S. Lee et al. (2020).