A.1 Ranges revisited

In Bioconductor there are two classes, IRanges and GRanges, that are standard data structures for representing genomics data. Throughout this document I refer to either of these classes as Ranges if an operation can be performed on either class, otherwise I explicitly mention if a function is appropriate for an IRanges or GRanges.

Ranges objects can either represent sets of integers as IRanges (which have start, end and width attributes) or represent genomic intervals (which have additional attributes, sequence name, and strand) as GRanges. In addition, both types of Ranges can store information about their intervals as metadata columns (for example GC content over a genomic interval).

Ranges objects follow the tidy data principle: each row of a Ranges object corresponds to an interval, while each column will represent a variable about that interval, and generally each object will represent a single unit of observation (like gene annotations).

Consequently, Ranges objects provide a powerful representation for reasoning about genomic data. In this vignette, you will learn more about Ranges objects and how via grouping, restriction and aggregation you can perform common data tasks.