A.1 Ranges revisited
In Bioconductor there are two classes, IRanges
and GRanges
, that are
standard data structures for representing genomics data. Throughout this
document I refer to either of these classes as Ranges if an operation can be
performed on either class, otherwise I explicitly mention if a function is
appropriate for an IRanges or GRanges.
Ranges objects can either represent sets of integers as IRanges (which have start, end and width attributes) or represent genomic intervals (which have additional attributes, sequence name, and strand) as GRanges. In addition, both types of Ranges can store information about their intervals as metadata columns (for example GC content over a genomic interval).
Ranges objects follow the tidy data principle: each row of a Ranges object corresponds to an interval, while each column will represent a variable about that interval, and generally each object will represent a single unit of observation (like gene annotations).
Consequently, Ranges objects provide a powerful representation for reasoning about genomic data. In this vignette, you will learn more about Ranges objects and how via grouping, restriction and aggregation you can perform common data tasks.