6.2 Further Work

A limitation of the grammar as we have implemented it in plyranges is lack of scalability and computational speed for data sets that do not fit in memory. We attempted several techniques for performing delayed operations over range-based data, however a more general approach that allows for data stored on the cloud or in scientific data formats like HDF5 that leverage existing Bioconductor frameworks would be useful. We showed in chapters 3 ad 4 that an analyst is able to do some very complex data transformations and re-sampling procedures via casting results into GRanges object. However, it is unclear whether the semantics of our grammar can be extended to data that can not be efficiently reshaped into long form tidy representations. Moreover, further work is required to explore the design space of grammars for data transformations and grammars for graphics when the data are large, multifaceted and non-rectangular.

We showed in Chapter 5 that tours provide a global overview that can be used as tool for exploring model fits. An issue that arises is how to scale the tour as the number of observations increases. There are latencies in sending data from the back end to the visualisation client that causes lag during animation. One could also question whether point based displays are appropriate in this case, and it would be worth exploring the usability of animations based on binning the projections. Moreover, when the number of observations are large, the points in the projections are concentrated in the centre of the tour display obscuring interesting aspects of the data. This is mitigated via having the ability to zoom, but further research into transforming the projections to avoid crowding would be valuable. An added complexity to changes in visual displays are thinking about the design of user interactions, and several promising avenues based on section tours could be explored (Laa et al. 2020; Laa, Cook, and Valencia 2020).