Visualization Grammar¶
Genome browser applications typically couple the visual representations to specific file formats and provide few customization options. GenomeSpy has a more abstract approach to visualization, providing combinatorial building blocks such as marks, transformations, and scales. As a result, users can author tailored visualizations that display the underlying data more effectively.
The concept was first introduced in The Grammar of Graphics and developed further in ggplot2 and Vega-Lite.
A dialect of Vega-Lite
The visualization grammar of GenomeSpy is a dialect of Vega-Lite, providing partial compatibility. However, the goals of GenomeSpy and Vega-Lite are different – GenomeSpy is more domain-specific and primarily intended for the visualization and analysis of large datasets containing genomic coordinates. Nevertheless, GenomeSpy tries to follow Vega-Lite's grammar where practical, and thus, this documentation has several references to its documentation.
A single view specification¶
Each view specification must have at least the data
to be visualized, the
mark
that will represent the data items, and an encoding
that specifies how
the fields of data are mapped to the visual channels of the mark. In addition,
an optional transform
steps allow for modifying the data before they are
encoded into mark instances.
{
"data": { "url": "sincos.csv" },
"transform": [
{ "type": "formula", "expr": "abs(datum.sin)", "as": "abs(sin)" }
],
"mark": "point",
"encoding": {
"x": { "field": "x", "type": "quantitative" },
"y": { "field": "abs(sin)", "type": "quantitative" },
"size": { "field": "x", "type": "quantitative" }
}
}
Properties¶
data
- Specifies a data source. If omitted, the data source is inherited from the parent view.
transform
- An array of transformations applied to the data before visual encoding.
mark
- The graphical mark presenting the data objects.
encoding
- Specifies how data is encoded using the visual channels.
name
- An internal name that can be used for referring the view. For referencing purposes, the name should be unique within the whole view hierarchy.
width
- Width of the view. Check child sizing for details.
height
- Height of the view. Check child sizing for details.
view
- View background. An object with the following
"rect"
mark's properties:fill
,stroke
,strokeWidth
,fillOpacity
,strokeOpacity
, andborderRadius
. padding
- Padding applied to the view. Accepts either a number reprenting pixels or a
PaddingConfig. Example:
padding: { top: 10, right: 20, bottom: 10, left: 20 }
title
- View title. Accepts a string or a
title specification
object. N.B.: Currently, GenomeSpy doesn't do bound calculation, and you need to
manually specify proper
padding
for the view to ensure that the title is visible. description
- A description of the view. Can be used for documentation. The description of the top-level view is shown in the toolbar of the GenomeSpy app.
baseUrl
- The base URL for relative URL data sources. The base URLs are inherited in the view hierarchy unless overridden with this property. By default, the top-level view's base URL equals to the visualization specification's base URL.
opacity
- Configures a static or dynamic opacity of the view. The latter enables semantic zooming. TODO: Elaborate.
visible
- The default visibility of the view. An invisible view is removed from the layout and not rendered. For context, see toggleable view visibility.
View composition for more complex visualizations¶
View composition allows for building more complex
visualizations from multiple single-view specifications. For example, the
layer
operator allows creation of custom glyphs and
the concatenation operators enables stacked layouts
resembling genome browsers with multiple tracks.