Visualizing Sample Collections¶
Developer Documentation
This page is intended for users who develop tailored visualizations using the GenomeSpy app.
Getting started¶
You can use the following HTML template to create a web page for your
visualization. The template loads the app from a content delivery network
and the visualization specification from a separate spec.json file placed
in the same directory. See the getting started
page for more information.
<!DOCTYPE html>
<html>
<head>
<title>GenomeSpy</title>
<link
rel="stylesheet"
type="text/css"
href="https://cdn.jsdelivr.net/npm/@genome-spy/app@0.51.x/dist/style.css"
/>
</head>
<body>
<script
type="text/javascript"
src="https://cdn.jsdelivr.net/npm/@genome-spy/app@0.51.x"
></script>
<script>
genomeSpyApp.embed(document.body, "spec.json", {
// Show the dataflow inspector button in the toolbar (default: true)
showInspectorButton: true,
});
</script>
</body>
</html>
For a complete example, check the website-examples repository on GitHub.
Specifying a Sample View¶
The GenomeSpy app extends the core library with a new view composition operator that allows visualization of multiple samples. In this context, a sample means a set of data objects representing an organism, a piece of tissue, a cell line, a single cell, etc. Each sample gets its own track in the visualization, and the behavior resembles the facet operator of Vega-Lite. However, there are subtle differences in the behavior.
A sample view is defined by the samples and spec properties. To assign a
track for a data object, define a sample-identifier field using the sample
channel. More complex visualizations can be created using the
layer operator. Each composed view may have
a different data source, enabling concurrent visualization of multiple data
types. For instance, the bottom layer could display segmented copy-number data,
while the top layer might show single-nucleotide variants.
{
"samples": {
// Optional sample identifiers and metadata
...
},
"spec": {
// A single or layer specification
...,
"encoding": {
...,
// The sample channel identifies the track
"sample": {
"field": "sampleId"
}
}
}
}
Y axis ticks
The Y axis ticks are not available in sample views at the moment. Will be fixed at a later time. However, they would not be particularly practical with high number of samples.
But we have Band scale?
Superficially similar results can be achieved by using the
"band" scale on the y channel. However, you can not
adjust the intra-band y-position, as the y channel is already reserved for
assigning a band for a datum. On the other hand, with the band scale, the
graphical marks can span multiple bands. You could, for example, draw lines
between the bands.
Implicit sample identifiers¶
By default, the identifiers of the samples are extracted from the data, and each sample gets its own track.
Defining samples (minimal example)¶
For most sample-collection visualizations, define:
- sample identity (
identity) - one or more metadata sources (
metadataSources)
{
"samples": {
"identity": {
"data": { "url": "samples.tsv" },
"idField": "sample",
"displayNameField": "displayName"
},
"metadataSources": [
{
"id": "clinical",
"name": "Clinical metadata",
"initialLoad": "*",
"excludeColumns": ["sample", "displayName"],
"backend": {
"backend": "data",
"data": { "url": "samples.tsv" },
"sampleIdField": "sample"
}
}
]
},
...
}
This configuration reads sample ids and display names from samples.tsv, then
loads metadata columns from the same file at startup. The sample and
displayName columns are excluded from metadata import, so only actual metadata
attributes are shown in the metadata panel.
For advanced metadata-source configuration (for example, lazy Zarr-backed imports), see Configuring Metadata Sources.
Adjusting font sizes, etc.¶
The samples object can also be used to adjust the font sizes, etc. of the
metadata attributes. For example, to increase the font sizes of the sample and
attribute labels, use the following configuration:
{
"samples": {
...,
"labelFontSize": 12,
"attributeLabelFontSize": 10
},
...
}
The following properties allow for fine-grained control of the font styles:
labelFont-
Type: string
The font typeface. GenomeSpy uses SDF versions of Google Fonts. Check their availability at the A-Frame Fonts repository. System fonts are not supported.
Default value:
"Lato" labelFontSize-
Type: number
The font size in pixels.
Default value:
11 labelFontWeight-
Type: number |
"thin"|"light"|"regular"|"normal"|"medium"|"bold"|"black"The font weight. The following strings and numbers are valid values:
"thin"(100),"light"(300),"regular"(400),"normal"(400),"medium"(500),"bold"(700),"black"(900)Default value:
"regular" labelFontStyle-
Type:
"normal"|"italic"The font style. Valid values:
"normal"and"italic".Default value:
"normal" labelAlign-
Type:
"left"|"center"|"right"The horizontal alignment of the text. One of
"left","center", or"right".Default value:
"left" attributeLabelFont-
Type: string
The font typeface. GenomeSpy uses SDF versions of Google Fonts. Check their availability at the A-Frame Fonts repository. System fonts are not supported.
Default value:
"Lato" attributeLabelFontSize-
Type: number
The font size in pixels.
Default value:
11 attributeLabelFontWeight-
Type: number |
"thin"|"light"|"regular"|"normal"|"medium"|"bold"|"black"The font weight. The following strings and numbers are valid values:
"thin"(100),"light"(300),"regular"(400),"normal"(400),"medium"(500),"bold"(700),"black"(900)Default value:
"regular" attributeLabelFontStyle-
Type:
"normal"|"italic"The font style. Valid values:
"normal"and"italic".Default value:
"normal"
In addition, the following properties are supported:
labelTitleText-
Type: string
Text in the label title
Default:
"Sample name" labelLength-
Type: number
How much space in pixels to reserve for the labels.
Default:
140 attributeSize-
Type: number
Default size (width) of the metadata attribute columns. Can be configured per attribute using the
attributesproperty.Default value:
10 attributeLabelAngle-
Type: number
Angle to be added to the default label angle (-90).
Default value:
0 attributeSpacing-
Type: number
Spacing between attribute columns in pixels.
Default value:
1
Handling variable sample heights¶
The height of a single sample depend on the number of samples and the height of the sample view. Moreover, the end user can toggle between a bird's eye view and a closeup view making the height very dynamic.
To adapt the maximum size of "point" marks to the
height of the samples, you need to specify a dynamic
scale range for the size channel. The following example
demonstrates how to use expressions and the
height parameter to adjust the point size:
"encoding": {
"size": {
"field": "VAF",
"type": "quantitative",
"scale": {
"domain": [0, 1],
"range": [
{ "expr": "0" },
{ "expr": "pow(clamp(height * 0.65, 2, 18), 2)" }
]
}
},
...
}
In this example, the height parameter, provided by the sample view, contains
the height of a single sample. By multiplying it with 0.65, the points get
some padding at the top and bottom. To prevent the points from becoming too
small or excessively large, the clamp function is used to limit the point's
diameter to a minimum of 2 and a maximum of 18 pixels. As the size channel
encodes the area, not the diameter of the points, the pow function is used
to square the value. The technique shown here is used in the
PARPiCL example.
Aggregation¶
TODO
Bookmarking¶
With the GenomeSpy app, users can save the current visualization state,
including scale domains and view visibilities, as bookmarks. These bookmarks are
stored in the
IndexedDB
of the user's web browser. Each database is unique to an
origin, which
typically refers to the hostname and domain of the web server hosting the
visualization. Since the server may host multiple visualizations, each
visualization must have a unique ID assigned to it. To enable bookmarking,
simply add the specId property with an arbitrary but unique string value to
the top-level view. Example:
{
"specId": "My example visualization",
"vconcat": { ... },
...
}
Pre-defined bookmarks and bookmark tour¶
You may want to provide users with a few pre-defined bookmarks that showcase interesting findings from the data. Since bookmarks support Markdown-formatted notes, you can also explain the implications of the findings and present essential background information.
The remote bookmarks feature allows for storing bookmarks in a JSON file on a
web server and provides them to users through the bookmark menu. In addition,
you can optionally enable the tour function, which automatically opens the
first bookmark in the file and allows the user navigate the tour using
previous/next buttons.
Enabling remote bookmarks¶
{
"bookmarks": {
"remote": {
"url": "tour.json",
"tour": true
}
},
"vconcat": { ... },
...
}
The remote object accepts the following properties:
urlRequired-
Type: string
URL to the remote bookmark file.
initialBookmark-
Type: string
Name of the bookmark that should be loaded as the initial state. The bookmark description dialog is shown only if the
tourproperty is set totrue. tour-
Type: boolean
Should the user be shown a tour of the remote bookmarks when the visualization is launched? If the
initialBookmarkproperty is not defined, the tour starts from the first bookmark.Default value:
false afterTourBookmark-
Type: string
Name of the bookmark that should be loaded when the user ends the tour. If
null, the dialog will be closed and the current state is retained. If undefined, the default state without any performed actions will be loaded.
The bookmark file¶
The remote bookmark file consists of an array of bookmark objects. The easiest way to create such bookmark objects is to create a bookmark in the app and choose Share from the submenu () of the bookmark item. The sharing dialog provides the bookmark in a URL-encoded format and as a JSON object. Just copy-paste the JSON object into the bookmark file to make it available to all users. A simplified example:
[
{
"name": "First bookmark",
"actions": [ ... ],
...
},
{
"name": "Second bookmark",
"actions": [ ... ],
...
}
]
Providing the user with an initial state
If you want to provide the user with an initial state comprising specific
actions performed on the samples, a particular visible genomic region, etc.,
you can create a bookmark with the desired settings and set the
initialBookmark property to the bookmark's name. See the documentation
above for details.
Toggleable View Visibility¶
When working with a complex visualization that includes multiple tracks and extensive metadata, it may not always be necessary to display all views simultaneously. The GenomeSpy app offers users the ability to toggle the visibility of nodes within the view hierarchy. This visibility state is also included in shareable links and bookmarks, allowing users to easily access their preferred configurations.
Views have two properties for controlling the visibility:
visible-
Type: boolean
The default visibility of the view. An invisible view is removed from the layout and not rendered. For context, see toggleable view visibility.
Default:
true configurableVisibility-
Type: boolean | AppVisibilityGroupSpec
Is the visibility configurable from the GenomeSpy App view visibility menu.
Configurability requires an explicit view name that is unique in its import scope.
Set to an object with
groupto make configurable views mutually exclusive in the menu (radio buttons) within the same import scope.Default value:
falsefor children oflayer,truefor others
Use object-form configurableVisibility to make views mutually exclusive in the
menu. Views that share the same group in the same import scope are shown as
radio buttons:
{
"name": "rawCoverage",
"configurableVisibility": { "group": "coverageMode" },
...
}
Search¶
The location/search field in the toolbar allows users to quickly navigate to
features in the data. To make features searchable, use the search channel
on marks that represent the searchable data objects.
search accepts either a single field definition or an array of field
definitions. When multiple fields are provided, a datum matches if any of the
fields matches the entered term (case-insensitive exact match).
Examples:
{
...,
"mark": "rect",
"encoding": {
"search": {
"field": "geneSymbol"
},
...,
},
...
}
{
...,
"mark": "rect",
"encoding": {
"search": [
{ "field": "geneSymbol" },
{ "field": "geneId" },
{ "field": "alias" }
],
...,
},
...
}
A practical example¶
Work in progress
This part of the documentation is still under construction. For a live example, check the PARPiCL visualization, which is also available for interactive exploration