logo figure

GenomeSpy

A visualization grammar and a GPU-accelerated rendering engine for genomic (and other) data.

Use GenomeSpy to make your own visualizations!

GenomeSpy builds upon the concepts originally introduced in The Grammar Of Graphics and later implemented in ggplot2 and Vega-Lite. The building blocks that GenomeSpy provides allow users to build tailored, interactive genomic visualizations, which can be embedded on web pages or within JavaScript applications. The carefully crafted GPU-accelerated rendering engine guarantees smoothly animated interactions and a pleasant user experience for end users. Scroll down for live examples.

The Building Blocks

data
Your data: Currently supported formats: CSV, TSV, JSON, FASTA, indexed FASTA, BigWig, BigBed, and GFF3.
transform
Transformations: Filter and derive data, perform computations such as pileup or coverage.
scale
Scales: Make the data dimensions suitable for visual representation.
mark
Graphical marks: Use the point mark for a scatter plot or mutations, adapt the rect mark for a bar chart or genomic segments.
channel
Visual channels: Map the scale-transformed data to the properties of the marks. For example: position, size, color, and symbol.
view composition
View composition: Combine multiple views, optionally sharing data and scales. Concatenate, layer, and facet.
view spec
View specification: Put everything together using the grammar. GenomeSpy's visualization grammar is heavily inspired by Vega-Lite, extending it with functionalities often needed with genomic data.

Resources

Publications

Abstract example: Using rect and text marks to specify a labeled bar chart. Abstract example: Using rect and text marks to make a labeled heatmap. The labels are automatically scaled to fit the cells. Try to zoom in and out! A scatter plot with one and a half million points decorated with some annotations visualizes a miserably failed t-SNE attempt. A Manhattan plot for Genome-Wide Association Study (GWAS). Multiple sequence alignment. Loads data from a fasta file and displays it as a scrollable heatmap and a sequence logo. A structural variation visualization that uses the link mark to show pretty arcs connecting the breakpoints. There's also some segmented copy-number data. GC content of the human genome: One dataset, two visual representations. The data are loaded lazily from a BigWig file and the scale domains are autoscaled to accommodate the region. Using lazy data loading, data transformations, and multiple layers to visualize the GENCODE gene annotation stored in a hierarchical GFF3 file. An Observable notebook describing how to replicate ASCAT's copy-number segmentation visualization. The visualization is interactive and thoroughly commented. Exploring a sample collection with the GenomeSpy App. The visualization shows several cell-line samples with segmented copy numbers, loss of heterozygosity, and SNPs and INDELs. GenomeSpy in action: Lahtinen, A., Lavikka, K., et al. (2023) Evolutionary states and trajectories characterized by distinct pathways stratify patients with ovarian high grade serous carcinoma. SegmentModel Spy. Visualize GATK's copy-number segment models together with read and allelic counts. An example of using GenomeSpy as a visualization library in a special-purpose web application.

Copyright © 2019-2024 Kari Lavikka

GenomeSpy is developed in The Systems Biology of Drug Resistance in Cancer group at the University of Helsinki.

This project has received funding from the European Union's Horizon 2020 Research and Innovation Programme under Grant agreement No. 965193 (DECIDER) and No. 847912 (RESCUER).