Scale¶

Scales are functions that map abstract data values (e.g., a type of a point mutation) to visual values (e.g., colors that indicate the type).

By default, GenomeSpy configures scales automatically based on the data type (e.g., "ordinal"), the visual channel, and the data domain. As the defaults may not always be optimal, the scales can be configured explicitly.

Scale defaults can also be configured globally using config.scale and config.range. For example, color defaults by data type can be set with nominalColorScheme, ordinalColorScheme, and quantitativeColorScheme. See Config, Themes, and Styles.

Specifying a scale for a channel

{
  "encoding": {
    "y": {
      "field": "impact",
      "type": "quantitative",
      "scale": {
        "type": "linear",
        "domain": [0, 1]
      }
    }
  },
  ...
}

In composed views, shared scales can also be configured at the view level. See Shared scales in composed views.

Vega-Lite scales¶

GenomeSpy implements most of the scale types of Vega-Lite. The aim is to replicate their behavior identically (unless stated otherwise) in GenomeSpy. Although that has yet to fully materialize, Vega-Lite's scale documentation generally applies to GenomeSpy as well.

The supported scales are: "linear", "pow", "sqrt", "symlog", "log", "ordinal", "band", "point", "quantize", and "threshold". Disabled scale is supported on quantitative channels such as x and opacity.

Currently, the following scales are not supported: "time", "utc", "quantile", "bin-linear", "bin-ordinal".

Relation to Vega scales

In fact, GenomeSpy uses Vega scales, which are based on d3-scale. However, GenomeSpy has GPU-based implementations for the actual scale transformations, ensuring high rendering performance.

GenomeSpy-specific scales¶

GenomeSpy provides two additional scales that are designed for molecular sequence data.

Index scale¶

The "index" scale allows mapping index-based values such as nucleotide or amino-acid locations to positional visual channels. It has traits from both the continuous "linear" and the discrete "band" scale. It is linear and zoomable but maps indices to the range like the band scale does – each index has its own band. Properties such as padding work just as in the band scale.

The indices must be zero-based, i.e., the counting must start from zero. The numbering of the axis labels can be adjusted to give an impression of, for example, one-based indexing.

The index scale is used by default when the field type is "index".

User-facing two-point domains on index scales are inclusive. For example, "domain": [2, 4] covers the indices 2, 3, and 4. Domains inferred from observed data also include the last observed index.

Point indices¶

When only the primary positional channel is defined, marks such as "rect" fill the whole band.

{
  "description": "Index scale example with band-filling marks.",

  "data": { "values": [0, 2, 4, 7, 8, 10, 12] },

  "encoding": {
    "x": { "field": "data", "type": "index" }
  },

  "layer": [
    {
      "mark": "rect",
      "encoding": {
        "color": { "field": "data", "type": "nominal" }
      }
    },
    {
      "mark": "text",
      "encoding": {
        "text": { "field": "data" }
      }
    }
  ]
}

Marks such as "point" that do not support the secondary positional channel are centered.

{
  "description": "Index scale example with centered point marks.",

  "data": { "values": [0, 2, 4, 7, 8, 10, 12] },

  "mark": "point",

  "encoding": {
    "x": { "field": "data", "type": "index" },
    "color": { "field": "data", "type": "nominal" },
    "size": { "value": 300 }
  }
}

Segment indices¶

When the index scale is used with segments, e.g., a "rect" mark that has both the x and x2 channels defined, the ranges must be half open. For example, if a segment should cover the indices 2, 3, and 4, a half-open range would be defined as: x = 2 (inclusive), x2 = 5 (exclusive).

Thus, scale.domain uses inclusive bounds, whereas ranged mark encodings such as x/x2 use half-open interval edges directly.

{
  "description": "Index scale example with half-open ranges.",

  "data": {
    "values": [
      { "from": 0, "to": 2 },
      { "from": 2, "to": 5 },
      { "from": 8, "to": 9 },
      { "from": 10, "to": 13 }
    ]
  },

  "encoding": {
    "x": { "field": "from", "type": "index" },
    "x2": { "field": "to" }
  },

  "layer": [
    {
      "mark": "rect",
      "encoding": {
        "color": { "field": "from", "type": "nominal" }
      }
    },
    {
      "mark": "text",
      "encoding": {
        "text": { "expr": "'[' + datum.from + ', ' + datum.to + ')'" }
      }
    }
  ]
}

Adjusting the indexing of axis labels¶

The index scale expects zero-based indexing. However, it may be desirable to display the axis labels using one-based indexing. Use the numberingOffset property adjust the label indices.

{
  "description": "Index scale example with one-based axis numbering.",

  "data": { "values": [0, 2, 4, 7, 8, 10, 12] },

  "encoding": {
    "x": {
      "field": "data",
      "type": "index",
      "scale": { "numberingOffset": 1 }
    }
  },

  "layer": [
    {
      "mark": "rect",
      "encoding": {
        "color": { "field": "data", "type": "nominal" }
      }
    },
    {
      "mark": "text",
      "encoding": {
        "text": { "field": "data" }
      }
    }
  ]
}

Locus scale¶

The "locus" scale is similar to the "index" scale, but provides a genome-aware axis with concatenated chromosomes. See genomic coordinates for assembly and coordinate-system details. Locus scales resolve their assembly from scale.assembly or, if omitted, from the root assembly. If root assembly is omitted and root genomes has exactly one entry, that entry is used as the default assembly.

The locus scale is used by default when the field type is "locus".

Note

The locus scale does not map the discrete chromosomes onto the concatenated axis. It's done by the linearizeGenomicCoordinate transform.

Specifying the domain¶

By default, the domain of the locus scale consists of the whole genome. However, You can specify a custom domain using either linearized or genomic coordinates. A genomic coordinate consists of a chromosome (chrom) and an optional position (pos). The left bound's position defaults to zero, whereas the right bound's position defaults to the size of the chromosome. Thus, the chromosomes are inclusive.

Two-point locus domains are inclusive and cover both endpoint positions.

For example, chromosomes 3, 4, and 5:

[{ "chrom": "chr3" }, { "chrom": "chr5" }]

Only the chromosome 3:

[{ "chrom": "chr3" }]

A specific region inside the chromosome 3:

[
  { "chrom": "chr3", "pos": 1000000 },
  { "chrom": "chr3", "pos": 2000000 }
]

Somewhere inside the chromosome 1:

[1000000, 2000000]

Example¶

{
  "description": "Locus scale example with an explicit genomic domain.",

  "assembly": "hg38",

  "data": {
    "values": [
      { "chrom": "chr3", "pos": 134567890 },
      { "chrom": "chr4", "pos": 123456789 },
      { "chrom": "chr9", "pos": 34567890 }
    ]
  },

  "mark": "point",

  "scales": {
    "x": {
      "domain": [{ "chrom": "chr3" }, { "chrom": "chr9" }]
    }
  },

  "encoding": {
    "x": {
      "chrom": "chrom",
      "pos": "pos",
      "type": "locus"
    },
    "size": { "value": 200 }
  }
}

Domain from Selection Parameters¶

Scale domains can be linked to interval selection parameters:

Use an object-valued domain:

{
  "scale": {
    "zoom": true,
    "domain": {
      "param": "brush",
      "initial": [10, 20]
    }
  }
}

Properties¶

encoding

Type: string

Selection interval channel to use.

If omitted, GenomeSpy infers the channel from the scale channel when possible (e.g., x -> x, x2 -> x, y -> y, y2 -> y).

initial

Type: NumericDomain | string[] | boolean[] | ComplexDomain

Initial configured domain for the linked scale when the linked interval selection is empty.

Only supported when the linked scale is zoomable.

Clearing the linked interval selection resets the domain to the normal default/data-derived domain instead of restoring initial.

param Required

Type: string

Name of an interval selection parameter that provides the domain.

Clearing the linked interval selection returns the scale to its normal default or data-derived domain instead of restoring initial.

Zoomable linked scales automatically synchronize the domain back to the selection. Non-zoomable linked scales only read the selection. This affects "index" and "locus" scales as they are zoomable by default.

For detailed brushing-and-linking guidance and interactive examples, see Parameters: Interval selection.

Zooming and panning¶

To enable zooming and panning of continuous scales on positional channels, set the zoom scale property to true. Example:

{
  "x": {
    "field": "foo",
    "type": "quantitative",
    "scale": {
      "zoom": true
    }
  }
}

Both "index" and "locus" scales are zoomable by default.

Zoom extent¶

The zoom extent allows you to control how far the scale can be zoomed out or panned (translated). Zoom extent equals the scale domain by default, except for the "locus" scale, where it includes the whole genome. Example:

For "index" and "locus" scales, two-point zoom extents are inclusive.

{
  ...,
  "scale": {
    "domain": [10, 20],
    "zoom": {
      "extent": [0, 30]
    }
  }
}

Domain transitions¶

By default, domain updates are applied with a smooth transition when that is possible. Set domainTransition to false to apply the new domain immediately. ExprRef-driven domains default to domainTransition: false unless overridden.

Shared scales in composed views¶

The channel-level scale property follows the Vega-Lite style: scale settings are placed inside an encoding channel. This works well for local scale settings in simple unit views. However, in composed GenomeSpy views, especially genome-browser-like multi-track views, a shared positional scale often represents the viewport of the whole subtree. Placing that viewport domain in one child encoding makes it harder to see which domain controls the composed view.

For example, the channel-level form places the domain inside a child encoding. This is valid, but it makes a subtree-level setting look local to one participant:

Channel-level scale configuration

{
  "layer": [
    {
      "mark": "rect",
      "encoding": {
        "x": {
          "chrom": "chrom",
          "pos": "start",
          "type": "locus",
          "scale": {
            "domain": [
              { "chrom": "chr15", "pos": 92925000 },
              { "chrom": "chr15", "pos": 92949000 }
            ]
          }
        },
        "x2": {
          "chrom": "chrom",
          "pos": "end"
        }
      }
    },
    {
      "mark": "point",
      "encoding": {
        "x": {
          "chrom": "chrom",
          "pos": "pos",
          "type": "locus"
        }
      }
    }
  ]
}

Use view-level scales to configure the same shared scale at the subtree that owns it:

View-level scale configuration

{
  "scales": {
    "x": {
      "domain": [
        { "chrom": "chr15", "pos": 92925000 },
        { "chrom": "chr15", "pos": 92949000 }
      ]
    }
  },
  "layer": [
    {
      "mark": "rect",
      "encoding": {
        "x": {
          "chrom": "chrom",
          "pos": "start",
          "type": "locus"
        },
        "x2": {
          "chrom": "chrom",
          "pos": "end"
        }
      }
    },
    {
      "mark": "point",
      "encoding": {
        "x": {
          "chrom": "chrom",
          "pos": "pos",
          "type": "locus"
        }
      }
    }
  ]
}

Use resolve.scale to choose how scales are shared. A view-level scales.<channel> entry configures the shared scale used by that view subtree. If the subtree has multiple independent scales for the same channel, place scales.<channel> closer to the intended subtree or make the sharing explicit with resolve.scale.

Do not mix view-level scales.<channel> with participating encoding.<channel>.scale objects for the same shared scale. Keep encoding.<channel>.type on member encodings; it describes the encoded data and drives default scale type inference.

Named scales¶

By giving the scale a name, it can be accessed through the API.

{
  ...,
  "scale": {
    "name": "myScale"
  }
}

Axes¶

Positional channels are usually annotated with axes, which are automatically generated based on the scale type. However, you can customize the axis by specifying the axis property in the encoding block.

{
  ...,
  "encoding": {
    "x": {
      "field": "foo",
      "type": "quantitative",
      "axis": {
        "title": "My axis title"
      }
    }
  }
}

GenomeSpy implements most of Vega-Lite's axis properties. See the interface definition for supported properties. TODO: Write a proper documentation.

Grid lines

Grid lines are hidden by default in GenomeSpy and can be enabled for each view using the grid property. Global defaults can be configured with config.axis*.

Genome axis for loci¶

The genome axis is a special axis for the "locus" scale. It displays chromosome names and the intra-chromosomal coordinates. You can adjust the style of the chromosome axis and grid using various parameters.

{
  "description": "Genome axis styling example for locus scales.",

  "assembly": "hg38",

  "data": { "values": [] },

  "mark": "point",

  "encoding": {
    "x": {
      "chrom": "a",
      "pos": "b",
      "type": "locus",
      "axis": {
        "chromTickColor": "#5F87F5",
        "chromLabelColor": "#E16B67",
        "grid": true,
        "gridColor": "gray",
        "gridOpacity": 0.5,
        "gridDash": [1, 11],
        "chromGrid": true,
        "chromGridDash": [3, 3],
        "chromGridColor": "#5F87F5",
        "chromGridOpacity": 0.7,
        "chromGridFillEven": "#BEFACC",
        "chromGridFillOdd": "#FDFCE8"
      }
    }
  }
}

Fully customized axes¶

You can also disable the genome axis and grid and specify a custom axis instead. The "axisGenome" data source provides the chromosomes and their sizes, which can be used to create a custom axes or grids for a view.