Visualizing a SEG file¶
Outdated information
This example is currently broken because of recent changes made to the sample handling and importing.
These examples visualize segmented data with two different visual encodings.
The example data consists of segmentations for two samples. Each segment has a chromosome, intra-chromosomal start and end coordinates, and two quantitative values:
'ID | chrom | loc.start | loc.end | num.mark | seg.mean |
---|---|---|---|---|---|
GenomeWideSNP_416532 | 1 | 51598 | 76187 | 14 | -0.7116 |
GenomeWideSNP_416532 | 1 | 76204 | 16022502 | 8510 | -0.029 |
GenomeWideSNP_416532 | 1 | 16026084 | 16026512 | 6 | -2.0424 |
GenomeWideSNP_416532 | 1 | 16026788 | 17063449 | 424 | -0.1024 |
... | ... | ... | ... | ... | ... |
Data source: https://software.broadinstitute.org/software/igv/SEG
A simple example¶
The following example uses a conventional heatmap
(rect
mark) to display the segments. The color
scale has been configured to match the Integrative Genomics
Viewer.
{
"genome": { "name": "hg18" },
"concat": [
{ "import": { "name": "cytobands" } },
{
"data": {
"url": "example.seg",
"format": { "type": "tsv" }
},
"mark": "rect",
"encoding": {
"x": { "chrom": "chrom", "pos": "loc\\.start", "type": "quantitative" },
"x2": { "chrom": "chrom", "pos": "loc\\.end" },
"y": { "field": "\\'ID", "type": "nominal" },
"color": {
"field": "seg\\.mean",
"type": "quantitative",
"scale": {
"domain": [-1.5, 1.5],
"range": ["blue", "white", "red"]
}
}
}
},
{ "import": { "name": "genomeAxis" } }
]
}
An advanced example: emphasizing focal segments¶
The data contains focal segments that are short and barely visible. Although zooming reveals them, finding them all requires a lot of effort. The following example uses an alternative visual encoding for the data, emphasizing the focal segments.
The quantitative value is encoded as position (height) instead of color.
Focal segments are extracted from the data using the
filter
transform and displayed using
point
mark.
{
"genome": {
"name": "hg18"
},
"concat": [
{
"import": { "name": "cytobands" }
},
{
"name": "layers",
"data": {
"url": "example.seg",
"format": { "type": "tsv" }
},
"encoding": {
"sample": { "field": "\\'ID", "type": "nominal" },
"color": {
"field": "seg\\.mean",
"type": "quantitative",
"scale": {
"type": "threshold",
"domain": [0],
"range": ["#2277ff", "#dd4422"]
}
},
"y": {
"field": "seg\\.mean",
"type": "quantitative"
}
},
"layer": [
{
"mark": {
"type": "rect",
"minWidth": 1,
"minOpacity": 0.2
},
"encoding": {
"x": {
"chrom": "chrom",
"pos": "loc\\.start",
"type": "quantitative"
},
"x2": { "chrom": "chrom", "pos": "loc\\.end" }
}
},
{
"transform": [
{
"type": "filter",
"expr": "datum['loc.end'] - datum['loc.start'] < 8000"
},
{
"type": "formula",
"expr": "(datum['loc.start'] + datum['loc.end']) / 2",
"as": "centre"
}
],
"mark": "point",
"encoding": {
"x": {
"chrom": "chrom",
"pos": "centre",
"type": "quantitative"
},
"y": {
"field": "seg\\.mean",
"type": "quantitative"
},
"size": {
"value": 40
}
}
}
]
},
{
"import": { "name": "genomeAxis" }
}
]
}