Skip to content

Regex Extract

The "regexExtract" transform extracts groups from a string field and adds them to the data objects as new fields.

Parameters

as Required
The new field or an array of fields where the extracted values are written.
field Required
The source field
regex Required

A valid JavaScript regular expression with at least one group. For example: "^Sample(\\d+)$".

Read more at: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

skipInvalidInput

Do not complain about invalid input. Just skip it and leave the new fields undefined on the affected datum.

Default: false

Example

Given the following data:

Gene Genome Location
AKT1 14:104770341-104792643

... and configuration:

{
  "type": "regexExtract",
  "field": "Genome Location",
  "regex": "^(X|Y|\\d+):(\\d+)-(\\d+)$",
  "as": ["Chrom", "Start", "End"]
}

Three new fields are added to the data:

Gene Genome Location Chrom Start End
AKT1 14:104770341-104792643 14 104770341 104792643