Skip to main content
Version: 0.9

Data

Users can specify the data of each visualization (i.e., track) through a track.data property.

{
"tracks":[{
"data": {...}, // specify the data used in this track
"mark": "rect",
"color": ...,
...
}]
}

Supported Data Formats

For the flexible data exploration, Gosling supports two different kinds of datasets:

  1. Plain Datasets (No HiGlass Server): These datasets can be directly used in Gosling without requiring any data preprocessing, including CSV, JSON, BigWig, BAM.

  2. Pre-aggregated Datasets (HiGlass Server): These datasets are preprocessed for the scalable data exploration and require a HiGlass server to access them in Gosling, including Vector, Multivec, and BEDDE. To learn more about preprocessing your data and setting up the server, please visit the HiGlass website.

CSV (No HiGlass Server)

Any small enough tabular data files, such as tsv, csv, BED, BEDPE, and GFF, can be loaded using "csv" data specification.

{
"tracks": [
{
"data": {
"url": "https://raw.githubusercontent.com/sehilyi/gemini-datasets/master/data/UCSC.HG38.Human.CytoBandIdeogram.csv",
"type": "csv",
"chromosomeField": "Chromosome",
"genomicFields": ["chromStart", "chromEnd"]
},
...,
}]
}

property type description
url

string

Required. Specify the URL address of the data file.

type

string

Required. Must be "csv".

separator

string

Specify file separator, Default: ','

sampleLength

number

Specify the number of rows loaded from the URL. Default: 1000

quantitativeFields

string[]

Specify the name of quantitative data fields.

headerNames

string[]

Specify the names of data fields if a CSV file is headerless.

genomicFields

string[]

Specify the name of genomic data fields.

chromosomeField

string

Specify the name of chromosome data fields.

JSON (No HiGlass Server)

This format allows users to include data directly in the Gosling's JSON specification.

{
"tracks":[{
"data": {
"type": "json",
"chromosomeField": "Chromosome",
"genomicFields": [
"chromStart",
"chromEnd"
],
"values": [
{
"Chromosome": "chr1",
"chromStart": 0,
"chromEnd": 2300000,
"Name": "p36.33",
"Stain": "gneg"
},
{
"Chromosome": "chr1",
"chromStart": 2300000,
"chromEnd": 5300000,
"Name": "p36.32",
"Stain": "gpos25"
}, ...
]
},
... // other configurations of this track
}]
}
property type description
values

DATUM

Required. Values in the form of JSON.

type

string

Required. Must be "json". Define data type.

sampleLength

number

Specify the number of rows loaded from the URL. Default: 1000

quantitativeFields

string[]

Specify the name of quantitative data fields.

genomicFields

string[]

Specify the name of genomic data fields.

chromosomeField

string

Specify the name of chromosome data fields.

BigWig (No HiGlass Server)

{
"tracks":[{
"data": {
"url": 'https://s3.amazonaws.com/gosling-lang.org/data/4DNFIMPI5A9N.bw',
"type": "bigwig",
"column": "position",
"value": "peak"
},
... // other configurations of this track
}]
}
property type description
value

string

Required. Assign a field name of quantitative values.

url

string

Required. Specify the URL address of the data file.

type

string

Required. Must be "bigwig".

column

string

Required. Assign a field name of the middle position of genomic intervals.

start

string

Assign a field name of the start position of genomic intervals.

end

string

Assign a field name of the end position of genomic intervals.

binSize

number

Binning the genomic interval in tiles (unit size: 256).

BAM (No HiGlass Server)

Binary Alignment Map (BAM) is the comprehensive raw data of genome sequencing; it consists of the lossless, compressed binary representation of the Sequence Alignment Map-files.

property type description
url

string

Required. URL link to the BAM data file

type

string

Required. Must be "bam".

indexUrl

string

Required. URL link to the index file of the BAM file

maxInsertSize

number

loadMates

boolean

Vector (Require HiGlass Server)

One-dimensional quantitative values along genomic position (e.g., bigwig) can be converted into HiGlass' "vector" format data. Find out more about this format at HiGlass Docs.

{
"tracks":[{
"data": {
"url": 'https://resgen.io/api/v1/tileset_info/?d=VLFaiSVjTjW6mkbjRjWREA',
"type": "vector",
"column": "position",
"value": "peak"
},
... // other configurations of this track
}]
}
property type description
value

string

Required. Assign a field name of quantitative values.

url

string

Required. Specify the URL address of the data file.

type

string

Required. Must be "vector".

column

string

Required. Assign a field name of the middle position of genomic intervals.

start

string

Assign a field name of the start position of genomic intervals.

end

string

Assign a field name of the end position of genomic intervals.

binSize

number

Binning the genomic interval in tiles (unit size: 256).

Multivec (Require HiGlass Server)

Two-dimensional quantitative values, one axis for genomic coordinate and the other for different samples, can be converted into HiGlass' "multivec" data. For example, multiple BigWig files can be converted into a single multivec file. You can also convert sequence data (FASTA) into this format where rows will be different nucleotide bases (e.g., A, T, G, C) and quantitative values represent the frequency. Find out more about this format at HiGlass Docs.

{
"tracks":[{
"data": {
"url": "https://resgen.io/api/v1/tileset_info/?d=UvVPeLHuRDiYA3qwFlm7xQ",
"type": "multivec",
"row": "sample",
"column": "position",
"value": "peak",
"categories": ["sample 1", "sample 2", "sample 3", "sample 4"]
},
...// other configurations of this track
}]
}
property type description
value

string

Required. Assign a field name of quantitative values.

url

string

Required. Specify the URL address of the data file.

type

string

Required. Must be "multivec".

row

string

Required. Assign a field name of samples.

column

string

Required. Assign a field name of the middle position of genomic intervals.

start

string

Assign a field name of the start position of genomic intervals.

end

string

Assign a field name of the end position of genomic intervals.

categories

string[]

assign names of individual samples.

binSize

number

Binning the genomic interval in tiles (unit size: 256).

BEDDB (Require HiGlass Server)

Regular BED, or similar, files can be pre-aggregated for the scalable data exploration. Find our more about this format at HiGlass Docs.

{
"tracks":[{
"data": {
"url": "https://higlass.io/api/v1/tileset_info/?d=OHJakQICQD6gTD7skx4EWA",
"type": "beddb",
"genomicFields": [
{"index": 1, "name": "start"},
{"index": 2, "name": "end"}
],
"valueFields": [
{"index": 5, "name": "strand", "type": "nominal"},
{"index": 3, "name": "name", "type": "nominal"}
],
"exonIntervalFields": [
{"index": 12, "name": "start"},
{"index": 13, "name": "end"}
]
},
... // other configurations of this track
}]
}
property type description
url

string

Required. Specify the URL address of the data file.

type

string

Required. Must be "beddb".

genomicFields

object[]

Required. Each object follows the format {"index":"number","name":"string"} Specify the name of genomic data fields.

valueFields

object[]

Each object follows the format {"index":"number","name":"string","type":"string"} Specify the column indexes, field names, and field types.

Data Transform

Gosling supports a diverse set of data transforms, including Filter Transform , Str Concat Transform , Str Replace Transform , Log Transform , Displace Transform , Exon Split Transform , Genomic Length Transform , Coverage Transform , Combine Mates Transform , JSON Parse Transform .
{
"tracks":[{
"data": ...,
// a list of data transforms can be applied to the data
"dataTransform": [
{ "type": "filter", "field": "type", "oneOf": ["gene"] },
{ "type": "filter", "field": "strand", "oneOf": ["+"], "not": true }
],
"mark": "rect",
...,
}]
}

Filter Transform

Users can apply three types of filters: oneOf, inRange, include. Each filter transform has the following properties:

Properties of One Of Filter

property type description
type

string

Required. Must be "filter".

oneOf

string[]| number[]

Required. Check whether the value is an element in the provided list.

field

string

Required. A filter is applied based on the values of the specified data field

not

boolean

when {"not": true}, apply a NOT logical operation to the filter. Default: false

Properties of In Range Filter

property type description
type

string

Required. Must be "filter".

inRange

number[]

Required. Check whether the value is in a number range.

field

string

Required. A filter is applied based on the values of the specified data field

not

boolean

when {"not": true}, apply a NOT logical operation to the filter. Default: false

Properties of Include Filter

property type description
type

string

Required. Must be "filter".

include

string

Required. Check whether the value includes a substring.

field

string

Required. A filter is applied based on the values of the specified data field

not

boolean

when {"not": true}, apply a NOT logical operation to the filter. Default: false

Str Concat Transform

property type description
type

string

Required. Must be "concat".

separator

string

Required.

newField

string

Required.

fields

string[]

Required.

Str Replace Transform

property type description
type

string

Required. Must be "replace".

replace

object[]

Required. Each object follows the format {"from":"string","to":"string"}

newField

string

Required.

field

string

Required.

Log Transform

property type description
type

string

Required. Must be "log".

field

string

Required.

newField

string

If specified, store transformed values in a new field.

base

number| string

If not specified, 10 is used.

Displace Transform

property type description
type

string

Required. Must be "displace".

newField

string

Required.

method

string

Required. One of "pile", "spread". A string that specifies the type of diseplancement.

boundingBox

BOUNDINGBOX

Required.

maxRows

number

Specify maximum rows to be generated (default has no limit).

Exon Split Transform

property type description
type

string

Required. Must be "exonSplit".

separator

string

Required.

flag

object

Required. Each object follows the format {"field":"string","value":"number|string"}

fields

object[]

Required. Each object follows the format {"chrField":"string","field":"string","newField":"string","type":"string"}

Coverage Transform

Aggregate rows and calculate coverage

property type description
type

string

Required. Must be "coverage".

startField

string

Required.

endField

string

Required.

newField

string

groupField

string

The name of a nominal field to group rows by in prior to piling-up

JSON Parse Transform

Parse JSON Object Array and append vertically

property type description
type

string

Required. Must be "subjson".

genomicLengthField

string

Required. Length of genomic interval.

genomicField

string

Required. Relative genomic position to parse.

field

string

Required. The field that contains the JSON object array.

baseGenomicField

string

Required. Base genomic position when parsing relative position.

Apart from these data transforms, users can also aggregate data values (min, max, bin, mean, and count). Read more about data aggregation

Types

Type:Datum
property type description
stringKey

number|string

Values in the form of JSON.

Type: BoundingBox
property type description
startField

string

Required. The name of a quantitative field that represents the start position.

endField

string

Required. The name of a quantitative field that represents the end position.

padding

number

The padding around visual lements. Either px or bp

isPaddingBP

boolean

Whether to consider padding as the bp length.

groupField

string

The name of a nominal field to group rows by in prior to piling-up.