Monday, September 23, 2013

v24 (2 of 3): JSON datahub format

Last modified: April 16, 2014



Please refer to the Datahub page on Wubrowse Wiki. This post is no longer updated.



From version 24 the Browser supports JSON datahub. We still maintain the support of tabular text hub format, but it is deprecated and we will drop the support in future.

JSON format is so much better, both for users and developers. Take a look at this feature-rich example datahub:

http://vizhub.wustl.edu/hubSample/hg19/hub.json



  Defining a track  
A track is defined as an object:

{
    type:"bedgraph",
    url:"http://vizhub.wustl.edu/hubSample/hg19/qual3.gz",
    name:"a track with quantitative values",
    mode:"show",
    custom_annotation:["aa","cc"],
    colorpositive:"#ff3333/#b30000",
    height:40,
    fixedscale:{min:0, max:10},
},

  • type
    • Required
    • Track type. Value is bedGraph/bed/bigWig/categorical/BAM/longrange. 
    • Names are case insensitive.
  • url
    • Required
    • URL of the track file
  • name
    • Label to be displayed along with the track. Not an unique identifier.
  • mode
    • Display mode of the track. If the track will be displayed by default, use show/full/thin/density according to its track type. Use "hide" so the track will be hidden by default.
    • If this attribute is not provided, the track will be hidden by default.
  • custom_annotation
    • Annotation by custom metadata vocabulary. The vocabulary must be defined in the same hub. Value is list of attributes.
  • annotation
    • Annotation by native metadata terms, must use term ID. For internal use.
  • colorpositive
    • Rendering color for positive numerical values
    • Optionally a second color can be supplied following a slash, this color is for values beyond max threshold. Numerical track only.
  • colornegative
    • Same as colorpositive but for negative values
  • fixedscale
    • Fixed Y scale threshold values
  • height
    • Plot height for numerical and categorical tracks
  • details
    • Itemized details in the form of key:value pairs. Can be displayed by menu option. 
    • Example value: {'antibody':'CTCF','protocol':'standard chip-seq',}
  • details_text
    • Same as details but is a long string. Mainly for internal use and details is preferred. 
    • Example: "antibody=CTCF; protocol=standard chip-seq; "
  • geo
    • Astring of GEO accessions
    • Example: "GSM123,GSM456"
  • categories
    • Category information for categorical tracks. 
    • Value is {id:["name","color"],id2:["name2","color2"], ...}

Following are experimental attributes:
  • group: to place tracks inside a group, value is integer 1/2/3... Numerical tracks in the same group  share the same Y scale range, and the Y scale is automatically determined to be extreme values of all tracks in the group. You thus cannot change Y scale type to threshold-based for member tracks.
  • normalize:
  • total_mapped_reads



  Defining a metadata vocabulary  

The metadata vocabulary can be as complex as needed:

  • Hierarchical in nature. Arbitrary depth
  • Tree or directed acyclic graph is supported, but not cycles


{
type:"metadata",
vocabulary:{
    "epigenetic mark":{
        "DNA methylation":["aa","bb","cc"],
        "histone mark":["dd","ee"],
        },
    "rna-seq":{
        "mRNA":["aa","ee"],
        "small RNA":["dd","bb"],
        },
     },
show:["epigenetic mark","rna-seq"],
tag:"My Metadata",
},

About the keywords:
  • vocabulary: value is the actual metadata vocabulary. In this example, "epigenetic mark" is a root-level term. It has "DNA methylation" and "histone mark" as its children. "aa" is one of leaf-level attributes. Only leaf-level attributes can be used to annotate tracks
  • show: the list of terms that will be shown in metadata colormap. Both leaf or non-leaf terms can be used
  • tag: optional name of this vocabulary
Please note that you can only define one vocabulary in a hub.



  Making gene sets  



  Adding terms from native metadata vocabulary   

{
type:"native_metadata_terms",
list:["Assay","Sample",30002],
},

"list" contains native metadata term names or ID, and they will show up in the metadata color map on loading the hub.



  Comments  
Any lines starting with '#' will be treated as comment and will not be parsed.

A comment must takes full lines, but not following any JSON statements, or have space characters proceeding it.

'#' comment is a feature designed to help users maintain their hub files. Technically the '#' comment is not legitimate JSON syntax, and thus should be avoided if your hub file will be handled by any third-party software.



  Using JSON hubs  

To submit a track hub on the browser, click "CustomTK" > "+ Add new custom tracks" > "DataHub", enter the hub file URL and click submit.

(You need to toggle the selection box to "json" to submit a JSON hub)

The hub can also be displayed via the "datahub_jsonfile" parameter in URL:

http://epigenomegateway.wustl.edu/browser/?genome=hg19&datahub_jsonfile=http://vizhub.wustl.edu/hubSample/hg19/hub.json