Tuesday, September 25, 2012

v11: search track by GEO accession, and bug fixes

Get source code from dropbox folder.

Version 11 code release provides new feather that allows you to search for tracks with a list of GEO accession numbers, and a few bug fixes.


Find track by GEO accession
Follow these steps to see this new feature in action.

Click this link to open up the Browser with human genome hg19 but no data:

http://epigenomegateway.wustl.edu/browser/?genome=hg19

At navigation bar click tab "Genome heatmap":


Inside the panel, click tab "Find by GEO":


Enter GEO accessions. You can use a list of them here, one accession per line:


GSM469970
GSM521901
GSM521895
GSM521897
GSM521909
GSM521913
GSM469968
GSM521889
GSM608165
GSM945297
GSM788085
GSM733692
GSM607494
GSM733776
GSM945228

This function only accepts Sample accessions (names start with GSM), but not Series (names start with GSE), or Platforms (names start with GPL).

Press button "Search", a short moment later the search results appear:


Tracks are listed in the table on the right. You can press the button "Display all" to have all of them displayed.



Otherwise, you can selectively display a few by clicking on the track labels, and the selected tracks will be added to a table in the toolbox panel. After you finish selection, press button there to add them.

And thank Rebecca Lowdon for suggesting this feature!


Bug fix
A bug associated with displaying bigwig tracks as "wreath tracks" for the Henge View is corrected.

A bug affecting the parsing of mismatching base pairs in the SAM track is fixed.

Sunday, September 23, 2012

v10 minor update: bug fix and more scrolling options

Download source code from dropbox.

This code release fixed a bug associated with SAM track display. When displaying read alignment data during genomic juxtaposition mode, the track image appeared to be shifted due to the bug, now it's fixed.

Besides, you can now drag on any of the genome annotation tracks to scroll.

Try this link to open the browser and show the tracks as displayed in the screen shot below:

http://epigenomegateway.wustl.edu/browser/?genome=hg19&juxtapose=LTR&gftk=LTR,full&coordinate=chr1:1340000-1390000&custombam=stat1hela,http://vizhub.wustl.edu/hubSample/hg19/sam1.gz,thin


As illustrated in this example, genomic juxtaposition focuses the view on LTR elements and reveals that STAT1 ChIP-Seq reads bind one LTR copy, suggesting this LTR copy has something to do with STAT1's business in HeLa cells.

STAT1 ChIP-Seq data comes from this publication: Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing

Sunday, September 16, 2012

v9 major update: continued bigWig support, new custom track UI, bug fix

Access source code from our public Dropbox folder.


bigWig format is continue to be supported

In this code release we continue to support bigWig file format. Now user has more options for visualizing quantitative data: either via bedGraph files compressed by tabix, or bigWig files.

The main advantage of bigWig over bedGraph files is that bigWig performs much faster at high-level, low-resolution browsing.

That is when you look at data at very large genomic intervals or whole chromosomes, especially the case of Bird's Eye View.

But when browsing at finer scale, we don't observe any performance difference between the two data formats.

However, saved sessions prior to this change have gone void due to this change. We're sorry for this inconvenience.


New user interface for custom track function

Click a tab belonging to one type of custom track to see the submission panel.


Showing the panel for bedGraph tracks. Click "GO BACK" on top to slid back.







Bug fix


A bug associated with Gene Plot is fixed. Now it correctly handles genes with only 1 exon.

A bug with getting chromosome sequence during Gene Set View is fixed. When the Browser is running Gene Set View at very fine zoom level, the chromosome sequence can be correctly shown.

Saturday, September 15, 2012

Prepare custom long-range interaction track

A sample script for converting certain UCSC ChIA-Pet track files into WashU Browser track format is now available at http://epigenomegateway.wustl.edu/browser/script/, with name "makeTrack_from_ucscChiapet.py".

To use this script, first download a ChIA-Pet track file from UCSC/ENCODE public file directory: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeGisChiaPet/

Here we use "wgEncodeGisChiaPetHct116Pol2InteractionsRep1.bed.gz" as input for the script. Only files ending with ".bed.gz" or in similar format can be processed by this script.

Make sure you have bedSort, bgzip and tabix programs installed on your computer. On a linux computer, run these commands:

gunzip wgEncodeGisChiaPetHct116Pol2InteractionsRep1.bed.gz

python makeTrack_from_ucscChiapet.py wgEncodeGisChiaPetHct116Pol2InteractionsRep1.bed abcd

After these two steps, 2 files will be generated: "abcd.gz", and "abcd.gz.tbi". Follow step 4 below to display this track via the custom track mechanism.

You are likely required to make small modifications to this script so it can process your data with a different format.




0Make sure you have the tabix program installed.
You can download the latest source and compile:
http://sourceforge.net/projects/samtools/files/tabix/

Or if you're using Ubuntu operating system, install it using apt-get:
$ apt-get install tabix
You should have both tabix and bgzip programs available on your computer.

1
Make a text file for your long-range interaction data with following columns:
  1. chromosome name
  2. start coordinate
  3. stop coordinate
  4. information about the interacting region (e.g. chrX:123-456,3.14, where "chrX:123-456" is the coordinate of the mate, and "3.14" is the score of the interaction)
  5. ID (unique non-negative integer)
  6. relative direction of the interacting region
Be sure to make TWO records for a pair of interacting loci, one record for each locus.


As an example, interval "chr1:111-222" interacts with interval "chr2:333-444" on a score of 55, we will use following two lines to represent this interaction:

chr1   \t   111   \t   222   \t   chr2:333-444,55   \t   1   \t   .
chr2   \t   333   \t   444   \t   chr1:111-222,55   \t   2   \t   .


2Compress the text file:
$ bgzip interaction.txt

The old file is gone and a new file "interaction.txt.gz" is there instead.

3Build tabix index of the compressed file:
$ tabix -p bed interaction.txt.gz

The "interaction.txt.gz" is untouched but an index file "interaction.txt.gz.tbi" is generated.

4Display this file as a custom long-range interaction track on WashU Genome Browser.
Place both files ".gz" and ".gz.tbi" on the SAME directory on your web server.
Use only the URL to the .gz file to make the custom track.

Sunday, September 9, 2012

v8 major update - tabix indexing, chromHMM tracks

The WashU Genome Browser has been on a fast track of change. Today we announce yet another major update.


tabix for file indexing/querying
The Browser is now using tabix to store, index and query the track files (http://samtools.sourceforge.net/tabix.shtml).

Tabix is a peer of UCSC's bigWig/bigBed system, except it is much more generic and simpler.

Users using the custom tracks need to migrate their data. Please refer to following posts on how to convert UCSC formats to tabix format:

bigWig to tabix
bigBed to tabix
BAM to tabix




chromatin state tracks (categorical data)

Broad chromatin state data (http://compbio.mit.edu/ChromHMM/) on 9 human cell lines are now displayed. Data is obtained from ENCODE project. To see all of them, click the link below:

http://epigenomegateway.wustl.edu/browser/?genome=hg19&hmtk=wgEncodeBroadHmmGm12878HMM,wgEncodeBroadHmmH1hescHMM,wgEncodeBroadHmmHepg2HMM,wgEncodeBroadHmmHmecHMM,wgEncodeBroadHmmHsmmHMM,wgEncodeBroadHmmHuvecHMM,wgEncodeBroadHmmK562HMM,wgEncodeBroadHmmNhekHMM,wgEncodeBroadHmmNhlfHMM&metadata=Sample


The chromHMM tracks are underlain by a new type of track -- the track with categorical data. This type of tracks are displayed in genome heatmap along all the genome-wide quantitative assay results, but show data of categorical nature, e.g. different chromatin states.


When you invoke the configuration options on the chromatin state tracks, you will see quite different options compared with quantitative tracks. That is, the complete list of "states" or "categories", and controls to change the color of each state.

This new feature is still under development. We're working to make the custom track support on this new track type.

Generate tabix file from SAM/BAM file

 Update 6/1/2013: SAM format is no longer supported, please use BAM format files instead.

0 Make sure you have the tabix program installed.
You can download the latest source and compile:
http://sourceforge.net/projects/samtools/files/tabix/

Or if you're using Ubuntu operating system, install it using apt-get:
$ apt-get install tabix
You should have both tabix and bgzip programs available on your computer.

1
Skip this step if you have a SAM file.

Convert the BAM file to SAM file using samtools:
$ samtools view input.bam > input.sam

2 Compress the SAM file:
$ bgzip input.sam

The old file is gone and a new file "input.sam.gz" is there instead.

3 Build tabix index of the compressed SAM file:
$ tabix -p sam input.sam.gz

The "input.sam.gz" is untouched but an index file "input.sam.gz.tbi" is generated.

4 Display this file as a custom SAM track on WashU Genome Browser.
Put the .gz and .gz.tbi files on the SAME directory on your web server.
Use only the URL to the .gz file to make the custom track.

Prepare custom track of annotation data (or "bed" track)

0 Make sure you have the tabix program installed.
You can download the latest source and compile:
http://sourceforge.net/projects/samtools/files/tabix/

Or if you're using Ubuntu operating system, install it using apt-get:
$ apt-get install tabix
You should have both tabix and bgzip programs available on your computer.

1 Skip this step if your file is BED format.

Run the command bigBedToBed in UCSC genome browser tool set and convert the bigBed file to a bed text file.

2 Compress the BED file:
$ bgzip input.bed

The old file is gone and a new file "input.bed.gz" is there instead.

3 Build tabix index of the compressed BED file:
$ tabix -p bed input.bed.gz

The "input.bed.gz" is untouched but an index file "input.bed.gz.tbi" is generated.

4 Display this file as a custom bed track on WashU Genome Browser.
Put the .gz and .gz.tbi files on the SAME directory on your web server.
Use only the URL to the .gz file to make the custom track.

The BED format used by WashU Epigenome Browser:
  1. chromosome name
  2. start coordinate
  3. stop coordinate
  4. Name (if absent, use dot)
  5. ID (unique non-negative integer)
  6. Strand (+/-/.)

Prepare custom track of numerical data

0 Make sure you have the tabix program installed.
You can download the latest source and compile:
http://sourceforge.net/projects/samtools/files/tabix/

Or if you're using Ubuntu operating system, install it using apt-get:
$ apt-get install tabix
You should have both tabix and bgzip programs available on your computer.

1 Skip this step if your file is bedGraph format.

For bigWig files...
Rrun the command bigWigToBedGraph in UCSC genome browser tool set and convert the bigWig file to a bedgraph text file.

For wiggle files...
Convert them into bedgraph text files. Only a few lines of code is needed for this task.
Or if you really hate coding, you can convert the wiggle file to bigWig format using wigToBigWig (also from UCSC genome browser tool set), then do bigWigToBedGraph.

2 Compress the bedgraph text file:
$ bgzip input.bedgraph

The old file is gone and a new file "input.bedgraph.gz" is there instead.

3 Build tabix index of the compressed bedgraph file
$ tabix -p bed input.bedgraph.gz

The "input.bedgraph.gz" is untouched but an index file "input.bedgraph.gz.tbi" is generated.

4 Display this file as a custom bedgraph track on WashU Genome Browser.
Put the .gz and .gz.tbi files on the SAME directory on your web server.
Use only the URL to the .gz file to make the custom track.