CGAP SAGE Track Settings
CGAP Long SAGE   (All mRNA and EST tracks)

Display mode:   

View table schema
Data coordinates converted via liftOver from: Mar. 2006 (NCBI36/hg18)
Data last updated: 2010-12-16

 Note: these data have been converted via liftOver from the Mar. 2006 (NCBI36/hg18) version of the track.


This track displays genomic mappings for human LongSAGE tags from the The Cancer Genome Anatomy Project. SAGE (Serial Analysis of Gene Expression) [Velculescu 1995] is a quantitative technique for measuring gene expression. For a brief overview of SAGE, see the CGAP SAGE information page.

Display Conventions and Configuration

Genomic mappings of 17-base LongSAGE tags are displayed. Tag counts are normalized to tags per million (TPM) in each tissue or library. Tags with higher TPM are more darkly shaded. The CATG restriction site before the start of the tag is rendered as a thick line; the 17 bases of the tag are drawn as a thinner line. Thus the thin end of the tag points in the direction of transcription. The track display modes are:

  • dense - Draws locations of mapped tags on a single line.
  • squish - Draws one item per tag per library without labels.
  • pack - Draws one item per tag per tissue with labels. The label includes the number of libraries of each tissue type containing the tag. Clicking on an item lists the libraries containing the tag, with the libraries from the selected tissue in bold. Clicking on a library in the list displays detailed information about that library.
  • full - Draws one item per tag per library. Clicking on an item displays information about the library, along with other libraries containing the tag.

The track can be configured to display only tags from a selected tissue.


Tag and library data, along with genomic mappers, were obtained from The Cancer Genome Anatomy Project.

Information about the various SAGE libraries, data downloads and other tools for exploring and analyzing these data is available from the CGAP SAGE Genie web site.

Mapping SAGE tags to the human genome

The goal of the SAGE tag mapping is to identify the genomic loci of the associated mRNAs. Since it is impossible to disambiguate tags that map to multiple loci, only unique genomic mappings are kept. To compensate for polypmorphisms between the reference genome and the mRNA libraries, SNPs are considered by the mapping algorithm.

For each position in the genome on both strands, all possible 21-mers, given all combinations of SNPs, were considered. The 21-mers beginning with CATG were generated for use in mapping. Only 21-mers that were unique across the genome were used in placing SAGE tags.

Only SNPs from dbSNP with the following characteristics were used:

  • single-base
  • maps to a single genomic location
  • reference allele matches reference genome
  • does not occur in a tandem repeat

Human embryonic stem cell (ESC) library construction

Detailed information regarding the human ESC lines used in this study can be found at and in Hirst, et al. 2007. The ESC tags were generated from RNA purified from human ESCs maintained under conditions that promote their maintenance in an undifferentiated state.

A complete set of embryonic stem cell LongSAGE tags is available through the CGAP web portal.


Many thanks to Martin Hirst of Canada's Michael Smith Genome Sciences Centre for his assistance in developing this track.

The LongSAGE data and genomic mappings were provided by the The Cancer Genome Anatomy Project of the National Cancer Institute, U.S. National Institutes of Health.

The human embryonic stem cell library was supported by funds from the National Cancer Institute, National Institutes of Health, under Contract No. N01-C0-12400 and by grants from Genome Canada, Genome British Columbia and the Canadian Stem Cell Network.


Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995 Oct 20;270(5235):484-7.

Hirst M, Delaney A, Rogers SA, Schnerch A, Persaud DR, O'Connor MD, Zeng T, Moksa M, Fichter K, Mah D, et al. LongSAGE profiling of nine human embryonic stem cell lines. Genome Biol. 2007 Jun 14;8(6):R113.

Saha S, Sparks AB, Rago C, Akmaev V, Wang CJ, Vogelstein B, Kinzler KW, Velculescu VE. Using the transcriptome to annotate the genome. Nat Biotechnol. 2002 May;20(5):508-12.

Siddiqui AS, Khattra J, Delaney AD, Zhao Y, Astell C, Asano J, Babakaiff R, Barber S, Beland J, Bohacec S, et al. A mouse atlas of gene expression: Large-scale digital gene-expression profiles from precisely defined developing C57BL/6J mouse tissues and cells. Proc Natl Acad Sci U S A. 2005 Dec 20;102(51):18485-90.

Khattra J, Delaney AD, Zhao Y, Siddiqui A, Asano J, McDonald H, Pandoh P, Dhalla N, Prabhu AL, Ma K, et al. Large-scale production of SAGE libraries from microdissected tissues, flow-sorted cells, and cell lines. Genome Res. 2007 Jan;17(1):108-16.

Lal A, Lash AE, Altschul SF, Velculescu V, Zhang L, McLendon RE, Marra MA, Prange C, Morin PJ, Polyak K, et al. A public database for gene expression in human cancers. Cancer Res. 1999 Nov 1;59(21):5403-7.

Riggins GJ, Strausberg RL. Genome and genetic resources from the Cancer Genome Anatomy Project. Hum Mol Genet. 2001 Apr;10(7):663-7.

Boon K, Osorio EC, Greenhut SF, Schaefer CF, Shoemaker J, Polyak K, Morin PJ, Buetow KH, Strausberg RL, De Souza SJ, Riggins GJ. An anatomy of normal and malignant gene expression. Proc Natl Acad Sci U S A. 2002 Aug 20;99(17):11287-92.

Liang P. SAGE Genie: a suite with panoramic view of gene expression. Proc Natl Acad Sci U S A. 2002 Sep 3;99(18):11547-8.