This track shows pseudogenes identified by the Yale Pseudogene Pipeline.
Pseudogenes are defined in this analysis as genomic sequences that are
similar to known genes with various inactivating disablements (e.g., premature
stop codons or frameshifts) in their "putative" protein coding regions.
Pseudogenes are flagged as either recently processed, recently duplicated,
or of uncertain origin (either ancient fragments or resulting from a
Briefly, the protein sequences of known human genes (as annotated by Ensembl Release
60) were used to search for similarities, not overlapping with known genes.
It was determined whether the matching sequences were disabled copies of genes
based on the occurrences of premature stop codons or frameshifts. The
intron-exon structure of the functional gene was further used to infer
whether a pseudogene was recently duplicated or processed. A duplicated
pseudogene retains the intron-exon structure of its parent functional
gene, whereas a processed pseudogene shows evidence that this structure
has been spliced out. Small pseudogene sequences that cannot be confidently
assigned to either the processed or duplicated category may be ancient
fragments. Further details are in the references below.
These data were generated by the pseudogene annotation group in the
Gerstein Lab at Yale University.
More information is available from
Zhang Z, Harrison PM, Liu Y, Gerstein M.
Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in
the human genome.
Genome Res. 2003 Dec;13(12):2541-58.
PMID: 14656962; PMC: PMC403796
Zheng D, Zhang Z, Harrison PM, Karro J, Carriero N, Gerstein M.
Integrated pseudogene annotation for human chromosome 22: evidence for transcription.
J Mol Biol. 2005 May 27;349(1):27-45.