|
A twoBit file is a highly efficient way to store genomic sequence.
The format is defined here. To complete the steps below you
will need to download the faToTwoBit, twoBitInfo, and twoBitToFa utilities. For more information
on downloading our command line utilities, please see these
instructions.
To create a twoBit file, follow these steps:
- Prepare the sequence for your twoBit file in a FASTA formatted file (i.e. genome.fa).
- Run the faToTwoBit program on your FASTA file.
faToTwoBit genome.fa genome.2bit
- Use twoBitInfo to verify the sequences in this assembly and create a chrom.sizes file which is useful in later processing to construct the big* files:
twoBitInfo genome.2bit stdout | sort -k2rn > genome.chrom.sizes
The twoBit commands can function with the .2bit file at a URL:
twoBitInfo -udcDir=. http://your-website.edu/~user/genome.2bit | sort -k2nr > genome.chrom.sizes
Sequence can be extracted from the .2bit file with the twoBitToFa command, for example:
twoBitToFa -seq=chr1 -udcDir=. http://your-website.edu/~user/genome.2bit stdout > genome.chr1.fa
| |