bash - Script for finding a longest and shortest word or string in a file?

bash - Script for finding a longest and shortest word or string in a file? - Ask Ubuntu

May 15, 2011

i performing genomics. have file fasta format reads. these genes. each gene called read or contig. each contig starts header , followed alphabets or nusleotides eg: actg , of specific length. want determine longest contig , shortest contig or read or gene in file. please tell me ubuntu script find such contigs. each contig or read in fasta format follows:

>locus_1000_transcript_1/1_confidence_0.000_length_648 ftbs=645 (header) ccgccttggtaacctcgccagcatattgagctttggatccggatggtcgtagaatggcaag gcaggagagagtgtctaatgtggcgccgctctgtacccggggggtaacaatgaatttgcga cgacgtggtatgcccttcgttgaaacccttattagttggagccgctatgtggcggtccaat tatcaagtatttcccacatcttgaagcgcttctggatgtacgcatactatgggttgacgtt agtgtagccgagatttcacagtagctccgaacggtggtagcagacgcccgttcacaaaaac

the header has defined format shows gene loci , number of genes , there space between each contig or read. each of read or contig in file start header of same type mentioned above, values may differ. each contig or read starts > sign. there may contigs of same lengths. – science 3 mins ago

assuming length values in fasta headers correct, extract them there:

sed -nre 's/^>.*_length_([0-9]+) .*/\1/p' \

then sort them numerically

| sort -n \

then output first , last line

| sed -ne '1p;$p'

in 1 statement:

sed -nre 's/^>.*length_([0-9]+) .*/\1/p' | sort -n | sed -ne '1p;$p'

if lengths declared in headers cannot trusted, count length of fasta sequences, first convert them unfasta, print line length of every second line same sort | sed filter above:

uf | awk 'nr%2==0 {print length}' | sort -n | sed -n '1p;$p'

where uf simple bash script found here.

note: both one-liners filters, read input standard input , write standard output. use cat feed them files (or wget -o - feed them off internet).

Search This Blog

Primitatvve

bash - Script for finding a longest and shortest word or string in a file? - Ask Ubuntu

Comments

Post a Comment

Popular posts from this blog

Windows XP installation, no previous version of Windows NT - Super User

software installation - How to install linux driver for a lb-link wireless usb adapter - Ask Ubuntu

permissions - Mount is denied because the NTFS volume is already exclusively opened - Ask Ubuntu