file content format(tab seperate)
seq_1     atgagagataac
seq_2     tagcgattcaca


output file format(fasta type)
>seq_1
atgagagataac
>seq_1_recom
gttatctctcat

>seq_2
tagcgattcaca
>seq_2_recom
tgtgaatcgcta

$ cat "sequence file" | while read id seq; do echo -e ">$id\n$seq"; recom_seq=$(echo $seq | rev | tr "atgc" "tacg"); recom_id="$id"_revom; echo -e ">$recom_id\n$recom_seq"; done > outout.fasta


ex)
$ awk 'NR<=1300' "sequence file" |grep -v gggg |cut -f2,8 | while read id seq; do echo -e ">$id\n$seq"; recom_seq=$(echo $seq | rev | tr "atgc" "tacg"); recom_id="$id"_revom; echo -e ">$recom_id\n$recom_seq"; done > output.fasta

728x90

'Bioinformatics' 카테고리의 다른 글

how to blast to nr database  (0) 2012.04.27
usage bwa using paired-end reads  (0) 2012.04.05
Promoter 지역에서 특정 motif 찾기  (0) 2012.02.20
ABySS 사용시 주의점  (0) 2012.02.09
xys 링크파일만들기  (0) 2011.12.14

+ Recent posts