您好,登錄后才能下訂單哦!
這篇文章主要為大家展示了“merged.gtf如何合并同一轉錄本的exon位置”,內容簡而易懂,條理清晰,希望能夠幫助大家解決疑惑,下面讓小編帶領大家一起研究并學習一下“merged.gtf如何合并同一轉錄本的exon位置”這篇文章吧。
在merged.gtf文件中有所有外顯子的信息,下面的腳本可根據此文件提取轉錄本的所有外顯子位置信息。
merged.gtf文件實例:
Chr00 Cufflinks exon 37990 38333 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "1"; gene_name "MD00G1000200"; oId "CUFF.2.1"; nearest_ref "mRNA:MD00G1000200"; class_code "j"; tss_id "TSS1"; Chr00 Cufflinks exon 38607 38710 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "2"; gene_name "MD00G1000200"; oId "CUFF.2.1"; nearest_ref "mRNA:MD00G1000200"; class_code "j"; tss_id "TSS1"; Chr00 Cufflinks exon 38814 38898 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "3"; gene_name "MD00G1000200"; oId "CUFF.2.1"; nearest_ref "mRNA:MD00G1000200"; class_code "j"; tss_id "TSS1"; Chr00 Cufflinks exon 42611 42713 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "4"; gene_name "MD00G1000200"; oId "CUFF.2.1"; nearest_ref "mRNA:MD00G1000200"; class_code "j"; tss_id "TSS1"; Chr00 Cufflinks exon 42906 43203 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "5"; gene_name "MD00G1000200"; oId "CUFF.2.1"; nearest_ref "mRNA:MD00G1000200"; class_code "j"; tss_id "TSS1";
輸出文件示例:
Chr00 + XLOC_000001 MD00G1000200 TCONS_00000001 exon 37990-38333 38607-38710 38814-38898 42611-42713 42906-43203 Chr00 + XLOC_000001 MD00G1000200 TCONS_00000002 exon 38005-38333 38607-38710 38814-38898 42611-42726 42906-43167 Chr00 + XLOC_000002 MD00G1000400 TCONS_00000003 exon 50386-50877 Chr00 + XLOC_000003 MD00G1000500 TCONS_00000004 exon 76659-76991 77468-77544 77649-77715 77889-77970 78355-78424 Chr00 + XLOC_000004 MD00G1000600 TCONS_00000005 exon 101951-102138 102228-102398 102957-103004 103099-103138 103227-103327 Chr00 + XLOC_000004 MD00G1000600 TCONS_00000006 exon 102003-102138 102228-102398 102957-103004 103099-103138 103227-103327 Chr00 + XLOC_000005 MD00G1000700 TCONS_00000007 exon 105542-105626 105926-106541 108356-108832 Chr00 + XLOC_000005 MD00G1000700 TCONS_00000009 exon 105542-105626 105926-106541 108902-109696 Chr00 + XLOC_000005 MD00G1000700 TCONS_00000008 exon 105542-105626 105926-106541 108949-109696 Chr00 + XLOC_000006 MD00G1001100 TCONS_00000010 exon 276592-277221 280928-280975
其中第一列為染色體;第二列為正負鏈;第三列是gene_id;第四列為gene_name;第五列為轉錄本ID;之后是外顯子的起始位置信息
代碼如下:
#!/usr/bin/perl -w use strict; use warnings; use Getopt::Long; use Config::General; use Cwd qw(abs_path getcwd); use FindBin qw($Bin $Script); my $version = "1.2"; ## prepare parameters ####################################################################### ## ------------------------------------------------------------------------------------------- ## GetOptions my %opts; GetOptions(\%opts, "gtf=s", "od=s", "h"); my $od = $opts{od}; $od = abs_path($od); mkdir $od unless(-d $od); open(IN,"$opts{gtf}") ||die "open file $opts{gtf} failed."; open(OUT,">$opts{od}/merged.tpm") ||die "open file $opts{od}/merged.tpm failed."; while(<IN>){ next if(/^#/); chomp; my($chr,$a,$exon,$start,$end,$c,$link,$d,$lin) = split("\t",$_); $lin=~/transcript_id \"([^\"]*)\"/; my $trans = $1; $lin=~/gene_name \"([^\"]*)\"/; my $gene_name= $1; $lin =~/gene_id \"([^\"]*)\"/; my $gene_id= $1; $lin =~/transcript_id \"([^\"]*)\"/; my $trans_id = $1; print OUT join("\t",$chr,$exon,$start,$end,$link,$gene_id,$trans_id)."\n"; } close(IN); close(OUT); open(IN,"$opts{od}/merged.tpm") ||die "open file $opts{od}/merged.tpm failed."; open(OUT,">$opts{od}/merged.gtf") ||die "open file $opts{od}/merged.gtf failed."; my $cmd=""; my $key=""; while(<IN>){ next if(/^#/); chomp; my ($chr,$exon,$start,$end,$link,$gene_id,$gene_name,$trans_id) = split("\t",$_); if($key eq $trans_id){ $cmd .= "\t".$start."-".$end; }else{ $key = $trans_id; if($cmd ne ""){ print OUT $cmd."\n"; } $cmd = join("\t",$chr,$link,$gene_id,$trans_id,$exon,$start."-".$end); } } close(IN); close(OUT);
以上是“merged.gtf如何合并同一轉錄本的exon位置”這篇文章的所有內容,感謝各位的閱讀!相信大家都有了一定的了解,希望分享的內容對大家有所幫助,如果還想學習更多知識,歡迎關注億速云行業資訊頻道!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。