用正则表达将出现基因类型miRNA,rRNA,snRNA的每一行都写入文件protin_coding.txt中
import re
Gene={}
pattern=re.compile(r'gene_type=((mi|r|sn)RNA);')
with open("gencode.v33lift37_gene.gff3","r") as File:
for line in File:
m=pattern.search(line)
if m!=None:
genetype=m.group(1)
(这之后就不知道怎么写了)
with open("protin_coding.txt","w") as out:
for i in sorted (Gene.items(),key=lambda x:x[1],reverse=True):
print('{0}\t{1}'.format(i[0],i[1]))
print('{0}\t{1}'.format(i[0],i[1]),file=out)
Gene={}是字典,你需要用什么做字典的键和值
如果是复制整行内容应该是
import re
Gene=[]
pattern=re.compile(r'gene_type=((mi|r|sn)RNA);')
with open("gencode.v33lift37_gene.gff3","r") as File:
for line in File:
m=pattern.search(line)
if m!=None:
Gene.append(line.strip())
with open("protin_coding.txt","w") as out:
for v in sorted (Gene,reverse=True):
print(v)
print(v,file=out)
如有帮助,请点击我的回答下方的【采纳该答案】按钮帮忙采纳下,谢谢!
不知道你最后是要什么格式的内容,我先把包含的字段和文本行都保存了,入轨要不同的格式,你可以通过修改最后一行代码实现:
import re
Gene={}
pattern=re.compile(r'((mi|r|sn)RNA)')
with open("gencode.v33lift37_gene.gff3","r") as File:
for line in File.readlines():
m=pattern.search(line.strip())
if m!=None:
genetype=m.group(1)
Gene[line.strip()] = genetype
with open("protin_coding.txt","w") as out:
for i in sorted (Gene.items(),key=lambda x:x[1],reverse=True):
out.write(f'{i[1]}\t{i[0]}\n')