现在有文本数据:
chr1 12176 12178 region1
chr1 12178 12182 region2
chr1 12182 12194 region3
chr1 12194 12215 region4
chr1 12215 12232 region5
chr1 12232 12233 region6
chr1 12235 12238 region7
chr1 12238 12242 region8
chr1 12242 12250 region9
chr1 12255 12260 region10
我想要把他们变成:
chr1 12176 12233 region1+region2+region3+region4+region5+region6
chr1 12235 12250 region7+region8+region9
chr1 12255 12260 region10
我应该在python上怎么实现?
dataList = []
with open("a.txt","r") as f:
while True:
data = f.readline().strip()
print(data)
if not data:
break
dataList.append(data.split(" "))
dataNewList = []
for index,data in enumerate(dataList):
if index == 0:
dataNewList.append(data)
else:
if data[1] == dataNewList[-1][2]:
newData = dataNewList[-1]
newData[2] = data[2]
newData[3] = newData[3] + "+" + data[3]
dataNewList[-1] = newData
else:
newData = data
dataNewList.append(newData)
# 转换为字符串
strList = []
for line in dataNewList:
newLine = " ".join(line)
strList.append(newLine)
for s in strList:
print(s)
with open("a.txt","w") as f:
for s in strList:
f.write(s + "\n")
结果:
如果觉得答案对你有帮助,请点击下采纳,谢谢~