需求:读取文件1到文件4,输出最终文件。
import pandas as pd
a
{'abc': 3, 'adb': 4, 'aer': 5}
b
{'abc': 2, 'adb': 3, 'sdf': 4}
c
{'abc': 1, 'qwe': 4, 'aer': 3}
d
{'adc': 4, 'aer': 5, 'add': 3}
df = pd.DataFrame([a,b,c,d]).fillna(0)
df
abc adb aer sdf qwe adc add
0 3.0 4.0 5.0 0.0 0.0 0.0 0.0
1 2.0 3.0 0.0 4.0 0.0 0.0 0.0
2 1.0 0.0 3.0 0.0 4.0 0.0 0.0
3 0.0 0.0 5.0 0.0 0.0 4.0 3.0
df.T
0 1 2 3
abc 3.0 2.0 1.0 0.0
adb 4.0 3.0 0.0 0.0
aer 5.0 0.0 3.0 5.0
sdf 0.0 4.0 0.0 0.0
qwe 0.0 0.0 4.0 0.0
adc 0.0 0.0 0.0 4.0
add 0.0 0.0 0.0 3.0
最后pd.to_csv就可以了
import pandas as pd
import re
result = []
#这里假设文件名是file1.txt~file4.txt
for i in range(4):
with open("file"+str(i+1)+".txt") as a:
result.append(pd.DataFrame([re.split("\s",k.replace("\n","")) for k in a.readlines()],columns = ["key","data"+str(i+1)]))
data = result[0]
for i in range(3):
data = data.merge(result[i+1],on = "key",how="outer").fillna(0)
data