如下图1,编码和箱型是原数据,如1002-2:A,3:D表示1002使用了2个A箱子和3个D箱子,现在要统计汇总各种箱型用了多少数量,该如何用Python实现,请指导一下,不胜感激!
达到的效果如下图2
使用pandas读取数据进行字符串处理再保存为csv即可。操作示例代码如下:
import pandas as pd
df=pd.DataFrame({'编码':['1001','1002','1003'],'箱形':['1:A,2:B,5:C','2:A,3:D','1:C,1:D']})
a=df['箱形'].str.split(',').tolist()
d={}
for x in a:
for y in x:
m=y.split(':')
if m[1] in d:
d[m[1]]+=int(m[0])
else:
d[m[1]] = int(m[0])
df1=pd.DataFrame(d.items())
print(df1)
运行结果
F:\2021\qa\ot2>t9
0 1
0 A 3
1 B 2
2 C 6
3 D 4
参考代码如下:
import re
li = [
"1001-3:A,1:B",
"1002-2:A,3:D",
"1003-6:C,4:D",
"1004-5:B,7:A"
]
dic = {}
for v in li:
res = re.findall(r"(\d+):([A-Z])",v)
for n,k in res:
dic[k] = dic.get(k,0) + int(n)
for k, v in dic.items():
print(k,":",v)
如有帮助,望采纳!谢谢!