望采纳,谢谢!
可以使用Python的 split() 函数来分割Conll文件中的地址和对应的target。
例如:
address,target
123 Main Street,Home
456 Park Avenue,Work
可以使用下面的代码来分割:
with open('conll_file.txt', 'r') as f:
for line in f:
address, target = line.split(',')
print(address, target)
输出:
123 Main Street Home
456 Park Avenue Work
address_priority = ['prov', 'city', 'district', 'town', 'poi', 'subpoi', 'houseno', 'floorno']
with open('address.txt', 'r', encoding='utf-8') as f:
lines = f.readlines()
address = {}
for line in lines:
txt = line.split()
if len(txt) > 1:
for i in address_priority:
if txt[1].endswith(i):
if txt[1].endswith('subpoi'):
if i != 'subpoi':
continue
if address.get(i) is None:
address[i] = txt[0]
else:
address[i] += txt[0]
break
else:
print(address)
address = {}
print(address)