python爬虫数据导出问题

尝试爬取私募排排网的净值。

第一导出有数据覆盖的问题,

第二产品名如何遍历导出。

代码如下

import requests
from lxml import etree
import json
import pandas as pd
import csv
import openpyxl


main_url="https://dc.simuwang.com/"
headers={
    'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36'
}

response=requests.get(url=main_url,headers=headers)
# response.encoding='gbk'
page_text=response.text
tree=etree.HTML(page_text)
id_list=[]
all_data_list=[]
divs = tree.xpath('//*[@id="tab-1-1"]//div/@name')# 产品id
title=tree.xpath('//*[@id="tab-1-1"]//a/@title')# 产品名字
for i in divs:
    headers={
    'Referer': 'https://dc.simuwang.com/',
    'user-agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36'
         ,'cookie':'focus-certification-pop=-1; Hm_lvt_c3f6328a1a952e922e996c667234cdae=1618189142,1618297537,1618310809,1619398821; http_tK_cache=b16cd5c24912be382d0d1ed2818ff61052d56859; cur_ck_time=1619399295; ck_request_key=4Wy8CIVn%2BwKK6tfR%2BRMS8WGfTH7vwsoY6AgOYRTg7iA%3D; passport=773283%09u3457744307362%09AAgMAA1RXAxSUFQFUQ9dAgAHA1cBVlsPCQdTBFRSVFA%3D00ba1ca13f; rz_rem_u_p=6H3dlc4C3f9fGhRVuYbzlrukhq0oUgH2zH4ETGpdofQ%3D%24UuKkSpvRl1u49BZ%2FZQTXr4ewhEWnlkLkc%2FX722YcrJE%3D; certification=1; qualified_investor=1; evaluation_result=3; sensorsdata2015jssdkcross=%7B%22distinct_id%22%3A%22773283%22%2C%22first_id%22%3A%22773283%22%2C%22props%22%3A%7B%22%24latest_traffic_source_type%22%3A%22%E7%9B%B4%E6%8E%A5%E6%B5%81%E9%87%8F%22%2C%22%24latest_search_keyword%22%3A%22%E6%9C%AA%E5%8F%96%E5%88%B0%E5%80%BC_%E7%9B%B4%E6%8E%A5%E6%89%93%E5%BC%80%22%2C%22%24latest_referrer%22%3A%22%22%2C%22%24latest_utm_source%22%3A%2210003%22%2C%22%24latest_utm_medium%22%3A%22cpc%22%2C%22%24latest_utm_campaign%22%3A%22S-%E5%93%81%E7%89%8C%E8%AE%A1%E5%88%92-B-PC%22%2C%22%24latest_utm_content%22%3A%22%E5%93%81%E4%B8%93%E8%AF%8D%22%2C%22%24latest_utm_term%22%3A%22%E7%A7%81%E5%8B%9F%E6%8E%92%E6%8E%92%E7%BD%91%22%2C%22_latest_utm_sign%22%3A%22baidu%22%2C%22_latest_utm_platform%22%3A%22pc%22%7D%2C%22%24device_id%22%3A%221786c594b4a64b-04021956498512-1633685c-1296000-1786c594b4b8bd%22%7D; smppw_tz_auth=1; Hm_lpvt_c3f6328a1a952e922e996c667234cdae=1619399323'
    }
#
    url='https://ppwapi.simuwang.com/chart/fundNavTrend'
    data={
        'fund_id': i,
        'index_type': 0,
        'period': 12,
        'rz_type': 3,
        'nav_flag':1,#距今多少个月
        'muid': 773283,
        'USER_ID': 773283,  }

    wbdata=requests.post(url,headers=headers,data=data).text

    j = json.loads(wbdata)
    categories = j['data']['categories']
    value = []
    fillname='/Users/bingtangdunxueli/Desktop/公司/1.csv'
    with open(fillname,'w',encoding='utf-8') as f:
        f.write("name,categories,value\n")
        for i in j['data']['data'][0]:
            value.append(i['value'])

        a = pd.DataFrame()

        a['categories'] = categories
        a['value'] = value
        a['categories'] =a['categories'].map(lambda x:x.replace('-', '') )
        print(a)
        #a.to_csv('fund_id.csv')
        for li in a.values.tolist():
                s = ",".join(map(str,li))
                f.write(s+"\n")

导出结果

只有遍历后最后产品的数据。

但是在打印过程中是没有覆盖的,只有在导出后被覆盖了

各位大神如何解决以上问题,使得产品名遍历导出,以及不再覆盖数据

with open(fillname,'w',encoding='utf-8') as f:

改为

with open(fillname,'w+',encoding='utf-8') as f:,,w+是追加模式

您好,我是有问必答小助手,你的问题已经有小伙伴为您解答了问题,您看下是否解决了您的问题,可以追评进行沟通哦~

如果有您比较满意的答案 / 帮您提供解决思路的答案,可以点击【采纳】按钮,给回答的小伙伴一些鼓励哦~~

ps:问答VIP仅需29元,即可享受5次/月 有问必答服务,了解详情>>>https://vip.csdn.net/askvip?utm_source=1146287632

非常感谢您使用有问必答服务,为了后续更快速的帮您解决问题,现诚邀您参与有问必答体验反馈。您的建议将会运用到我们的产品优化中,希望能得到您的支持与协助!

速戳参与调研>>>https://t.csdnimg.cn/Kf0y