headers = {
'User-Agent ': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'
}
url = 'http://httpbin.org/post'
data = bytes( urllib.parse.urlencode( {'word':'hello'} ) , encoding='utf-8' )
req = urllib.request.Request(url=url , data = data , headers= headers , method='POST')
response = urllib.request.urlopen( req)
print( response.read().decode('utf-8') )
结果是这样:
如果用这段代码去爬别的反爬网站就会报错,
要怎么把前面的python头部删掉呢?
你用的是什么库,你怎么修改的user-agent,什么都没有说,天知道哪里的问题。按理说视你的代码的问题,你是追加而不是替换user-agent
import urllib.request
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.86 Safari/537.36"}
url=""
req=urllib.request.Request(url, headers= headers)
result=urllib.request.urlopen(req).read() #有些时候需要后面再调用.decode('utf-8')
print(result)
https://www.cnblogs.com/AJim-ggsddu-999/p/9608642.html
里面最下面告诉了怎么获取user-agent
使用urllib.request.urlretrieve,有时需要添加UserAgent,这里提供一种变通的方法:
opener = urllib.request.build_opener()
opener.headers = [('User-agent', 'Opera/9.80 (Android 2.3.4; Linux; Opera Mobi/build-1107180945; U; en-GB) Presto/2.8.149 Version/11.10')]
urllib.request.urlopen(urllib.request.Request(url, headers= headers))
参考文章:https://stackoverflow.com/questions/2364593/urlretrieve-and-user-agent-python