通过System.Net.Http命名空间的HttpClient类获取网页内容,该网页上所有标签均为大写,
如“ 。”
采用小写网页标签的,如“ ”则可以正确访问。
HttpClient hc2 = new HttpClient();
hc2.DefaultRequestHeaders.Add("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8");
hc2.DefaultRequestHeaders.Add("Accept-Encoding", "gzip, deflate");
hc2.DefaultRequestHeaders.Add("Accept-Language", "");
hc2.DefaultRequestHeaders.Add("Connection", "keep-alive");
hc2.DefaultRequestHeaders.Add("Host", "设备IP");
hc2.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:100.0) Gecko/20100101 Firefox/100.0");
//网页标签为大写<HTML> </HTML>无法正常访问
var response3 = await hc2.GetAsync("设备IP");
//网页标签为小写<html></html>的可以正常访问
System.Net.Http.HttpRequestException
HResult=0x80131500
Message=Received an invalid status line: ''.
Source=System.Net.Http
未找到解决办法
希望通过设置HttpClient相关参数,使其能够识别大写的网页标签。
头改成不同的类型试一下,如举例
_httpClient.Timeout = new TimeSpan(0, 0, 30);
_httpClient.DefaultRequestHeaders.Clear();
_httpClient.DefaultRequestHeaders.Accept.Add(
new MediaTypeWithQualityHeaderValue("application/json"));
_httpClient.DefaultRequestHeaders.Accept.Add(
new MediaTypeWithQualityHeaderValue("text/xml"));
控制台输出代码效果:
测试代码:
string url = String.Format("https://www.xcsharp.top/resume/test.html");
HttpClient client = new HttpClient();
string result = client.GetStringAsync(url).Result;
Console.WriteLine("请求HTML内容输出:");
Console.WriteLine(result);
把,hc2.DefaultRequestHeaders.Add("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,/;q=0.8");
改成大写