Bri*_*ort 5 c# screen-scraping login httpwebrequest httpwebresponse
虽然我发现了许多关于如何使用HttpWebRequest和Response进行GET和POST的文章和其他信息,但我发现自己很难让事情发挥作用,就像我期望它们一样.
我一直在玩我发现的几个想法,但到目前为止,没有任何工作......我会发布我的代码:
private void start_post()
{
string username = txtUser.Text;
string password = txtPassword.Text;
string strResponce;
byte[] buffer = Encoding.ASCII.GetBytes("username="+username+"&password="+password);
HttpWebRequest WebReq = (HttpWebRequest)WebRequest.Create(txtLink.Text);
WebReq.Method = "POST";
//WebReq.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
WebReq.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)";
WebReq.Headers.Add("Translate", "F");
WebReq.AllowAutoRedirect = true;
WebReq.CookieContainer = cookieJar;
WebReq.KeepAlive = true;
WebReq.ContentType = "application/x-www-form-urlencoded";
WebReq.ContentLength = buffer.Length;
Stream PostData = WebReq.GetRequestStream();
PostData.Write(buffer, 0, buffer.Length);
PostData.Close();
HttpWebResponse WebResp = (HttpWebResponse)WebReq.GetResponse();
//txtResult.Text = WebResp.StatusCode.ToString() + WebResp.Server.ToString();
Stream answer = WebResp.GetResponseStream();
StreamReader _answer = new StreamReader(answer);
strResponce = _answer.ReadToEnd();
//txtResult.Text = txtResult.Text + _answer.ReadToEnd();
answer.Close();
_answer.Close();
foreach (Cookie cookie in WebResp.Cookies)
{
cookieJar.Add(new Cookie(cookie.Name.Trim(), cookie.Value.Trim(), cookie.Path, cookie.Domain));
txtResult.Text += cookie.Name.ToString() + Environment.NewLine + cookie.Value.ToString() + Environment.NewLine + cookie.Path.ToString() + Environment.NewLine + cookie.Domain.ToString();
}
if (strResponce.Contains("Log On Successful") || strResponce.Contains("already has a webseal session"))
{
MessageBox.Show("Login success");
foreach (Control cont in this.Controls)
{
cont.Visible = true;
}
}
else
{
MessageBox.Show("Login Failed.");
}
}
Run Code Online (Sandbox Code Playgroud)
在代码中,我能够一直到底,当我导航到http://www.comicearth.com(我自己的网站,php和apache)时仍然无法登录失败我创建了一个表单,从该表单中,我输入密码和用户名.当它这样做时,它表示失败,这没关系.我也在使用Fidder来观察发生了什么.
所以从这里,我知道我从下面的代码做错了.
但是,当我导航到另一个Web应用程序时,我在行上收到以下错误:
HttpWebResponse WebResp = (HttpWebResponse)WebReq.GetResponse();
Run Code Online (Sandbox Code Playgroud)
"无法为不写入数据的操作设置Content-Length或Chunked Encoding."
我试图找出错误,我所说的一切都是因为302重定向......
所以,看着Fiddler,我可以看到我尝试发布数据和通过网页登录时的巨大差异.所以我知道我做得不够,但我不知道在哪里看.
我的目标是构建一个能够登录网站的应用程序,然后通过他们的搜索选项提取当前我们的用户手动执行的必要数据,如果我可以自动执行一些繁琐的工作,它将真正帮助每个人出.但是,我目前仍然坚持登录,了解cookie等...此外,该网站使用框架,我不知道这是否会成为一个问题,但我想我会发布这些信息,以防万一这是我还没遇到的另一个障碍.
如果您需要我查看更多代码,请告诉我,目前我正在使用httpwebrequest和httpwebresponse,并且我已阅读有关Web客户端的其他信息.
我已经下载并玩过htmlagilitypack,但此时并不确定我是否100%擅长这一切的效果.
如果你知道任何好的文章,或者更深入地介绍这个主题的其他信息,或者有任何我可以尝试的信息,请告诉我.
非常感谢你的时间.
使用新代码进行更新,请参阅下面的评论: - 好的,我发现因为重定向我收到了一条错误消息:"Content-Length或Chunked Encoding等......"所以我转了allowAutoRedirect =假,现在我寻找"位置"标签,并重定向自己等,摆脱了这个消息,但是,我仍然没有登录到该网站,这是令人失望的,我无法弄清楚为什么在时刻.:S
private void start_post2()
{
string username = txtUser.Text;
string password = txtPassword.Text;
Uri link = new Uri(txtLink.Text);
string postArgs = string.Format(@"userId={0}&password={1}", username, password);
byte[] buffer = Encoding.ASCII.GetBytes(postArgs);
HttpWebRequest WebReq = (HttpWebRequest)WebRequest.Create(txtLink.Text);
WebReq.Method = "POST";
//WebReq.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
WebReq.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)";
//WebReq.ClientCertificates.Add("Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5");
WebReq.AllowAutoRedirect = false;
WebReq.Accept = "application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
WebReq.Accept = "*/*";
//WebReq.Headers.Add(HttpRequestHeader.Cookie, cookieJar);
WebReq.CookieContainer = cookieJar;
WebReq.KeepAlive = true;
WebReq.ContentType = "application/x-www-form-urlencoded";
WebReq.ContentLength = buffer.Length;
Stream PostData = WebReq.GetRequestStream();
PostData.Write(buffer, 0, buffer.Length);
PostData.Close();
HttpWebResponse WebResp = (HttpWebResponse)WebReq.GetResponse();
if (WebResp == null) throw new Exception("Response is null");
foreach (Cookie cookie in WebResp.Cookies)
{
cookieJar.Add(new Cookie(cookie.Name.Trim(), cookie.Value.Trim(), cookie.Path, cookie.Domain));
//txtResult.Text += cookie.Name.ToString() + Environment.NewLine + cookie.Value.ToString() + Environment.NewLine + cookie.Path.ToString() + Environment.NewLine + cookie.Domain.ToString();
}
if (!string.IsNullOrEmpty(WebResp.Headers["Location"]))
{
string newLocation = WebResp.Headers["Location"];
//Request the new location
WebReq = (HttpWebRequest)WebRequest.Create(newLocation);
WebReq.Method = "GET";
WebReq.ContentType = "application/x-www-form-unlencoded";
WebReq.AllowAutoRedirect = false;
WebReq.CookieContainer = cookieJar;
WebReq.CookieContainer.Add(WebResp.Cookies);
buffer = Encoding.ASCII.GetBytes("userId=" + username + "&password=" + password);
WebReq.ContentLength = buffer.Length;
PostData = WebReq.GetRequestStream();
PostData.Write(buffer, 0, buffer.Length);
PostData.Close();
WebResp = (HttpWebResponse)WebReq.GetResponse();
foreach (Cookie cookie in WebResp.Cookies)
{
cookieJar.Add(new Cookie(cookie.Name.Trim(), cookie.Value.Trim(), cookie.Path, cookie.Domain));
//txtResult.Text += cookie.Name.ToString() + Environment.NewLine + cookie.Value.ToString() + Environment.NewLine + cookie.Path.ToString() + Environment.NewLine + cookie.Domain.ToString();
}
}
else if (!string.IsNullOrEmpty(WebResp.Headers["Set-Cookie"]))
{
// thinking...
}
foreach (Cookie cookie in cookieJar.GetCookies(link))
{
MessageBox.Show(cookie.Name.ToString() + Environment.NewLine + cookie.Value.ToString() + Environment.NewLine + cookie.Path.ToString() + Environment.NewLine + cookie.Domain.ToString());
}
StreamReader sr = new StreamReader(WebResp.GetResponseStream());
string responseHtml = sr.ReadToEnd().Trim();
SearchPatient(WebReq, username, password);
}
Run Code Online (Sandbox Code Playgroud)