mal*_*man 5 html c# httpwebresponse html-agility-pack
我试图解析几个值的HTML响应,然后将它们插入SQL.我能够获得两个值但是,因为代码包含在foreach语句中,所以我得到它们两次.
这是我的HTML回复
<div align="CENTER" class='dataTitle'>Host State Breakdowns:</div>
<p align='center'>
<a href='trends.cgi?host=hostname&includesoftstates=no&assumeinitialstates=yes&initialassumedhoststate=0&backtrack=4'><img src='trends.cgi?createimage&host=hostname&includesoftstates=no&initialassumedhoststate=0&backtrack=4' border="1" alt='Host State Trends' title='Host State Trends' width='500' height='20'></a><br>
</p>
<div align="CENTER">
<table border="0" class='data'>
<tr><th class='data'>State</th><th class='data'>Type / Reason</th><th class='data'>Time</th><th class='data'>% Total Time</th><th class='data'>% Known Time</th></tr>
<tr class='dataEven'><td class='hostUP' rowspan="3">UP</td><td class='dataEven'>Unscheduled</td><td class='dataEven'>0d 10h 5m 19s</td><td class='dataEven'>100.000%</td><td class='dataEven'>100.000%</td></tr>
<tr class='dataEven'><td class='dataEven'>Scheduled</td><td class='dataEven'>0d 0h 0m 0s</td><td class='dataEven'>0.000%</td><td class='dataEven'>0.000%</td></tr>
<tr class='hostUNREACHABLE'><td class='hostUNREACHABLE'>Total</td><td class='hostUNREACHABLE'>0d 0h 0m 0s</td><td class='hostUNREACHABLE'>0.000%</td><td class='hostUNREACHABLE'>0.000%</td></tr>
<tr class='dataOdd'><td class='dataOdd' rowspan="3">Undetermined</td><td class='dataOdd'>Nagios Not Running</td><td class='dataOdd'>0d 0h 0m 0s</td><td class='dataOdd'>0.000%</td><td class='dataOdd'></td></tr>
<tr class='dataOdd'><td class='dataOdd'>Insufficient Data</td><td class='dataOdd'>0d 0h 0m 0s</td><td class='dataOdd'>0.000%</td><td class='dataOdd'></td></tr>
<tr class='dataOdd'><td class='dataOdd'>Total</td><td class='dataOdd'>0d 0h 0m 0s</td><td class='dataOdd'>0.000%</td><td class='dataOdd'></td></tr>
<tr><td colspan="3"></td></tr>
<tr class='dataEven'><td class='dataEven'>All</td><td class='dataEven'>Total</td><td class='dataEven'>0d 10h 5m 19s</td><td class='dataEven'>100.000%</td><td class='dataEven'>100.000%</td></tr>
</table>
</div>
<br><br>
<div align="CENTER" class='dataTitle'>State Breakdowns For Host Services:</div>
<div align="CENTER">
<table border="0" class='data'>
<tr><th class='data'>Service</th><th class='data'>% Time OK</th><th class='data'>% Time Warning</th><th class='data'>% Time Unknown</th><th class='data'>% Time Critical</th><th class='data'>% Time Undetermined</th></tr>
<tr class='dataOdd'><td class='dataOdd'><a href='avail.cgi?host=hostname&service=servicename&t1=1478498400&t2=1478534719&backtrack=4&assumestateretention=yes&assumeinitialstates=yes&assumestatesduringnotrunning=yes&initialassumedhoststate=0&initialassumedservicestate=0&show_log_entries&showscheduleddowntime=yes&rpttimeperiod=24x7'>servicename</a></td><td class='serviceOK'>100.000% (100.000%)</td><td class='serviceWARNING'>0.000% (0.000%)</td><td class='serviceUNKNOWN'>0.000% (0.000%)</td><td class='serviceCRITICAL'>0.000% (0.000%)</td><td class='dataOdd'>0.000%</td></tr>
<tr class='dataEven'><td class='dataEven'><a href='avail.cgi?host=hostname&service=servicename2&t1=1478498400&t2=1478534719&backtrack=4&assumestateretention=yes&assumeinitialstates=yes&assumestatesduringnotrunning=yes&initialassumedhoststate=0&initialassumedservicestate=0&show_log_entries&showscheduleddowntime=yes&rpttimeperiod=24x7'>servicename2</a></td><td class='serviceOK'>100.000% (100.000%)</td><td class='serviceWARNING'>0.000% (0.000%)</td><td class='serviceUNKNOWN'>0.000% (0.000%)</td><td class='serviceCRITICAL'>0.000% (0.000%)</td><td class='dataEven'>0.000%</td></tr>
</table>
</div>
Run Code Online (Sandbox Code Playgroud)
这是我的代码:
var response = (HttpWebResponse)request.GetResponse();
var stream = response.GetResponseStream();
HtmlDocument doc = new HtmlDocument();
doc.Load(stream);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//table[@class]"))
{
foreach (HtmlNode node2 in node.SelectNodes("//td[@class = 'serviceOK']"))
{
var value = node2.InnerText;
}
foreach (HtmlNode node3 in node.SelectNodes("//a[contains(@href, 'avail.cgi')]"))
{
var name = node3.InnerText;
}
}
Run Code Online (Sandbox Code Playgroud)
name显示了servicename,value显示了类serviceOK,但由于第一个foreach,它再次重复.
我的结果如下:
100.000% (100.000%)
100.000% (100.000%)
servicename
servicename2
100.000% (100.000%)
100.000% (100.000%)
servicename
servicename2
Run Code Online (Sandbox Code Playgroud)
有没有办法,首先,匹配值,两个,只有他们显示一次?
Your first foreach traverses the entire document as do both of your other foreach statements inside of the first.
Because there are 2 table elements matching your XPath expression
"//table[@class]"
Run Code Online (Sandbox Code Playgroud)
you are getting your answer twice. If you had more table elements matching your XPath expression, say 7 for example, you would get the result 7 times.
What you want is to find all table divisions (td) with class "serviceOK" that are within a table row (tr) within a table.
Once you have this HtmlNode you can just go to the previous sibling which will contain the service name.
var response = (HttpWebResponse)request.GetResponse();
var stream = response.GetResponseStream();
HtmlDocument doc = new HtmlDocument();
doc.Load(stream);
foreach (HtmlNode serviceOkNode in doc.DocumentNode.SelectNodes("//table[@class]/tr/td[@class = 'serviceOK']"))
{
HtmlNode serviceNameNode = serviceOkNode.PreviousSibling;
var value = serviceOkNode.InnerText;
var name = serviceNameNode.InnerText;
}
Run Code Online (Sandbox Code Playgroud)