BIL*_*ILL 4 c# html-agility-pack
我有一张桌子
<table>
<tr class="odd">
<td class="ind gray">1</td>
<td><b>acceding</b></td>
<td class="transcr">[?ks?i?d??]</td>
<td class="tran">?????????????</td>
</tr>
<!-- .... -->
<tr class="odd">
<td class="ind gray">999</td>
<td><b>related</b></td>
<td class="transcr">[r?l?e??t?d]</td>
<td class="tran">???????????</td>
</tr>
</table>
Run Code Online (Sandbox Code Playgroud)
我想要在一行中解析三个"td".我的代码
Dictionary<string, Word> words = new Dictionary<string, Word>();
string text = webBrowser1.DocumentText;
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(text);
for (int i = 0; i < doc.DocumentNode.SelectNodes("//tr").Count; i++)
{
HtmlNode node = doc.DocumentNode.SelectNodes("//tr")[i];
Word word = null;
if (TryParseWord(node, out word))
{
try
{
if (!words.ContainsKey(word.eng))
{
words.Add(word.eng, word);
}
}
catch
{ continue; }
}
}
Run Code Online (Sandbox Code Playgroud)
和解析功能
private bool TryParseWord(HtmlNode node, out Word word)
{
word = null;
try
{
var eng = node.SelectNodes("//td")[1].InnerText;
var trans = node.SelectNodes("//td")[2].InnerText;
var rus = node.SelectNodes("//td")[3].InnerText;
word = new Word();
word.eng = eng;
word.rus = rus;
word.trans = trans;
return true;
}
catch
{
word = null;
return false;
}
}
Run Code Online (Sandbox Code Playgroud)
在我的方法TryParseWord中,我只有第一行的值.如何解决这个问题?
我可以很容易地获得这些价值
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var table = doc.DocumentNode
.Descendants("tr")
.Select(n => n.Elements("td").Select(e => e.InnerText).ToArray());
Run Code Online (Sandbox Code Playgroud)
用法:
foreach (var tr in table)
{
Console.WriteLine("{0} {1} {2} {3}", tr[0], tr[1], tr[2], tr[3]);
}
Run Code Online (Sandbox Code Playgroud)