Selenium - 获取元素html而不是文本值

Lon*_*der 20 c# html-parsing selenium-webdriver

通过该代码,我从html文档中提取了所有需要的文本

private void RunThroughSearch(string url)
{
    private IWebDriver driver;
    driver = new FirefoxDriver();
    INavigation nav = driver.Navigate();
    nav.GoToUrl(url);

    var div = driver.FindElement(By.Id("results"));
    var element = driver.FindElements(By.ClassName("sa_wr"));
}
Run Code Online (Sandbox Code Playgroud)

虽然因为我需要优化提取文档的结果

Container
    HEADER -> Title of a given block
    Url -> Link to the relevant block
    text -> body of a given block
/Container
Run Code Online (Sandbox Code Playgroud)

你可以在我的代码中看到我能够将文本部分的值作为文本值获得,这很好,但如果我想将容器的值作为HTML而不是提取的文本呢?

<div class="container">
    <div class="Header"> Title...</div>
    <div class="Url"> www.example.co.il</div>
    <div class="ResConent"> bla.. </div>
</div>
Run Code Online (Sandbox Code Playgroud)

所以容器在页面中大概是10次我需要提取它的innerHtml.

有任何想法吗 ?(使用Selenium)

Oli*_*ell 39

这似乎对我有用,并且代码较少:

var element = driver.FindElement(By.ClassName("sa_wr"));
var innerHtml = element.GetAttribute("innerHTML");
Run Code Online (Sandbox Code Playgroud)


Yi *_*eng 9

首先找到元素,然后使用IJavaScriptExecutor获取内部HTML.

var element = driver.FindElements(By.ClassName("sa_wr"));
IJavaScriptExecutor js = driver as IJavaScriptExecutor;
if (js != null) {
    string innerHtml = (string)js.ExecuteScript("return arguments[0].innerHTML;", element);
}
Run Code Online (Sandbox Code Playgroud)