use*_*er1 8 html excel iframe vba web-scraping
要点:
我已成功使用VBA执行以下操作:
使用getElementsByName登录网站
选择将生成的报告的参数(使用getelementsby ...)
需要注意的重要事项 - 该网站是客户端的
以上是简单的部分,困难的部分如下:
单击iframe中将数据集导出到csv的gif图像
我尝试过以下方法:
Dim idoc As HTMLDocument
Dim iframe As HTMLFrameElement
Dim iframe2 As HTMLDocument
Set idoc = objIE.document
Set iframe = idoc.all("iframename")
Set iframe2 = iframe.contentDocument
Do Until InStr(1, objIE.document.all("iframename").contentDocument.innerHTML, "img.gif", vbTextCompare) = 0
DoEvents
Loop
Run Code Online (Sandbox Code Playgroud)
为上面的逻辑提供一些背景 -
正是在这条线上它说"对象不支持这个属性或方法"
还尝试通过a元素和href属性访问iframe gif,但这完全失败了.我也尝试从其源URL抓取图像,但所有这一切都将我带到图像所在的页面.
注意:iframe没有ID,奇怪的是gif图像没有"onclick"元素/事件
最后的考虑 - 尝试使用R来抓取iframe
访问iframe的HTML节点很简单,但是尝试访问iframe的属性,随后表的节点被证明是不成功的.它返回的只是"Character(0)"
library(rvest)
library(magrittr)
Blah <-read_html("web address redacted") %>%
html_nodes("#iframe")%>%
html_nodes("#img")%>%
html_attr("#src")%>%
#read_html()%>%
head()
Blah
Run Code Online (Sandbox Code Playgroud)
只要ai包含read_html,脚本就会返回以下错误:
if(grepl("<|>",x)){:参数的长度为零时出错
我怀疑这是指字符(0)
感谢这里的任何指导!
非常感谢,
HTML
<div align="center">
<table id="table1" style="border-collapse: collapse" width="700" cellspacing="0" cellpadding="0" border="0">
<tbody>
<tr>
<td colspan="6"> </td>
</tr>
<tr>
<td colspan="6">
<a href="href redacted">
<img src="img.gif" width="38" height="38" border="0" align="right">
</a>
<strong>x - </strong>
</td>
</tr>
</tbody>
</table>
</div>
Run Code Online (Sandbox Code Playgroud)
它有时很棘手iframes
.根据html
您的提供,我创建了这个示例.哪个在本地工作,但它也适合你吗?
要到IFrame
了frames
可以用来收藏.希望你知道name
的IFrame
?
Dim iframeDoc As MSHTML.HTMLDocument
Set iframeDoc = doc.frames("iframename").document
Run Code Online (Sandbox Code Playgroud)
然后去image
我们可以使用querySelector
方法,例如:
Dim img As MSHTML.HTMLImg
Set img = iframeDoc.querySelector("div table[id='table1'] tbody tr td a[href^='https://stackoverflow.com'] img")
Run Code Online (Sandbox Code Playgroud)
选择器a[href^='https://stackoverflow.com']
选择anchor
具有href
以给定文本开头的属性.该^
代表开始.
然后,当我们对图像进行简单的调用时click
,它就是所需的父对象anchor
.HTH
完整的例子:
Option Explicit
' Add reference to Microsoft Internet Controls (SHDocVw)
' Add reference to Microsoft HTML Object Library
Sub Demo()
Dim ie As SHDocVw.InternetExplorer
Dim doc As MSHTML.HTMLDocument
Dim url As String
url = "file:///C:/Users/dusek/Documents/My Web Sites/mainpage.html"
Set ie = New SHDocVw.InternetExplorer
ie.Visible = True
ie.navigate url
While ie.Busy Or ie.readyState <> READYSTATE_COMPLETE
DoEvents
Wend
Set doc = ie.document
Dim iframeDoc As MSHTML.HTMLDocument
Set iframeDoc = doc.frames("iframename").document
If iframeDoc Is Nothing Then
MsgBox "IFrame with name 'iframename' was not found."
ie.Quit
Exit Sub
End If
Dim img As MSHTML.HTMLImg
Set img = iframeDoc.querySelector("div table[id='table1'] tbody tr td a[href^='https://stackoverflow.com'] img")
If img Is Nothing Then
MsgBox "Image element within iframe was not found."
ie.Quit
Exit Sub
Else
img.parentElement.Click
End If
ie.Quit
End Sub
Run Code Online (Sandbox Code Playgroud)
使用主页HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<!-- saved from url=(0016)http://localhost -->
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<title>x -</title>
</head>
<body>
<iframe name="iframename" src="iframe1.html">
</iframe>
</body>
</html>
Run Code Online (Sandbox Code Playgroud)
使用IFrame HTML(保存为文件
iframe1.html
)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<!-- saved from url=(0016)http://localhost -->
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<title>Untitled 2</title>
</head>
<body>
<div align="center">
<table id="table1" style="border-collapse: collapse" width="700" cellspacing="0" cellpadding="0" border="0">
<tbody>
<tr>
<td colspan="6"> </td>
</tr>
<tr>
<td colspan="6">
<a href="https://stackoverflow.com/questions/44902558/accessing-object-in-iframe-using-vba">
<img src="img.gif" width="38" height="38" border="0" align="right">
</a>
<strong>x - </strong>
</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
9721 次 |
最近记录: |