检测何时加载网页而不使用睡眠

use*_*474 5 vbscript internet-explorer dom wsh web-scraping

我在Windows上创建一个VB脚本,在IE中打开一个站点.我想要的:检测网页何时加载并显示消息.我通过使用sleep(WScript.Sleep)来实现这一目标.网站加载时的秒数.但是,该网站在中途弹出用户名,密码.只有当用户输入凭据时,才会完成加载页面.所以我不想使用"睡眠"大约几秒钟,而是使用精确的函数或检测页面加载的方法.我查了线路和尝试使用Do While循环,onload,onclick功能,但没有任何工程.为了简化,即使我编写脚本来打开像yahoo这样的站点并检测,在页面加载时显示消息"Hi":如果不使用sleep(WScript.Sleep),它将无法工作.

ome*_*pes 5

尝试传统方法:

Set objIE = CreateObject("InternetExplorer.Application")
objIE.Visible = True
objIE.Navigate "https://www.yahoo.com/"
Do While objIE.ReadyState <> 4
    WScript.Sleep 10
Loop
' your code here
' ...
Run Code Online (Sandbox Code Playgroud)

UPD:这个应检查错误:

Set objIE = CreateObject("InternetExplorer.Application")
objIE.Visible = True
objIE.Navigate "https://www.yahoo.com/"
On Error Resume Next
Do 
    If objIE.ReadyState = 4 Then
        If Err = 0 Then
            Exit Do
        Else
            Err.Clear
        End If
    End If
    WScript.Sleep 10
Loop
On Error Goto 0
' your code here
' ...
Run Code Online (Sandbox Code Playgroud)

UPD2:你写了IE随着登录弹出进来而断开连接,假设有一种方法可以解除断开连接,然后再次获取IE实例.注意这是"异常编程":)我希望这有助于:

Option Explicit
Dim objIE, strSignature, strInitType

Set objIE = CreateObject("InternetExplorer.Application") ' create IE instance
objIE.Visible = True
strSignature = Left(CreateObject("Scriptlet.TypeLib").GUID, 38) ' generate uid
objIE.putproperty "marker", strSignature ' tokenize the instance
strInitType = TypeName(objIE) ' get typename
objIE.Navigate "https://www.yahoo.com/"
MsgBox "Initial type = " & TypeName(objIE) ' for visualisation

On Error Resume Next
Do While TypeName(objIE) = strInitType ' wait until typename changes (ActveX disconnection), may cause error 800A000E if not within OERN
    WScript.Sleep 10
Loop
MsgBox "Changed type = " & TypeName(objIE) ' for visualisation

Set objIE = Nothing ' excessive statement, just for clearance
Do
    For Each objIE In CreateObject("Shell.Application").Windows ' loop through all explorer windows to find tokenized instance
        If objIE.getproperty("marker") = strSignature Then ' our instance found
            If TypeName(objIE) = strInitType Then Exit Do ' may be excessive type check
        End If
    Next
    WScript.Sleep 10
Loop
MsgBox "Found type = " & TypeName(objIE) ' for visualisation
On Error GoTo 0

Do While objIE.ReadyState <> 4 ' conventional wait if instance not ready
    WScript.Sleep 10
Loop

MsgBox "Title = " & objIE.Document.Title ' for visualisation
Run Code Online (Sandbox Code Playgroud)

您可以从DOM获取所有文本节点,链接等,如下所示:

Option Explicit
Dim objIE, colTags, strResult, objTag, objChild, arrResult

Set objIE = CreateObject("InternetExplorer.Application")
objIE.Visible = True
objIE.Navigate "https://www.yahoo.com/"

Do While objIE.ReadyState <> 4
    WScript.Sleep 10
Loop

Set colTags = objIE.Document.GetElementsByTagName("a")
strResult = "Total " & colTags.Length & " DOM Anchor Nodes:" & vbCrLf
For Each objTag In colTags
    strResult = strResult & objTag.GetAttribute("href") & vbCrLf
Next
ShowInNotepad strResult

Set colTags = objIE.Document.GetElementsByTagName("*")
arrResult = Array()
For Each objTag In colTags
    For Each objChild In objTag.ChildNodes
        If objChild.NodeType = 3 Then
            ReDim Preserve arrResult(UBound(arrResult) + 1)
            arrResult(UBound(arrResult)) = objChild.NodeValue
        End If
    Next
Next
strResult = "Total " & colTags.Length & " DOM object nodes + total " & UBound(arrResult) + 1 & " #text nodes:" & vbCrLf
strResult = strResult & Join(arrResult, vbCrLf)
ShowInNotepad strResult

objIE.Quit

Sub ShowInNotepad(strToFile)
    Dim strTempPath
    With CreateObject("Scripting.FileSystemObject")
        strTempPath = CreateObject("WScript.Shell").ExpandEnvironmentStrings("%TEMP%") & "\" & .gettempname
        With .CreateTextFile(strTempPath, True, True)
            .WriteLine (strToFile)
            .Close
        End With
        CreateObject("WScript.Shell").Run "notepad.exe " & strTempPath, 1, True
        .DeleteFile (strTempPath)
    End With
End Sub
Run Code Online (Sandbox Code Playgroud)

另外看看文本数据

UPD3:我想在这里附加检查网页加载和初始化是否完成:

' ...
' Navigating to some url
objIE.Navigate strUrl
' Wait for IE ready
Do While objIE.ReadyState <> 4 Or objIE.Busy
    WScript.Sleep 10
Loop
' Wait for document complete
Do While objIE.Document.ReadyState <> "complete"
    WScript.Sleep 10
Loop
' Processing loaded webpage code
' ...
Run Code Online (Sandbox Code Playgroud)

UPD4:在某些情况下,您需要跟踪文档中是否已创建目标节点(通常,如果Object required在尝试访问节点时出现错误,则必须执行此操作.getElementById等):

如果页面使用AJAX(加载的页面源HTML不包含目标节点,像JavaScript这样的活动内容动态创建它),则下面的页面片段中有一个示例,显示了它的外观.5.99可以在页面完全加载后创建文本节点,并且向服务器发送一些其他要显示的额外数据的请求占用了一个位置:

...
<td class="price-label">
    <span id="priceblock" class="price-big color">
        5.99
    </span>
</td>
...
Run Code Online (Sandbox Code Playgroud)

或者,如果您正在加载例如Google搜索结果页面并且Next出现等待按钮(特别是,如果您.click在上一页上调用了方法),或者加载了一些带有登录Web表单并等待用户名输入字段的页面<input name="userID" id="userID" type="text" maxlength="24" required="" placeholder="Username" autofocus="">.

以下代码允许进一步检查目标节点是否可访问:

With objIE
    ' Navigating to some url
    .Navigate strUrl
    ' Wait for IE ready
    Do While .ReadyState <> 4 Or .Busy
        WScript.Sleep 10
    Loop
    ' Wait for document complete
    Do While .Document.ReadyState <> "complete"
        WScript.Sleep 10
    Loop
    ' Wait for target node created
    Do While TypeName(.Document.getElementById("userID")) = "Null"
        WScript.Sleep 10
    Loop
    ' Processing target node
    .Document.getElementById("userID").Value = "myusername"
    ' ...
    '
End With
Run Code Online (Sandbox Code Playgroud)