fac*_*497 1 excel vba excel-vba
我试图找到一种从yelp.com获取数据的方法
我有一个电子表格,其中有几个关键字和位置.我希望根据我的电子表格中已有的关键字和位置从yelp列表中提取数据.
我创建了以下代码,但它似乎得到了荒谬的数据,而不是我正在寻找的确切信息.
我想获得公司名称,地址和电话号码,但我得到的只是一无所获.如果有人在这里可以帮我解决这个问题.
Sub find()
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
With ie
ie.Visible = False
ie.Navigate "http://www.yelp.com/search?find_desc=boutique&find_loc=New+York%2C+NY&ns=1&ls=3387133dfc25cc99#start=10"
' Don't show window
ie.Visible = False
'Wait until IE is done loading page
Do While ie.Busy
Application.StatusBar = "Downloading information, lease wait..."
DoEvents
Loop
' Make a string from IE content
Set mDoc = ie.Document
peopleData = mDoc.body.innerText
ActiveSheet.Cells(1, 1).Value = peopleData
End With
peopleData = "" 'Nothing
Set mDoc = Nothing
End Sub
Run Code Online (Sandbox Code Playgroud)
如果您在IE中右键单击View Source,那么很明显,网站上提供的数据不属于文档.Body.innerText属性的一部分.我注意到动态提供的数据通常就是这种情况,对于大多数网络抓取来说,这种方法实在太简单了.
我在谷歌浏览器中打开它并检查元素以了解我真正想要的东西,以及如何使用DOM/HTML解析器找到它; 您需要添加对Microsoft HTML Object Library的引用.

我想你可以让它返回一个<DIV>标签集合,然后检查那些带有If循环内部语句的classname .
我对原始答案进行了一些修改,这应该在新单元格中打印每条记录:
Option Explicit
Private Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub find()
'Uses late binding, or add reference to Microsoft HTML Object Library
' and change variable Types to use intellisense
Dim ie As Object 'InternetExplorer.Application
Dim html As Object 'HTMLDocument
Dim Listings As Object 'IHTMLElementCollection
Dim l As Object 'IHTMLElement
Dim r As Long
Set ie = CreateObject("InternetExplorer.Application")
With ie
.Visible = False
.Navigate "http://www.yelp.com/search?find_desc=boutique&find_loc=New+York%2C+NY&ns=1&ls=3387133dfc25cc99#start=10"
' Don't show window
'Wait until IE is done loading page
Do While .readyState <> 4
Application.StatusBar = "Downloading information, Please wait..."
DoEvents
Sleep 200
Loop
Set html = .Document
End With
Set Listings = html.getElementsByTagName("LI") ' ## returns the list
For Each l In Listings
'## make sure this list item looks like the listings Div Class:
' then, build the string to put in your cell
If InStr(1, l.innerHTML, "media-block clearfix media-block-large main-attributes") > 0 Then
Range("A1").Offset(r, 0).Value = l.innerText
r = r + 1
End If
Next
Set html = Nothing
Set ie = Nothing
End Sub
Run Code Online (Sandbox Code Playgroud)