我一直在使用GetSearchResults网络服务,它突然停止工作,因为它要求我回复一个Captcha,这没有任何意义,因为它是一个API,所以它不应该要求人类响应.
此代码在Google App Engine中运行.在localhost上它工作正常,但在生产中失败.
下面显示了我的代码尝试获取的内容.即将推出的HTML包含一个Captcha.我应该改回XML.这是怎么回事?
zillow: http://www.zillow.com/webservice/GetSearchResults.htm?zws-id=[API KEY REMOVED]&address=10797+Alameda+Ave&citystatezip=92316&rentzestimate=1
zillow results:
<html><head><title>Zillow: Real Estate, Apartments, Mortgage & Home Values in the US</title><meta http-equiv="X-UA-Compatible" content="IE=8, IE=9"/><meta name="ROBOTS" content="NOINDEX, NOFOLLOW"/><link href="//fonts.googleapis.com/css?family=Open+Sans:400&subset=latin" rel="stylesheet" type="text/css"/><link href="http://www.zillowstatic.com/vstatic/5b67875/static/css/z-pages/captcha.css" type="text/css" rel="stylesheet" media="screen"/><script language="javascript">
function onReCaptchaLoad() {
window.reCaptchaLoaded = true;
}
window.setTimeout(function () {
if (!window.reCaptchaLoaded) {
document.getElementById('norecaptcha').value = true;
document.getElementById('captcha-form').submit();
}
}, 5000);
</script></head><body><main class="zsg-layout-content"><div class="error-content-block"><div class="error-text-content"><!-- <h1>Captcha</h1> --><h5>Please verify you're a human to continue.</h5><div id="content" class="captcha-container"><form method="POST" action="" id="captcha-form"><script type="text/javascript">
var RecaptchaOptions = {"theme":"white","lang":"en-US"};
</script>
<script …Run Code Online (Sandbox Code Playgroud) 就在最近,一个完美运行的脚本开始对http://www.zillow.com/webservice/GetSearchResults.htm和http://www.zillow.com的 HTML get 调用返回 410 响应/webservice/GetDeepSearchResults.htm
将脚本获取的 url 粘贴到浏览器中会产生相同的 410 错误,就好像他们已将其完全删除一样。
我看到对他们的新 Bridge API 的引用,但没有我能找到的关于他们的旧 API 已停止使用的通知。任何见解?
我试图从一个名为zillow的网站获取API为我工作,但我对网络新东西是新手.他们试着在这里解释如何使用它,但它让我失去了所以我看了他们的论坛.有人在那里发布了一个"示例",但我无法看到他们的代码甚至调用API的位置.基本上我需要一个表格字段,它将是一个地址并发送该信息以获取数据,这里是从这些人的例子中获取的源代码,
<html xml:lang="en" lang="en">
<head>
<title></title>
</head>
<body>
<h3><font face="Verdana, Arial, Helvetica, sans-serif">Get Property < # >Zestimates
from Zillow</a></font></h3>
<form method="post" action="/Real-Estate/Zestimate.php" name="zip_search">
<table align="center" width="618">
<tr>
<td colspan="2"><font face="verdana, arial, sans-serif">Please specify the
Property address. </font></td>
<td width="205" align="left"> <div align="left"><font face="Verdana, Arial, Helvetica, sans-serif"><#></a></font></div></td>
</tr>
<tr>
<td colspan="2"><font face="Verdana, Arial, Helvetica, sans-serif">Street</font>:
<input id="street2" type="text" maxlength="50" size="50" value="" name="street"/></td>
<td> </td>
</tr>
<tr>
<td colspan="2"><font face="verdana, arial, sans-serif">City, State or ZipCode:</font>
<input id="citystatezip3" type="text" maxlength="50" size="20" value="" …Run Code Online (Sandbox Code Playgroud) 我在使用 Zillow API 时遇到了一些问题:
问题是我似乎找不到使用 Zillow API 进行一般搜索的方法,例如仅按邮政编码进行搜索。这是来自深度搜索的示例查询:
<SearchResults:searchresults xsi:schemaLocation="http://www.zillow.com/static/xsd/SearchResults.xsd http://www.zillowstatic.com/vstatic/419b583f682a74b83f007039dd9c49f8/static/xsd/SearchResults.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SearchResults="http://www.zillow.com/static/xsd/SearchResults.xsd">
<request>
<address>15096 Oak Creek Rd</address>
<citystatezip>El Cajon, CA</citystatezip>
</request>
<message>
<text>Request successfully processed</text>
<code>0</code>
</message>
<response>
<results>
<result>
<zpid>16893601</zpid>
<links>
<homedetails>http://www.zillow.com/homedetails/15096-Oak-Creek-Rd-El-Cajon-CA-92021/16893601_zpid/</homedetails>
<graphsanddata>http://www.zillow.com/homedetails/15096-Oak-Creek-Rd-El-Cajon-CA-92021/16893601_zpid/#charts-and-data</graphsanddata>
<mapthishome>http://www.zillow.com/homes/16893601_zpid/</mapthishome>
<comparables>http://www.zillow.com/homes/comps/16893601_zpid/</comparables>
</links>
<address>
<street>15096 Oak Creek Rd</street>
<zipcode>92021</zipcode>
<city>El Cajon</city>
<state>CA</state>
<latitude>32.86576</latitude>
<longitude>-116.847964</longitude>
</address>
<FIPScounty>6073</FIPScounty>
<useCode>SingleFamily</useCode>
<taxAssessmentYear>2012</taxAssessmentYear>
<taxAssessment>496002.0</taxAssessment>
<yearBuilt>2006</yearBuilt>
<lotSizeSqFt>108900</lotSizeSqFt>
<finishedSqFt>2700</finishedSqFt>
<bathrooms>3.0</bathrooms>
<bedrooms>3</bedrooms>
<totalRooms>7</totalRooms>
<lastSoldDate>03/22/1999</lastSoldDate>
<lastSoldPrice currency="USD">268000</lastSoldPrice>
<zestimate>
<amount currency="USD">581783</amount>
<last-updated>05/12/2013</last-updated>
<oneWeekChange deprecated="true"/>
<valueChange duration="30" currency="USD">12050</valueChange>
<valuationRange>
<low currency="USD">523605</low> …Run Code Online (Sandbox Code Playgroud) 我想GetDeepSearchResults从Zillow API 访问信息。
我的代码:
library(ZillowR)
zapi_key = getOption('Myapikey')
GetDeepSearchResults(
address = '600 S. Quail Ct.',
zipcode = '67114',
rentzestimate = FALSE,
api_key = zapi_key
)
Run Code Online (Sandbox Code Playgroud)
错误:
Error in GetDeepSearchResults(address = "600 S. Quail Ct.", zipcode = "67114", :
unused arguments (zipcode = "67114", api_key = zapi_key)
Run Code Online (Sandbox Code Playgroud)
为什么会发生此错误?我该怎么做才能解决此问题?
编辑:我根据注释更改了代码,并得到了:
我的代码:
library(ZillowR)
zapi_key = getOption('myapikey')
GetDeepSearchResults(
address = '600 S. Quail Ct.',
citystatezip = '67114',
rentzestimate = FALSE,
zws_id = 'myapikey',
url = "http://www.zillow.com/webservice/GetDeepSearchResults.htm"
)
Run Code Online (Sandbox Code Playgroud)
输出:
$request
$request$address
NULL …Run Code Online (Sandbox Code Playgroud) 我正在使用 Zillow API,但在检索租金数据时遇到问题。目前我正在使用 Python Zillow 包装器,但我不确定它是否适用于提取租金数据。
这是我用于 Zillow API 的帮助页面:https : //www.zillow.com/howto/api/GetSearchResults.htm
import pyzillow
from pyzillow.pyzillow import ZillowWrapper, GetDeepSearchResults
import pandas as pd
house = pd.read_excel('Housing_Output.xlsx')
### Login to Zillow API
address = ['123 Test Street City, State Abbreviation'] # Fill this in with an address
zip_code = ['zip code'] # fill this in with a zip code
zillow_data = ZillowWrapper(API KEY)
deep_search_response = zillow_data.get_deep_search_results(address, zip_code)
result = GetDeepSearchResults(deep_search_response)
# These API calls work, but I am not sure …Run Code Online (Sandbox Code Playgroud) 按照本教程,我尝试从 zillow.com 中提取基本属性信息。更具体地说,我想提取与网站上显示的财产卡相关的信息。
尽管第一页上存在多个属性卡,以下代码只能提取 3 个属性的信息。有人可以解释一下为什么代码会跳过剩余的属性吗?
import requests
import ast
from bs4 import BeautifulSoup
url = 'https://www.zillow.com/homes/for_sale/house,multifamily,townhouse_type/?searchQueryState=%7B%22pagination%22%3A%7B%7D%2C%22mapBounds%22%3A%7B%22west%22%3A-106.43826441618356%2C%22east%22%3A-103.36483912321481%2C%22south%22%3A38.903882034738686%2C%22north%22%3A40.52008627183672%7D%2C%22mapZoom%22%3A9%2C%22customRegionId%22%3A%22fcac4612c1X1-CR9xde3hldsvpa_v24ah%22%2C%22isMapVisible%22%3Afalse%2C%22filterState%22%3A%7B%22hoa%22%3A%7B%22max%22%3A200%7D%2C%22con%22%3A%7B%22value%22%3Afalse%7D%2C%22apa%22%3A%7B%22value%22%3Afalse%7D%2C%22sch%22%3A%7B%22value%22%3Atrue%7D%2C%22ah%22%3A%7B%22value%22%3Atrue%7D%2C%22sort%22%3A%7B%22value%22%3A%22globalrelevanceex%22%7D%2C%22land%22%3A%7B%22value%22%3Afalse%7D%2C%22schu%22%3A%7B%22value%22%3Afalse%7D%2C%22manu%22%3A%7B%22value%22%3Afalse%7D%2C%22schr%22%3A%7B%22value%22%3Afalse%7D%2C%22apco%22%3A%7B%22value%22%3Afalse%7D%2C%22basf%22%3A%7B%22value%22%3Atrue%7D%2C%22schc%22%3A%7B%22value%22%3Afalse%7D%2C%22schb%22%3A%7B%22min%22%3A%227%22%7D%7D%2C%22isListVisible%22%3Atrue%7D'
headers = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9',
'cookie': 'zguid=23|%24ca6368b9-7b92-4d51-ab67-c2be89065efd; _ga=GA1.2.1460486079.1621047110; _pxvid=7fa13d96-b528-11eb-9860-0242ac120012; _gcl_au=1.1.2025797213.1621047113; __gads=ID=66253ab863481044:T=1621047113:S=ALNI_MZr3mehwm2Wjo7NOrmalVtEcJSXag; __pdst=50987f626deb4767a53b5d8ca2ea406a; _fbp=fb.1.1621047115574.1019382068; _pin_unauth=dWlkPU5EVm1PRGRpTVRBdE5UTTFaUzAwWlRBNExUZzJZall0TWpZMU1HWTBNV0ppWlRkbA; G_ENABLED_IDPS=google; userid=X|3|231a9d744e104379%7C3%7CiEt8bkUx9hWaFeyCeAwN9tHl_T0d0Cq-kynGuEvNYr4%3D; loginmemento=1|c2274ba4a4ad76bbe89263d30695c182e9177b9c40a2691f3054987d66a944be; zjs_user_id=%22X1-ZU158jhpb2klds9_4wzn7%22; zgcus_lbut=; zgcus_aeut=189997416; zgcus_ludi=b44a961b-c7ef-11eb-a48f-96824e7eff50-18999; optimizelyEndUserId=oeu1623111792776r0.8778663892923859; _cs_c=1; WRUIDAWS=3326630244368428; visitor_id701843=248614376; visitor_id701843-hash=4be116fbd77089f953bfb6eaf5996ef92662a6ef7d237d3c49f154ffaf4eaa9295c64fb254b106bdff234e183c94498c01af2aab; __stripe_mid=80125db1-17d1-4fc5-ae37-86b12a68709cf3da6d; g_state={"i_p":1627697570928,"i_l":4}; zjs_anonymous_id=%22ca6368b9-7b92-4d51-ab67-c2be89065efd%22; _gac_UA-21174015-56=1.1626042638.Cj0KCQjwraqHBhDsARIsAKuGZeH8gi095UkXfohW-WWvyLosdmTdL8cfJwgAabYF9hS2XU6JlXqpWLcaAq5SEALw_wcB; _gcl_aw=GCL.1626042640.Cj0KCQjwraqHBhDsARIsAKuGZeH8gi095UkXfohW-WWvyLosdmTdL8cfJwgAabYF9hS2XU6JlXqpWLcaAq5SEALw_wcB; zgsession=1|1edd82e6-372a-4546-bc8b-c2bbadfd29b4; DoubleClickSession=true; fbc=fb.1.1626412984774.IwAR2QM6bzrTskAWN5Sk8UnmPlAxb1HRy1h1GRch888QqXfczHZZWb2vDZfIw; _fbc=fb.1.1626413249162.IwAR2QM6bzrTskAWN5Sk8UnmPlAxb1HRy1h1GRch888QqXfczHZZWb2vDZfIw; _csrf=lV2BBFim7Vy2gFTn--PUt0VA; _gaexp=GAX1.2.w27igyYtRQaAa8XQM3MjDw.18837.2!VDVoDKTnRcyv8f4FAcJ8PA.18915.2!Khnq27RoQmSe5DEusmh5xA.18913.3; _gid=GA1.2.705011419.1630004829; FSsampler=707279376; __CT_Data=gpv=26&ckp=tld&dm=zillow.com&apv_82_www33=26&cpv_82_www33=26&rpv_82_www33=13; OptanonConsent=isIABGlobal=false&datestamp=Fri+Aug+27+2021+12%3A39%3A52+GMT-0600+(Mountain+Daylight+Time)&version=5.11.0&landingPath=NotLandingPage&groups=1%3A1%2C3%3A1%2C4%3A1&AwaitingReconsent=false; _cs_id=41cbdc9c-bb0b-aad9-9521-b1328a65ff77.1623111795.22.1630089665.1630089591.1.1657275795752; utag_main=v_id:01796deff9e3001a59964343177e03079002907100838$_sn:41$_se:2$_ss:0$_st:1630255637884$dc_visit:38$ses_id:1630253822479%3Bexp-session$_pn:1%3Bexp-session$dcsyncran:1%3Bexp-session$tdsyncran:1%3Bexp-session$dc_event:2%3Bexp-session$dc_region:us-east-1%3Bexp-session$ttd_uuid:7b8796ca-44dd-45c9-97d9-bcb642d04cd1%3Bexp-session; JSESSIONID=6CB8C410E0FE216644E8C3A0D0851618; ZILLOW_SID=1|AAAAAVVbFRIBVVsVEklf443J474nftKzJe5PKLD80sujgHvySB7tGcqZunX3BDDH9VwceMqGMTPC54%2F0q4CH%2BfmwsC6P; KruxPixel=true; _derived_epik=dj0yJnU9ai1PSUp1eHZ2Y3J3d0c2NVU1N3BBOFlHbnRBOGFzT0smbj1vLWRISDFwdUNoblN5MjQ4cTVyN213Jm09MSZ0PUFBQUFBR0VzRjRVJnJtPTEmcnQ9QUFBQUFHRXNGNFU; KruxAddition=true; search=6|1632872450375%7Crect%3D40.241821806991595%252C-103.77545313688668%252C39.18758562803622%252C-106.02765040251168%26disp%3Dmap%26mdm%3Dauto%26type%3Dhouse%252Cmultifamily%252Ctownhouse%26fs%3D1%26fr%3D0%26mmm%3D1%26rs%3D0%26ah%3D0%09%0911093%09%09%09%09%09%09; _uetsid=d5e0465006a011ecbe3bd1a0f1c47d01; _uetvid=987e1c70c40a11ebaed8859af36f82fb; _px3=ba45c3df5d5d63d4d9780a102253cd60b21ab52b04778344e332e05474011c21:oCvapPXE6jD0rCXhSf4UjtEC2U956148EDyiWwRFOF8z5vwK63/hC8OWsk09O61g1spnZw64iXApZu1wOmKpyA==:1000:68UzJ5+ar5XwNm61bm41bhSHp8Zp1PfQQlL/5tcqdUIJ3RmA106//vvYGewCCwmln6acqbDAVKgqfB8Th05yX0Cw0TBW7dhfNdeNRjp9bxeLvKqZ56yuW+aVoYYp/zj6MNKv9c16vKlP771xSdCgUTvZ0CDmh7Ng55sHugOHt/jj+2Zmp2WLnuYR4rf7SEndqWBbAyQhhG4BKeyrZyEMpA==; AWSALB=3BIj2fUDeYgoAcLKaZdMkcyTzWSof62v91DQuCssJMyknlpZWcRcVnUU5Me29AcnFcjg1k9H2ehS6N0rSwxo4w8lmEvFCy6hgQfKm1HH8oVoWtpICS36NoLMMxmZ; AWSALBCORS=3BIj2fUDeYgoAcLKaZdMkcyTzWSof62v91DQuCssJMyknlpZWcRcVnUU5Me29AcnFcjg1k9H2ehS6N0rSwxo4w8lmEvFCy6hgQfKm1HH8oVoWtpICS36NoLMMxmZ',
'referer': …Run Code Online (Sandbox Code Playgroud) 我试图使用Zillow API.实际上,它正在我的本地工作并返回我需要的所有数据但是当我尝试在我们的托管中发布它时,API返回"请求被阻止,已检测到爬虫".
这是在我的本地但不在我们的服务器中工作的示例代码.
echo @file_get_content("example.xml");
Run Code Online (Sandbox Code Playgroud)
谢谢!
我正在与Zillow Zestimate合作并尝试从中获取一些数据.我有一个表格收集客户的当前地址.
我需要知道的是如何使用我收到的API密钥调用API,然后一旦我能够获取XML数据.发回后如何将其发布到我的网页上?
以下是API调用的链接:
我需要做些什么来解决这个问题?
我试图从Zillow收集数据是不成功的.
例:
url = https://www.zillow.com/homes/for_sale/Los-Angeles-CA_rb/?fromHomePage=true&shouldFireSellPageImplicitClaimGA=false&fromHomePageTab=buy
我想从洛杉矶的所有家庭中提取地址,价格,zestimates,地点等信息.
我已经尝试使用像BeautifulSoup这样的包进行HTML抓取.我也尝试过使用json.我几乎肯定Zillow的API没有帮助.我的理解是,API最适合收集特定属性的信息.
我已经能够从其他站点获取信息,但似乎Zillow使用动态ID(更改每次刷新)使得访问该信息变得更加困难.
更新: 尝试使用以下代码,但仍然没有产生任何结果
import requests
from bs4 import BeautifulSoup
url = 'https://www.zillow.com/homes/for_sale/Los-Angeles-CA_rb/?fromHomePage=true&shouldFireSellPageImplicitClaimGA=false&fromHomePageTab=buy'
page = requests.get(url)
data = page.content
soup = BeautifulSoup(data, 'html.parser')
for li in soup.find_all('div', {'class': 'zsg-photo-card-caption'}):
try:
#There is sponsored links in the list. You might need to take care
#of that
#Better check for null values which we are not doing in here
print(li.find('span', {'class': 'zsg-photo-card-price'}).text)
print(li.find('span', {'class': 'zsg-photo-card-info'}).text)
print(li.find('span', {'class': 'zsg-photo-card-address'}).text)
print(li.find('span', {'class': 'zsg-photo-card-broker-name'}).text)
except :
print('An error occured')
Run Code Online (Sandbox Code Playgroud)