我试图从这个政府网站的API自动下载一些数据.
网站指示:
All requests for this version should be made to the following URL:
http://api.finder.healthcare.gov/v2.0/
Run Code Online (Sandbox Code Playgroud)
我可以找到大量的相关信息如何发送XML请求,但没有一个例子是R特异性..并有大量的R代码里面,在那里,展示了如何使用XML,httr以及RCurl包,但我找不到任何的例子在SO或r-help邮件列表中,了解如何实际发送 xml请求...还有更多用于解析响应的文档.
在政府网站上,如果单击该PlansForIndividualOrFamily Samples示例,则会显示需要发送的xml请求(下面的代码).
url <- "http://api.finder.healthcare.gov/v2.0/"
xml.request <-
"<?xml version='1.0' encoding='UTF-8'?>
<PrivateOptionsAPIRequest>
<PlansForIndividualOrFamilyRequest>
<Enrollees>
<Primary>
<DateOfBirth>1990-01-01</DateOfBirth>
<Gender>Male</Gender>
<TobaccoUser>Smoker</TobaccoUser>
</Primary>
</Enrollees>
<Location>
<ZipCode>69201</ZipCode>
<County>
<CountyName>CHERRY</CountyName>
<StateCode>NE</StateCode>
</County>
</Location>
<InsuranceEffectiveDate>2012-10-01</InsuranceEffectiveDate>
<IsFilterAnalysisRequiredIndicator>false</IsFilterAnalysisRequiredIndicator>
<PaginationInformation>
<PageNumber>1</PageNumber>
<PageSize>10</PageSize>
</PaginationInformation>
<SortOrder>
<SortField>OOP LIMIT - INDIVIDUAL - IN NETWORK</SortField>
<SortDirection>ASC</SortDirection>
</SortOrder>
<Filter/>
</PlansForIndividualOrFamilyRequest>
</PrivateOptionsAPIRequest>"
Run Code Online (Sandbox Code Playgroud)
用RCurl你这样做;
myheader=c(Connection="close",
'Content-Type' = "application/xml",
'Content-length' =nchar(xml.request))
data = getURL(url = url,
postfields=xml.request,
httpheader=myheader,
verbose=TRUE)
Run Code Online (Sandbox Code Playgroud)
就是这样.然后您可以使用xpathApplyXML包来检索数据.例如,要获得家庭ID:
library(XML)
xmltext <- xmlTreeParse(data, asText = TRUE,useInternalNodes=T)
unlist(xpathApply(xmltext,'//Plan/PlanID',xmlValue)) ## change the right xpath here
"29678NE0780012" "29678NE0780011" "29678NE0140010"
"29678NE0780010" "29678NE0140019" "29678NE0140018" "29678NE0140017"
"29678NE0140016" "29678NE0780005" "29678NE0780004"
Run Code Online (Sandbox Code Playgroud)