我尝试从以下站点提取数据:
https://www.zomato.com/ncr/restaurants/north-indian
Run Code Online (Sandbox Code Playgroud)
使用R编程,我是该领域的学习者和初学者!
我尝试过这些:
> library(XML)
> doc<-htmlParse("the url mentioned above")
> Warning message:
> XML content does not seem to be XML: 'https://www.zomato.com/ncr/restaurants/north-indian'
Run Code Online (Sandbox Code Playgroud)
这是一个......我也尝试了readLines()输出如下:-
> readLines("the URL as mentioned above") [i can't specify more than two links so typing this]
> Error in file(con, "r") : cannot open the connection
> In addition: Warning message:
> In file(con, "r") : unsupported URL scheme
Run Code Online (Sandbox Code Playgroud)
我知道该页面不是错误中所示的 XML,但是我还有什么其他方法可以从该站点捕获数据...我确实尝试使用 tidy html 将其转换为 XML 或 XHTML,然后进行处理,但是我无处可去,也许我还不知道使用 tidy html 的实际过程!:( 不确定!建议解决此问题并进行更正(如果有)?
为什么会出现此错误:
错误:未声明前缀“ xsi”的命名空间。
这是我的Java代码:
package com.emp.ma.jbl.nsnhlrspmlpl.nsnhlrspmlpl.internal.action;
import com.emp.ma.util.xml.XMLDocument;
import com.emp.ma.util.xml.XMLDocumentBuilder;
public class yay {
public static void main(String[] args) {
XMLDocument xmldoc = XMLDocumentBuilder.newDocument().addRoot("spml:modifyRequest");
xmldoc.gotoRoot().addTag("modification").addText("");
xmldoc.gotoChild("modification").addTag("valueObject").addText("");
xmldoc.gotoChild("valueObject").addAttribute("xsi:type","halo");
System.out.println(xmldoc);
}
}
Run Code Online (Sandbox Code Playgroud)
直到我尝试引发转换器异常,同时将XML文件转换为HTML进行实验时,这段代码才能正常运行。我需要创建一个格式为xml的文件:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<spml:modifyRequest>
<modification>
<valueObject xsi:type="halo">
</valueObject>
</modification>
</spml:modifyRequest>
Run Code Online (Sandbox Code Playgroud)
我已经从代码中删除了转换器部分,但是在eclipse中却遇到了这个错误:
ERROR: 'Namespace for prefix 'xsi' has not been declared.'
Exception in thread "main" com.emp.ma.util.xml.XMLDocumentException: java.lang.RuntimeException: Namespace for prefix 'xsi' has not been declared.
at com.emp.ma.util.xml.XMLDocumentImpl.toResult(XMLDocumentImpl.java:1244)
at com.emp.ma.util.xml.XMLDocumentImpl.toStream(XMLDocumentImpl.java:1314)
at com.emp.ma.util.xml.XMLDocumentImpl.toString(XMLDocumentImpl.java:1336)
at com.emp.ma.util.xml.XMLDocumentImpl.toString(XMLDocumentImpl.java:1325)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) …Run Code Online (Sandbox Code Playgroud)