如何从Java访问Wikidata SPARQL界面？

Question

如何从Java访问Wikidata SPARQL界面？

And*_*ann 5 java sesame sparql wikidata wikidata-api

我试图从维基数据查询实体的所有实例.我发现目前唯一的方法是使用SPARQL-API.

我找到了一个示例查询,它可以解决我想要做什么并从Web接口成功执行它.不幸的是,我似乎无法在我的Java代码中执行它.我正在使用openRDF SPARQL库.这是我的相关代码:

SPARQLRepository sparqlRepository = new SPARQLRepository(
        "https://query.wikidata.org/");
SPARQLConnection sparqlConnection = new SPARQLConnection(
        sparqlRepository);

String query = "SELECT ?s ?desc ?authorlabel (COUNT(DISTINCT ?sitelink) as ?linkcount) WHERE {"
        + "?s wdt:P31 wd:Q571 ."
        + "?sitelink schema:about ?s ."
        + "?s wdt:P50 ?author"
        + "OPTIONAL { ?s rdfs:label ?desc filter (lang(?desc) = \"en\"). }"
        + "OPTIONAL {"
        + "?author rdfs:label ?authorlabel filter (lang(?authorlabel) = \"en\")."
        + "}"
        + "} GROUP BY ?s ?desc ?authorlabel ORDER BY DESC(?linkcount)";

TupleQuery tupleQuery = sparqlConnection.prepareTupleQuery(
        QueryLanguage.SPARQL, query);
System.out.println("Result for tupleQuery" + tupleQuery.evaluate());

Run Code Online (Sandbox Code Playgroud)

以下是我收到的回复:

Exception in thread "main" org.openrdf.query.QueryEvaluationException: <html>
<head><title>405 Not Allowed</title></head>
<body bgcolor="white">
<center><h1>405 Not Allowed</h1></center>
<hr><center>nginx/1.9.4</center>
</body>
</html>
    at org.openrdf.repository.sparql.query.SPARQLTupleQuery.evaluate(SPARQLTupleQuery.java:59)
    at main.Test.main(Test.java:72)
Caused by: org.openrdf.repository.RepositoryException: <html>
<head><title>405 Not Allowed</title></head>
<body bgcolor="white">
<center><h1>405 Not Allowed</h1></center>
<hr><center>nginx/1.9.4</center>
</body>
</html>
    at org.openrdf.http.client.HTTPClient.handleHTTPError(HTTPClient.java:953)
    at org.openrdf.http.client.HTTPClient.sendTupleQueryViaHttp(HTTPClient.java:718)
    at org.openrdf.http.client.HTTPClient.getBackgroundTupleQueryResult(HTTPClient.java:602)
    at org.openrdf.http.client.HTTPClient.sendTupleQuery(HTTPClient.java:367)
    at org.openrdf.repository.sparql.query.SPARQLTupleQuery.evaluate(SPARQLTupleQuery.java:52)
    ... 1 more

Run Code Online (Sandbox Code Playgroud)

通常我会认为这意味着我需要一个各种各样的API密钥,但Wikidata API似乎是完全开放的.我设置连接时出错了吗？

Answer 1

Jee*_*tra 5

维基数据的正确端点 URL 是https://query.wikidata.org/sparql- 你错过了最后一点。

此外，我注意到您的代码中存在一些小故障。首先，你这样做：

SPARQLConnection sparqlConnection = new SPARQLConnection(sparqlRepository);

Run Code Online (Sandbox Code Playgroud)

这应该是这样的：

RepositoryConnection sparqlConnection = sparqlRepository.getConnection();

Run Code Online (Sandbox Code Playgroud)

始终Repository使用对象从对象中检索您的连接对象getConnection()- 这意味着资源是共享的，并且Repository可以在必要时关闭“悬空”连接。

其次：你不能像这样打印出查询的结果：

System.out.println("Result for tupleQuery" + tupleQuery.evaluate());

Run Code Online (Sandbox Code Playgroud)

如果您希望将结果打印出来，System.out您应该执行以下操作：

tupleQuery.evaluate(new SPARQLResultsTSVWriter(System.out));

Run Code Online (Sandbox Code Playgroud)

或者（如果您希望更多地自定义结果）：

for (BindingSet bs : QueryResults.asList(tupleQuery.evaluate())) {
    System.out.println(bs);
}

Run Code Online (Sandbox Code Playgroud)

对于它的价值 - 通过上述更改，查询请求运行，但看起来您的查询对于 Wikidata 来说太“重”了 - 至少我从服务器收到了超时错误。尝试一个更简单的查询，你会看到代码有效。

归档时间：	9 年，7 月前
查看次数：	1558 次
最近记录：	9 年，7 月前