我精心设计了这个正则表达式:
<entry>\\n<(\w+)>(.+?)</\w+>\\n</entry>
Run Code Online (Sandbox Code Playgroud)
解析以下RSS 提要:
<?xml version="1.0" encoding="UTF-8"?>\n<feed version="0.3" xmlns="http://purl.org/atom/ns#">\n<title>Gmail - Inbox for g.bargelli@gmail.com</title>\n<tagline>New messages in your Gmail Inbox</tagline>\n<fullcount>2</fullcount>\n<link rel="alternate" href="http://mail.google.com/mail" type="text/html" />\n<modified>2011-03-15T11:07:48Z</modified>\n<entry>\n<title>con due mail...</title>\n<summary>Gianluca Bargelli http://about.me/proudlygeek/bio</summary>\n<link rel="alternate" href="http://mail.google.com/mail?account_id=g.bargelli@gmail.com&message_id=12eb9332c2c1fa27&view=conv&extsrc=atom" type="text/html" />\n<modified>2011-03-15T11:07:42Z</modified>\n<issued>2011-03-15T11:07:42Z</issued>\n<id>tag:gmail.google.com,2004:1363345158434847271</id>\n<author>\n<name>me</name>\n<email>g.bargelli@gmail.com</email>\n</author>\n</entry>\n<entry>\n<title>test nuova mail</title>\n<summary>Gianluca Bargelli sono tornato!?& http://about.me/proudlygeek/bio</summary>\n<link rel="alternate" href="http://mail.google.com/mail?account_id=g.bargelli@gmail.com&message_id=12eb93140d9f7627&view=conv&extsrc=atom" type="text/html" />\n<modified>2011-03-15T11:05:36Z</modified>\n<issued>2011-03-15T11:05:36Z</issued>\n<id>tag:gmail.google.com,2004:1363345026546890279</id>\n<author>\n<name>me</name>\n<email>g.bargelli@gmail.com</email>\n</author>\n</entry>\n</feed>\n'skinner.com/products/spl].
Run Code Online (Sandbox Code Playgroud)
问题是我没有通过使用Python 的 re 模块获得任何匹配项:
import re
regex = re.compile("""<entry>\\n<(\w+)>(.+?)</\w+>\\n</entry>""")
regex.findall(rss_string) # Returns an empty list
Run Code Online (Sandbox Code Playgroud)
使用在线正则表达式测试器(例如这个)可以按预期工作,所以我认为这不是正则表达式问题。
我很清楚使用正则表达式来解析上下文无关语法是不好的,但在我的情况下,正则表达式可能只适用于那个 RSS 提要(顺便说一下,它是一个 Gmail 收件箱提要),我知道我可以使用外部库/xml 解析器来完成此任务:这只是练习,而不是习惯。
问题应该是 …
我想将 RSS 提要保存到我计算机上的 xml 文档中。我自己使用 XPath 和 Java 来解析 XML,所以我想要的只是一个文件,其中包含我在查看网站 RSS 页面的源时看到的源 (XML)。
换句话说,我不想将 RSS 页面的源代码复制并粘贴到我另存为 XML 文件的文件中,而是想编写一个程序来为我提取它。
将RSS源加载到PHP源中的最佳方法是什么?
我还想处理media:content显示图像的问题.此刻我有以下代码,但我不知道这是否是最好的.
<?php
$rss = new DOMDocument();
$rss->load('http://www.hln.be/rss.xml');
$feed = array();
foreach ($rss->getElementsByTagName('item') as $node) {
$item = array (
'title' => $node->getElementsByTagName('title')->item(0)->nodeValue,
'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue,
'link' => $node->getElementsByTagName('link')->item(0)->nodeValue,
'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue,
'media' => $node->getElementsByTagName('media:content url')->item(0)->nodeValue,
);
array_push($feed, $item);
}
$limit = 20;
for($x=0;$x<$limit;$x++) {
$title = str_replace(' & ', ' & ', $feed[$x]['title']);
$link = $feed[$x]['link'];
$description = $feed[$x]['desc'];
$date = date('l F d, Y', strtotime($feed[$x]['date']));
$image = $feed[$x]['media'];
echo '<p><strong><a href="'.$link.'" title="'.$title.'">'.$title.'</a></strong><br />';
echo '<small><em>Posted on …Run Code Online (Sandbox Code Playgroud) 我正在尝试为我的新闻帖子创建一个RSS提要,我用谷歌搜索它并提出这个代码:
def feed
@posts = News.all(:conditions => "#{Settings.show} = 1", :select => "id, title, heading, content, date_posted", :order => "date_posted DESC")
respond_to do |format|
format.rss { render :layout => false }
end
end
Run Code Online (Sandbox Code Playgroud)
然后在一个名为"feed.rss.builder"的文件中我有这个:
xml.instruct! :xml, :version => "1.0"
xml.rss :version => "2.0" do
xml.channel do
xml.title "Your Blog Title"
xml.description "A blog about software and chocolate"
xml.link posts_url
for post in @posts
xml.item do
xml.title post.title
xml.description post.content
xml.pubDate post.date_posted.to_s(:rfc822)
xml.link post_url(post)
xml.guid post_url(post)
end
end
end
end
Run Code Online (Sandbox Code Playgroud)
我已将它添加到我的路线文件中, …
我使用以下代码:
private string covertRss(string url)
{
var s = RssReader.Read(url);
StringBuilder sb = new StringBuilder();
foreach (RssNews rs in s) //ERROR LINE
{
sb.AppendLine(rs.Title);
sb.AppendLine(rs.PublicationDate);
sb.AppendLine(rs.Description);
}
return sb.ToString();
}
Run Code Online (Sandbox Code Playgroud)
我收到一个错误:
错误1 foreach语句无法对类型为"System.Threading.Tasks.Task(System.Collections.Generic.List(Cricket.MainPage.RssNews))"的变量进行操作,因为'System.Threading.Tasks.Task(System.Collections.Generic) .List(Cricket.MainPage.RssNews))'不包含'GetEnumerator'的公共定义
RssNews课程是:
public class RssNews
{
public string Title;
public string PublicationDate;
public string Description;
}
Run Code Online (Sandbox Code Playgroud)
我应该添加什么代码,以便删除错误并且代码的目的不会被编译?提前致谢!
RssReader.Read()的代码
public class RssReader
{
public static async System.Threading.Tasks.Task<List<RssNews>> Read(string url)
{
HttpClient httpClient = new HttpClient();
string result = await httpClient.GetStringAsync(url);
XDocument document = XDocument.Parse(result);
return (from descendant in …Run Code Online (Sandbox Code Playgroud) 我正在使用 ruby 的 stdlibrss库创建一个 Atom 提要。该库基本上没有记录,但我使用此页面上提供的示例使其工作:
require 'rss'
rss = RSS::Maker.make("atom") do |m|
m.channel.author = "Steve Wattam"
m.channel.updated = Time.now
m.channel.about = "http://stephenwattam.com/blog/"
m.channel.title = "Steve W's Blog"
storage.posts.each do |p|
m.items.new_item do |item|
item.link = p.link
item.title = p.title
item.updated = p.edited
item.pubDate = p.date
item.summary = p.summary
end
end
end
Run Code Online (Sandbox Code Playgroud)
这工作正常。但是,我无法添加内容元素。目前是没有这样的事item.content=,而且我似乎无法在网上找到任何示例代码---源的浏览指示content 被存储在项(文档在这里),但我缺乏足够的知识梳理出来。
有谁知道我可能会如何添加内容元素?
顺便说一句,我知道存在其他库可以执行此操作,但理想情况下希望在不需要任何 gem 的情况下使其工作。
我正在使用Google Feed API从tumblr feed 中提取博客条目。
我已经能够提取内容,但输出带有 html 标签,如下所示:
<p>I remember one day asking one of my mentors James if he ever got nervous around people. James replied, “Only when I need something from them.”</p>
代码很简单,如下:
<script type="text/javascript" src="https://www.google.com/jsapi"></script>
<script type="text/javascript">
google.load("feeds", "1");
function initialize() {
var feed = new google.feeds.Feed("http://adriennetran.tumblr.com/rss");
feed.load(function(result) {
if (!result.error) {
var container = document.getElementById("feed");
for (var i = 0; i < result.feed.entries.length; i++) {
var entry = result.feed.entries[i];
window.content = document.createTextNode(entry.content); …Run Code Online (Sandbox Code Playgroud) I'm working on a project that read RSS feed using java, I use this tutorial they using Stax parser.My question is how I can read attributes values ?
http://www.vogella.com/tutorials/RSSFeed/article.html
This is the RSSReader class,
package de.vogella.rss.read;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.XMLEvent;
import de.vogella.rss.model.Feed;
import de.vogella.rss.model.FeedMessage;
public class RSSFeedParser {
static final String TITLE = "title";
static final String DESCRIPTION = "description";
static final String CHANNEL = "channel"; …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用以下来自http://codex.wordpress.org/Function_Reference/fetch_feed#Usage的代码从我的个人网站获取 2 个最新帖子
<h2><?php _e( 'Recent news from Some-Other Blog:', 'my-text-domain' ); ?></h2>
<?php // Get RSS Feed(s)
include_once( ABSPATH . WPINC . '/feed.php' );
// Get a SimplePie feed object from the specified feed source.
$rss = fetch_feed( 'THISISWHEREMYURLGOES/' );
$maxitems = 0;
if ( ! is_wp_error( $rss ) ) : // Checks that the object is created correctly
// Figure out how many total items there are, but limit it to 5.
$maxitems = $rss->get_item_quantity( 2 …Run Code Online (Sandbox Code Playgroud) 我需要创建带有名称的XML标签geo:lat并geo:long创建一个GeoRSS提要.但它抛出
':'字符,十六进制值0x3A,不能包含在名称中.
部分代码是这样的:
XElement("geo:lat", item.Latitude);
XElement("geo:long", item.Longitude);
Run Code Online (Sandbox Code Playgroud)
如何在C#中实现这种格式?
<?xml version="1.0"?>
<?xml-stylesheet href="/eqcenter/catalogs/rssxsl.php?feed=eqs7day-M5.xml" type="text/xsl"
media="screen"?>
<rss version="2.0"
xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
<title>USGS M5+ Earthquakes</title>
<description>Real-time, worldwide earthquake list for the past 7 days</description>
<link>https://earthquake.usgs.gov/eqcenter/</link>
<dc:publisher>U.S. Geological Survey</dc:publisher>
<pubDate>Thu, 27 Dec 2007 23:56:15 PST</pubDate>
<item>
<pubDate>Fri, 28 Dec 2007 05:24:17 GMT</pubDate>
<title>M 5.3, northern Sumatra, Indonesia</title>
<description>December 28, 2007 05:24:17 GMT</description>
<link>https://example.com</link>
<geo:lat>5.5319</geo:lat>
<geo:long>95.8972</geo:long>
</item>
Run Code Online (Sandbox Code Playgroud)