小编McP*_*dr0的帖子

如何使用Streams2 ObjectMode？

在Node v10.11中,我试图将对象推向管道,但我总是得到错误.

events.js:72
    throw er; // Unhandled 'error' event
          ^
TypeError: Invalid non-string/buffer chunk
    at validChunk (_stream_writable.js:150:14)
    at WriteStream.Writable.write (_stream_writable.js:179:12)

Run Code Online (Sandbox Code Playgroud)

我能做到

this.push(chunk)

Run Code Online (Sandbox Code Playgroud)

直接管道数据,但我做不到

var result = {'the web content is': chunk}
this.push(result)

Run Code Online (Sandbox Code Playgroud)

30 LOC中的可运行示例:

var stream = require('stream');

var MsgExtractStream = function() {
  stream.Transform.call(this,{objectMode: true});
}

MsgExtractStream.prototype = Object.create(
  stream.Transform.prototype, {constructor: {value: MsgExtractStream}} )

MsgExtractStream.prototype._transform = function(chunk, encoding, callback) {
  var result = {'the website is': chunk};
  this.push(result);
}

MsgExtractStream.prototype.write = function () {
  this._transform.apply(this, arguments);
};

MsgExtractStream.prototype.end = function …

Run Code Online (Sandbox Code Playgroud)

node.js streams2

McP*_*dr0

2013 07-11

14
推荐指数

1
解决办法

3449
查看次数

节点请求抛出:错误:无效的URI"www.urlworksinbrowser.com"或options.uri是必需的参数

我在Ubuntu 12.04上使用Node v0.10.11.我无法弄清楚我缺少什么来使URL流与请求模块一起工作.该程序试图访问邮件列表站点,找到每个月的下载链接,然后下载每个月的页面.

Mikael的自述文件说"第一个参数可以是url或options对象.唯一需要的选项是URI,其他所有选项都是可选的.ur || url - 完全限定的uri或url.parse()中解析的url对象"

如果我打电话url.parse(www.targeturl.com),我会

Error: options.uri is a required argument

Run Code Online (Sandbox Code Playgroud)

如果我不使用url.parse,我会

Error: Invalid URI "www.freelists.org/archive/si-list/06-2013"

Run Code Online (Sandbox Code Playgroud)

(此链接在我的浏览器中完美运行)

我把代码减少到了42行.欢迎任何建议

var request = require('request'),
  url = require('url'),
  stream = require('stream'),
  cheerio = require('cheerio'), // a reduced jQuery style DOM library
  Transform = require('stream').Transform

var DomStripStream = function(target) {
  this.target = target;
  stream.Transform.call(this,{objectMode: true});
}

DomStripStream.prototype = Object.create(
  Transform.prototype, {constructor: {value: DomStripStream}} 
)

DomStripStream.prototype.write = function () {
  this._transform.apply(this, arguments);
};

DomStripStream.prototype.end = function () …

Run Code Online (Sandbox Code Playgroud)

url parsing request node.js streams2

McP*_*dr0

2018 07-18

6
推荐指数

1
解决办法

1万
查看次数

从 HTMLParser handle_starttag 返回数据

我的问题是这个的更简单版本

我有一个 youtube iframe：

<iframe width="560" height="315" src="//www.youtube.com/embed/fY9UhIxitYM" frameborder="0" allowfullscreen></iframe>

我正在开发一个小型网络应用程序，需要提取随机代码（在本例中为 fY9UhIxitYM）。我想使用标准库而不是导入 Beautiful Soup。

from HTMLParser import HTMLParser

class YoutubeLinkParser(HTMLParser):
    def __init__(self):
        HTMLParser.__init__(self)
        self.data = []

    def handle_starttag(self, tag, attrs):
        data = attrs[2][1].split('/')[-1]
        self.data.append(data)

iframe = open('iframe.html').read()
parser = YoutubeLinkParser()
linkCode = parser.feed(iframe)

Run Code Online (Sandbox Code Playgroud)

我发现的示例使用 handle_data(self, data)，但我需要有关 open 标记的 attr 的信息。我可以打印方法中的值，但是当我尝试获取返回值时，linkCode 返回“none”。

我错过了什么？谢谢！

html python parsing class

McP*_*dr0

2017 05-23

4
推荐指数

1
解决办法

5041
查看次数