我正在寻找一种方法向我展示给定对象的不同属性/值...
$obj1 = new StdClass; $obj1->prop = 1;
$obj2 = new StdClass; $obj2->prop = 2;
var_dump(array_diff((array)$obj1, (array)$obj2));
//output array(1) { ["prop"]=> int(1) }
Run Code Online (Sandbox Code Playgroud)
只要属性不是对象或数组,这种方法就可以正常工作.
$obj1 = new StdClass; $obj1->prop = array(1,2);
$obj2 = new StdClass; $obj2->prop = array(1,3);
var_dump(array_diff((array)$obj1, (array)$obj2))
// Output array(0) { }
// Expected output - array { ["prop"]=> array { [1]=> int(2) } }
Run Code Online (Sandbox Code Playgroud)
有没有办法摆脱这个,即使属性是另一个对象?!
服务器
启用调试模式后,抓取会在大约 400 000 个 URL 后停止,很可能是因为服务器内存不足。如果没有调试模式,最多需要 5 天,在我看来,这相当慢,而且会占用大量内存 (96%)
非常欢迎任何提示:)
import scrapy
import csv
def get_urls_from_csv():
with open('data.csv', newline='') as csv_file:
data = csv.reader(csv_file, delimiter=',')
scrapurls = []
for row in data:
scrapurls.append("http://"+row[2])
return scrapurls
class rssitem(scrapy.Item):
sourceurl = scrapy.Field()
rssurl = scrapy.Field()
class RssparserSpider(scrapy.Spider):
name = "rssspider"
allowed_domains = ["*"]
start_urls = ()
def start_requests(self):
return [scrapy.http.Request(url=start_url) for start_url in …Run Code Online (Sandbox Code Playgroud)