use*_*529 5 python canvas image beautifulsoup web-scraping
我正在网页抓取一个页面,其中各种数字也出现在小价格图表的图像.
如果我在浏览器中单击此图像,我可以将该图表保存为.png图像.
当我查看元素在检查时看起来像这样的源代码时:
<div class="performance_2d_sparkline graph ng-isolate-scope ng-scope" x-data-percent-change-day="ticker.pct_chge_1D" x-sparkline="watchlistData.sparklineData[ticker.ticker]">
<span class="inlinesparkline ng-binding">
<canvas width="100" height="40" style="display: inline-block; width: 100px; height: 40px; vertical-align: top;">
</canvas>
</span>
</div>
Run Code Online (Sandbox Code Playgroud)
有没有什么方法可以通过网页抓取我可以通过浏览器手动保存的相同图像?
小智 6
如果您使用Selenium进行Web抓取,则可以使用以下代码段获取canvas元素并将其保存到图像文件中:
# get the base64 representation of the canvas image (the part substring(21) is for removing the padding "data:image/png;base64")
base64_image = driver.execute_script("return document.querySelector('.inlinesparkline canvas').toDataURL('image/png').substring(21);")
# decode the base64 image
output_image = base64.b64decode(base64_image)
# save to the output image
with open("image.png", 'wb') as f:
f.write(output_image)
Run Code Online (Sandbox Code Playgroud)