我一直在尝试用 npm install of 替换基于链接的 pdf.js 几个小时pdfjs-dist
,因为我注意到我的链接不打算用作 cdns 并且可能变得不稳定,如here所述。
除了几个示例之外,我找不到关于如何使其工作的太多文档,并且当涉及 Webpack 时,它们主要与 React 一起使用,而我只是在 Django 框架中使用 ES6(在所需的 django 目录上静态编译,而不使用webpack 插件。)
在与一位从事 pdf.js 工作的人交换了几条消息后,我的编译错误似乎是由于 Webpack 在内部处理库的方式造成的。这是我所看到的:
WARNING in ./node_modules/worker-loader/dist/index.js
Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileAsyncWasmPlugin' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/worker-loader/dist'
@ ./node_modules/worker-loader/dist/index.js
@ ./node_modules/worker-loader/dist/cjs.js
@ ./node_modules/pdfjs-dist/webpack.js
@ ./src/js/views/pdfViews.js
@ ./src/js/index.js
WARNING in ./node_modules/worker-loader/dist/index.js
Module not found: Error: Can't resolve 'webpack/lib/web/FetchCompileWasmPlugin' in '/home/giampaolo/dev/KJ_import/KJ-JS/node_modules/worker-loader/dist'
@ ./node_modules/worker-loader/dist/index.js
@ ./node_modules/worker-loader/dist/cjs.js
@ ./node_modules/pdfjs-dist/webpack.js
@ ./src/js/views/pdfViews.js
@ ./src/js/index.js
ERROR in (webpack)/lib/node/NodeTargetPlugin.js
Module not …
Run Code Online (Sandbox Code Playgroud) 几天来我一直在尝试抓取特定页面,但无济于事。我在抓取和 Python 方面都是菜鸟。
我确实在寻找页面的最后一个大表,但没有可以依赖的 ID,所以我尝试抓取所有表。
我想出了这个代码:
import requests
import urllib.request
from bs4 import BeautifulSoup
url = "https://www.freecell.net/f/c/personal.html?uname=Giampaolo44&submit=Go"
r = requests.get(url)
r.raise_for_status()
html_content = r.text
soup = BeautifulSoup(html_content,"html.parser")
tables = soup.findAll("table")
for table in tables:
row_data = []
for row in table.find_all('tr'):
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
row_data.append(cols)
print(row_data)
Run Code Online (Sandbox Code Playgroud)
通过上述内容,我在打印输出(*)中得到了大量垃圾,这是我两天的标准输出。
(*) IE:
['12/155:27\xa0pm8x4\xa05309-6Streak4:07Won12/155:23\xa0pm8x4\xa013396-6Streak2:49Won', '5:27\xa0pm8x4\xa05309-6Streak4:07Won12/155:23\xa0pm8x4\xa013396-6Streak2:49Won', '8x4\xa05309-6Streak4:07Won12/155:23\xa0pm8x4\xa013396-6Streak2:49Won', 'Streak4:07Won12/155:23\xa0pm8x4\xa013396-6Streak2:49Won', '4:07Won12/155:23\xa0pm8x4\xa013396-6Streak2:49Won', 'Won12/155:23\xa0pm8x4\xa013396-6Streak2:49Won', '12/155:23\xa0pm8x4\xa013396-6Streak2:49Won', '5:23\xa0pm8x4\xa013396-6Streak2:49Won', '8x4\xa013396-6Streak2:49Won', 'Streak2:49Won', '2:49Won', 'Won'], ['12/155:23\xa0pm8x4\xa013396-6Streak2:49Won', '5:23\xa0pm8x4\xa013396-6Streak2:49Won', '8x4\xa013396-6Streak2:49Won', 'Streak2:49Won', '2:49Won', 'Won']]
Run Code Online (Sandbox Code Playgroud) 我认为reduce
可能会帮助我用最少的代码解决对节点列表的简单检查,但我无法确定我是否无法使其工作,或者我忽略了一些限制,例如reduce
“它不适用于节点列表”。
这是我到目前为止所尝试的:
网页:
<input class="form-check-input typeOfRepetition" type="radio" id="r1">
<input class="form-check-input typeOfRepetition" type="radio" id="r2">
<input class="form-check-input typeOfRepetition" type="radio" id="r2">
Run Code Online (Sandbox Code Playgroud)
JS
typeOfRepetition__all= document.querySelectorAll('.typeOfRepetition')
const reducer = (acc, currV) => acc + (currV.checked ? 1 : 0)
const choiceChecked = els.typeOfRepetition__all.reduce(reducer);
if (choiceChecked) {
//...do some stuff
}
Run Code Online (Sandbox Code Playgroud)
我尝试了在MDN 沙箱内使用reduce 和三元运算符检查的方法,效果很好,即:
const array1 = [1, 2, 3, 4];
const reducer = (accumulator, currentValue) => accumulator + ( currentValue % 2 ? 1 : 0);
// 1 + 2 …
Run Code Online (Sandbox Code Playgroud) javascript ×2
django ×1
ecmascript-6 ×1
npm ×1
pdfjs-dist ×1
python ×1
web-scraping ×1
webpack ×1