dec*_*ott 14 html python json html-table beautifulsoup
我正在尝试将我通过BeautifulSoup提取的表转换为JSON.
到目前为止,我已设法隔离所有行,但我不确定如何使用此处的数据.任何建议将非常感谢.
[<tr><td><strong>Balance</strong></td><td><strong>$18.30</strong></td></tr>,
<tr><td>Card name</td><td>Name</td></tr>,
<tr><td>Account holder</td><td>NAME</td></tr>,
<tr><td>Card number</td><td>1234</td></tr>,
<tr><td>Status</td><td>Active</td></tr>]
Run Code Online (Sandbox Code Playgroud)
(为了便于阅读,我打破了线路)
这是我的尝试:
result = []
allrows = table.tbody.findAll('tr')
for row in allrows:
result.append([])
allcols = row.findAll('td')
for col in allcols:
thestrings = [unicode(s) for s in col.findAll(text=True)]
thetext = ''.join(thestrings)
result[-1].append(thetext)
Run Code Online (Sandbox Code Playgroud)
这给了我以下结果:
[
[u'Card balance', u'$18.30'],
[u'Card name', u'NAMEn'],
[u'Account holder', u'NAME'],
[u'Card number', u'1234'],
[u'Status', u'Active']
]
Run Code Online (Sandbox Code Playgroud)
H.D*_*.D. 27
可能你的数据是这样的:
html_data = """
<table>
<tr>
<td>Card balance</td>
<td>$18.30</td>
</tr>
<tr>
<td>Card name</td>
<td>NAMEn</td>
</tr>
<tr>
<td>Account holder</td>
<td>NAME</td>
</tr>
<tr>
<td>Card number</td>
<td>1234</td>
</tr>
<tr>
<td>Status</td>
<td>Active</td>
</tr>
</table>
"""
Run Code Online (Sandbox Code Playgroud)
我们可以使用以下代码将结果作为列表获取:
from bs4 import BeautifulSoup
table_data = [[cell.text for cell in row("td")]
for row in BeautifulSoup(html_data)("tr")]
Run Code Online (Sandbox Code Playgroud)
要将结果转换为JSON,如果您不关心订单:
import json
print json.dumps(dict(table_data))
Run Code Online (Sandbox Code Playgroud)
结果:
{
"Status": "Active",
"Card name": "NAMEn",
"Account holder":
"NAME", "Card number": "1234",
"Card balance": "$18.30"
}
Run Code Online (Sandbox Code Playgroud)
如果您需要相同的订单,请使用以下命令:
from collections import OrderedDict
import json
print json.dumps(OrderedDict(table_data))
Run Code Online (Sandbox Code Playgroud)
哪个给你:
{
"Card balance": "$18.30",
"Card name": "NAMEn",
"Account holder": "NAME",
"Card number": "1234",
"Status": "Active"
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
24996 次 |
| 最近记录: |