将python中的一行解析为字典的最佳方法

ran*_*ght 1 python parsing delimiter

我有一个像行的文件

account = "TEST1" Qty=100 price = 20.11 subject="some value" values="3=this, 4=that"
Run Code Online (Sandbox Code Playgroud)

没有特殊的分隔符,每个键的值都是双引号(如果是字符串),如果是数字则不是.虽然可能存在表示为""的空白字符串并且没有引用的转义字符,因此没有没有值的键,因为它不需要

我想知道用python解析这种行的好方法是什么,并将值存储为字典中的键值对

bob*_*nce 11

我们需要一个正则表达式.

import re, decimal
r= re.compile('([^ =]+) *= *("[^"]*"|[^ ]*)')

d= {}
for k, v in r.findall(line):
    if v[:1]=='"':
        d[k]= v[1:-1]
    else:
        d[k]= decimal.Decimal(v)

>>> d
{'account': 'TEST1', 'subject': 'some value', 'values': '3=this, 4=that', 'price': Decimal('20.11'), 'Qty': Decimal('100.0')}
Run Code Online (Sandbox Code Playgroud)

如果您愿意,可以使用float而不是decimal,但如果涉及到金钱,那可能是个坏主意.


Pau*_*McG 5

可能更容易遵循的是pyparsing rendition:

from pyparsing import *

# define basic elements - use re's for numerics, faster than easier than 
# composing from pyparsing objects
integer = Regex(r'[+-]?\d+')
real = Regex(r'[+-]?\d+\.\d*')
ident = Word(alphanums)
value = real | integer | quotedString.setParseAction(removeQuotes)

# define a key-value pair, and a configline as one or more of these
# wrap configline in a Dict so that results are accessible by given keys
kvpair = Group(ident + Suppress('=') + value)
configline = Dict(OneOrMore(kvpair))

src = 'account = "TEST1" Qty=100 price = 20.11 subject="some value" ' \
        'values="3=this, 4=that"'

configitems = configline.parseString(src)
Run Code Online (Sandbox Code Playgroud)

现在,您可以使用返回的configitems ParseResults对象访问您的作品:

>>> print configitems.asList()
[['account', 'TEST1'], ['Qty', '100'], ['price', '20.11'], 
 ['subject', 'some value'], ['values', '3=this, 4=that']]

>>> print configitems.asDict()
{'account': 'TEST1', 'Qty': '100', 'values': '3=this, 4=that', 
  'price': '20.11', 'subject': 'some value'}

>>> print configitems.dump()
[['account', 'TEST1'], ['Qty', '100'], ['price', '20.11'], 
 ['subject', 'some value'], ['values', '3=this, 4=that']]
- Qty: 100
- account: TEST1
- price: 20.11
- subject: some value
- values: 3=this, 4=that

>>> print configitems.keys()
['account', 'subject', 'values', 'price', 'Qty']

>>> print configitems.subject
some value
Run Code Online (Sandbox Code Playgroud)