使用Python解析lisp文件

qua*_*qua 8 python parsing

我有以下lisp文件,它来自UCI机器学习数据库.我想使用python将其转换为平面文本文件.典型的线条如下所示:

(1 ((st 8) (pitch 67) (dur 4) (keysig 1) (timesig 12) (fermata 0))((st 12) (pitch 67) (dur 8) (keysig 1) (timesig 12) (fermata 0)))
Run Code Online (Sandbox Code Playgroud)

我想将其解析为一个文本文件,如:

time pitch duration keysig timesig fermata
8    67    4        1      12      0
12   67    8        1      12      0
Run Code Online (Sandbox Code Playgroud)

是否有一个python模块智能解析这个?这是我第一次看到口齿不清.

geo*_*org 21

本答案所示,pyparsing似乎是正确的工具:

inputdata = '(1 ((st 8) (pitch 67) (dur 4) (keysig 1) (timesig 12) (fermata 0))((st 12) (pitch 67) (dur 8) (keysig 1) (timesig 12) (fermata 0)))'

from pyparsing import OneOrMore, nestedExpr

data = OneOrMore(nestedExpr()).parseString(inputdata)
print data

# [['1', [['st', '8'], ['pitch', '67'], ['dur', '4'], ['keysig', '1'], ['timesig', '12'], ['fermata', '0']], [['st', '12'], ['pitch', '67'], ['dur', '8'], ['keysig', '1'], ['timesig', '12'], ['fermata', '0']]]]
Run Code Online (Sandbox Code Playgroud)

为了完整性,这是如何格式化结果(使用texttable):

from texttable import Texttable

tab = Texttable()
for row in data.asList()[0][1:]:
    row = dict(row)
    tab.header(row.keys())
    tab.add_row(row.values())
print tab.draw()
Run Code Online (Sandbox Code Playgroud)
+---------+--------+----+-------+-----+---------+
| timesig | keysig | st | pitch | dur | fermata |
+=========+========+====+=======+=====+=========+
| 12      | 1      | 8  | 67    | 4   | 0       |
+---------+--------+----+-------+-----+---------+
| 12      | 1      | 12 | 67    | 8   | 0       |
+---------+--------+----+-------+-----+---------+

要将该数据转换回lisp表示法:

def lisp(x):
    return '(%s)' % ' '.join(lisp(y) for y in x) if isinstance(x, list) else x

d = lisp(d[0])
Run Code Online (Sandbox Code Playgroud)