使用Python将JavaScript数组破解为JSON

Gre*_*reg 4 javascript python json

我从远程站点获取一个.js文件,该文件包含我想要使用我的Google App Engine站点上的simplejson库处理为JSON的数据..js文件如下所示:

var txns = [
    { apples: '100', oranges: '20', type: 'SELL'}, 
    { apples: '200', oranges: '10', type: 'BUY'}]
Run Code Online (Sandbox Code Playgroud)

我无法控制此文件的格式.我最初只是为了破解它所做的就是"var txns = "从字符串中删除一点然后在字符串上做一系列.replace(old, new, [count])直到它看起来像标准的JSON:

cleanJSON = malformedJSON.replace("'", '"').replace('apples:', '"apples":').replace('oranges:', '"oranges":').replace('type:', '"type":').replace('{', '{"transaction":{').replace('}', '}}')
Run Code Online (Sandbox Code Playgroud)

所以它现在看起来像:

[{ "transaction" : { "apples": "100", "oranges": "20", "type": "SELL"} }, 
 { "transaction" : { "apples": "200", "oranges": "10", "type": "BUY"} }]
Run Code Online (Sandbox Code Playgroud)

你会如何解决这个格式化问题?是否有一种已知的方法(库,脚本)将JavaScript数组格式化为JSON表示法?

Tor*_*rek 5

使用PyParsing编写自己的小方块并不困难.

import json
from pyparsing import *

data = """var txns = [
   { apples: '100', oranges: '20', type: 'SELL'}, 
   { apples: '200', oranges: '10', type: 'BUY'}]"""


def js_grammar():
    key = Word(alphas).setResultsName("key")
    value = QuotedString("'").setResultsName("value")
    pair = Group(key + Literal(":").suppress() + value)
    object_ = nestedExpr("{", "}", delimitedList(pair, ","))
    array = nestedExpr("[", "]", delimitedList(object_, ","))
    return array + StringEnd()

JS_GRAMMAR = js_grammar()

def parse(js):
    return JS_GRAMMAR.parseString(js[len("var txns = "):])[0]

def to_dict(object_):
    return dict((p.key, p.value) for p in object_)

result = [
    {"transaction": to_dict(object_)}
    for object_ in parse(data)]
print json.dumps(result)
Run Code Online (Sandbox Code Playgroud)

这将打印出来

[{"transaction": {"type": "SELL", "apples": "100", "oranges": "20"}},
 {"transaction": {"type": "BUY", "apples": "200", "oranges": "10"}}]
Run Code Online (Sandbox Code Playgroud)

您还可以将分配添加到语法本身.鉴于已经有现成的解析器,你应该更好地使用它们.