如何解析Apple的IAP收据格式错误的JSON?

est*_*est 4 python json in-app-purchase

我像这样从苹果得到了 JSON

{
    "original-purchase-date-pst" = "2012-06-28 02:46:02 America/Los_Angeles";
    "original-transaction-id" = "1000000051960431";
    "bvrs" = "1.0";
    "transaction-id" = "1000000051960431";
    "quantity" = "1";
    "original-purchase-date-ms" = "1340876762450";
    "product-id" = "com.x";
    "item-id" = "523404215";
    "bid" = "com.x";
    "purchase-date-ms" = "1340876762450";
    "purchase-date" = "2012-06-28 09:46:02 Etc/GMT";
    "purchase-date-pst" = "2012-06-28 02:46:02 America/Los_Angeles";
    "original-purchase-date" = "2012-06-28 09:46:02 Etc/GMT";
}
Run Code Online (Sandbox Code Playgroud)

这不是我们所知道的JSON。在 JSON 中明确定义了

每个名称后跟 :(冒号),名称/值对由 ,(逗号)分隔。

我怎样才能在Python的json(或simplejson)模块中解析它?

json仅支持separatorsin json.dumps()、不 injson.loads()和 in simplejson/decoder.py,其中def JSONObject()具有硬编码分隔符:,

我能做些什么?编写我自己的解析器?

Mar*_*ers 5

这确实是比较混乱的事情。一个快速修复方法是用正则表达式替换有问题的分隔符:

line = re.compile(r'("[^"]*")\s*=\s*("[^"]*");')
result = line.sub(r'\1: \2,', result)
Run Code Online (Sandbox Code Playgroud)

您还需要删除最后一个逗号:

trailingcomma = re.compile(r',(\s*})')
result = trailingcomma.sub(r'\1', result)
Run Code Online (Sandbox Code Playgroud)

通过这些操作,示例加载为 json:

>>> import json, re
>>> line = re.compile('("[^"]*")\s*=\s*("[^"]*");')
>>> result = '''\
... {
...     "original-purchase-date-pst" = "2012-06-28 02:46:02 America/Los_Angeles";
...     "original-transaction-id" = "1000000051960431";
...     "bvrs" = "1.0";
...     "transaction-id" = "1000000051960431";
...     "quantity" = "1";
...     "original-purchase-date-ms" = "1340876762450";
...     "product-id" = "com.x";
...     "item-id" = "523404215";
...     "bid" = "com.x";
...     "purchase-date-ms" = "1340876762450";
...     "purchase-date" = "2012-06-28 09:46:02 Etc/GMT";
...     "purchase-date-pst" = "2012-06-28 02:46:02 America/Los_Angeles";
...     "original-purchase-date" = "2012-06-28 09:46:02 Etc/GMT";
... }
... '''
>>> line = re.compile(r'("[^"]*")\s*=\s*("[^"]*");')
>>> trailingcomma = re.compile(r',(\s*})')
>>> corrected = trailingcomma.sub(r'\1', line.sub(r'\1: \2,', result))
>>> json.loads(corrected)
{u'product-id': u'com.x', u'purchase-date-pst': u'2012-06-28 02:46:02 America/Los_Angeles', u'transaction-id': u'1000000051960431', u'original-purchase-date-pst': u'2012-06-28 02:46:02 America/Los_Angeles', u'bid': u'com.x', u'purchase-date-ms': u'1340876762450', u'original-transaction-id': u'1000000051960431', u'bvrs': u'1.0', u'original-purchase-date-ms': u'1340876762450', u'purchase-date': u'2012-06-28 09:46:02 Etc/GMT', u'original-purchase-date': u'2012-06-28 09:46:02 Etc/GMT', u'item-id': u'523404215', u'quantity': u'1'}
Run Code Online (Sandbox Code Playgroud)

它也应该处理嵌套映射。"但这确实假设值本身没有转义引号。如果有的话,无论如何你都需要一个解析器。