可以成功解析格式错误的JSON字符串吗?

fir*_*iel 0 java json

这是一个示例字符串:

String s = "{\"source\": \"another \"quote inside\" text\"}";
Run Code Online (Sandbox Code Playgroud)

解析这个的最佳方法是什么?我已经尝试了4个解析器:json-lib,json-simple,gsonGrails内置的JSON解析器.

我正在使用Java,我想知道在捕获MalformedJsonException之后是否有办法修复字符串.

注意:或者这可能是Twitter API中的错误?这是一个示例响应字符串:

{
    "coordinates": null,
    "user": {
        "is_translator": false,
        "show_all_inline_media": false,
        "following": null,
        "geo_enabled": false,
        "profile_background_color": "C0DEED",
        "listed_count": 11,
        "profile_background_image_url": "http://a3.twimg.com/a/1298064126/images/themes/theme1/bg.png",
        "favourites_count": 4,
        "followers_count": 66,
        "contributors_enabled": false,
        "statuses_count": 1078,
        "time_zone": "Tokyo",
        "profile_text_color": "333333",
        "friends_count": 51,
        "profile_sidebar_fill_color": "DDEEF6",
        "id_str": "107723125",
        "profile_background_tile": false,
        "created_at": "Sat Jan 23 14:16:03 +0000 2010",
        "profile_image_url": "http://a3.twimg.com/profile_images/652140488/--------------_normal.jpg",
        "description": "Mu8ecdu56e3u306eu56e3u9577u3068u30eau30fcu30c0u30fcu3067u3059u3002u8da3u5473u306fu7af6u99acu306eu4e88u60f3u3068u30b0u30e9u30c3u30d7u30eau30f3u30b0u3068u6253u6483u3092u30e1u30a4u30f3u3068u3057u3066u3044u307eu3059u3063uff01",
        "location": "u5bccu5c71u770c",
        "notifications": null,
        "profile_link_color": "0084B4",
        "protected": false,
        "screen_name": "mattsun0209",
        "follow_request_sent": null,
        "lang": "ja",
        "profile_sidebar_border_color": "C0DEED",
        "name": "u307eu3063u3064u3093",
        "verified": false,
        "id": 107723125,
        "profile_use_background_image": true,
        "utc_offset": 32400,
        "url": null
    },
    "in_reply_to_screen_name": null,
    "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "text": "u3042u30fcu3001u7d50u819cu708eu306bu306au3063u3066u3057u307eu3063u305fu3002",
    "contributors": null,
    "retweeted": false,
    "in_reply_to_user_id_str": null,
    "retweet_count": 0,
    "source": "u003Ca href="http: //twtr.jp" rel="nofollow"u003EKeitai Webu003C/au003E",
    "id_str": "42128197566861312",
    "created_at": "Mon Feb 28 07:45:19 +0000 2011",
    "geo": null,
    "entities": {
        "hashtags": [],
        "user_mentions": [],
        "urls": []
    },
    "truncated": false,
    "place": null,
    "id": 42128197566861312,
    "favorited": false
}
Run Code Online (Sandbox Code Playgroud)

请注意source酒店:

"source": "u003Ca href="http: //twtr.jp" rel="nofollow"u003EKeitai Webu003C/au003E"
Run Code Online (Sandbox Code Playgroud)

T.J*_*der 6

我担心这是一个经典的"垃圾进,垃圾出"的情况.JSON 无效,因此您无法正确解析它.你只能猜测它的意义.现在,我们人类可以很好地猜测出预期的内容(显然),但在解析器级别上则更难.

如果您知道持续获得此无效source属性,则可以在反序列化之前对字符串进行预处理,但真正的修复必须位于无效数据的来源 - Twitter或任何twit(就像它)提供的它.我假设这是你收到的实际字符串数据,而不是它的处理形式.