在json编码golang中转义unicode字符

Jac*_*cob 5 unicode encoding character-encoding go

给出以下示例:

\n\n
func main() {\n    buf := new(bytes.Buffer)\n    enc := json.NewEncoder(buf)\n    toEncode := []string{"hello", "w\xc3\xb6rld"}\n    enc.Encode(toEncode)\n    fmt.Println(buf.String())\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

我希望输出带有转义的 Unicode 字符:

\n\n
\n

[“你好”,“w\\u00f6rld”]

\n
\n\n

而不是:

\n\n
\n

[“你好”,“w\xc3\xb6rld”]

\n
\n\n

我试图编写一个函数来引用 Unicode 字符,strconv.QuoteToASCII并将结果提供给Encode()然而,这会导致双重转义:

\n\n
func quotedUnicode(data []string) []string {\n    for index, element := range data {                                                \n                quotedUnicode := strconv.QuoteToASCII(element) \n                // get rid of additional quotes                         \n                quotedUnicode = strings.TrimSuffix(quotedUnicode, "\\"") \n                quotedUnicode = strings.TrimPrefix(quotedUnicode, "\\"") \n                data[index] = quotedUnicode                               \n         }                                                                                                                                    \n         return data                                                             \n}  \n
Run Code Online (Sandbox Code Playgroud)\n\n
\n

[“你好”,“w\\\\u00f6rld”]

\n
\n\n

如何确保 json.Encode 的输出包含正确转义的 Unicode 字符?

\n

hgs*_*gs3 0

encoding/json软件包不支持此功能,但您可以自己实现。

\n

对于结构体的每个字符串字段,将其类型从 更改为stringjson.RawMessage使用以下函数引用它:

\n
func QuoteToJSON(s string) json.RawMessage {\n    var sb strings.Builder\n    for _, r := range s {\n        if r > 0xFFFF {\n            r1, r2 := utf16.EncodeRune(r)\n            sb.WriteString(fmt.Sprintf("\\\\u%04X", r1))\n            sb.WriteString(fmt.Sprintf("\\\\u%04X", r2))\n        } else if r > 0x7F {\n            sb.WriteString(fmt.Sprintf("\\\\u%04X", r))\n        } else {\n            sb.WriteRune(r)\n        }\n    }\n    return json.RawMessage(`"` + sb.String() + `"`)\n}\n
Run Code Online (Sandbox Code Playgroud)\n

完整示例:

\n
type Greeting struct {\n    Text string\n}\n\nfunc (g *Greeting) MarshalJSON() ([]byte, error) {\n    return json.Marshal(&struct {\n        Text json.RawMessage `json:"text"`\n    }{\n        Text: QuoteToJSON(g.Text),\n    })\n}\n\nfunc main() {\n    out := Greeting{"H\xc3\xad, !"}\n    dat, _ := json.Marshal(&out)\n    fmt.Println(string(dat)) // {"text":"H\\u00ED, \\uD83D\\uDE0A!"}\n\n    var in Greeting\n    json.Unmarshal(dat, &in)\n    fmt.Println(in.Text) // "H\xc3\xad, !"\n}\n
Run Code Online (Sandbox Code Playgroud)\n

去游乐场: https: //go.dev/play/p/Jk6GZwdvyvm

\n

有些人建议使用strconv.QuoteToASCII,但这有两个问题:

\n
    \n
  1. 字符串在编组时将进行双重转义,例如它们看起来像"\\\\u4e09"而不是"\\u4e09"
  2. \n
  3. JSON 要求使用 UTF-16 代理项对基本多语言平面之外的字符进行编码,例如字符串""应编码为"\\ud83d\\ude0a"和 不"\\U0001f60a"
  4. \n
\n