Clojure相当于Python的编码('hex')和解码('hex')

Ice*_*ack 13 hex clojure

在Clojure中将字符串编码和解码为十六进制是否有惯用的方法?Python的例子:

'Clojure'.encode('hex')
# ? '436c6f6a757265'
'436c6f6a757265'.decode('hex')
# ? 'Clojure'
Run Code Online (Sandbox Code Playgroud)

为了表明我的一些努力:

(defn hexify [s]
  (apply str
    (map #(format "%02x" (int %)) s)))

(defn unhexify [hex]
  (apply str
    (map 
      (fn [[x y]] (char (Integer/parseInt (str x y) 16))) 
      (partition 2 hex))))

(hexify "Clojure")
;; ? "436c6f6a757265"

(unhexify "436c6f6a757265")
;; ? "Clojure"
Run Code Online (Sandbox Code Playgroud)

sw1*_*1nn 15

您的实现不适用于非ascii字符,

(defn hexify [s]
  (apply str
    (map #(format "%02x" (int %)) s)))

(defn unhexify [hex]
  (apply str
    (map 
      (fn [[x y]] (char (Integer/parseInt (str x y) 16))) 
        (partition 2 hex))))

(= "\u2195" (unhexify(hexify "\u2195")))
false ; should be true 
Run Code Online (Sandbox Code Playgroud)

要解决此问题,您需要使用所需的字符编码序列化字符串的字节,每个字符可以是多字节.

这有一些"问题".

  • 请记住,所有数字类型都在JVM中签名.
  • 没有无符号字节.

在惯用java中,您将使用整数的低字节,并在您使用它的任何地方将其屏蔽.

    int intValue = 0x80;
    byte byteValue = (byte)(intValue & 0xff); -- use only low byte

    System.out.println("int:\t" + intValue);
    System.out.println("byte:\t" + byteValue);

    -- output:
    -- int:   128
    -- byte:  -128
Run Code Online (Sandbox Code Playgroud)

clojure必须(unchecked-byte)有效地做同样的事情.

例如,使用UTF-8可以执行以下操作:

(defn hexify [s]
  (apply str (map #(format "%02x" %) (.getBytes s "UTF-8"))))

(defn unhexify [s]
  (let [bytes (into-array Byte/TYPE
                 (map (fn [[x y]]
                    (unchecked-byte (Integer/parseInt (str x y) 16)))
                       (partition 2 s)))]
    (String. bytes "UTF-8")))

; with the above implementation:

;=> (hexify "\u2195")
"e28695"
;=> (unhexify "e28695")
"?"
;=> (= "\u2195" (unhexify (hexify "\u2195")))
true
Run Code Online (Sandbox Code Playgroud)


Grz*_*ywo 15

由于所有发布的解决方案都有一些缺陷,我自己分享:

(defn hexify "Convert byte sequence to hex string" [coll]
  (let [hex [\0 \1 \2 \3 \4 \5 \6 \7 \8 \9 \a \b \c \d \e \f]]
      (letfn [(hexify-byte [b]
        (let [v (bit-and b 0xFF)]
          [(hex (bit-shift-right v 4)) (hex (bit-and v 0x0F))]))]
        (apply str (mapcat hexify-byte coll)))))

(defn hexify-str [s]
  (hexify (.getBytes s)))
Run Code Online (Sandbox Code Playgroud)

(defn unhexify "Convert hex string to byte sequence" [s] 
      (letfn [(unhexify-2 [c1 c2] 
                 (unchecked-byte 
                   (+ (bit-shift-left (Character/digit c1 16) 4)
                      (Character/digit c2 16))))]
     (map #(apply unhexify-2 %) (partition 2 s))))

(defn unhexify-str [s]
  (apply str (map char (unhexify s)))) 
Run Code Online (Sandbox Code Playgroud)

优点:

  • 高性能
  • 通用字节流< - >使用专用包装器进行字符串转换
  • 处理十六进制结果中的前导零


Jer*_*eld 6

遗憾的是, “惯用语”似乎正在使用Apache Commons Codec,例如:buddy

(ns name-of-ns
  (:import org.apache.commons.codec.binary.Hex))

(defn str->bytes
  "Convert string to byte array."
  ([^String s]
   (str->bytes s "UTF-8"))
  ([^String s, ^String encoding]
   (.getBytes s encoding)))

(defn bytes->str
  "Convert byte array to String."
  ([^bytes data]
   (bytes->str data "UTF-8"))
  ([^bytes data, ^String encoding]
   (String. data encoding)))

(defn bytes->hex
  "Convert a byte array to hex encoded string."
  [^bytes data]
  (Hex/encodeHexString data))

(defn hex->bytes
  "Convert hexadecimal encoded string to bytes array."
  [^String data]
  (Hex/decodeHex (.toCharArray data)))
Run Code Online (Sandbox Code Playgroud)


Ósc*_*pez 5

我相信你的unhexify功能是尽可能惯用的.但是,hexify可以用更简单的方式编写:

(defn hexify [s]
  (format "%x" (new java.math.BigInteger (.getBytes s))))
Run Code Online (Sandbox Code Playgroud)

  • 这格式化字节数组,其中第一位为1为"负". (4认同)
  • 我也曾经使用过这种方法,直到我意识到前导零被删除了。 (2认同)