Bai*_*ose 148
没有标准的模块,但我已经编写了自己的功能来实现这一目标.
BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encode(num, alphabet=BASE62):
"""Encode a positive number in Base X
Arguments:
- `num`: The number to encode
- `alphabet`: The alphabet to use for encoding
"""
if num == 0:
return alphabet[0]
arr = []
base = len(alphabet)
while num:
num, rem = divmod(num, base)
arr.append(alphabet[rem])
arr.reverse()
return ''.join(arr)
def decode(string, alphabet=BASE62):
"""Decode a Base X encoded string into the number
Arguments:
- `string`: The encoded string
- `alphabet`: The alphabet to use for encoding
"""
base = len(alphabet)
strlen = len(string)
num = 0
idx = 0
for char in string:
power = (strlen - (idx + 1))
num += alphabet.index(char) * (base ** power)
idx += 1
return num
Run Code Online (Sandbox Code Playgroud)
请注意,您可以为其提供任何用于编码和解码的字母表.如果你退出alphabet
论证,你将获得在第一行代码上定义的62个字符的字母表,从而编码/解码到62个基数.
希望这可以帮助.
PS - 对于URL缩短器,我发现最好省略一些令人困惑的字符,如0Ol1oI等.因此,我使用这个字母表来缩短我的URL缩短需求 - "23456789abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ"
玩得开心.
Wol*_*lph 45
我曾经写过一个脚本来做这个,我觉得它很优雅:)
import string
# Remove the `_@` below for base62, now it has 64 characters
BASE_LIST = string.digits + string.letters + '_@'
BASE_DICT = dict((c, i) for i, c in enumerate(BASE_LIST))
def base_decode(string, reverse_base=BASE_DICT):
length = len(reverse_base)
ret = 0
for i, c in enumerate(string[::-1]):
ret += (length ** i) * reverse_base[c]
return ret
def base_encode(integer, base=BASE_LIST):
if integer == 0:
return base[0]
length = len(base)
ret = ''
while integer != 0:
ret = base[integer % length] + ret
integer /= length
return ret
Run Code Online (Sandbox Code Playgroud)
用法示例:
for i in range(100):
print i, base_decode(base_encode(i)), base_encode(i)
Run Code Online (Sandbox Code Playgroud)
以下解码器制造商可以使用任何合理的基础,具有更加整洁的循环,并在遇到无效字符时给出明确的错误消息.
def base_n_decoder(alphabet):
"""Return a decoder for a base-n encoded string
Argument:
- `alphabet`: The alphabet used for encoding
"""
base = len(alphabet)
char_value = dict(((c, v) for v, c in enumerate(alphabet)))
def f(string):
num = 0
try:
for char in string:
num = num * base + char_value[char]
except KeyError:
raise ValueError('Unexpected character %r' % char)
return num
return f
if __name__ == "__main__":
func = base_n_decoder('0123456789abcdef')
for test in ('0', 'f', '2020', 'ffff', 'abqdef'):
print test
print func(test)
Run Code Online (Sandbox Code Playgroud)
如果您正在寻找最高效率(如django),您将需要类似以下内容.此代码是Baishampayan Ghose和WoLpH以及John Machin的有效方法的组合.
# Edit this list of characters as desired.
BASE_ALPH = tuple("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz")
BASE_DICT = dict((c, v) for v, c in enumerate(BASE_ALPH))
BASE_LEN = len(BASE_ALPH)
def base_decode(string):
num = 0
for char in string:
num = num * BASE_LEN + BASE_DICT[char]
return num
def base_encode(num):
if not num:
return BASE_ALPH[0]
encoding = ""
while num:
num, rem = divmod(num, BASE_LEN)
encoding = BASE_ALPH[rem] + encoding
return encoding
Run Code Online (Sandbox Code Playgroud)
您可能还想提前计算字典.(注意:使用字符串进行编码比使用列表更有效率,即使数字非常长.)
>>> timeit.timeit("for i in xrange(1000000): base.base_decode(base.base_encode(i))", setup="import base", number=1)
2.3302059173583984
Run Code Online (Sandbox Code Playgroud)
在2.5秒内编码和解码100万个数字.(2.2Ghz i7-2670QM)
如果您使用 django 框架,则可以使用 django.utils.baseconv 模块。
>>> from django.utils import baseconv
>>> baseconv.base62.encode(1234567890)
1LY7VK
Run Code Online (Sandbox Code Playgroud)
除了base62,baseconv还定义了base2/base16/base36/base56/base64。