将整数表示为一系列乘数

Question

将整数表示为一系列乘数

Esk*_*sko 11 language-agnostic algorithm

向下滚动以查看最新的编辑,我在此留下所有这些文本,以便我不会使此问题到目前为止收到的回复无效!

我有以下的脑筋急转弯我希望得到一个解决方案,我试图解决这个问题,但是因为我在数学上没有那么高于平均水平(也就是说,我认为我非常接近平均水平)我可以'似乎把我的头围住了.

的问题:给定数目x应拆分到的系列multipliers,其中每个multiplier <= y,y是像10或16或任何一个常数.在系列(技术上array of integers)中,应该添加最后一个数字而不是相乘以能够将乘数转换回原始数字.

作为一个例子,让我们假设x=29和y=10.在这种情况下,预期的数组将是{10,2,9}有意义的10*2+9.但是,如果y=5它是{5,5,4}意义5*5+4或者是否y=3,它将是{3,3,3,2}那样的3*3*3+2.

我尝试通过这样做来解决这个问题:

同时x >= y,存储y到multipliers,然后x = x - y
什么时候x < y,存储x到multipliers

显然这不起作用,我也尝试分别存储"剩余"部分,并在其他所有内容之后添加,但这也不起作用.我认为我的主要问题是我试图以过于复杂的方式思考这个问题,而解决方案显而易见且简单明了.

重申一下,这些是此算法应具有的限制:

必须使用64位长
必须返回一个32位整数数组(......好吧,短裤也可以)
虽然支持已签名的数字(+和 - )会很好,如果它有助于任务只有无符号数字是必须的

虽然我使用Java做这个,但我宁愿将任何可能的代码示例作为伪代码,我特别不想轻易做出答案,我只需要一个轻推(好吧,更多的强力踢)以便我可以解决这至少部分是我自己.提前致谢.

编辑:进一步澄清

为了避免一些混乱,我想我应该改写一下:

结果数组中的每个整数应小于或等于y,包括最后一个数字.
是的,最后一个数字只是一个神奇的数字.
不,这不是模数,因为在大多数情况下第二个数字将大于y.
是的,大多数可用数字都有多个答案,但是我正在寻找数学运算量最少的数字.至于我的逻辑是,这意味着找到大乘数尽可能的最高金额,例如x=1 000 000,y=100是100*100*100,即使10*10*10*10*10*10是同样正确答案的数学明智的.

到目前为止,我需要仔细考虑给出的答案,但如果你还有什么要补充的话,请做!我非常感谢您对此已经表现出的兴趣,谢谢大家.

编辑2:更多解释+赏金

好吧,看起来我在这里的目标是不能按照我认为的方式完成.我对自己的目标过于暧昧,在考虑了一下之后,我决定只告诉你我想做什么,看看你能想出什么.

我最初的目标是想出一个特定的方法来将1..n大整数(也就是长整数)打包在一起,这样它们的String表示明显比写实际数字短.认为十,10 ^ 6和1 000 000的倍数是相同的,但是表示的字符长度不是.

为此我想以某种方式组合数字,因为预计这些数字彼此有些接近.我firsth以为代表100, 121, 282的100+21+161可能是要走的路,但在字符串长度节约忽略不计,在最好的,真的不工作那么好,如果数字是不是很接近对方.基本上我想要超过10%.

所以我想出了如果我将数字按公共属性(如乘数)分组并将数字的其余部分除以单个组件,然后我可以将其表示为字符串.这就是这个问题所在的地方,我认为例如1 000 000和10 000可以表示为10 ^(5 | 6)但是由于我的目标用法的背景,这有点太过于片状:

上下文是Web.RESTful URL:s具体.这就是为什么我提到使用64个字符(web安全的字母数字非保留字符然后一些)的原因,从那以后我可以创建看似随机的URL,可以解压缩到表示一组id号的整数列表.在这一点上,我想到创建一个类似64的基数系统来表示基数10/2数字,但由于我不是数学天才,所以我不知道如何做到这一点.

赏金

现在我已经写完了整个故事(对不起,这是一个漫长的故事),我正在为这个问题打开奖金.关于先前指定的首选算法的要求的所有内容仍然有效.我还想说,我已经感谢到目前为止我收到的所有答案,如果按照你们所做的那样的方式完成,我觉得自己被证明是错误的.

结论

好吧,现在给予赏金.我在回复中散布了一些评论,主要是为了将来参考和我自己,如果您认为我们应该能够在多个答案中进行传播,您还可以查看我的SO Uservoice关于传播与此问题相关的赏金的建议.

谢谢大家抽出时间回答!

Answer 1

Unk*_*own 16

更新

我无法抗拒试图为第一个问题提出我自己的解决方案,即使它没有进行压缩.这是一个使用名为pyecm的第三方分解算法的Python解决方案.

这个解决方案可能比Yevgeny的解决方案效率高几个.对于合理的y值,计算需要几秒而不是几小时甚至几周/年.对于x = 2 ^ 32-1和y = 256,我的核心组合1.2 ghz需要1.68秒.

>>> import time
>>> def test():
...     before = time.time()
...     print factor(2**32-1, 256)
...     print time.time()-before
...
>>> test()
[254, 232, 215, 113, 3, 15]
1.68499994278
>>> 254*232*215*113*3+15
4294967295L

Run Code Online (Sandbox Code Playgroud)

以下是代码:

def factor(x, y):
    # y should be smaller than x. If x=y then {y, 1, 0} is the best solution
    assert(x > y)

    best_output = []

    # try all possible remainders from 0 to y 
    for remainder in xrange(y+1):
        output = []
        composite = x - remainder
        factors = getFactors(composite)

        # check if any factor is larger than y
        bad_remainder = False
        for n in factors.iterkeys():
            if n > y: 
                bad_remainder = True
                break
        if bad_remainder: continue

        # make the best factors
        while True:
            results = largestFactors(factors, y)
            if results == None: break
            output += [results[0]]
            factors = results[1]

        # store the best output
        output = output + [remainder]
        if len(best_output) == 0 or len(output) < len(best_output):
            best_output = output

    return best_output

# Heuristic
# The bigger the number the better. 8 is more compact than 2,2,2 etc...

# Find the most factors you can have below or equal to y
# output the number and unused factors that can be reinserted in this function
def largestFactors(factors, y):
    assert(y > 1)
    # iterate from y to 2 and see if the factors are present.
    for i in xrange(y, 1, -1):
        try_another_number = False
        factors_below_y = getFactors(i)
        for number, copies in factors_below_y.iteritems():
            if number in factors:
                if factors[number] < copies:
                    try_another_number = True
                    continue # not enough factors
            else:
                try_another_number = True
                continue # a factor is not present

        # Do we want to try another number, or was a solution found?
        if try_another_number == True:
            continue
        else:
            output = 1
            for number, copies in factors_below_y.items():
                remaining = factors[number] - copies
                if remaining > 0:
                    factors[number] = remaining
                else:
                    del factors[number]
                output *= number ** copies

            return (output, factors)

    return None # failed




# Find prime factors. You can use any formula you want for this.
# I am using elliptic curve factorization from http://sourceforge.net/projects/pyecm
import pyecm, collections, copy

getFactors_cache = {}
def getFactors(n):
    assert(n != 0)
    # attempt to retrieve from cache. Returns a copy
    try:
        return copy.copy(getFactors_cache[n])
    except KeyError:
        pass

    output = collections.defaultdict(int)
    for factor in pyecm.factors(n, False, True, 10, 1):
        output[factor] += 1

    # cache result
    getFactors_cache[n] = output

    return copy.copy(output)

Run Code Online (Sandbox Code Playgroud)

回答第一个问题

你说你想要压缩数字,但是根据你的例子,这些序列比未分解的数字更长.如果没有更多细节到你遗漏的系统(序列概率/是否有可编程客户端？),就无法压缩这些数字.你能详细说明一下吗？

这是一个数学解释,为什么当前对问题的第一部分的答案永远不会解决你的第二个问题.它与背包问题无关.

$香农的熵$

这是香农的熵算法.它告诉您表示序列{X0,X1,X2,...,Xn-1,Xn}所需的理论最小位数,其中p(Xi)是看到令牌Xi的概率.

假设X0到Xn是0到4294967295(整数范围)的跨度.根据您的描述,每个数字都可能与另一个数字一样出现.因此,每个元素的概率是1/4294967296.

当我们将其插入Shannon的算法时,它将告诉我们表示流所需的最小位数.

import math

def entropy():
    num = 2**32
    probability = 1./num
    return -(num) * probability * math.log(probability, 2)
    # the (num) * probability cancels out

Run Code Online (Sandbox Code Playgroud)

不出所料,熵是32.我们需要32位来表示一个整数,其中每个数字的可能性相等.减少这个数字的唯一方法是增加某些数字的概率,并降低其他数字的概率.您应该更详细地解释该流.

回答第二个问题

正确的方法是在与HTTP通信时使用base64.显然Java在标准库中没有这个,但是我找到了一个免费实现的链接:

http://iharder.sourceforge.net/current/java/base64/

这是"伪代码",它在Python中完美运行,并且应该很难转换为Java(我的Java生锈):

def longTo64(num):
    mapping = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"
    output = ""

    # special case for 0
    if num == 0:
        return mapping[0]

    while num != 0:
        output = mapping[num % 64] + output
        num /= 64

    return output

Run Code Online (Sandbox Code Playgroud)

如果您可以控制Web服务器和Web客户端,并且可以毫无问题地解析整个HTTP请求,则可以升级到base85.根据维基百科,url编码最多允许85个字符.否则,您可能需要从映射中删除一些字符.

这是Python中的另一个代码示例

def longTo85(num):
    mapping = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.~!*'();:@&=+$,/?%#[]"
    output = ""
    base = len(mapping)

    # special case for 0
    if num == 0:
        return mapping[0]

    while num != 0:
        output = mapping[num % base] + output
        num /= base

    return output

Run Code Online (Sandbox Code Playgroud)

这是逆操作:

def stringToLong(string):
    mapping = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_.~!*'();:@&=+$,/?%#[]"
    output = 0
    base = len(mapping)

    place = 0
    # check each digit from the lowest place
    for digit in reversed(string):
        # find the number the mapping of symbol to number, then multiply by base^place
        output += mapping.find(digit) * (base ** place)
        place += 1

    return output

Run Code Online (Sandbox Code Playgroud)

这是Shannon算法在不同基础上的图表. 替代文字

如您所见,基数越高,表示数字所需的符号越少.在base64处,需要~11个符号来表示长.在base85,它变成~10个符号.

Answer 2

Yev*_*tor 6

最终解释后编辑:

我认为base64是最好的解决方案,因为有标准功能可以处理它,而这个想法的变体并没有带来太大的改进.这里的其他人更详细地回答了这个问题.

关于原始问题,尽管代码有效,但不能保证在任何合理的时间内运行,正如LFSR咨询公司对此问题的回答和评论.

原答案:

你的意思是这样的？

编辑 - 评论后更正.

shortest_output = {}

foreach (int R = 0; R <= X; R++) {
    // iteration over possible remainders
    // check if the rest of X can be decomposed into multipliers
    newX = X - R;
    output = {};

    while (newX > Y) {
       int i;
       for (i = Y; i > 1; i--) {
           if ( newX  % i == 0) { // found a divider
           output.append(i);
           newX  = newX /i;  
           break;
           }
       }

       if (i == 1) { // no dividers <= Y
          break;
       }
    }
    if (newX != 1) {
        // couldn't find dividers with no remainder
        output.clear();
    }
    else {
        output.append(R);
            if (output.length() < shortest_output.length()) {
                 shortest_output = output;
            }
    }
}

Run Code Online (Sandbox Code Playgroud)

问题的原始问题谈到将一个数字分解为乘数.任何算法在合理的时间内都无法解决这个问题 - 这就是RSA(公钥/私钥)加密基于...... (2认同)

Answer 3

Dav*_*ave 5

听起来好像你想要压缩随机数据 - 出于信息理论的原因,这是不可能的.(请参阅http://www.faqs.org/faqs/compression-faq/part1/preamble.html问题9.)在数字的连接二进制表示中使用Base64并完成它.

归档时间：	17 年，1 月前
查看次数：	1913 次
最近记录：	17 年前