Kee*_*Jay 37 c++ python algorithm performance
我最近写了一个简短的算法来计算python中的快乐数字.该程序允许您选择一个上限,它将确定它下面的所有快乐数字.为了进行速度比较,我决定将我所知道的算法从python直接翻译成c ++.
令人惊讶的是,c ++版本的运行速度明显慢于python版本.在发现前10,000个满意数字的执行时间之间进行准确的速度测试表明python程序平均在0.59秒内运行,而c ++版本平均在8.5秒内运行.
我将这个速度差异归结为我必须在c ++版本中编写部分计算的辅助函数(例如,确定元素是否在列表/数组/向量中),这些函数已经内置到python语言中.
首先,这是否是这种荒谬的速度差异的真正原因,其次,如何更改c ++版本以比python版本更快地执行(在我看来应该如此).
带有速度测试的两段代码在这里:Python Version,C++ Version.谢谢您的帮助.
#include <iostream>
#include <vector>
#include <string>
#include <ctime>
#include <windows.h>
using namespace std;
bool inVector(int inQuestion, vector<int> known);
int sum(vector<int> given);
int pow(int given, int power);
void calcMain(int upperBound);
int main()
{
while(true)
{
int upperBound;
cout << "Pick an upper bound: ";
cin >> upperBound;
long start, end;
start = GetTickCount();
calcMain(upperBound);
end = GetTickCount();
double seconds = (double)(end-start) / 1000.0;
cout << seconds << " seconds." << endl << endl;
}
return 0;
}
void calcMain(int upperBound)
{
vector<int> known;
for(int i = 0; i <= upperBound; i++)
{
bool next = false;
int current = i;
vector<int> history;
while(!next)
{
char* buffer = new char[10];
itoa(current, buffer, 10);
string digits = buffer;
delete buffer;
vector<int> squares;
for(int j = 0; j < digits.size(); j++)
{
char charDigit = digits[j];
int digit = atoi(&charDigit);
int square = pow(digit, 2);
squares.push_back(square);
}
int squaresum = sum(squares);
current = squaresum;
if(inVector(current, history))
{
next = true;
if(current == 1)
{
known.push_back(i);
//cout << i << "\t";
}
}
history.push_back(current);
}
}
//cout << "\n\n";
}
bool inVector(int inQuestion, vector<int> known)
{
for(vector<int>::iterator it = known.begin(); it != known.end(); it++)
if(*it == inQuestion)
return true;
return false;
}
int sum(vector<int> given)
{
int sum = 0;
for(vector<int>::iterator it = given.begin(); it != given.end(); it++)
sum += *it;
return sum;
}
int pow(int given, int power)
{
int original = given;
int current = given;
for(int i = 0; i < power-1; i++)
current *= original;
return current;
}
Run Code Online (Sandbox Code Playgroud)
#!/usr/bin/env python
import timeit
upperBound = 0
def calcMain():
known = []
for i in range(0,upperBound+1):
next = False
current = i
history = []
while not next:
digits = str(current)
squares = [pow(int(digit), 2) for digit in digits]
squaresum = sum(squares)
current = squaresum
if current in history:
next = True
if current == 1:
known.append(i)
##print i, "\t",
history.append(current)
##print "\nend"
while True:
upperBound = input("Pick an upper bound: ")
result = timeit.Timer(calcMain).timeit(1)
print result, "seconds.\n"
Run Code Online (Sandbox Code Playgroud)
Asi*_*sik 154
对于100000个元素,Python代码需要6.9秒,而C++最初需要37秒.
我对您的代码进行了一些基本的优化,并设法使C++代码比Python实现快100倍.它现在在0.06秒内完成100000个元素.这比原始C++代码快617倍.
最重要的是在发布模式下进行编译,并进行所有优化.在调试模式下,此代码实际上要慢一个数量级.
接下来,我将解释我所做的优化.
通过使用预先分配的数组而不是向量,可以更好地优化代码,但这将是更多的工作,我将把它作为练习留给读者.:P
这是优化的代码:
#include <iostream>
#include <vector>
#include <string>
#include <ctime>
#include <algorithm>
#include <windows.h>
using namespace std;
void calcMain(int upperBound, vector<int>& known);
int main()
{
while(true)
{
vector<int> results;
int upperBound;
cout << "Pick an upper bound: ";
cin >> upperBound;
long start, end;
start = GetTickCount();
calcMain(upperBound, results);
end = GetTickCount();
for (size_t i = 0; i < results.size(); ++i) {
cout << results[i] << ", ";
}
cout << endl;
double seconds = (double)(end-start) / 1000.0;
cout << seconds << " seconds." << endl << endl;
}
return 0;
}
void calcMain(int upperBound, vector<int>& known)
{
vector<int> history;
for(int i = 0; i <= upperBound; i++)
{
int current = i;
history.clear();
while(true)
{
int temp = current;
int sum = 0;
while (temp > 0) {
sum += (temp % 10) * (temp % 10);
temp /= 10;
}
current = sum;
if(find(history.begin(), history.end(), current) != history.end())
{
if(current == 1)
{
known.push_back(i);
}
break;
}
history.push_back(current);
}
}
}
Run Code Online (Sandbox Code Playgroud)
ily*_* n. 20
有一个新的,速度更快的版本作为单独的答案,所以这个答案已被弃用.
我重写了你的算法,只要它找到快乐或不快乐的数字就进行缓存.我也试图尽可能地使它成为pythonic,例如通过创建单独的函数digits()
和happy()
.很抱歉使用Python 3,但我也可以从中展示一些有用的东西.
这个版本要快得多.它运行在1.7S这是10倍比你原来的程序,需要更快的18S(好,我的MacBook是相当老又慢:))
#!/usr/bin/env python3
from timeit import Timer
from itertools import count
print_numbers = False
upperBound = 10**5 # Default value, can be overidden by user.
def digits(x:'nonnegative number') -> "yields number's digits":
if not (x >= 0): raise ValueError('Number should be nonnegative')
while x:
yield x % 10
x //= 10
def happy(number, known = {1}, happies = {1}) -> 'True/None':
'''This function tells if the number is happy or not, caching results.
It uses two static variables, parameters known and happies; the
first one contains known happy and unhappy numbers; the second
contains only happy ones.
If you want, you can pass your own known and happies arguments. If
you do, you should keep the assumption commented out on the 1 line.
'''
# assert 1 in known and happies <= known # <= is expensive
if number in known:
return number in happies
history = set()
while True:
history.add(number)
number = sum(x**2 for x in digits(number))
if number in known or number in history:
break
known.update(history)
if number in happies:
happies.update(history)
return True
def calcMain():
happies = {x for x in range(upperBound) if happy(x) }
if print_numbers:
print(happies)
if __name__ == '__main__':
upperBound = eval(
input("Pick an upper bound [default {0}]: "
.format(upperBound)).strip()
or repr(upperBound))
result = Timer(calcMain).timeit(1)
print ('This computation took {0} seconds'.format(result))
Run Code Online (Sandbox Code Playgroud)
Joh*_*ohn 16
看起来你正在按值传递向量到其他函数.这将是一个显着的减速,因为程序实际上会在向量传递给函数之前制作完整的矢量副本.要解决此问题,请将常量引用传递给向量而不是副本.所以代替:
int sum(vector<int> given)
Run Code Online (Sandbox Code Playgroud)
使用:
int sum(const vector<int>& given)
Run Code Online (Sandbox Code Playgroud)
执行此操作时,您将无法再使用vector :: iterator,因为它不是常量.你需要用vector :: const_iterator替换它.
您也可以传入非常量引用,但在这种情况下,您根本不需要修改参数.
我可以看到你有很多不必要的堆分配
例如:
while(!next)
{
char* buffer = new char[10];
Run Code Online (Sandbox Code Playgroud)
这看起来不是很优化.因此,您可能希望预先分配数组并在循环中使用它.这是一种易于发现和完成的基本优化技术.它也可能变得一团糟,所以要小心.
你也在使用atoi()函数,我真的不知道它是否真的被优化了.也许做模数10并得到数字可能会更好(你必须衡量你,我没有测试这个).
您进行线性搜索(inVector)的事实可能不好.用std :: set替换矢量数据结构可能会加快速度.hash_set也可以做到这一点.
但我认为最糟糕的问题是字符串和这个循环中堆上的东西的分配.这看起来不太好.我会先尝试那些地方.
这是我的第二个答案; 它可以缓存诸如平方和之类的值<= 10**6
:
happy_list[sq_list[x%happy_base] + sq_list[x//happy_base]]
Run Code Online (Sandbox Code Playgroud)
那是,
我不认为Python版本可以比这更快(好吧,如果你扔掉旧版本,这是try:
开销,它快10%).
我认为这是一个很好的问题,它表明,事实上,
好的,在这里(第2版现在......):
#!/usr/bin/env python3
'''Provides slower and faster versions of a function to compute happy numbers.
slow_happy() implements the algorithm as in the definition of happy
numbers (but also caches the results).
happy() uses the precomputed lists of sums of squares and happy numbers
to return result in just 3 list lookups and 3 arithmetic operations for
numbers less than 10**6; it falls back to slow_happy() for big numbers.
Utilities: digits() generator, my_timeit() context manager.
'''
from time import time # For my_timeit.
from random import randint # For example with random number.
upperBound = 10**5 # Default value, can be overridden by user.
class my_timeit:
'''Very simple timing context manager.'''
def __init__(self, message):
self.message = message
self.start = time()
def __enter__(self):
return self
def __exit__(self, *data):
print(self.message.format(time() - self.start))
def digits(x:'nonnegative number') -> "yields number's digits":
if not (x >= 0): raise ValueError('Number should be nonnegative')
while x:
yield x % 10
x //= 10
def slow_happy(number, known = {1}, happies = {1}) -> 'True/None':
'''Tell if the number is happy or not, caching results.
It uses two static variables, parameters known and happies; the
first one contains known happy and unhappy numbers; the second
contains only happy ones.
If you want, you can pass your own known and happies arguments. If
you do, you should keep the assumption commented out on the 1 line.
'''
# This is commented out because <= is expensive.
# assert {1} <= happies <= known
if number in known:
return number in happies
history = set()
while True:
history.add(number)
number = sum(x**2 for x in digits(number))
if number in known or number in history:
break
known.update(history)
if number in happies:
happies.update(history)
return True
# This will define new happy() to be much faster ------------------------.
with my_timeit('Preparation time was {0} seconds.\n'):
LogAbsoluteUpperBound = 6 # The maximum possible number is 10**this.
happy_list = [slow_happy(x)
for x in range(81*LogAbsoluteUpperBound + 1)]
happy_base = 10**((LogAbsoluteUpperBound + 1)//2)
sq_list = [sum(d**2 for d in digits(x))
for x in range(happy_base + 1)]
def happy(x):
'''Tell if the number is happy, optimized for smaller numbers.
This function works fast for numbers <= 10**LogAbsoluteUpperBound.
'''
try:
return happy_list[sq_list[x%happy_base] + sq_list[x//happy_base]]
except IndexError:
return slow_happy(x)
# End of happy()'s redefinition -----------------------------------------.
def calcMain(print_numbers, upper_bound):
happies = [x for x in range(upper_bound + 1) if happy(x)]
if print_numbers:
print(happies)
if __name__ == '__main__':
while True:
upperBound = eval(input(
"Pick an upper bound [{0} default, 0 ends, negative number prints]: "
.format(upperBound)).strip() or repr(upperBound))
if not upperBound:
break
with my_timeit('This computation took {0} seconds.'):
calcMain(upperBound < 0, abs(upperBound))
single = 0
while not happy(single):
single = randint(1, 10**12)
print('FYI, {0} is {1}.\n'.format(single,
'happy' if happy(single) else 'unhappy'))
print('Nice to see you, goodbye!')
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
10754 次 |
最近记录: |