Sha*_*hay 4 python oop performance class
我正在制作一个程序(至少现在)从TwitchTV(流媒体平台)中检索流信息.这个程序是自我教育自己,但是当我运行它时,只需要2分钟打印流光的名称.
我在Windows7上使用Python 2.7.3 64位,如果这在任何方面都很重要的话.
classes.py:
#imports:
import urllib
import re
#classes:
class Streamer:
#constructor:
def __init__(self, name, mode, link):
self.name = name
self.mode = mode
self.link = link
class Information:
#constructor:
def __init__(self, TWITCH_STREAMS, GAME, STREAMER_INFO):
self.TWITCH_STREAMS = TWITCH_STREAMS
self.GAME = GAME
self.STREAMER_INFO = STREAMER_INFO
def get_game_streamer_names(self):
"Connects to Twitch.TV API, extracts and returns all streams for a spesific game."
#start connection
self.con = urllib2.urlopen(self.TWITCH_STREAMS + self.GAME)
self.info = self.con.read()
self.con.close()
#regular expressions to get all the stream names
self.info = re.sub(r'"teams":\[\{.+?"\}\]', '', self.info) #remove all team names (they have the same name: parameter as streamer names)
self.streamers_names = re.findall('"name":"(.+?)"', self.info) #looks for the name of each streamer in the pile of info
#run in a for to reduce all "live_user_NAME" values
for name in self.streamers_names:
if name.startswith("live_user_"):
self.streamers_names.remove(name)
#end method
return self.streamers_names
def get_streamer_mode(self, name):
"Returns a streamers mode (on/off)"
#start connection
self.con = urllib2.urlopen(self.STREAMER_INFO + name)
self.info = self.con.read()
self.con.close()
#check if stream is online or offline ("stream":null indicates offline stream)
if self.info.count('"stream":null') > 0:
return "offline"
else:
return "online"
Run Code Online (Sandbox Code Playgroud)
main.py:
#imports:
from classes import *
#consts:
TWITCH_STREAMS = "https://api.twitch.tv/kraken/streams/?game=" #add the game name at the end of the link (space = "+", eg: Game+Name)
STREAMER_INFO = "https://api.twitch.tv/kraken/streams/" #add streamer name at the end of the link
GAME = "League+of+Legends"
def main():
#create an information object
info = Information(TWITCH_STREAMS, GAME, STREAMER_INFO)
streamer_list = [] #create a streamer list
for name in info.get_game_streamer_names():
#run for every streamer name, create a streamer object and place it in the list
mode = info.get_streamer_mode(name)
streamer_name = Streamer(name, mode, 'http://twitch.tv/' + name)
streamer_list.append(streamer_name)
#this line is just to try and print something
print streamer_list[0].name, streamer_list[0].mode
if __name__ == '__main__':
main()
Run Code Online (Sandbox Code Playgroud)
程序本身运行完美,只是非常慢
有任何想法吗?
程序效率通常低于80/20规则(或者某些人称之为90/10规则,甚至是95/5规则).也就是说,80%的时间程序在20%的代码中实际运行.换句话说,有一个很好的镜头,你的代码有一个"瓶颈":代码的一小部分运行缓慢,而其余的运行速度非常快.您的目标是识别瓶颈(或瓶颈),然后修复它们(它们)以更快地运行.
执行此操作的最佳方法是分析您的代码.这意味着您使用日志记录模块记录特定操作发生的时间,使用timeit,如建议的评论者,使用某些内置的分析器,或者只是在程序的非常位置打印出当前时间.最终,您会发现代码的一部分似乎花费了大量时间.
经验告诉您,I/O(从磁盘读取或通过Internet访问资源等内容)将花费比内存计算更长的时间.我对这个问题的猜测是你使用1个HTTP连接获取一个流媒体列表,然后使用一个HTTP连接来获取该流媒体的状态.假设有10000个流媒体:您的程序在完成之前需要建立10001个HTTP连接.
如果确实如此,有几种方法可以解决这个问题: