Ath*_*euz 1 python caching flask
我正在尝试创建一个Flask Web应用程序,您必须要求整个非本地网站,我想知道是否可以将其缓存以加快速度,因为网站不会经常更改但我仍然希望它每天更新缓存一次.
无论如何,我查了一下,找到了Flask-Cache,它似乎做了我想要的,所以我对它进行了适当的修改,并提出了添加:
from flask.ext.cache import Cache
[...]
cache = Cache()
[...]
cache.init_app(app)
[...]
@cache.cached(timeout=86400, key_prefix='content')
def get_content():
return lxml.html.fromstring(urllib2.urlopen('http://WEBSITE.com').read())
Run Code Online (Sandbox Code Playgroud)
然后我从需要内容的函数中调用,如下所示:
content = get_content()
Run Code Online (Sandbox Code Playgroud)
现在我希望每次调用时都重用缓存的lxml.html对象,但这不是我所看到的.每次进行呼叫时对象的ID都会改变,并且根本没有加速.那么我误解了Flask-Cache的功能,或者我在这里做错了什么?我尝试过使用memoize装饰器,我已经尝试过减少超时或者将它们全部一起删除,但似乎没有任何改变.
谢谢.
默认值CACHE_TYPE是null给你一个NullCache- 所以你根本没有得到任何缓存.文档没有明确说明,但源代码中的这一行Cache.init_app是:
self.config.setdefault('CACHE_TYPE', 'null')
Run Code Online (Sandbox Code Playgroud)
要实际使用一些缓存,请初始化您的Cache实例以使用适当的缓存.
cache = Cache(config={'CACHE_TYPE': 'simple'})
Run Code Online (Sandbox Code Playgroud)
旁白:请注意,这SimpleCache对于开发和测试以及此示例非常有用,但您不应在生产中使用它.喜欢MemCached或RedisCache会更好的东西
现在,有了实际的缓存,您将遇到下一个问题.在第二次调用时,lxml.html将从中检索缓存对象Cache,但它会被破坏,因为这些对象不可缓存.Stacktrace看起来像这样:
Traceback (most recent call last):
File "/home/day/.virtualenvs/so-flask/lib/python2.7/site-packages/flask/app.py", line 1701, in __call__
return self.wsgi_app(environ, start_response)
File "/home/day/.virtualenvs/so-flask/lib/python2.7/site-packages/flask/app.py", line 1689, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/home/day/.virtualenvs/so-flask/lib/python2.7/site-packages/flask/app.py", line 1687, in wsgi_app
response = self.full_dispatch_request()
File "/home/day/.virtualenvs/so-flask/lib/python2.7/site-packages/flask/app.py", line 1360, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/day/.virtualenvs/so-flask/lib/python2.7/site-packages/flask/app.py", line 1358, in full_dispatch_request
rv = self.dispatch_request()
File "/home/day/.virtualenvs/so-flask/lib/python2.7/site-packages/flask/app.py", line 1344, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/day/q12030403.py", line 20, in index
return "get_content returned: {0!r}\n".format(get_content())
File "lxml.etree.pyx", line 1034, in lxml.etree._Element.__repr__ (src/lxml/lxml.etree.c:41389)
File "lxml.etree.pyx", line 881, in lxml.etree._Element.tag.__get__ (src/lxml/lxml.etree.c:39979)
File "apihelpers.pxi", line 15, in lxml.etree._assertValidNode (src/lxml/lxml.etree.c:12306)
AssertionError: invalid Element proxy at 3056741852
Run Code Online (Sandbox Code Playgroud)
因此lxml.html,您应该只缓存简单字符串 - 您下载的网站内容,然后重新分析它以获取lxml.html每次新对象,而不是缓存对象.您的缓存仍然有帮助,因为您每次都没有访问其他网站.这是一个完整的程序,用于演示有效的解决方案:
from flask import Flask
from flask.ext.cache import Cache
import time
import lxml.html
import urllib2
app = Flask(__name__)
cache = Cache(config={'CACHE_TYPE': 'simple'})
cache.init_app(app)
@cache.cached(timeout=86400, key_prefix='content')
def get_content():
app.logger.debug("get_content called")
# return lxml.html.fromstring(urllib2.urlopen('http://daybarr.com/wishlist').read())
return urllib2.urlopen('http://daybarr.com/wishlist').read()
@app.route("/")
def index():
app.logger.debug("index called")
return "get_content returned: {0!r}\n".format(get_content())
if __name__ == "__main__":
app.run(debug=True)
Run Code Online (Sandbox Code Playgroud)
当我运行程序并发出两个请求时http://127.0.0.1:5000/,我得到了这个输出.请注意,get_content第二次不会调用,因为内容是从缓存提供的.
* Running on http://127.0.0.1:5000/
* Restarting with reloader
--------------------------------------------------------------------------------
DEBUG in q12030403 [q12030403.py:20]:
index called
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
DEBUG in q12030403 [q12030403.py:14]:
get_content called
--------------------------------------------------------------------------------
127.0.0.1 - - [21/Dec/2012 00:03:28] "GET / HTTP/1.1" 200 -
--------------------------------------------------------------------------------
DEBUG in q12030403 [q12030403.py:20]:
index called
--------------------------------------------------------------------------------
127.0.0.1 - - [21/Dec/2012 00:03:33] "GET / HTTP/1.1" 200 -
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1015 次 |
| 最近记录: |