我正在努力控制我的程序内存占用.我以为我会从导入开始,因为我只使用相当大的PyObjC库中的3-4个函数.但是,我有点惊讶地发现,导入较大模块的特定部分与实际加载到内存中的内容无关.
在OSX上加载整个Quartz.CoreGraphics库:
Line # Mem usage Increment Line Contents
================================================
77 @profile
78 7.953 MB 0.000 MB def test_import_all():
79 26.734 MB 18.781 MB import Quartz.CoreGraphics as CG
Run Code Online (Sandbox Code Playgroud)
它以近19MB的速度拉入整个图书馆.
试图只提取我需要的东西,得到相同的19MB结果:
Line # Mem usage Increment Line Contents
================================================
82 @profile
83 7.941 MB 0.000 MB def test_import_some():
84 26.727 MB 18.785 MB from Quartz.CoreGraphics import CGImageGetWidth
Run Code Online (Sandbox Code Playgroud)
因此,似乎特定的导入与实际加载的内容无关.
从一个巨大的模块只需要一小部分功能似乎是一个常见的用例.有没有办法只将我需要的模块加载到内存中,或者这只是使用外部库的结果?
我正试图从历史书的网站上搜集练习测验,这样我就可以将它们组合成一个大的测试来学习.
页面在这里.
这一切都是由javascript,所以我试图HtmlUnit用来刮页.
我更像是一个Python人,因此我将初始代码设置为非常接近HtmlUnit的入门部分:
import com.gargoylesoftware.htmlunit.*;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
public class HelloWorld {
public static void main(String[] args) throws Exception {
homePage();
System.out.println("Done.");
}
public static void homePage() throws Exception {
final WebClient webClient = new WebClient();
String url = "http://www.wwnorton.com/college/polisci/we-the-people8/shorter/ch/15/quiz.aspx";
final HtmlPage page = webClient.getPage(url);
System.out.println(page.asText());
webClient.closeAllWindows();
}
}
Run Code Online (Sandbox Code Playgroud)
在运行时,我得到以下打印输出:2013年4月27日下午12:50:16
com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Apr 27, 2013 12:50:16 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Apr …Run Code Online (Sandbox Code Playgroud) 我试图以编程方式向表单提交一些数据.我有一个小问题,服务器"不喜欢"我发送的内容.令人沮丧的是,没有任何错误信息或任何可以帮助诊断问题的信息,所有这一切都会让我回到我点击时开始的同一页面br.submit().
当我在浏览器中手动单击提交按钮时,生成的页面显示一个小的"成功!" 信息.通过脚本提交时不会显示此类消息.此外,实际上没有更改发布到服务器.这很奇怪,我第一次遇到这种行为.
通过Mechanize文档,它表明在这些奇怪的,难以诊断的问题下,最好复制浏览器实际提交的请求标头.
我的问题是,当我打电话时,如何查看请求标题是什么br.submit()?
location = 'http://ww.mysite.com'
br = mechanize.Browser()
cj = mechanize.LWPCookieJar()
br.set_cookiejar(cj)
username = MY_USER_NAME
password = MY_PASSWORD
br.addheaders.append(('Authorization', 'Basic %s' % base64.encodestring('%s:%s' % (username, password))))
br.open(location)
br.select_form(nr=0)
br['text'] = 'MY JUNK TO SUBMIT' #Text field. Can put anything
br['DropDown1'] = ['4'] #This is a dropdown of integer values
br['DropDown2'] = ['3'] #Also a dropdown of ints
br.submit()
Run Code Online (Sandbox Code Playgroud)
在提交表单时如何查看正在发送的标题?
我正在尝试将csv文件上传到此站点。但是,我遇到了一些问题,我认为这源于不正确mimetype(也许)。
我正在尝试通过 手动发布文件urllib2,因此我的代码如下所示:
import urllib\nimport urllib2\nimport mimetools, mimetypes\nimport os, stat\nfrom cStringIO import StringIO\n\n#============================\n# Note: I found this recipe online. I can\'t remember where exactly though.. \n#=============================\n\nclass Callable:\n def __init__(self, anycallable):\n self.__call__ = anycallable\n\n# Controls how sequences are uncoded. If true, elements may be given multiple values by\n# assigning a sequence.\ndoseq = 1\n\nclass MultipartPostHandler(urllib2.BaseHandler):\n handler_order = urllib2.HTTPHandler.handler_order - 10 # needs to run first\n\n def http_request(self, request):\n data = …Run Code Online (Sandbox Code Playgroud) 我正在对底层操作系统库进行一堆ctypes调用.只要文档引用存储在a中的常量值,我的进度就会慢慢爬行.h.文件在哪里,因为我必须追踪它,并弄清楚实际值是什么,以便我可以将它传递给函数.
有没有办法加载.h带有ctypes 的文件并访问所有常量?
上下文将整数列表拆分为它们自己的偶数和奇数列表.
even = []
odd = []
for i in my_list:
if i % 2 == 0:
even.append(i)
else:
odd.append(i)
Run Code Online (Sandbox Code Playgroud)
有没有办法把上面变成一个漂亮,紧凑的列表理解...?
我正试图在这个网站上做一点点抓取以编程方式查找轮询信息.我最初尝试使用Python,它非常适合加载网站并在aspx表单周围导航,但无法提取嵌入的地图数据(因为没有包(至今)处理javascript).所以我选择了除掉我的Java技能并打破HtmlUnit.但是,我几乎立即遇到了障碍.
看起来好像网站上存在一些不存在的javascript文件的死链接.当HtmlUnit尝试加载它们时,它会获得404并自我毁灭.
Jul 21, 2013 9:51:22 PM com.gargoylesoftware.htmlunit.html.HtmlPage loadExternalJavaScriptFile
SEVERE: Error loading JavaScript from [http://www.eci-polldaymonitoring.nic.in/psl/GoogleMapForASPNet.ascx/jsdebug].
com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException: 404 Not Found for http://www.eci-polldaymonitoring.nic.in/psl/GoogleMapForASPNet.ascx/jsdebug
at com.gargoylesoftware.htmlunit.WebClient.throwFailingHttpStatusCodeExceptionIfNecessary(WebClient.java:544)
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadJavaScriptFromUrl(HtmlPage.java:1119)
at com.gargoylesoftware.htmlunit.html.HtmlPage.loadExternalJavaScriptFile(HtmlPage.java:1059)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:399)
at com.gargoylesoftware.htmlunit.html.HtmlScript$3.execute(HtmlScript.java:260)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:276)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:676)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:635)
at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1170)
at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1072)
at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206)
at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330)
at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3074)
at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2041)
at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:918)
at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499)
at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:892)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:241)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:187)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:434)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:309) …Run Code Online (Sandbox Code Playgroud) 我试图了解这个递归函数中发生了什么.它反转了一个String,但我不太明白这些单独的返回调用最后是如何组合成一个字符串的.
def reverse(string: String): String = {
if (string.length() == 0)
return string
return reverse(string.substring(1)) + string.charAt(0)
}
Run Code Online (Sandbox Code Playgroud)
我已经通过添加print语句分析了这个功能,虽然我有点理解它是如何工作的(概念上),我不明白,嗯...... 它是如何工作的.
例如,我知道递归的每个循环都将事物推入堆栈.
所以,我希望reverse("hello"),成为一堆
o
l
l
e
h
Run Code Online (Sandbox Code Playgroud)
但它必须比那更复杂,因为递归调用是return reverse(string.substring(1)) + string.charAt(0).实际上堆栈也是如此
o,
l, o
l, lo
e, llo
H, ello
Run Code Online (Sandbox Code Playgroud)
?
如何将它变成我们期望的单个字符串?