我在Ubuntu 11上使用easy_install安装lxml时遇到了困难.
当我输入时,$ easy_install lxml
我得到:
Searching for lxml
Reading http://pypi.python.org/simple/lxml/
Reading http://codespeak.net/lxml
Best match: lxml 2.3
Downloading http://lxml.de/files/lxml-2.3.tgz
Processing lxml-2.3.tgz
Running lxml-2.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-7UdQOZ/lxml-2.3/egg-dist-tmp-GacQGy
Building lxml version 2.3.
Building without Cython.
ERROR: /bin/sh: xslt-config: not found
** make sure the development packages of libxml2 and libxslt are installed **
Using build configuration of libxslt
In file included from src/lxml/lxml.etree.c:227:0:
src/lxml/etree_defs.h:9:31: fatal error: libxml/xmlversion.h: No such file or directory
compilation terminated.
Run Code Online (Sandbox Code Playgroud)
似乎libxslt
或libxml2
没有安装.我已尝试按照http://www.techsww.com/tutorials/libraries/libxslt/installation/installing_libxslt_on_ubuntu_linux.php …
这是我的错误:
(mysite)zjm1126@zjm1126-G41MT-S2:~/zjm_test/mysite$ pip install lxml
Downloading/unpacking lxml
Running setup.py egg_info for package lxml
Building lxml version 2.3.
Building without Cython.
ERROR: /bin/sh: xslt-config: not found
** make sure the development packages of libxml2 and libxslt are installed **
Using build configuration of libxslt
Installing collected packages: lxml
Running setup.py install for lxml
Building lxml version 2.3.
Building without Cython.
ERROR: /bin/sh: xslt-config: not found
** make sure the development packages of libxml2 and libxslt are installed **
Using build configuration …
Run Code Online (Sandbox Code Playgroud) 我想安装Lxml,然后我可以安装Scrapy.
当我今天更新我的Mac时它不会让我重新安装lxml,我收到以下错误:
In file included from src/lxml/lxml.etree.c:314:
/private/tmp/pip_build_root/lxml/src/lxml/includes/etree_defs.h:9:10: fatal error: 'libxml/xmlversion.h' file not found
#include "libxml/xmlversion.h"
^
1 error generated.
error: command 'cc' failed with exit status 1
Run Code Online (Sandbox Code Playgroud)
我已经尝试使用brew来安装libxml2和libxslt,两者都安装得很好但我仍然无法安装lxml.
上次我安装时我需要在Xcode上启用开发人员工具,但由于它更新到Xcode 5,它不再给我这个选项了.
有谁知道我需要做什么?
我已经将我的脚本从python 2.7转换为3.2,我有一些bug.
# -*- coding: utf-8 -*-
import time
from datetime import date
from lxml import etree
from collections import OrderedDict
# Create the root element
page = etree.Element('results')
# Make a new document tree
doc = etree.ElementTree(page)
# Add the subelements
pageElement = etree.SubElement(page, 'Country',Tim = 'Now',
name='Germany', AnotherParameter = 'Bye',
Code='DE',
Storage='Basic')
pageElement = etree.SubElement(page, 'City',
name='Germany',
Code='PZ',
Storage='Basic',AnotherParameter = 'Hello')
# For multiple multiple attributes, use as shown above
# Save to XML file
outFile = open('output.xml', …
Run Code Online (Sandbox Code Playgroud) ...
soup = BeautifulSoup(html, "lxml")
File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__
% ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?
Run Code Online (Sandbox Code Playgroud)
以上输出在我的终端上.我在Mac OS 10.7.x上.我有Python 2.7.1,并按照本教程获得Beautiful Soup和lxml,它们都已成功安装并使用位于此处的单独测试文件.在导致此错误的Python脚本中,我包含了这一行:
from pageCrawler import comparePages
在pageCrawler文件中,我包含以下两行:
from bs4 import BeautifulSoup
from urllib2 import urlopen
任何有关确定问题是什么以及如何解决问题的帮助都将不胜感激.
我正在运行以下命令,以便在该文件" pip install -r requirements.txt --download-cache=~/tmp/pip-cache
"中安装软件包
.
requirement.txt包含像pacakages
# Data formats
# ------------
PIL==1.1.7 #
html5lib==0.90
httplib2==0.7.4
lxml==2.3.1
# Documentation
# -------------
Sphinx==1.1
docutils==0.8.1
# Testing
# -------
behave==1.1.0
dingus==0.3.2
django-testscenarios==0.7.2
mechanize==0.2.5
mock==0.7.2
testscenarios==0.2
testtools==0.9.14
wsgi_intercept==0.5.1
Run Code Online (Sandbox Code Playgroud)
当我想要安装"lxml"软件包时,我得到了以下的错误
Requirement already satisfied (use --upgrade to upgrade): django-testproject>=0.1.1 in /usr/lib/python2.7/site-packages/django_testproject-0.1.1-py2.7.egg (from django-testscenarios==0.7.2->-r requirements.txt (line 33))
Installing collected packages: lxml, Sphinx, docutils, behave, dingus, mock, testscenarios, testtools, South
Running setup.py install for lxml
Building lxml version 2.3.1.
Building without Cython.
ERROR: /bin/sh: xslt-config: …
Run Code Online (Sandbox Code Playgroud) 我有一个HTML文件(来自Newegg),他们的HTML组织如下.其规格表中的所有数据均为" desc ",而每个部分的标题均为" 名称".'以下是来自Newegg页面的两个数据示例.
<tr>
<td class="name">Brand</td>
<td class="desc">Intel</td>
</tr>
<tr>
<td class="name">Series</td>
<td class="desc">Core i5</td>
</tr>
<tr>
<td class="name">Cores</td>
<td class="desc">4</td>
</tr>
<tr>
<td class="name">Socket</td>
<td class="desc">LGA 1156</td>
Run Code Online (Sandbox Code Playgroud)
<tr>
<td class="name">Brand</td>
<td class="desc">AMD</td>
</tr>
<tr>
<td class="name">Series</td>
<td class="desc">Phenom II X4</td>
</tr>
<tr>
<td class="name">Cores</td>
<td class="desc">4</td>
</tr>
<tr>
<td class="name">Socket</td>
<td class="desc">Socket AM3</td>
</tr>
Run Code Online (Sandbox Code Playgroud)
最后,我希望有一个CPU(已经设置好)的类,它包含Brand,Series,Cores和Socket类型,用于存储每个数据.这是我能想到的唯一方法:
if(parsedDocument.xpath(tr/td[@class="name"])=='Brand'):
CPU.brand = parsedDocument.xpath(tr/td[@class="name"]/nextsibling?).text
Run Code Online (Sandbox Code Playgroud)
并为其余的值执行此操作.我如何完成nextsibling并且有更简单的方法吗?
我试图解析包含一些非ASCII cheracter的xml,
代码如下所示
from lxml import etree
from lxml import objectify
content = u'<?xml version="1.0" encoding="utf-8"?><div>Order date : 05/08/2013 12:24:28</div>'
mail.replace('\xa0',' ')
xml = etree.fromstring(mail)
Run Code Online (Sandbox Code Playgroud)
但它显示我在'content = ...'这一行上的错误
syntaxError: Non-ASCII character '\xc2' in file /home/projects/ztest/responce.py on line 3,
but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
Run Code Online (Sandbox Code Playgroud)
在终端它正在工作,但在eclipse IDE上运行它给了我一个错误.
不知道如何克服..
我正在尝试安装lxml以在我的Mac上安装scrapy(v 10.9.4)
??ishaantaylor@Ishaans-MacBook-Pro.local ~
??? pip install lxml
Downloading/unpacking lxml
Downloading lxml-3.4.0.tar.gz (3.5MB): 3.5MB downloaded
Running setup.py (path:/private/var/folders/8l/t7tcq67d34v7qq_4hp3s1dm80000gn/T/pip_build_ishaantaylor/lxml/setup.py) egg_info for package lxml
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
warnings.warn(msg)
Building lxml version 3.4.0.
Building without Cython.
Using build configuration of libxslt 1.1.28
warning: no previously-included files found matching '*.py'
Installing collected packages: lxml
Running setup.py install for lxml
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'bugtrack_url'
warnings.warn(msg)
Building lxml version 3.4.0.
Building without Cython.
Using build configuration of libxslt 1.1.28
building 'lxml.etree' …
Run Code Online (Sandbox Code Playgroud) 在运行python脚本时,我遇到了这个错误
from lxml import etree
ImportError: No module named lxml
Run Code Online (Sandbox Code Playgroud)
现在我尝试安装lxml
sudo easy_install lmxl
Run Code Online (Sandbox Code Playgroud)
但它给了我以下错误
Building lxml version 2.3.beta1.
NOTE: Trying to build without Cython, pre-generated 'src/lxml/lxml.etree.c' needs to be available.
ERROR: /bin/sh: xslt-config: not found
** make sure the development packages of libxml2 and libxslt are installed **
Run Code Online (Sandbox Code Playgroud)
使用libxslt的构建配置
src/lxml/lxml.etree.c:4: fatal error: Python.h: No such file or directory
compilation terminated.
error: Setup script exited with error: command 'gcc' failed with exit status 1
Run Code Online (Sandbox Code Playgroud) lxml ×10
python ×8
pip ×3
macos ×2
python-2.7 ×2
scrapy ×2
easy-install ×1
encoding ×1
libxml2 ×1
python-3.x ×1
ubuntu ×1
xcode ×1
xml ×1
xml-parsing ×1
xpath ×1