看起来很简单,但我无法在pandas DataFrame中绘制带有"点"的XY图表.我想在XY图表上显示子标记为"Mark",其中X为年龄,Y为fdg.
代码到目前为止
mydata = [{'subid': 'B14-111', 'age': 75, 'fdg': 3}, {'subid': 'B14-112', 'age': 22, 'fdg': 2}, {'subid': 'B14-112', 'age': 40, 'fdg': 5}]
df = pandas.DataFrame(mydata)
DataFrame.plot(df,x="age",y="fdg")
show()
Run Code Online (Sandbox Code Playgroud)

我试图在HDFS上为Impala上传csv文件并且失败很多次.我不遵循指南,不知道这里有什么问题.而且csv也在HDFS上.
CREATE EXTERNAL TABLE gc_imp
(
asd INT,
full_name STRING,
sd_fd_date STRING,
ret INT,
ftyu INT,
qwerINT
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY','
LOCATION '/user/hadoop/Gc_4';
Run Code Online (Sandbox Code Playgroud)
我得到的错误.而我正在使用Hue.
> TExecuteStatementResp(status=TStatus(errorCode=None,
> errorMessage='MetaException: hdfs://nameservice1/user/hadoop/Gc_4 is
> not a directory or unable to create one', sqlState='HY000',
> infoMessages=None, statusCode=3), operationHandle=None)
Run Code Online (Sandbox Code Playgroud)
任何领导.
我正在尝试加入(垂直)一些元组,最好说我将这些元组插入到数据框中。但直到现在还做不到。问题出现了,因为我试图水平而不是垂直添加它们。
data_frame = pandas.DataFrame(columns=("A","B","C","D"))
str1 = "Doodles are the logo-incorporating works of art that Google regularly features on its homepage. They began in 1998 with a stick figure by Google co-founders Larry Page and Sergey Brin -- to indicate they were attending the Burning Man festival. Since then the doodles have become works of art -- some of them high-tech and complex -- created by a team of doodlers. Stay tuned here for more of this year's doodles"
aa = …Run Code Online (Sandbox Code Playgroud) 我已经尝试过像Lambda,List comprehension等序列生成器,但似乎我无法得到我真正想要的东西.我的最终目标是从像字符串[1:3]这样的字符串中打印单词序列
我在找什么:
a = [0,13,26,39]
b = [12,25,38,51]
str = 'If you are done with the file, move to the command area across from the file name in the RL screen and type'
read = str.split()
read[0:12]
['If', 'you', 'are', 'done', 'with', 'the', 'file,', 'move', 'to', 'the', 'command', 'area']
read[13:25]
['from', 'the', 'file', 'name', 'in', 'the', 'RL', 'screen', 'and', 'type']
Run Code Online (Sandbox Code Playgroud) 几次敲我的头后,我终于来到了这里.
问题:我正在尝试下载每个craiglist发布的内容.按内容我的意思是"发布机构",如手机的描述.寻找新的旧手机,因为iPhone完成了所有的兴奋.
代码是Michael Herman的精彩作品.
我的蜘蛛类
from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.selector import *
from craig.items import CraiglistSampleItem
class MySpider(CrawlSpider):
name = "craigs"
allowed_domains = ["craigslist.org"]
start_urls = ["http://minneapolis.craigslist.org/moa/"]
rules = (Rule (SgmlLinkExtractor(allow=("index\d00\.html", ),restrict_xpaths=('//p[@class="nextpage"]',))
, callback="parse_items", follow= True),
)
def parse_items(self,response):
hxs = HtmlXPathSelector(response)
titles = hxs.select("//span[@class='pl']")
items = []
for titles in titles:
item = CraiglistSampleItem()
item ["title"] = titles.select("a/text()").extract()
item ["link"] = titles.select("a/@href").extract()
items.append(item)
return items
Run Code Online (Sandbox Code Playgroud)
和Item类
from scrapy.item import …Run Code Online (Sandbox Code Playgroud) 我目前正在尝试使用 pyodbc 从 python 程序连接 DB2。由于驱动程序尚未安装在服务器中,我从 IBM 网站下载了它,但我不确定如何使用 pyodbc 进行连接。
我使用的代码是:
cnx = pyodbc.connect(
'Driver={IBM DB2 ODBC Driver}; '
'Hostname=hostname; '
'Port=50100; '
'Protocol=TCPIP; '
'Database=db_name; '
'CurrentSchema=schema; '
'UID=user_id; '
'PWD = passw;'
)
Run Code Online (Sandbox Code Playgroud)
不确定如何将它与我刚刚下载的驱动程序和 CLI 连接起来,任何提示都会非常有帮助。
这个问题是相关的:
通过 FTP 访问远程服务器时,出现以下错误。不确定它的问题是什么,所以我可以解决。任何线索都会有帮助。
代码:
import ftplib
from ftplib import FTP
ftp = ftplib.FTP("server_name")
Run Code Online (Sandbox Code Playgroud)
错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.6/ftplib.py", line 116, in __init__
self.connect(host)
File "/usr/lib64/python2.6/ftplib.py", line 131, in connect
self.sock = socket.create_connection((self.host, self.port), self.timeout)
File "/usr/lib64/python2.6/socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
socket.gaierror: [Errno -2] Name or service not known
Run Code Online (Sandbox Code Playgroud) 我试图将SAS数据集的大小放入宏变量中.虽然无法获得良好的提示.我正在考虑的选项如下:
proc sql;
select *
from dictionary.tables
where libname = "REPOS"
;
quit;
proc contents data = repos.ajk;
run;
Run Code Online (Sandbox Code Playgroud) 我正在尝试从Wikipedia页面获取重要图像,而不是缩略图或其他gif,并使用以下代码。但是,“ img”的长度为“ 0”。关于如何纠正它的任何建议。
代码:
import urllib
import urllib2
from bs4 import BeautifulSoup
import os
html = urllib2.urlopen("http://en.wikipedia.org/wiki/Main_Page")
soup = BeautifulSoup(html)
imgs = soup.findAll("div",{"class":"image"})
Run Code Online (Sandbox Code Playgroud)
另外,如果有人可以通过查看网页中的“源元素”来详细说明如何使用findAll。那太好了。
我试图划分的第一排与第二,第三排,第四,第五行与第六及以后的一个相当大的数据表.没有太多计算,有没有办法做到这一点.
输入
Name Month Income
John Jan 10000
John_County Jan 20000
Tanya Jan 20000
Tanya_County Jan 40000
Run Code Online (Sandbox Code Playgroud)
产量
Name Month Per_Income
John Jan 50%
Tanya Jan 50%
Run Code Online (Sandbox Code Playgroud)