我正在尝试使用此Python urllib2 Basic Auth Problem代码来从需要身份验证的URL下载网页内容.我正在尝试的代码是:
import urllib2, base64
request = urllib2.Request("http://api.foursquare.com/v1/user")
base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string)
result = urllib2.urlopen(request)
Run Code Online (Sandbox Code Playgroud)
它告诉我:
./xx.py: line 1: import: command not found
./xx.py: line 3: syntax error near unexpected token `('
./xx.py: line 3: `request = urllib2.Request("http://api.foursquare.com/v1/user")'
Run Code Online (Sandbox Code Playgroud)
我想知道我做错了什么?我在用Python 2.7.5
.如何从需要身份验证的URL下载文件内容?
我有一个csv
标题文件,如:
鉴于此test.csv
文件:
"A","B","C","D","E","F","timestamp"
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291111964948E12
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291113113366E12
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291120650486E12
Run Code Online (Sandbox Code Playgroud)
我只想将它作为矩阵/ ndarray加载3行和7列,我也想column vectors
从给定的访问column name
.如果我使用genfromtxt
(如下所示),我得到一个3行(每行一个)而没有列的ndarray.
r = np.genfromtxt('test.csv',delimiter=',',dtype=None, names=True)
print r
print r.shape
[ (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291111964948.0)
(611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291113113366.0)
(611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291120650486.0)]
(3,)
Run Code Online (Sandbox Code Playgroud)
我可以从列名中获取列向量,如下所示:
print r['A']
[ 611.88243 611.88243 611.88243]
Run Code Online (Sandbox Code Playgroud)
如果,我使用load.txt
然后我得到3行和7列的数组但无法columns
使用column
名称访问(如下所示).
numpy.loadtxt(open("test.csv","rb"),delimiter=",",skiprows=1)
Run Code Online (Sandbox Code Playgroud)
我明白了
[ [611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291111964948E12]
[611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291113113366E12]
[611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291120650486E12] ]
Run Code Online (Sandbox Code Playgroud)
有没有办法Python
可以同时实现两个要求(access columns by coluumn name …
我正在玩scikit-learn
找到tf-idf
价值观.
我有一套documents
像:
D1 = "The sky is blue."
D2 = "The sun is bright."
D3 = "The sun in the sky is bright."
Run Code Online (Sandbox Code Playgroud)
我想创建一个这样的矩阵:
Docs blue bright sky sun
D1 tf-idf 0.0000000 tf-idf 0.0000000
D2 0.0000000 tf-idf 0.0000000 tf-idf
D3 0.0000000 tf-idf tf-idf tf-idf
Run Code Online (Sandbox Code Playgroud)
所以,我的代码Python
是:
import nltk
import string
from sklearn.feature_extraction.text import TfidfVectorizer
from nltk.corpus import stopwords
train_set = ["sky is blue", "sun is bright", "sun in the sky is bright"]
stop_words …
Run Code Online (Sandbox Code Playgroud) 我对数字图像处理很陌生并且修复了下面提到的问题: - 我需要编写一个C程序,它将加载ppm图像文件并使用卷积内核进行线检测.任何形式的帮助将不胜感激.
我试图读取txt
文件的每一行,并打印出不同文件中的每一行.假设,我有一个文本如下:
How are you? I am good.
Wow, that's great.
This is a text file.
......
Run Code Online (Sandbox Code Playgroud)
现在,我希望filename1.txt
有以下内容:
How are you? I am good.
Run Code Online (Sandbox Code Playgroud)
filename2.txt
具有:
Wow, that's great.
Run Code Online (Sandbox Code Playgroud)
等等.
我的代码是:
#! /usr/bin/Python
for i in range(1,4): // this range should increase with number of lines
with open('testdata.txt', 'r') as input:
with open('filename%i.txt' %i, 'w') as output:
for line in input:
output.write(line)
Run Code Online (Sandbox Code Playgroud)
我得到的是,所有文件都包含文件的所有行.我希望每个文件只有1行,如上所述.
我有2个文件doc1.txt
和doc2.txt
.这两份文件的内容如下:
#doc1.txt
very good, very bad, you are great
#doc2.txt
very bad, good restaurent, nice place to visit
Run Code Online (Sandbox Code Playgroud)
我想让我的语料库分开,,
以便我的最终DocumentTermMatrix
成为:
terms
docs very good very bad you are great good restaurent nice place to visit
doc1 tf-idf tf-idf tf-idf 0 0
doc2 0 tf-idf 0 tf-idf tf-idf
Run Code Online (Sandbox Code Playgroud)
我知道,如何计算DocumentTermMatrix
的各个单词(使用http://scikit-learn.org/stable/modules/feature_extraction.html),但不知道如何计算DocumentTermMatrix
的strings
Python编写的.