小编use*_*422的帖子

./xx.py:line 1:import:未找到命令

我正在尝试使用此Python urllib2 Basic Auth Problem代码来从需要身份验证的URL下载网页内容.我正在尝试的代码是:

 import urllib2, base64

request = urllib2.Request("http://api.foursquare.com/v1/user")
base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string)   
result = urllib2.urlopen(request)
Run Code Online (Sandbox Code Playgroud)

它告诉我:

./xx.py: line 1: import: command not found
./xx.py: line 3: syntax error near unexpected token `('
./xx.py: line 3: `request = urllib2.Request("http://api.foursquare.com/v1/user")'
Run Code Online (Sandbox Code Playgroud)

我想知道我做错了什么?我在用Python 2.7.5.如何从需要身份验证的URL下载文件内容?

python url

19
推荐指数
4
解决办法
8万
查看次数

将csv文件加载到numpy并按名称访问列

我有一个csv标题文件,如:

鉴于此test.csv文件:

"A","B","C","D","E","F","timestamp"
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291111964948E12
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291113113366E12
611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291120650486E12
Run Code Online (Sandbox Code Playgroud)

我只想将它作为矩阵/ ndarray加载3行和7列,我也想column vectors从给定的访问column name.如果我使用genfromtxt(如下所示),我得到一个3行(每行一个)而没有列的ndarray.

r = np.genfromtxt('test.csv',delimiter=',',dtype=None, names=True)
print r
print r.shape

[ (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291111964948.0)
 (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291113113366.0)
 (611.88243, 9089.5601000000006, 5133.0, 864.07514000000003, 1715.3747599999999, 765.22776999999996, 1291120650486.0)]
(3,)
Run Code Online (Sandbox Code Playgroud)

我可以从列名中获取列向量,如下所示:

print r['A']
  [ 611.88243  611.88243  611.88243]
Run Code Online (Sandbox Code Playgroud)

如果,我使用load.txt然后我得到3行和7列的数组但无法columns使用column名称访问(如下所示).

numpy.loadtxt(open("test.csv","rb"),delimiter=",",skiprows=1)
Run Code Online (Sandbox Code Playgroud)

我明白了

  [ [611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291111964948E12]
    [611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291113113366E12]
    [611.88243,9089.5601,5133.0,864.07514,1715.37476,765.22777,1.291120650486E12] ]
Run Code Online (Sandbox Code Playgroud)

有没有办法Python可以同时实现两个要求(access columns by coluumn name …

python csv arrays numpy

9
推荐指数
1
解决办法
1万
查看次数

使用scikit-learn和hand计算的tf-idf矩阵值的差异

我正在玩scikit-learn找到tf-idf价值观.

我有一套documents像:

D1 = "The sky is blue."
D2 = "The sun is bright."
D3 = "The sun in the sky is bright."
Run Code Online (Sandbox Code Playgroud)

我想创建一个这样的矩阵:

   Docs      blue    bright       sky       sun
   D1 tf-idf 0.0000000 tf-idf 0.0000000
   D2 0.0000000 tf-idf 0.0000000 tf-idf
   D3 0.0000000 tf-idf tf-idf tf-idf
Run Code Online (Sandbox Code Playgroud)

所以,我的代码Python是:

import nltk
import string

from sklearn.feature_extraction.text import TfidfVectorizer
from nltk.corpus import stopwords

train_set = ["sky is blue", "sun is bright", "sun in the sky is bright"]
stop_words …
Run Code Online (Sandbox Code Playgroud)

python machine-learning matrix tf-idf

6
推荐指数
1
解决办法
1763
查看次数

数字图像处理

我对数字图像处理很陌生并且修复了下面提到的问题: - 我需要编写一个C程序,它将加载ppm图像文件并使用卷积内核进行线检测.任何形式的帮助将不胜感激.

c image-processing

3
推荐指数
1
解决办法
4553
查看次数

将for循环的输出写入多个文件

我试图读取txt文件的每一行,并打印出不同文件中的每一行.假设,我有一个文本如下:

How are you? I am good.
Wow, that's great.
This is a text file.
......
Run Code Online (Sandbox Code Playgroud)

现在,我希望filename1.txt有以下内容:

How are you? I am good.
Run Code Online (Sandbox Code Playgroud)

filename2.txt 具有:

Wow, that's great.
Run Code Online (Sandbox Code Playgroud)

等等.

我的代码是:

#! /usr/bin/Python

for i in range(1,4): // this range should increase with number of lines 
   with open('testdata.txt', 'r') as input:
       with open('filename%i.txt' %i, 'w') as output:
          for line in input:
            output.write(line)
Run Code Online (Sandbox Code Playgroud)

我得到的是,所有文件都包含文件的所有行.我希望每个文件只有1行,如上所述.

python file-io file

2
推荐指数
1
解决办法
9817
查看次数

计算字符串的tf-idf

我有2个文件doc1.txtdoc2.txt.这两份文件的内容如下:

 #doc1.txt
 very good, very bad, you are great

 #doc2.txt
 very bad, good restaurent, nice place to visit
Run Code Online (Sandbox Code Playgroud)

我想让我的语料库分开,,以便我的最终DocumentTermMatrix成为:

      terms
 docs       very good      very bad        you are great   good restaurent   nice place to visit
 doc1       tf-idf          tf-idf         tf-idf          0                    0
 doc2       0               tf-idf         0               tf-idf             tf-idf
Run Code Online (Sandbox Code Playgroud)

我知道,如何计算DocumentTermMatrix的各个单词(使用http://scikit-learn.org/stable/modules/feature_extraction.html),但不知道如何计算DocumentTermMatrixstringsPython编写的.

python tf-idf scikit-learn

0
推荐指数
1
解决办法
3219
查看次数