我正在尝试使用Eclipse作为远程编辑器.我想编辑的文件是一个SFTP服务器,我通常连接到使用私人.pem密钥(这是一个OpenStack的节点).我是否可以通过Remote System Explorer最终用户运行时 Eclipse插件阅读Eclipse本身支持FTP和SSH .如何配置它以使用私钥连接到SFTP .pem?
我没有任何密码,因为我使用私钥证明我的身份.pem,所以当我尝试在没有输入任何密码的情况下SSH到服务器时:

它抱怨说:

任何的想法?
如何最佳地(就财务成本而言)使用boto清空DynamoDB表?(就像我们可以在SQL中使用一条truncate语句一样。)
boto.dynamodb2.table.delete()或boto.dynamodb2.layer1.DynamoDBConnection.delete_table()删除整个表格,而boto.dynamodb2.table.delete_item() boto.dynamodb2.table.BatchTable.delete_item()只删除指定的项目。
在scikit-learn中,是否有任何内置函数可以使二进制概率分类器获得最大的准确性?
例如,要获得我最大的F1分数:
# AUCPR
precision, recall, thresholds = sklearn.metrics.precision_recall_curve(y_true, y_score)
auprc = sklearn.metrics.auc(recall, precision)
max_f1 = 0
for r, p, t in zip(recall, precision, thresholds):
if p + r == 0: continue
if (2*p*r)/(p + r) > max_f1:
max_f1 = (2*p*r)/(p + r)
max_f1_threshold = t
Run Code Online (Sandbox Code Playgroud)
我可以用类似的方式计算最大精度:
accuracies = []
thresholds = np.arange(0,1,0.1)
for threshold in thresholds:
y_pred = np.greater(y_score, threshold).astype(int)
accuracy = sklearn.metrics.accuracy_score(y_true, y_pred)
accuracies.append(accuracy)
accuracies = np.array(accuracies)
max_accuracy = accuracies.max()
max_accuracy_threshold = thresholds[accuracies.argmax()]
Run Code Online (Sandbox Code Playgroud)
但我想知道是否有任何内置功能。
scipy.sparse.hstack((1, [2]))并且scipy.sparse.hstack((1, [2]))工作得很好,但不是scipy.sparse.hstack(([1], [2])).为什么会这样?
以下是我系统上发生的情况:
C:\Anaconda>python
Python 2.7.10 |Anaconda 2.3.0 (64-bit)| (default, May 28 2015, 16:44:52) [MSC v.
1500 64 bit (AMD64)] on win32
>>> import scipy.sparse
>>> scipy.sparse.hstack((1, [2]))
<1x2 sparse matrix of type '<type 'numpy.int32'>'
with 2 stored elements in COOrdinate format>
>>> scipy.sparse.hstack((1, 2))
<1x2 sparse matrix of type '<type 'numpy.int32'>'
with 2 stored elements in COOrdinate format>
>>> scipy.sparse.hstack(([1], [2]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module> …Run Code Online (Sandbox Code Playgroud) 我使用sklearn.feature_extraction.text.CountVectorizer来计算n-gram.例:
import sklearn.feature_extraction.text # FYI http://scikit-learn.org/stable/install.html
ngram_size = 4
string = ["I really like python, it's pretty awesome."]
vect = sklearn.feature_extraction.text.CountVectorizer(ngram_range=(ngram_size,ngram_size))
vect.fit(string)
print('{1}-grams: {0}'.format(vect.get_feature_names(), ngram_size))
Run Code Online (Sandbox Code Playgroud)
输出:
4-grams: [u'like python it pretty', u'python it pretty awesome', u'really like python it']
Run Code Online (Sandbox Code Playgroud)
删除标点符号:如何将它们作为单独的标记包含在内?
我想知道如何从命令行使用Apache cTAKES.
例如:
\apache-ctakes-3.2.2-bin\apache-ctakes-3.2.2\desc\ctakes-clinical-pipeline\desc\analysis_engine\AggregatePlaintextUMLSProcessor.xml如何使用命令行获取分析引擎的输出(即注释)(即不使用UIMA CAS Visual Debugger或Collection Processing Engine等图形用户界面)?我更喜欢使用提供的JAR文件而不是编译代码.
问题很简单,但我无法在cTAKES的README或Confluence中找到这些信息 .
使用时sklearn.cross_decomposition.PLSRegression:
import numpy as np
import sklearn.cross_decomposition
pls2 = sklearn.cross_decomposition.PLSRegression()
xx = np.random.random((5,5))
yy = np.zeros((5,5) )
yy[0,:] = [0,1,0,0,0]
yy[1,:] = [0,0,0,1,0]
yy[2,:] = [0,0,0,0,1]
#yy[3,:] = [1,0,0,0,0] # Uncommenting this line solves the issue
pls2.fit(xx, yy)
Run Code Online (Sandbox Code Playgroud)
我明白了:
C:\Anaconda\lib\site-packages\sklearn\cross_decomposition\pls_.py:44: RuntimeWarning: invalid value encountered in divide
x_weights = np.dot(X.T, y_score) / np.dot(y_score.T, y_score)
C:\Anaconda\lib\site-packages\sklearn\cross_decomposition\pls_.py:64: RuntimeWarning: invalid value encountered in less
if np.dot(x_weights_diff.T, x_weights_diff) < tol or Y.shape[1] == 1:
C:\Anaconda\lib\site-packages\sklearn\cross_decomposition\pls_.py:67: UserWarning: Maximum number of iterations reached
warnings.warn('Maximum number …Run Code Online (Sandbox Code Playgroud) 如何查看TensorFlow会话的默认配置选项?
我知道config初始化会话时可以通过参数传递配置选项,例如:
# Launch the graph in a session that allows soft device placement and
# logs the placement decisions.
sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True,
log_device_placement=True))
Run Code Online (Sandbox Code Playgroud)
但是我不知道如何查看默认配置选项。 tensorflow/core/protobuf/config.proto看起来不像默认选项。
我正在尝试将R中保存为RData的数据帧导入到pandas数据帧中.我怎么能这样做?我没有成功尝试使用rpy2如下:
import pandas as pd
from rpy2.robjects import r
from rpy2.robjects import pandas2ri
pandas2ri.activate()
# I use iris for convenience but I could have done r.load('my_data.RData')
print(r.data('iris'))
print(r['iris'].head())
print(type(r.data('iris')))
print(pandas2ri.ri2py_dataframe(r.data('iris')))
print(pandas2ri.ri2py(r.data('iris')))
print(pd.DataFrame(r.data('iris')))
Run Code Online (Sandbox Code Playgroud)
输出:
[1] "iris"
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
<class 'rpy2.robjects.vectors.StrVector'>
0 1 2 3
0 i r i …Run Code Online (Sandbox Code Playgroud) 如何在pandas的数据框中检索k个最高值?
例如,给定DataFrame:
b d e
Utah 1.624345 -0.611756 -0.528172
Ohio -1.072969 0.865408 -2.301539
Texas 1.744812 -0.761207 0.319039
Oregon -0.249370 1.462108 -2.060141
Run Code Online (Sandbox Code Playgroud)
生成:
import numpy as np
import pandas as pd
np.random.seed(1)
frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'),
index=['Utah', 'Ohio', 'Texas', 'Oregon'])
print(frame)
Run Code Online (Sandbox Code Playgroud)
数据框中的3个最高值是: