meo*_*ver 5 python numpy python-3.x pandas
我收到以下错误:
TypeError Traceback (most recent call last)
C:\Users\levanim\Desktop\Levani Predictive\cosinesimilarity1.py in <module>()
39
40 for i in meowmix_nearest_neighbors.index:
---> 41 top_ten = pd.DataFrame(similarity_matrix.ix[i,]).sort([i],
ascending=False[1:6]).index.values
42 meowmix_nearest_neighbors.ix[i,:] = top_ten
43
TypeError: 'bool' object is not subscriptable
Run Code Online (Sandbox Code Playgroud)
对于以下代码。我是 Python 新手,不太清楚如何更改语法(如果它是 python 3 语法问题)。有人遇到这种情况吗?我认为这与 ascending=False[1:6] 部分有关,并且花了一些时间将我的头撞在墙上。希望这是一个简单的修复,但还不够了解
import numpy as np
import pandas as pd
from scipy.spatial.distance import cosine
enrollments = pd.read_csv(r'C:\Users\levanim\Desktop\Levani
Predictive\smallsample.csv')
meowmix = enrollments.fillna(0)
meowmix.ix[0:5,0:5]
def getCosine(x,y) :
cosine = np.sum(x*y) / (np.sqrt(np.sum(x*x)) * np.sqrt(np.sum(y*y)))
return cosine
print("done creating cosine function")
similarity_matrix = pd.DataFrame(index=meowmix.columns,
columns=meowmix.columns)
similarity_matrix = similarity_matrix.fillna(np.nan)
similarity_matrix.ix[0:5,0:5]
print("done creating a matrix placeholder")
for i in similarity_matrix.columns:
for j in similarity_matrix.columns:
similarity_matrix.ix[i,j] = getCosine(meowmix[i].values,
meowmix[j].values)
print("done looping through each column and filling in placeholder with
cosine similarities")
meowmix_nearest_neighbors = pd.DataFrame(index=meowmix.columns,
columns=['top_'+str(i+1) for i in
range(5)])
meowmix_nearest_neighbors = meowmix_nearest_neighbors.fillna(np.nan)
print("done creating a nearest neighbor placeholder for each item")
for i in meowmix_nearest_neighbors.index:
top_ten = pd.DataFrame(similarity_matrix.ix[i,]).sort([i],
ascending=False[1:6]).index.values
meowmix_nearest_neighbors.ix[i,:] = top_ten
print("done creating the top 5 neighbors for each item")
meowmix_nearest_neighbors.head()
Run Code Online (Sandbox Code Playgroud)
代替
top_ten = pd.DataFrame(similarity_matrix.ix[i,]).sort([i],
ascending=False[1:6]).index.values
Run Code Online (Sandbox Code Playgroud)
使用
top_ten = pd.DataFrame(similarity_matrix.ix[i,]).sort([i],
ascending=False), [1:6]).index.values
Run Code Online (Sandbox Code Playgroud)
),(即在 后面插入False。)
False是方法参数的值,sort()含义为“不按升序排列”,即需要降序排列。因此,您需要使用 终止sort()方法参数列表),然后使用 分隔构造函数的第一个参数DataFrame和第二个,参数。
[1:6]是DataFrame 构造函数的第二个参数(用于结果帧的索引)
是的,你不能这样做False[1:6]-False是一个boolean,意味着它只能是两件事之一 (False或True)
只需将其更改为False即可解决您的问题。
该[1:6]构造用于与lists 一起使用。因此,如果您有,例如:
theList = [ "a","b","c","d","e","f","g","h","i","j","k","l" ]
print theList # (prints the whole list)
print theList[1] # "b"
print theList[1:6] # ['b', 'c', 'd', 'e', 'f']
Run Code Online (Sandbox Code Playgroud)
在 python 中,这称为“切片”,并且非常有用。
您还可以执行以下操作:
print theList[6:] # everything in the list after "f"
print theList[:6] # everything in the list before "f", but including f
Run Code Online (Sandbox Code Playgroud)
我鼓励您使用Jupyter Notebook来玩这个- 当然,请阅读文档