当我在我的程序中运行mean_acc()方法时,有%(min_groups,self.n_splits)),警告)错误...
def mean_acc():
models = [
RandomForestClassifier(n_estimators=200, max_depth=3, random_state=0),
LinearSVC(),
MultinomialNB(),
LogisticRegression(random_state=0)]
CV = 6
cv_df = pd.DataFrame(index=range(CV * len(models)))
entries = []
for model in models:
model_name = model.__class__.__name__
accuracies = cross_val_score(model, features, labels, scoring='accuracy', cv=CV)
for fold_idx, accuracy in enumerate(accuracies):
entries.append((model_name, fold_idx, accuracy))
cv_df = pd.DataFrame(entries, columns=['model_name', 'fold_idx', 'accuracy'])
print(cv_df.groupby('model_name').accuracy.mean())
Run Code Online (Sandbox Code Playgroud)
这些是我使用mean_acc()方法运行程序时显示的错误.我可以知道如何在下面解决这些错误吗?请帮助我看看上面导致这些错误的代码,谢谢!
% (min_groups, self.n_splits)), Warning)
C:\Users\L31307\PycharmProjects\FYP\venv\lib\site-packages\sklearn\model_selection\_split.py:626: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of members in any …Run Code Online (Sandbox Code Playgroud) 以下是我尝试在名为的方法中读取文本文件中的文本的代码 check_keyword()
def check_keyword():
with open(unknown.txt, "r") as text_file:
unknown = text_file.readlines()
return unknown
Run Code Online (Sandbox Code Playgroud)
这就是我调用该方法的方式:
dataanalysis.category_analysis.check_keyword()
Run Code Online (Sandbox Code Playgroud)
文本文件中的文本:
Hello this is a new text file
Run Code Online (Sandbox Code Playgroud)
上述方法没有输出:((
当我运行下面的这些代码时,它给我错误,说有属性错误“ float”对象在python中没有属性“ split”。
我想知道为什么会出现此错误,请帮助我查看下面的代码,谢谢:((
pd.options.display.max_colwidth = 10000
df = pd.read_csv(output, sep='|')
def text_processing(df):
"""""=== Lower case ==="""
'''First step is to transform comments into lower case'''
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
'''=== Removal of stop words ==='''
df['content'] = df['content'].apply(lambda x: " ".join(x for x in x.split() if x not in stop_words))
'''=== Removal of Punctuation ==='''
df['content'] = df['content'].str.replace('[^\w\s]', '')
'''=== Removal of Numeric ==='''
df['content'] = df['content'].str.replace('[0-9]', '') …Run Code Online (Sandbox Code Playgroud)