我正在使用 SMOTE 重新采样我的数据(多类)。
sm = SMOTE(random_state=1)
X_res, Y_res = sm.fit_resample(X_train, Y_train)
Run Code Online (Sandbox Code Playgroud)
但是,我收到此属性错误。任何人都可以帮忙吗?
我最近在 jupyter 中安装了 imblearn 包
!pip show imbalanced-learn
Run Code Online (Sandbox Code Playgroud)
但我无法导入这个包。
from tensorflow.keras import backend
from imblearn.over_sampling import SMOTE
Run Code Online (Sandbox Code Playgroud)
我收到以下错误
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-20-f19c5a0e54af> in <module>
1 # from sklearn.utils import resample
2 from tensorflow.keras import backend
----> 3 from imblearn.over_sampling import SMOTE
4
5
~/.virtualenvs/p3/lib/python3.6/site-packages/imblearn/__init__.py in <module>
32 Module which allowing to create pipeline with scikit-learn estimators.
33 """
---> 34 from . import combine
35 from . import ensemble
36 from . import exceptions
~/.virtualenvs/p3/lib/python3.6/site-packages/imblearn/combine/__init__.py in …Run Code Online (Sandbox Code Playgroud) 我正在尝试从swagger(使用drf-yasg)进行上传文件输入,但是当我使用MultiPartParser类时,它给了我以下错误:
drf_yasg.errors.SwaggerGenerationError: FileField is supported only in a formData Parameter or response Schema
Run Code Online (Sandbox Code Playgroud)
我的看法:
drf_yasg.errors.SwaggerGenerationError: FileField is supported only in a formData Parameter or response Schema
Run Code Online (Sandbox Code Playgroud)
我的序列化器:
class AddExperience(generics.CreateAPIView):
parser_classes = [MultiPartParser]
permission_classes = [IsAuthenticated]
serializer_class = DoctorExperienceSerializer
Run Code Online (Sandbox Code Playgroud)
我也尝试过FormParser,但它仍然给我同样的错误。另外:FileUploadParser解析器但它的工作原理如下JsonParser:
我的文字如图所示:
list1 = ["My name is xyz", "My name is pqr", "I work in abc"]
Run Code Online (Sandbox Code Playgroud)
以上将是使用 kmeans 聚类文本的训练集。
list2 = ["My name is xyz", "I work in abc"]
Run Code Online (Sandbox Code Playgroud)
以上是我的测试集。
我构建了一个矢量化器和模型,如下所示:
vectorizer = TfidfVectorizer(min_df = 0, max_df=0.5, stop_words = "english", charset_error = "ignore", ngram_range = (1,3))
vectorized = vectorizer.fit_transform(list1)
km=KMeans(n_clusters=2, init='k-means++', n_init=10, max_iter=1000, tol=0.0001, precompute_distances=True, verbose=0, random_state=None, copy_x=True, n_jobs=1)
km.fit(vectorized)
Run Code Online (Sandbox Code Playgroud)
如果我尝试预测“list2”测试集的集群:
km.predict(list2)
Run Code Online (Sandbox Code Playgroud)
我收到以下错误:
ValueError: Incorrect number of features. Got 2 features, expected 5
Run Code Online (Sandbox Code Playgroud)
有人告诉我用它Pipeline来解决这个问题。所以我写了下面的代码:
pipe = Pipeline([('vect', vectorizer), ('vectorized', …Run Code Online (Sandbox Code Playgroud) python machine-learning k-means scikit-learn scikit-learn-pipeline
当我导入 Excel 文件时,列中的某些数字是浮点型,有些不是。如何将所有内容转换为浮点数?里面的空间3 000,00给我带来了问题。
df['column']:
column
0 3 000,00
1 156.00
2 0
Run Code Online (Sandbox Code Playgroud)
我在尝试:
df['column']:
column
0 3 000,00
1 156.00
2 0
Run Code Online (Sandbox Code Playgroud)
但它不起作用。我会在之后做.astype(float),但无法到达那里。有什么解决办法吗?1已经是一个浮点数,但是0是一个字符串。
我无法导入 Transformer 类的管道函数,因为我的 jupyter 内核一直死机。尝试使用 Transformer-4.15.0 和 4.16.2。有人遇到过这个问题吗?
我尝试将类导入到新笔记本中,如图所示,它不断杀死内核。

python jupyter-notebook jupyter-lab huggingface-transformers
我试过:
from lazypredict.Supervised import LazyClassifier
Run Code Online (Sandbox Code Playgroud)
但得到以下回溯:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-f518cae57501> in <module>
10 from sklearn.linear_model import LogisticRegression
11 from sklearn.ensemble import RandomForestClassifier
---> 12 from lazypredict.Supervised import LazyClassifier
13 from sklearn.model_selection import GridSearchCV
14 from sklearn.metrics import accuracy_score
~\AppData\Roaming\Python\Python38\site-packages\lazypredict\Supervised.py in <module>
14 from sklearn.preprocessing import StandardScaler, OneHotEncoder, OrdinalEncoder
15 from sklearn.compose import ColumnTransformer
---> 16 from sklearn.utils.testing import all_estimators
17 from sklearn.base import RegressorMixin
18 from sklearn.base import ClassifierMixin
S:\anaconda\lib\site-packages\sklearn\utils\testing.py in <module>
5 from . import …Run Code Online (Sandbox Code Playgroud) 我有点困惑,我想使用 texthero 库进行一些 PCA 分析。但是当我尝试运行我的代码时:
import texthero as hero
import pandas as pd
df['pca']=(df['clean_tweet'].pipe(hero.clean).pipe(hero.do_tfidf).pipe(hero.do_pca))
hero.scatterplot(df, col='pca', color='topic', title="PCA BBC Sport news")
Run Code Online (Sandbox Code Playgroud)
我收到错误:
ModuleNotFoundError: No module named 'gensim.sklearn_api
Run Code Online (Sandbox Code Playgroud)
但是当我输入 !pip 显示 gensim 时。我有
Name: gensim
Version: 4.0.1
Summary: Python framework for fast Vector Space Modelling
Home-page: http://radimrehurek.com/gensim
Run Code Online (Sandbox Code Playgroud) 所以我一直在致力于fake_useragent一个网络抓取项目:
from fake_useragent import UserAgent
ua = UserAgent()
headers = {
"User-Agent":ua.random
}
Run Code Online (Sandbox Code Playgroud)
但我最近在 Python 3 中收到了这个错误:
Error occurred during loading data. Trying to use cache server file https://useragent.melroy.org/cache.json
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/fake_useragent/utils.py", line 64, in get
urlopen(
File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.10/urllib/request.py", line 525, in open
response = meth(req, response)
File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response
response = self.parent.error(
File "/usr/lib/python3.10/urllib/request.py", line 563, in error
return self._call_chain(*args) …Run Code Online (Sandbox Code Playgroud) 我已经重新安装了最新的 scikit-learn 和不平衡学习。我还检查了所有其他库,以确保它们与不平衡学习兼容。
我只想运行一个简单的RandomOverSample(),但我收到以下导入错误消息:
import imblearn
from imblearn.over_sampling import RandomOverSampler
ros = RandomOverSampler()
X_ros, y_ros = ros.fit_sample(x, y)
Run Code Online (Sandbox Code Playgroud)
错误信息:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-122-0bf7409a8688> in <module>
----> 1 import imblearn
2 from imblearn.over_sampling import RandomOverSampler
3
4 ros = RandomOverSampler()
5 X_ros, y_ros = ros.fit_sample(x, y)
~/.local/lib/python3.5/site-packages/imblearn/__init__.py in <module>
32 Module which allowing to create pipeline with scikit-learn estimators.
33 """
---> 34 from . import combine
35 from . import ensemble
36 from . …Run Code Online (Sandbox Code Playgroud) python ×9
scikit-learn ×5
imblearn ×3
python-3.x ×3
pandas ×2
django ×1
drf-yasg ×1
gensim ×1
github ×1
graph ×1
jupyter-lab ×1
k-means ×1
linux ×1
string ×1
themes ×1
user-agent ×1