我想隐藏这个警告UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy并且我已经尝试过
import warnings
warnings.simplefilter(action='ignore', category=UserWarning)
import pandas
Run Code Online (Sandbox Code Playgroud)
但警告仍然显示。
我的 python 脚本从数据库读取数据。我用于pandas.read_sqlSQL 查询和psycopg2数据库连接。
我还想知道哪一行触发了警告。
I want to do "One-sample test for proportion" with Python. I found this document one sample proportion ztest example but I don't understand how to use it. For example, what are count and nobs. In the 2 examples, example1 gives single number for count and nobs, however, example2 gives 2 numbers.
For result, I'd like to know the p-value that the event happen rate is higher than 60%
Example1
>>> count = 5
>>> nobs = 83
>>> value = …Run Code Online (Sandbox Code Playgroud) 我想将数据帧转换为 fasttext 格式
我的数据框
text label
Fan bake vs bake baking
What's the purpose of a bread box? storage-method
Michelin Three Star Restaurant; but if the chef is not there restaurant
Run Code Online (Sandbox Code Playgroud)
快速文本格式
__label__baking Fan bake vs bake
__label__storage-method What's the purpose of a bread box?
__label__restaurant Michelin Three Star Restaurant; but if the chef is not there
Run Code Online (Sandbox Code Playgroud)
我尝试过df['label'].apply(lambda x: '__label__' + x).add_suffix(df['text'])
,但它没有按我的预期工作。我应该如何更改我的代码?
您好,我有一个包含文本的数据框列。我想使用 fasttext 模型来进行预测。我可以通过将文本数组传递给 fasttext 模型来实现此目的。
import fasttext
d = {'id':[1, 2, 3], 'name':['a', 'b', 'c']}
df = pd.DataFrame(data=d)
Run Code Online (Sandbox Code Playgroud)
我从系列中删除了“\n”
name_list = df['name'].tolist()
name_list = [name.strip() for name in name_list]
Run Code Online (Sandbox Code Playgroud)
并做出预测model.predict(name_list)
然而,我得到了ValueError: predict processes one line at a time (remove '\n')
我的列表中没有 '\n' 并且'\n' in name_list返回False
我还发现了一个有类似问题的帖子,但仍然遇到同样的错误。
predictions=[]
for line in df['name']:
pred_label=model.predict(line, k=-1, threshold=0.5)[0][0]
predictions.append(pred_label)
df['prediction']=predictions
Run Code Online (Sandbox Code Playgroud) I want to remove dataframe rows that index is larger than 13491.
I tried
df.drop(df.index > [13491])
Run Code Online (Sandbox Code Playgroud)
but received error
KeyError: 'labels [False False False ... True True True] not contained in axis'
Run Code Online (Sandbox Code Playgroud)
This one works fine
df= df[df.index < 13492]
Run Code Online (Sandbox Code Playgroud)
But how to remove the filtered rows from dataframe ?
Can someone give me some suggestions ? Thank you in advanced !
嗨,我有一个如下表所示,我希望按照date_contact和user_id分组创建群组.我收到错误消息,说"cohort_month"不是有效名称.

SELECT user_id, CONVERT(VARCHAR(7), min(date_contact), 120) AS cohort_month
from cohort
group by user_id, cohort_month
Run Code Online (Sandbox Code Playgroud)
有什么建议吗?谢谢!
您好,我有一个数据框,我想从频率表中选择百分比最高的列。
d = {'c1':['a', 'a', 'b', 'b', 'c', 'c'], 'c2':['Low', 'High', 'Low', 'High', 'High', 'High']}
dd = pd.DataFrame(data=d)
dd.groupby('c1')['c2'].value_counts(normalize=True).mul(100)
Run Code Online (Sandbox Code Playgroud)
它将返回一个频率表
c1 c2
a High 50.0
Low 50.0
b High 50.0
Low 50.0
c High 100.0
Name: c2, dtype: float64
Run Code Online (Sandbox Code Playgroud)
我想打印出c百分比最高的100.0
我可以使用max()打印输出100.0,但不知道如何打印输出c
我有一个数据框,我想用它来绘制树图squarify。我想通过编辑参数在图表上显示country_name和,但它似乎只采用一个值。countslabels
示例数据
import squarify
import pandas as pd
from matplotlib import pyplot as plt
d = {'country_name':['USA', 'UK', 'Germany'], 'counts':[100, 200, 300]}
dd = pd.DataFrame(data=d)
Run Code Online (Sandbox Code Playgroud)
fig = plt.gcf()
ax = fig.add_subplot()
fig.set_size_inches(16, 4.5)
norm = matplotlib.colors.Normalize(vmin=min(dd.counts), vmax=max(dd.counts))
colors = [matplotlib.cm.Blues(norm(value)) for value in dd.counts]
squarify.plot(label=dd.country_name, sizes=dd.counts, alpha=.7, color=colors)
plt.axis('off')
plt.show()
Run Code Online (Sandbox Code Playgroud)
预期输出将在图表上同时出现counts和。country_name
我想将字典的所有值更改为1(浮点数),我在网上进行了研究,但似乎人们很少有这种随机需求.
这本词典有数以千计的条目,下面是其中的一部分
{
'2015': [2.8216107792591907],
'2016': [2.3686578052627687],
'2017': [2.03069274701226]
}
Run Code Online (Sandbox Code Playgroud)
有人可以给我一些想法吗?谢谢!
pandas ×6
python ×5
fasttext ×2
arrays ×1
dictionary ×1
matplotlib ×1
postgresql ×1
sql-server ×1
squarify ×1