这是片段:
test = pd.DataFrame({'days': [0,31,45]})
test['range'] = pd.cut(test.days, [0,30,60])
Run Code Online (Sandbox Code Playgroud)
输出:
days range
0 0 NaN
1 31 (30, 60]
2 45 (30, 60]
Run Code Online (Sandbox Code Playgroud)
我很惊讶0不在(0,30),我应该怎么做才能将0归类为(0,30)?
jez*_*ael 30
test['range'] = pd.cut(test.days, [0,30,60], include_lowest=True)
print (test)
days range
0 0 (-0.001, 30.0]
1 31 (30.0, 60.0]
2 45 (30.0, 60.0]
Run Code Online (Sandbox Code Playgroud)
看到差异:
test = pd.DataFrame({'days': [0,20,30,31,45,60]})
test['range1'] = pd.cut(test.days, [0,30,60], include_lowest=True)
#30 value is in [30, 60) group
test['range2'] = pd.cut(test.days, [0,30,60], right=False)
#30 value is in (0, 30] group
test['range3'] = pd.cut(test.days, [0,30,60])
print (test)
days range1 range2 range3
0 0 (-0.001, 30.0] [0, 30) NaN
1 20 (-0.001, 30.0] [0, 30) (0, 30]
2 30 (-0.001, 30.0] [30, 60) (0, 30]
3 31 (30.0, 60.0] [30, 60) (30, 60]
4 45 (30.0, 60.0] [30, 60) (30, 60]
5 60 (30.0, 60.0] NaN (30, 60]
Run Code Online (Sandbox Code Playgroud)
或使用numpy.searchsorted,但days有待分类的值:
arr = np.array([0,30,60])
test['range1'] = arr.searchsorted(test.days)
test['range2'] = arr.searchsorted(test.days, side='right') - 1
print (test)
days range1 range2
0 0 0 0
1 20 1 0
2 30 1 1
3 31 2 1
4 45 2 1
5 60 2 2
Run Code Online (Sandbox Code Playgroud)
piR*_*red 15
pd.cut文档
包含参数right=False
test = pd.DataFrame({'days': [0,31,45]})
test['range'] = pd.cut(test.days, [0,30,60], right=False)
test
days range
0 0 [0, 30)
1 31 [30, 60)
2 45 [30, 60)
Run Code Online (Sandbox Code Playgroud)
小智 10
您也可以对 pd.cut() 使用标签。以下示例包含 0-10 范围内的学生成绩。我们添加了一个名为“grade_cat”的新列来对成绩进行分类。
bins代表区间:0-4为1个区间,5-6为1个区间,依此类推对应的标签为“差”、“正常”等
bins = [0, 4, 6, 10]
labels = ["poor","normal","excellent"]
student['grade_cat'] = pd.cut(student['grade'], bins=bins, labels=labels)
Run Code Online (Sandbox Code Playgroud)
小智 5
.cut 如何工作的示例
s=pd.Series([168,180,174,190,170,185,179,181,175,169,182,177,180,171])
pd.cut(s,3)
#To add labels to bins
pd.cut(s,3,labels=["Small","Medium","Large"])
Run Code Online (Sandbox Code Playgroud)
这可以直接用于范围
| 归档时间: |
|
| 查看次数: |
41107 次 |
| 最近记录: |