给定一个建筑信息数据框,如下所示:
id floor type
0 1 13 office
1 2 12 office
2 3 9 office
3 4 9 office
4 5 7 office
5 6 6 office
6 7 9 office
7 8 5 office
8 9 5 office
9 10 5 office
10 11 4 retail
11 12 3 retail
12 13 2 retail
13 14 1 retail
14 15 -1 parking
15 16 -2 parking
16 17 13 office
Run Code Online (Sandbox Code Playgroud)
我想检查列floor中是否缺少楼层(楼层 0 除外,默认情况下不存在楼层)。
代码: …
我有一个像这样的pandas数据框:
import pandas as pd
import numpy as np
data = {
"Type": ["A", "A", "B", "B", "B"],
"Project": ["X123", "X123", "X21", "L31", "L31"],
"Number": [100, 300, 100, 200, 500],
"Status": ['Y', 'Y', 'N', 'Y', 'N']
}
df = pd.DataFrame.from_dict(data)
Run Code Online (Sandbox Code Playgroud)
我想按类型进行分组,并获得计数和几个条件的总和,得到如下结果:
Type Total_Count Total_Number Count_Status=Y Number_Status=Y Count_Status=N Number_Status=N
A 2 400 2 400 0 0
B 5 800 1 200 2 600
Run Code Online (Sandbox Code Playgroud)
我试过以下但不完全是我需要的.请分享您可能有的任何想法.谢谢!
df1 = pd.pivot_table(df, index = 'Type', values = 'Number', aggfunc = np.sum)
df2 = pd.pivot_table(df, index = …Run Code Online (Sandbox Code Playgroud) 我有以下数据框:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
'B': 'one one two three two two one three'.split(),
'C': np.arange(8), 'D': np.arange(8) * 2})
print(df1)
A B C D
0 foo one 0 0
1 bar one 1 2
2 foo two 2 4
3 bar three 3 6
4 foo two 4 8
5 bar two 5 10
6 foo one 6 12
7 foo three 7 14 …Run Code Online (Sandbox Code Playgroud) 我有两个列表,如下所示:
list1 = ['bj-100-cy','bj-101-hd','sh-200-pd','sh-201-hp']
list2 = [100, 200]
Run Code Online (Sandbox Code Playgroud)
我想list1按 的元素进行子字符串过滤list2并获得预期输出,如下所示:
outcome = ['bj-100-cy', 'sh-200-pd']
Run Code Online (Sandbox Code Playgroud)
做时:
list1 = str(list1)
list2 = str(list2)
outcome = [x for x in list2 if [y for y in list1 if x in y]]
Run Code Online (Sandbox Code Playgroud)
我得到这样的结果:['[', '1', '0', '0', ',', ' ', '2', '0', '0', ']']。如何才能正确过滤呢?谢谢。
相关参考:
给定一个数据框如下:
df <- data.frame(city = c("bj", "sh", "gz", "sz"),
price = c(12, 7, 5, 6),
pct = c(-2.3, 5, -4, 4), stringsAsFactors=FALSE)
Run Code Online (Sandbox Code Playgroud)
出去:
city price pct
0 bj 12 -2.3
1 sh 7 5.0
2 gz 5 -4.0
3 sz 6 4.0
Run Code Online (Sandbox Code Playgroud)
我想用 ggplot: barchartfor city, pointfor绘制一个图pct,但我想为负值和正值使用不同的颜色。
我怎样才能在ggplot2中做到这一点?
代码:
ggplot(df, aes(fill = city, y = price, x = city)) +
geom_bar(position = "dodge", stat = "identity", alpha = 0.5, fill = "#FF6666") +
geom_point(data = …Run Code Online (Sandbox Code Playgroud) 假设我们有一个学生的成绩数据df1和学分数据df2如下:
df1:
stu_id major Python English C++
0 U202010521 computer 56 81 82
1 U202010522 management 92 56 64
2 U202010523 management 95 88 81
3 U202010524 BigData&AI 79 53 74
4 U202010525 computer 53 71 -1
5 U202010526 computer 78 96 53
6 U202010527 BigData&AI 69 63 74
7 U202010528 BigData&AI 86 57 82
8 U202010529 BigData&AI 81 100 85
9 U202010530 BigData&AI 79 67 80
Run Code Online (Sandbox Code Playgroud)
df2:
class credit
0 Python 2
1 English 4 …Run Code Online (Sandbox Code Playgroud) 对于以下玩具数据dd,我尝试根据向量的顺序对列进行分组langue并重新排列列:charchar_order
dd <- data.frame(langue = c('English', 'French', 'English', 'French'),
char = c('world', 'monde', 'hello', 'bonjour'),
x = c(8, 3, 9, 9),
y = c(1, 1, 1, 2))
dd
char_order <- c('hello', 'world', 'bonjour', 'monde')
dd %>%
group_by(langue) %>% arrange(.by_group = TRUE)
Run Code Online (Sandbox Code Playgroud)
出去:
langue char x y
<chr> <chr> <dbl> <dbl>
1 English world 8 1
2 English hello 9 1
3 French monde 3 1
4 French bonjour 9 2
Run Code Online (Sandbox Code Playgroud)
但我希望得到如下结果:
langue char x …Run Code Online (Sandbox Code Playgroud) 我的目标是学习笔记本。它的召回率高达 97%,而我却在 F1 分数“流失的客户”77.9% 中苦苦挣扎。问题是笔记本使用LightGBM。我无法安装 LightGBM。
\n我尝试过的:
\npip install lightgbm-> 它抛出错误python setup.py egg_info did not run successfully.pip install wheel-> 现在它抛出错误python setup.py bdist_wheel did not run successfully.pip install Cmake,,pip install --upgrade pip setuptoolsbrew install libomp >错误仍然存在。完整错误
\n \xc3\x97 python setup.py bdist_wheel did not run successfully.\n \xe2\x94\x82 exit code: 1\n \xe2\x95\xb0\xe2\x94\x80> [80 lines of output]\n INFO:root:running bdist_wheel\n /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install …Run Code Online (Sandbox Code Playgroud) 我有一个像这样的Pandas DataFrame:
id fruits
01 Apple, Apricot
02 Apple, Banana, Clementine, Pear
03 Orange, Pineapple, Pear
Run Code Online (Sandbox Code Playgroud)
如何通过删除重复项来获得这样的水果列表?
['Apple','Apricot','Banana','Clementine','Orange','Pear','Pineapple']
Run Code Online (Sandbox Code Playgroud) 我有以下数据帧:
import pandas as pd
import numpy as np
data = {
"index": [1, 2, 3, 4, 5],
"A": [11, 17, 5, 9, 10],
"B": [8, 6, 16, 17, 9],
"C": [10, 17, 12, 13, 15],
"target": [12, 13, 8, 6, 12]
}
df = pd.DataFrame.from_dict(data)
print(df)
Run Code Online (Sandbox Code Playgroud)
我想在A,B和C列中找到列目标的最接近值,并将这些值放入列结果中.据我所知,我需要使用abs()和argmin()函数.这是我预期的输出:
index A B C target result
0 1 11 8 10 12 11
1 2 17 6 17 13 17
2 3 5 16 12 8 5
3 4 9 17 13 …Run Code Online (Sandbox Code Playgroud)