我正在尝试列出表格中每列火车的最新目的地(最近出发时间),例如:
Train Dest Time
1 HK 10:00
1 SH 12:00
1 SZ 14:00
2 HK 13:00
2 SH 09:00
2 SZ 07:00
Run Code Online (Sandbox Code Playgroud)
期望的结果应该是:
Train Dest Time
1 SZ 14:00
2 HK 13:00
Run Code Online (Sandbox Code Playgroud)
我试过用
SELECT Train, Dest, MAX(Time)
FROM TrainTable
GROUP BY Train
Run Code Online (Sandbox Code Playgroud)
我得到了一个"ora-00979不是GROUP BY表达式"错误,说我必须在我的分组中包含'Dest'语句.但肯定不是我想要的......
是否可以在一行SQL中执行此操作?
只是好奇SQL语法.所以,如果我有
SELECT
itemName as ItemName,
substring(itemName, 1,1) as FirstLetter,
Count(itemName)
FROM table1
GROUP BY itemName, FirstLetter
Run Code Online (Sandbox Code Playgroud)
这是不正确的,因为
GROUP BY itemName, FirstLetter
Run Code Online (Sandbox Code Playgroud)
真的应该
GROUP BY itemName, substring(itemName, 1,1)
Run Code Online (Sandbox Code Playgroud)
但为什么我们不能简单地使用前者来方便呢?
我正在使用这个数据框:
Fruit Date Name Number
Apples 10/6/2016 Bob 7
Apples 10/6/2016 Bob 8
Apples 10/6/2016 Mike 9
Apples 10/7/2016 Steve 10
Apples 10/7/2016 Bob 1
Oranges 10/7/2016 Bob 2
Oranges 10/6/2016 Tom 15
Oranges 10/6/2016 Mike 57
Oranges 10/6/2016 Bob 65
Oranges 10/7/2016 Tony 1
Grapes 10/7/2016 Bob 1
Grapes 10/7/2016 Tom 87
Grapes 10/7/2016 Bob 22
Grapes 10/7/2016 Bob 12
Grapes 10/7/2016 Tony 15
Run Code Online (Sandbox Code Playgroud)
我希望通过名称然后通过水果来汇总这个,以获得每个名字的水果总数.
Bob,Apples,16 ( for example )
Run Code Online (Sandbox Code Playgroud)
我尝试按名称和水果分组,但我如何获得水果总数.
如何通过密钥访问groupby对象中的相应groupby数据帧?使用以下groupby:
rand = np.random.RandomState(1)
df = pd.DataFrame({'A': ['foo', 'bar'] * 3,
'B': rand.randn(6),
'C': rand.randint(0, 20, 6)})
gb = df.groupby(['A'])
Run Code Online (Sandbox Code Playgroud)
我可以遍历它以获取密钥和组:
In [11]: for k, gp in gb:
print 'key=' + str(k)
print gp
key=bar
A B C
1 bar -0.611756 18
3 bar -1.072969 10
5 bar -2.301539 18
key=foo
A B C
0 foo 1.624345 5
2 foo -0.528172 11
4 foo 0.865408 14
Run Code Online (Sandbox Code Playgroud)
我希望能够做类似的事情
In [12]: gb['foo']
Out[12]:
A B C
0 foo 1.624345 5
2 foo …Run Code Online (Sandbox Code Playgroud) 我想将数据框分组为两列,然后对组内的聚合结果进行排序.
In [167]:
df
Out[167]:
count job source
0 2 sales A
1 4 sales B
2 6 sales C
3 3 sales D
4 7 sales E
5 5 market A
6 3 market B
7 2 market C
8 4 market D
9 1 market E
In [168]:
df.groupby(['job','source']).agg({'count':sum})
Out[168]:
count
job source
market A 5
B 3
C 2
D 4
E 1
sales A 2
B 4
C 6
D 3
E 7
Run Code Online (Sandbox Code Playgroud)
我现在想在每个组中按降序对count列进行排序.然后只占前三行.得到类似的东西:
count
job …Run Code Online (Sandbox Code Playgroud) 假设我想计算每组中不同值的比例.例如,使用所述mtcars数据,如何计算相对数量的频率齿轮由点(自动/手动)一气呵成与dplyr?
library(dplyr)
data(mtcars)
mtcars <- tbl_df(mtcars)
# count frequency
mtcars %>%
group_by(am, gear) %>%
summarise(n = n())
# am gear n
# 0 3 15
# 0 4 4
# 1 4 8
# 1 5 5
Run Code Online (Sandbox Code Playgroud)
我想要实现的目标:
am gear n rel.freq
0 3 15 0.7894737
0 4 4 0.2105263
1 4 8 0.6153846
1 5 5 0.3846154
Run Code Online (Sandbox Code Playgroud) 如何在linq(vb.net)中编写此查询?
select B.Name
from Company B
group by B.Name
having COUNT(1) > 1
Run Code Online (Sandbox Code Playgroud) 如何通过查询计算组返回的记录数,
例如:
select count(*)
from temptable
group by column_1, column_2, column_3, column_4
Run Code Online (Sandbox Code Playgroud)
给我,
1
1
2
Run Code Online (Sandbox Code Playgroud)
我需要计算上述记录,得到1 + 1 + 1 = 3.
我想列出所有销售,并按天分组.
Sales (saleID INT, amount INT, created DATETIME)
Run Code Online (Sandbox Code Playgroud)
更新 我正在使用SQL Server 2005
假设我有一组数据对,其中索引0是值,索引1是类型:
input = [
('11013331', 'KAT'),
('9085267', 'NOT'),
('5238761', 'ETH'),
('5349618', 'ETH'),
('11788544', 'NOT'),
('962142', 'ETH'),
('7795297', 'ETH'),
('7341464', 'ETH'),
('9843236', 'KAT'),
('5594916', 'ETH'),
('1550003', 'ETH')
]
Run Code Online (Sandbox Code Playgroud)
我想按类型(按第一个索引字符串)对它们进行分组:
result = [
{
type:'KAT',
items: ['11013331', '9843236']
},
{
type:'NOT',
items: ['9085267', '11788544']
},
{
type:'ETH',
items: ['5238761', '962142', '7795297', '7341464', '5594916', '1550003']
}
]
Run Code Online (Sandbox Code Playgroud)
我怎样才能以有效的方式实现这一目标?