小编fun*_*ion的帖子

在pandas DataFrame上排序时,numpy sort很奇怪

当我这样做时,data[genres].sum()我得到以下结果

Action        1891
Adult            9
Adventure     1313
Animation      314
Biography      394
Comedy        3922
Crime         1867
Drama         5697
Family         754
Fantasy        916
Film-Noir       40
History        358
Horror        1215
Music          371
Musical        260
Mystery       1009
News             1
Reality-TV       1
Romance       2441
Sci-Fi         897
Sport          288
Thriller      2832
War            512
Western        235
dtype: int64

Run Code Online (Sandbox Code Playgroud)

但是当我尝试使用时对总和进行排序 np.sort

genre_count = np.sort(data[genres].sum())[::-1]
pd.DataFrame({'Genre Count': genre_count})`

Run Code Online (Sandbox Code Playgroud)

我得到以下结果

`Out[19]:
    Genre Count
0   5697
1   3922
2   2832
3   2441
4   1891
5   1867
6   1313
7 …

Run Code Online (Sandbox Code Playgroud)

python sorting numpy dataframe pandas

fun*_*ion

2017 01-02

7
推荐指数

1
解决办法

776
查看次数

"\n","\ t"如何分别添加新行和制表符？

在编程语言中,如果我使用"\n",它会添加一个换行符.

有人可以解释"\n"如何转换为换行符和"\ t"相同吗？

c c++ java

fun*_*ion

2016 02-08

6
推荐指数

1
解决办法

2205
查看次数

对Pandas Grouby数据框建立索引给出错误

我有一个名为ratings_by_title的Pandas GroupBy数据框,如下所示:

title
$1,000,000 Duck (1971)                37
'Night Mother (1986)                  70
'Til There Was You (1997)             52
'burbs, The (1989)                   303
...And Justice for All (1979)        199
1-900 (1994)                           2
10 Things I Hate About You (1999)    700
101 Dalmatians (1961)                565
101 Dalmatians (1996)                364
12 Angry Men (1957)                  616

Run Code Online (Sandbox Code Playgroud)

我想过滤掉评分> = 250的标题,所以,

我尝试了以下内容 active_titles = ratings_by_title.index[ratings_by_title >= 250]

但是,这在iPython中说错了

AttributeError:无法访问'DataFrameGroupBy'对象的属性'index',请尝试使用'apply'方法

有人可以帮我理解发生了什么吗？

pandas

fun*_*ion

2015 05-06

4
推荐指数

1
解决办法

5502
查看次数

根据r中另一个数据框中的列填充数据框中的列

我有一个评论数据框,看起来像这样(df1)

Comments
Apple laptops are really good for work,we should buy them
Apple Iphones are too costly,we can resort to some other brands
Google search is the best search engine 
Android phones are great these days
I lost my visa card today

Run Code Online (Sandbox Code Playgroud)

我有另一个merchent名称的数据框,看起来像这样(df2):

Merchant_Name
Google
Android
Geoni
Visa
Apple
MC
WallMart

Run Code Online (Sandbox Code Playgroud)

如果df2中的merchant_name出现在df 1的Comment中,则将该商家名称附加到R中df1中的第二列.匹配不必是完全匹配.近似值是必需的.此外,df1包含大约500K行!我的最终输出df可能看起来像这样

Comments                                                        Merchant
Apple laptops are really good for work,we should buy them       Apple
Apple Iphones are too costly,we can resort to some other brands Apple
Google search …

Run Code Online (Sandbox Code Playgroud)

r data-analysis

fun*_*ion

lucky-day

3
推荐指数

1
解决办法

237
查看次数