小编use*_*235的帖子

有效地选择互斥对

这是一个可以用某种类型的暴力算法完成的问题,但我想知道是否有一些有效的方法来做到这一点.

我们假设我们有以下整数对

 (1, 3), (2, 5), (4, 7), (2, 7), (10, 9)

Run Code Online (Sandbox Code Playgroud)

我们想弄清楚互斥对的最大数量是多少.

通过互斥对,我的意思是它们没有任何公共整数.

例如,我们不能同时选择(2,5),(2,7),因为两个对都包含2.

在上面的例子中,4将是一个解决方案,因为我们可以选择以下互斥对:

      (1, 3), (2, 5), (4, 7), (10, 9)

Run Code Online (Sandbox Code Playgroud)

因此总共有4对.

我想知道是否有有效的方法这样做.

algorithm

use*_*235

lucky-day

9
推荐指数

1
解决办法

760
查看次数

将 pandas GroupBy 中的列值聚合为字典

这是我之前面试的时候也问过的问题。

我们的输入数据具有以下列：

语言、产品 ID、货架 ID、排名

例如，输入将具有以下格式

English, 742005, 4560, 10.2 
English, 6000075389352, 4560, 49
French, 899883993, 4560, 32
French, 731317391, 7868, 81

Run Code Online (Sandbox Code Playgroud)

我们希望对语言、货架 ID 列进行“分组”操作，并根据“排名”属性上的 sort desc 对产品列表进行排序，这将导致输出具有以下格式：

语言、shelf_id、{product_id:rank1、product_id:rank2 ....}

对于每条记录。

对于给定的输入，输出如下：

English, 4560, {6000075389352:49, 742005:10.2}
French, 4560, 899883993:32
French, 7868, 731317391:81

Run Code Online (Sandbox Code Playgroud)

我通过使用键（通过组合语言和货架 id 创建）创建一个字典并插入每个键的产品 id 和排名来解决这个问题。

我的方法有效，但看起来有一种更简单的方法可以使用 python pandas 库来实现。我读过一些参考资料，但我仍然不确定是否有比我所做的更好的方法（通过使用语言、书架 ID 和具有该密钥的字典创建密钥来解决问题）

任何帮助将不胜感激。

python dictionary dataframe pandas pandas-groupby

use*_*235

2019 06-28

5
推荐指数

1
解决办法

5364
查看次数

在python pandas中计算两列之间的许多相同的单词

我们假设我在python pandas中有下表

friend_description  friend_definition
    James is dumb      dumb dude
    Jacob is smart     smart guy
    Jane is pretty     she looks pretty
    Susan is rich      she is rich

Run Code Online (Sandbox Code Playgroud)

在这里,在第一行中,'dumb'一词包含在两列中.在第二行中,'smart'包含在两列中.在第三行中,'pretty'包含在两列中,在最后一行中,'is'和'rich'包含在两列中.我想创建以下列:

friend_description  friend_definition      word_overlap    overlap_count
    James is dumb      dumb dude              dumb             1
    Jacob is smart     smart guy              smart            1
    Jane is pretty     she looks pretty       pretty           1
    Susan is rich      she is rich            is rich          2

Run Code Online (Sandbox Code Playgroud)

我可以使用for循环来手动定义带有这些东西的新列,但我想知道pandas中是否有一个函数可以使这种类型的操作更加平滑.

python pandas

use*_*235

lucky-day

5
推荐指数

1
解决办法

440
查看次数

Split sentence into words and non-white characters for POS Tagging

This was the question I got from an onsite interview with a tech firm, and one that I think ultimately killed my chances.

You're given a sentence, and a dictionary that has words as keys and parts of speech as values.

The goal is to write a function in which when you're given a sentence, change each word to its part of speech given in the dictionary in order. We can assume that all the stuffs in sentence are present …

python nlp

use*_*235

2019 05-22

3
推荐指数

1
解决办法

68
查看次数

计算R中的转移概率

假设我们有以下4种状态:(A,B,C,D)

我的表格格式如下

old   new 
A      B
A      A
B      C
D      B
C      D
.      .
.      .
.      .
.      .

Run Code Online (Sandbox Code Playgroud)

我想根据表中给出的数据计算以下概率:

P(new=A | old=A)
P(new=B | old=A)
P(new=C | old=A)
P(new=D | old=A)
P(new=A | old=B)
.
.
.
.
P(new=C | old=D)
P(new=D | old=D)

Run Code Online (Sandbox Code Playgroud)

我可以手动方式完成,在每次转换发生时总结所有值并除以行数,但我想知道R中是否有内置函数来计算这些概率或者至少有助于加强计算那些概率.

任何帮助/输入将不胜感激.如果没有这样的功能哦,哦.

use*_*235

lucky-day

2
推荐指数

1
解决办法

1514
查看次数

我想创建一个函数来读取输入的每一行并产生它的总和,并使用C++将其保存为sum.txt

我们假设我有以下输入:

Run Code Online (Sandbox Code Playgroud)

存储在wow.txt.

我想创建一个函数来读取输入的每一行并产生它的总和并保存为sum.txt使用C++.

在输入文件中,我们有以下内容:

1)我们不知道每一行的长度,但它最多有10个整数.
2)每个整数由空格分隔.

所以我开始

ifstream inFile;
inFile.open("wow.txt");
ofstream outFile;
outFile.open("sum.txt");

Run Code Online (Sandbox Code Playgroud)

并且不确定接下来该做什么.

我的一些朋友建议我使用getline,标记每一行,然后将字符串转换为整数,但我想知道是否有更简单的方法来做到这一点而无需来回更改类型(int to string,string to int) .

任何帮助将不胜感激.

c++ iostream

use*_*235

2016 03-26

1
推荐指数

1
解决办法

1万
查看次数

根据 pandas 中其他两列的比较更改列的值

对于在 pandas 中创建的以下数据表，

Date        Score    Study_Date
02/2011      70       11/2012   
03/2011      72       11/2012   
10/2011      60       11/2012
12/2011      50       11/2012
01/2012      40       11/2012
02/2012      60       11/2012
03/2012      75       11/2012
11/2012      70       11/2012
12/2012      70       11/2012
01/2013      30       11/2012
02/2013      20       11/2012
04/2013      60       11/2012
06/2013      80       11/2012

Run Code Online (Sandbox Code Playgroud)

我想将日期早于研究日期的行的所有分数替换为 0。

我尝试了以下方法：

df[df.Date < df.Study_Date, 'Score']=0

Run Code Online (Sandbox Code Playgroud)

但我得到：

类型错误：“系列”对象是可变的，因此它们不能被散列

任何帮助将不胜感激。

python indexing time-series pandas

use*_*235

2018 05-05

1
推荐指数

1
解决办法

1173
查看次数

将数据框转换为表格

我有一个以下格式的数据框：

 state1     state2     score
   A          A          3
   A          B          13
   A          C          5
   B          A          1
   B          B          0
   B          C          0
   C          A          5
   C          B          6
   C          C          3

Run Code Online (Sandbox Code Playgroud)

我想将其转换为表格：

      A     B     C
A     3     13    5 
B     1     0     0
C     5     6     3

Run Code Online (Sandbox Code Playgroud)

除了手动之外，还有其他简单的方法吗？

use*_*235

2019 03-12

-1
推荐指数

1
解决办法

5075
查看次数

标签统计

python ×4

pandas ×3

r ×2

algorithm ×1

c++ ×1

dataframe ×1

dictionary ×1

indexing ×1

iostream ×1

nlp ×1

pandas-groupby ×1

time-series ×1

标签 统计

小编use_235的帖子

标签统计