小编Sam*_*Sam的帖子

如何在python中获取季度开始日期

我想获取日期的季度开始日期

x="2018-02-07"
x=pd.to_datetime(x)
x=x-pd.offsets.QuarterBegin()
print(x)
2017-12-01 00:00:00

Run Code Online (Sandbox Code Playgroud)

哪个是错误的，应该是“ 2018-01-01 00:00:00 ”

有人可以帮助我哪里出错了吗？

python datetime date python-3.x pandas

Sam*_*Sam

2018 11-07

5
推荐指数

1
解决办法

2408
查看次数

使用pytorch数据集进行模型推理-GPU

我正在使用文本列对数据框运行 T5-base-grammar- Correction 进行语法校正

from happytransformer import HappyTextToText
from happytransformer import TTSettings
from tqdm.notebook import tqdm
tqdm.pandas()

happy_tt = HappyTextToText("T5",  "./t5-base-grammar-correction")
beam_settings =  TTSettings(num_beams=5, min_length=1, max_length=30)
def grammer_pipeline(text):
    text = "gec: " + text
    result = happy_tt.generate_text(text, args=beam_settings)
    
    return result.text

df['new_text'] =  df['original_text'].progress_apply(grammer_pipeline)

Run Code Online (Sandbox Code Playgroud)

Pandas apply 函数虽然运行并提供所需的结果，但运行速度相当慢。

另外，我在执行代码时收到以下警告

/home/.local/lib/python3.6/site-packages/transformers/pipelines/base.py:908: UserWarning: You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset
  UserWarning,

Run Code Online (Sandbox Code Playgroud)

我可以访问 GPU。有人可以提供一些指导来加快执行速度并利用 GPU 的全部功能吗

- - - - - …

gpu dataset pytorch huggingface-transformers

Sam*_*Sam

2022 05-13

5
推荐指数

0
解决办法

3400
查看次数

通过在字符串列中查找确切的单词来创建新列

如果列表中的任何单词与数据帧字符串列完全匹配，我想创建一个包含 1 或 0 的新列。

list_provided=["mul","the"]
#how my dataframe looks
id  text
a    simultaneous there the
b    simultaneous there
c    mul why

Run Code Online (Sandbox Code Playgroud)

预期产出

id  text                     found
a    simultaneous there the   1
b    simultaneous there       0
c    mul why                  1

Run Code Online (Sandbox Code Playgroud)

第二行分配为 0，因为“mul”或“the”在字符串列“text”中不完全匹配

到目前为止尝试过代码

#For exact match I am using the below code
data["Found"]=np.where(data["text"].str.contains(r'(?:\s|^)penalidades(?:\s|$)'),1,0)

Run Code Online (Sandbox Code Playgroud)

如何迭代循环以找到所提供的单词列表中所有单词的完全匹配？

编辑： 如果我按照 Georgey 的建议使用 str.contains(pattern)，则 data["Found"] 的所有行都会变为 1

data=pd.DataFrame({"id":("a","b","c","d"), "text":("simultaneous there the","simultaneous there","mul why","mul")})
list_of_word=["mul","the"]
pattern = '|'.join(list_of_word)
data["Found"]=np.where(data["text"].str.contains(pattern),1,0)

Output:
id  text                     found
a    simultaneous …

Run Code Online (Sandbox Code Playgroud)

python string dataframe python-3.x pandas

Sam*_*Sam

2018 04-18

3
推荐指数

1
解决办法

4030
查看次数

如何在Python中将不同的Excel文件合并到一个具有不同工作表名称的工作簿中

我有两个Excel工作簿。

一个带有3张纸，另一个只有一张。我正在尝试将这两部分合并为一本工作簿。该工作簿应有4张纸。

from pandas import ExcelWriter

writer = ExcelWriter("Sample.xlsx")

for filename in glob.glob("*.xlsx"):
    df_excel = pd.read_excel(filename)

    (_, f_name) = os.path.split(filename)
    (f_short_name, _) = os.path.splitext(f_name)

    df_excel.to_excel(writer, f_short_name, index=False)

writer.save()

Run Code Online (Sandbox Code Playgroud)

这样做给了我一个工作簿，但是只有两页。第一工作簿的第一页和第二工作簿的第二页。

如何在一个工作簿中获取所有4张纸？

python excel python-3.x pandas

Sam*_*Sam

2018 08-03

2
推荐指数

1
解决办法

1609
查看次数