我有数据框,其中txt列包含一个列表.我想txt使用函数clean_text()清理列.
data = {'value':['abc.txt', 'cda.txt'], 'txt':['[''2019/01/31-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart'']',
'[''2019/02/01-11:56:23.288258 1886 7F0ED4CDC704 asfasnfs: remove datepart'']']}
df = pandas.DataFrame(data=data)
def clean_text(text):
"""
:param text: it is the plain text
:return: cleaned text
"""
patterns = [r"^{53}",
r"[A-Za-z]+[\d]+[\w]*|[\d]+[A-Za-z]+[\w]*",
r"[-=/':,?${}\[\]-_()>.~" ";+]"]
for p in patterns:
text = re.sub(p, '', text)
return text
Run Code Online (Sandbox Code Playgroud)
我的解决方案:
df['txt'] = df['txt'].apply(lambda x: clean_text(x))
Run Code Online (Sandbox Code Playgroud)
但我得到以下错误: 错误
sre_constants.error: nothing to repeat at position 1
Run Code Online (Sandbox Code Playgroud) 我正在尝试将数据从 s3 存储桶加载到亚马逊 RDS 数据库。我知道这不是编程问题。但我真的很感激帮助。我使用了下面的代码:
aws rds restore-db-instance-from-s3 ^
--allocated-storage 250 ^
--db-instance-identifier myidentifier ^
--db-instance-class db.m4.large ^
--engine mysql ^
--master-user-name masterawsuser ^
--master-user-password masteruserpassword ^
--s3-bucket-name mybucket ^
--s3-ingestion-role-arn arn:aws:iam::account-number:role/rolename ^
--s3-prefix bucketprefix ^
--source-engine mysql ^
--source-engine-version 5.6.27
Run Code Online (Sandbox Code Playgroud)
但是我收到了以下错误,尽管我给出了正确的 ARN 编号:
“An error occurred (InvalidParameterValue) when calling the RestoreDBInstanceFrom S3 operation: IAM role ARN value is invalid or does not include the required permissions for: S3_SNAPSHOT_INGESTION”
Run Code Online (Sandbox Code Playgroud)
对此有何评论?
谢谢
我希望以.log应该是第一个文件和.gz文件应该按降序排列的方式对此列表进行排序
my_list = [
'/abc/a.log.1.gz',
'/abc/a.log',
'/abc/a.log.30.gz',
'/abc/a.log.2.gz',
'/abc/a.log.5.gz',
'/abc/a.log.3.gz',
'/abc/a.log.6.gz',
'/abc/a.log.4.gz',
'/abc/a.log.12.gz',
'/abc/a.log.10.gz',
'/abc/a.log.8.gz',
'/abc/a.log.14.gz',
'/abc/a.log.29.gz'
]
Run Code Online (Sandbox Code Playgroud)
预期结果:
my_list = ['/abc/a.log',
'/abc/a.log.30.gz',
'/abc/a.log.29.gz',
'/abc/a.log.29.gz',
'/abc/a.log.14.gz',
'/abc/a.log.12.gz',
'/abc/a.log.10.gz',
'/abc/a.log.8.gz',
'/abc/a.log.6.gz',
'/abc/a.log.5.gz',
'/abc/a.log.4.gz',
'/abc/a.log.3.gz',
'/abc/a.log.2.gz'
'/abc/a.log.1.gz']
Run Code Online (Sandbox Code Playgroud)
reversed(mylist) 也没有让我得到理想的解决方案.
I have a list of list as given below. ANd I want to convert it into dataframe in the desired format.
myList:
[[1]]
NULL
[[2]]
[[2]]$`file`
[1] "ABC"
[[2]]$New
[1] 21
[[2]]$Old
[1] 42
[[3]]
[[3]]$`file`
[1] "CDF"
[[3]]$NEW
[1] 206
[[3]]$Old
[1] 84
Run Code Online (Sandbox Code Playgroud)
And I want to convert this list of list object to dataframe in the desired format:
file New Old
ABC 21 42
CDF 206 84
Run Code Online (Sandbox Code Playgroud)
Thanks in advance!
我有一个包含两列的数据框。一列包含不同的类别,另一列包含值。
import pandas as pd
data={"category":["Topic1","Topic2","Topic3","Topic2","Topic1","Topic3"], "value":["hello","hey","hi","name","valuess","python"]}
df=pd.DataFrame(data=data)
Run Code Online (Sandbox Code Playgroud)
我想要不同的类别到列中,如下所示。
电流输入:
category value
Topic1 hello
Topic2 hey
Topic3 hi
Topic2 name
Topic1 valuess
Topic3 python
Run Code Online (Sandbox Code Playgroud)
期望输出:
Topic1 Topic2 Topic3
hello hey hi
valuess name python
Run Code Online (Sandbox Code Playgroud)
我尝试使用转置数据帧但没有得到预期的结果。
我在dataframe中有一列包含字符串值,如下所示:
sortdf=pd.DataFrame(data= {'col1':["hello are you","what happenend","hello you there","issue is in our program","whatt is your name"]})
Run Code Online (Sandbox Code Playgroud)
我想按字母顺序对元素中的每个单词进行排序.
期望的输出:
col1
0 are hello you
1 happenend what
2 hello there you
3 is in issue our program
4 is name whatt your
Run Code Online (Sandbox Code Playgroud)
我尝试使用以下代码执行此操作:
sortdf['col1']. sort()
Run Code Online (Sandbox Code Playgroud)
但是这段代码不起作用.
我有一个列表'abc'(字符串),我试图从列表'abc'和abc中的所有数字中删除列表'stop'中的一些单词.
abc=[ 'issues in performance 421',
'how are you doing',
'hey my name is abc, 143 what is your name',
'attention pleased',
'compliance installed 234']
stop=['attention', 'installed']
Run Code Online (Sandbox Code Playgroud)
我正在使用列表推导删除它,但下面的代码无法删除该单词.
new_word=[word for word in abc if word not in stop ]
Run Code Online (Sandbox Code Playgroud)
结果:(注意词仍然存在.)
['issues in performance',
'how are you doing',
'hey my name is abc, what is your name',
'attention pleased',
'compliance installed']
Run Code Online (Sandbox Code Playgroud)
期望的输出:
['issues in performance',
'how are you doing',
'hey my name is abc, what is your name',
'pleased',
'compliance']
Run Code Online (Sandbox Code Playgroud)
谢谢