mvx*_*mvx 5 dataframe python-3.x pandas
EDITED AS PER COMMENTS
Background: Here is what the current dataframe looks like. The row labels are information texts in original excel file. But I hope this small reproduction of data will be enough for a solution? Actual file has about 100 columns and 200 rows.
Column headers and Row #0 values are repeated with pattern shown below -- except the Sales or Validation text changes at every occurrence of column with an existing title.
One more column before sales with text in each row. Mapping of Xs done for this test. Unfortunately, found no elegant way of displaying text as part of output below.
Sales Unnamed: 2 Unnamed: 3 Validation Unnamed: 5 Unnamed: 6
0 Commented No comment Commented No comment
1 x x
2 x x
3 x x
Run Code Online (Sandbox Code Playgroud)
Expected Output: Replacing the X with 0s, 1s and 2s depending on which column they are in (Commented / No Comment)
Sales Unnamed: 2 Unnamed: 3 Validation Unnamed: 5 Unnamed: 6
0 Commented No comment Commented No comment
1 0 1
2 2 0
3 1 2
Run Code Online (Sandbox Code Playgroud)
Possible Code: I assume the loop would look something like this:
while in row 9:
if column value = "commented":
replace all "x" with 1
elif row 9 when column valkue = "no comment":
replace all "x" with 2
else:
replace all "x" with 0
Run Code Online (Sandbox Code Playgroud)
But being a python novice, I am not sure how to convert this to a working code. I'd appreciate all support and help.
这是一种方法:
import re
def replaceX(col):
cond = ~((col == "x") | (col == "X"))
# Check if the name of the column is undefined
if not re.match(r'Unnamed: \d+', col.name):
return col.where(cond, 0)
else:
# Check what is the value of the first row
if col.iloc[0] == "Commented":
return col.where(cond, 1)
elif col.iloc[0] == "No comment":
return col.where(cond, 2)
return col
Run Code Online (Sandbox Code Playgroud)
或者,如果您的第一行不包含标题列的“注释”或“无注释”,您可以使用不使用正则表达式的解决方案:
def replaceX(col):
cond = ~((col == "x") | (col == "X"))
# Check what is the value of the first row
if col.iloc[0] == "Commented":
return col.where(cond, 1)
elif col.iloc[0] == "No comment":
return col.where(cond, 2)
return col.where(cond, 0)
Run Code Online (Sandbox Code Playgroud)
# Apply the function on every column (axis not specified so equal 0)
df.apply(lambda col: replaceX(col))
Run Code Online (Sandbox Code Playgroud)
输出:
title Unnamed: 2 Unnamed: 3
0 Commented No comment
1
2 0 2
3 1
Run Code Online (Sandbox Code Playgroud)
文档:
| 归档时间: |
|
| 查看次数: |
95 次 |
| 最近记录: |