Replace values in multiple untitled columns to 0, 1, 2 depending on column

mvx*_*mvx 5 dataframe python-3.x pandas

EDITED AS PER COMMENTS

Background: Here is what the current dataframe looks like. The row labels are information texts in original excel file. But I hope this small reproduction of data will be enough for a solution? Actual file has about 100 columns and 200 rows.

Column headers and Row #0 values are repeated with pattern shown below -- except the Sales or Validation text changes at every occurrence of column with an existing title.

One more column before sales with text in each row. Mapping of Xs done for this test. Unfortunately, found no elegant way of displaying text as part of output below.

 Sales Unnamed: 2  Unnamed: 3  Validation Unnamed: 5 Unnamed: 6
0       Commented  No comment             Commented  No comment                                   
1     x                                             x                        
2                            x          x                                                
3                x                                             x             
Run Code Online (Sandbox Code Playgroud)

Expected Output: Replacing the X with 0s, 1s and 2s depending on which column they are in (Commented / No Comment)

 Sales Unnamed: 2  Unnamed: 3  Validation Unnamed: 5 Unnamed: 6
0       Commented  No comment             Commented  No comment                                   
1     0                                            1                        
2                            2          0                                                
3                1                                             2  
Run Code Online (Sandbox Code Playgroud)

Possible Code: I assume the loop would look something like this:

while in row 9:
    if column value = "commented":

        replace all "x" with 1

    elif row 9 when column valkue = "no comment":

        replace all "x" with 2

    else:

        replace all "x" with 0
Run Code Online (Sandbox Code Playgroud)

But being a python novice, I am not sure how to convert this to a working code. I'd appreciate all support and help.

Smi*_*rod 1

这是一种方法:

  1. 定义一个函数来替换 x:
import re

def replaceX(col):
    cond = ~((col == "x") | (col == "X"))
    # Check if the name of the column is undefined
    if not re.match(r'Unnamed: \d+', col.name):
        return col.where(cond, 0)
    else:
        # Check what is the value of the first row
        if col.iloc[0] == "Commented":
            return col.where(cond, 1)
        elif col.iloc[0] == "No comment":
            return col.where(cond, 2)
    return col
Run Code Online (Sandbox Code Playgroud)

或者,如果您的第一行不包含标题列的“注释”或“无注释”,您可以使用不使用正则表达式的解决方案:

def replaceX(col):
    cond = ~((col == "x") | (col == "X"))
    # Check what is the value of the first row
    if col.iloc[0] == "Commented":
        return col.where(cond, 1)
    elif col.iloc[0] == "No comment":
        return col.where(cond, 2)
    return col.where(cond, 0)
Run Code Online (Sandbox Code Playgroud)
  1. 在 DataFrame 上应用此函数:
# Apply the function on every column (axis not specified so equal 0)
df.apply(lambda col: replaceX(col))
Run Code Online (Sandbox Code Playgroud)

输出:

  title Unnamed: 2  Unnamed: 3
0        Commented  No comment
1                             
2     0                      2
3                1            
Run Code Online (Sandbox Code Playgroud)

文档:

  • 应用:根据轴在每列/行上应用函数
  • where:检查一系列条件是否满足,如果不满足,则替换为指定的值。