如何遍历行中的列以查找满足某些条件的第一个列

Question

如何遍历行中的列以查找满足某些条件的第一个列

我需要遍历数据框行中的列，以找到第一个完全大写的单元格（在此给定的行中）。我需要对数据帧中的所有行重复此操作，最后输出一个数据帧，其中包含一列，每一行都具有相应的第一个大写字符串。

举个例子-这可能是输入数据框：

+-----+--------+--------+--------+------+
|  0  |   1    |   2    |   3    |  4   |
+-----+--------+--------+--------+------+
| a   | Amount | SEQ    | LTOTAL | None |
| BBc | LCALC  | None   | None   | None |
| c   | LCALC  | None   | None   | None |
| Dea | RYR    | LTOTAL | None   | None |
+-----+--------+--------+--------+------+

Run Code Online (Sandbox Code Playgroud)

我需要在单独的数据框中输出以下内容：

+-------+
| SEQ   |
| LCALC |
| LCALC |
| RYR   |
+-------+

Run Code Online (Sandbox Code Playgroud)

Answer 1

jez*_*ael 6

如果需要检查所有列的测试值，isupper并将不匹配的值替换为NaNs，那么可能回填丢失的值，并通过iloc以下方法填充第一列：

df = df.where(df.applymap(lambda x: x.isupper())).bfill(axis=1).iloc[:, 0].to_frame('col')
print (df)
     col
0    SEQ
1  LCALC
2  LCALC
3    RYR

Run Code Online (Sandbox Code Playgroud)

编辑：

df1通过按匹配值的位置创建列，因此第一列是第一高位值，...：

#reshape by stack, None and NaNs columns are removed, 
#remove second level of MultiIndex
s = df.stack().reset_index(level=1, drop=True)
#filter only upper values, convert to DataFrame
df1  = s[s.str.isupper()].rename_axis('idx').reset_index(name='val')
#create counter column for count first, second... columns
df1['g'] = df1.groupby('idx').cumcount()
#reshape by pivot and if necessary add non upper rows
df1 = df1.pivot('idx','g','val').reindex(df.index)
print (df1)
g      0       1
0    SEQ  LTOTAL
1  LCALC     NaN
2  LCALC     NaN
3    RYR  LTOTAL

first = df1[0].to_frame('col')
second = df1[1].to_frame('col')
print (first)
    col
0    SEQ
1  LCALC
2  LCALC
3    RYR

print (second)
      col
0  LTOTAL
1     NaN
2     NaN
3  LTOTAL

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，8 月前
查看次数：	58 次
最近记录：	6 年，8 月前