遍历循环中的索引列表,以重新格式化字符串

Mat*_*ien 16 python loops list

我有一个列表,看起来像这样,是从格式不佳的csv文件中提取的:

DF = [['Customer Number: 001 '],
 ['Notes: Bought a ton of stuff and was easy to deal with'],
 ['Customer Number: 666 '],
 ['Notes: acted and looked like Chris Farley on that hidden decaf skit from SNL'],
 ['Customer Number: 103 '],
 ['Notes: bought a ton of stuff got a free keychain'],
 ['Notes: gave us a referral to his uncles cousins hairdresser'],
 ['Notes: name address birthday social security number on file'],
 ['Customer Number: 007 '],
 ['Notes: looked a lot like James Bond'],
 ['Notes: came in with a martini']]
Run Code Online (Sandbox Code Playgroud)

我想最终得到一个像这样的新结构:

['Customer Number: 001 Notes: Bought a ton of stuff and was easy to deal with',
 'Customer Number: 666 Notes: acted and looked like Chris Farley on that hidden decaf skit from SNL',
 'Customer Number: 103 Notes: bought a ton of stuff got a free keychain',
 'Customer Number: 103 Notes: gave us a referral to his uncles cousins hairdresser',
 'Customer Number: 103 Notes: name address birthday social security number on file',
 'Customer Number: 007 Notes: looked a lot like James Bond',
 'Customer Number: 007 Notes: came in with a martini']
Run Code Online (Sandbox Code Playgroud)

之后我可以进一步拆分,剥离等

所以,我用了以下事实:

  • 客户编号始终以 Customer Number
  • Notes总是长
  • Notes从不超过5 的数量

编写一个显然是荒谬的解决方案,尽管它有效.

DF = [item for sublist in DF for item in sublist]
DF = DF + ['stophere']
DF2 = []

for record in DF:
    if (record[0:17]=="Customer Number: ") & (record !="stophere"):
        DF2.append(record + DF[DF.index(record)+1])
        if len(DF[DF.index(record)+2]) >21:
            DF2.append(record + DF[DF.index(record)+2])
            if len(DF[DF.index(record)+3]) >21:
                DF2.append(record + DF[DF.index(record)+3])
                if len(DF[DF.index(record)+4]) >21:
                    DF2.append(record + DF[DF.index(record)+4])
                    if len(DF[DF.index(record)+5]) >21:
                        DF2.append(record + DF[DF.index(record)+5])
Run Code Online (Sandbox Code Playgroud)

有人会介意为这类问题推荐更稳定,更智能的解决方案吗?

Pad*_*ham 13

只需跟踪我们何时找到新客户:

from pprint import pprint as pp

out = []
for sub in DF:
    if sub[0].startswith("Customer Number"):
        cust = sub[0]
    else:
        out.append(cust + sub[0])
pp(out)
Run Code Online (Sandbox Code Playgroud)

输出:

['Customer Number: 001 Notes: Bought a ton of stuff and was easy to deal with',
 'Customer Number: 666 Notes: acted and looked like Chris Farley on that '
 'hidden decaf skit from SNL',
 'Customer Number: 103 Notes: bought a ton of stuff got a free keychain',
 'Customer Number: 103 Notes: gave us a referral to his uncles cousins '
 'hairdresser',
 'Customer Number: 103 Notes: name address birthday social security number '
 'on file',
 'Customer Number: 007 Notes: looked a lot like James Bond',
 'Customer Number: 007 Notes: came in with a martini']
Run Code Online (Sandbox Code Playgroud)

如果客户可以稍后重复,并且您希望它们组合在一起,请使用dict:

from collections import defaultdict
d = defaultdict(list)
for sub in DF:
    if sub[0].startswith("Customer Number"):
        cust = sub[0]
    else:
        d[cust].append(cust + sub[0])
print(d)
Run Code Online (Sandbox Code Playgroud)

输出:

pp(d)

{'Customer Number: 001 ': ['Customer Number: 001 Notes: Bought a ton of '
                           'stuff and was easy to deal with'],
 'Customer Number: 007 ': ['Customer Number: 007 Notes: looked a lot like '
                           'James Bond',
                           'Customer Number: 007 Notes: came in with a '
                           'martini'],
 'Customer Number: 103 ': ['Customer Number: 103 Notes: bought a ton of '
                           'stuff got a free keychain',
                           'Customer Number: 103 Notes: gave us a referral '
                           'to his uncles cousins hairdresser',
                           'Customer Number: 103 Notes: name address '
                           'birthday social security number on file'],
 'Customer Number: 666 ': ['Customer Number: 666 Notes: acted and looked '
                           'like Chris Farley on that hidden decaf skit '
                           'from SNL']}
Run Code Online (Sandbox Code Playgroud)

根据您的评论和错误,您似乎在实际客户之前有线路,因此我们可以将它们添加到列表中的第一个客户:

# added ["foo"] before we see any customer

DF = [["foo"],['Customer Number: 001 '],
 ['Notes: Bought a ton of stuff and was easy to deal with'],
 ['Customer Number: 666 '],
 ['Notes: acted and looked like Chris Farley on that hidden decaf skit from SNL'],
 ['Customer Number: 103 '],
 ['Notes: bought a ton of stuff got a free keychain'],
 ['Notes: gave us a referral to his uncles cousins hairdresser'],
 ['Notes: name address birthday social security number on file'],
 ['Customer Number: 007 '],
 ['Notes: looked a lot like James Bond'],
 ['Notes: came in with a martini']]


from pprint import pprint as pp

from itertools import takewhile, islice

# find lines up to first customer
start = list(takewhile(lambda x: "Customer Number:" not in x[0], DF))

out = []
ln = len(start)
# if we had data before we actually found a customer this will be True
if start: 
    # so set cust to first customer in list and start adding to out
    cust = DF[ln][0]
    for sub in start:
        out.append(cust + sub[0])
# ln will either be 0 if start is empty else we start at first customer
for sub in islice(DF, ln, None):
    if sub[0].startswith("Customer Number"):
        cust = sub[0]
    else:
        out.append(cust + sub[0])
Run Code Online (Sandbox Code Playgroud)

哪个输出:

 ['Customer Number: 001 foo',
 'Customer Number: 001 Notes: Bought a ton of stuff and was easy to deal with',
 'Customer Number: 666 Notes: acted and looked like Chris Farley on that '
 'hidden decaf skit from SNL',
 'Customer Number: 103 Notes: bought a ton of stuff got a free keychain',
 'Customer Number: 103 Notes: gave us a referral to his uncles cousins '
 'hairdresser',
 'Customer Number: 103 Notes: name address birthday social security number '
 'on file',
 'Customer Number: 007 Notes: looked a lot like James Bond',
 'Customer Number: 007 Notes: came in with a martini']
Run Code Online (Sandbox Code Playgroud)

我推测你会考虑在任何客户真正属于第一个客户之前的线路.