我想循环遍历文本文件的内容,并在某些行上进行搜索和替换,并将结果写回文件.我可以先将整个文件加载到内存中然后再写回来,但这可能不是最好的方法.
在以下代码中,最好的方法是什么?
f = open(file)
for line in f:
if line.contains('foo'):
newline = line.replace('foo', 'bar')
# how to write this newline back to the file
Run Code Online (Sandbox Code Playgroud)
Eli*_*sky 248
最短的方法可能是使用fileinput模块.例如,以下内容将行号添加到文件中:
import fileinput
for line in fileinput.input("test.txt", inplace=True):
print('{} {}'.format(fileinput.filelineno(), line), end='') # for Python 3
# print "%d: %s" % (fileinput.filelineno(), line), # for Python 2
Run Code Online (Sandbox Code Playgroud)
这里发生的是:
print语句都会写回原始文件fileinput有更多的花里胡哨.例如,它可用于自动操作所有文件sys.args[1:],而无需显式迭代它们.从Python 3.2开始,它还提供了一个方便的上下文管理器,可以在with语句中使用.
虽然fileinput对于一次性脚本非常有用,但我会谨慎地在实际代码中使用它,因为不可否认它不是很易读或不熟悉.在实际(生产)代码中,花费更多代码行来使流程显式化并因此使代码可读是值得的.
有两种选择:
Tho*_*dal 181
我猜这样的事情应该这样做.它基本上将内容写入新文件并用新文件替换旧文件:
from tempfile import mkstemp
from shutil import move
from os import fdopen, remove
def replace(file_path, pattern, subst):
#Create temp file
fh, abs_path = mkstemp()
with fdopen(fh,'w') as new_file:
with open(file_path) as old_file:
for line in old_file:
new_file.write(line.replace(pattern, subst))
#Remove original file
remove(file_path)
#Move new file
move(abs_path, file_path)
Run Code Online (Sandbox Code Playgroud)
Jas*_*son 73
这是另一个经过测试的示例,它将匹配搜索和替换模式:
import fileinput
import sys
def replaceAll(file,searchExp,replaceExp):
for line in fileinput.input(file, inplace=1):
if searchExp in line:
line = line.replace(searchExp,replaceExp)
sys.stdout.write(line)
Run Code Online (Sandbox Code Playgroud)
使用示例:
replaceAll("/fooBar.txt","Hello\sWorld!$","Goodbye\sWorld.")
Run Code Online (Sandbox Code Playgroud)
Kin*_*lan 61
这应该工作:(现场编辑)
import fileinput
# Does a list of files, and
# redirects STDOUT to the file in question
for line in fileinput.input(files, inplace = 1):
print line.replace("foo", "bar"),
Run Code Online (Sandbox Code Playgroud)
Thi*_*ijs 23
根据Thomas Watnedal的回答.但是,这并没有完全回答原始问题的线到线部分.该功能仍然可以在线到线的基础上替换
此实现在不使用临时文件的情况下替换文件内容,因此文件权限保持不变.
另外,re.sub而不是replace,只允许正则表达式替换而不是纯文本替换.
将文件作为单个字符串而不是逐行读取允许多行匹配和替换.
import re
def replace(file, pattern, subst):
# Read contents from file as a single string
file_handle = open(file, 'r')
file_string = file_handle.read()
file_handle.close()
# Use RE package to allow for replacement (also allowing for (multiline) REGEX)
file_string = (re.sub(pattern, subst, file_string))
# Write contents to file.
# Using mode 'w' truncates the file.
file_handle = open(file, 'w')
file_handle.write(file_string)
file_handle.close()
Run Code Online (Sandbox Code Playgroud)
ham*_*mcn 11
正如lassevk建议的那样,随便写出新文件,这里有一些示例代码:
fin = open("a.txt")
fout = open("b.txt", "wt")
for line in fin:
fout.write( line.replace('foo', 'bar') )
fin.close()
fout.close()
Run Code Online (Sandbox Code Playgroud)
sta*_*t64 11
如果你想要一个用一些其他文本替换任何文本的泛型函数,这可能是最好的方法,特别是如果你是正则表达式的粉丝:
import re
def replace( filePath, text, subs, flags=0 ):
with open( filePath, "r+" ) as file:
fileContents = file.read()
textPattern = re.compile( re.escape( text ), flags )
fileContents = textPattern.sub( subs, fileContents )
file.seek( 0 )
file.truncate()
file.write( fileContents )
Run Code Online (Sandbox Code Playgroud)
更加pythonic的方式是使用上下文管理器,如下面的代码:
from tempfile import mkstemp
from shutil import move
from os import remove
def replace(source_file_path, pattern, substring):
fh, target_file_path = mkstemp()
with open(target_file_path, 'w') as target_file:
with open(source_file_path, 'r') as source_file:
for line in source_file:
target_file.write(line.replace(pattern, substring))
remove(source_file_path)
move(target_file_path, source_file_path)
Run Code Online (Sandbox Code Playgroud)
你可以在这里找到完整的片段.
fileinput 正如前面的答案中提到的那样非常简单:
import fileinput
def replace_in_file(file_path, search_text, new_text):
with fileinput.input(file_path, inplace=True) as f:
for line in f:
new_line = line.replace(search_text, new_text)
print(new_line, end='')
Run Code Online (Sandbox Code Playgroud)
解释:
fileinput可以接受多个文件,但我更喜欢在处理后立即关闭每个文件。所以file_path在with声明中放单。print语句在 时不打印任何内容inplace=True,因为STDOUT正在转发到原始文件。end=''inprint语句是为了消除中间的空白新行。可以如下使用:
file_path = '/path/to/my/file'
replace_in_file(file_path, 'old-text', 'new-text')
Run Code Online (Sandbox Code Playgroud)
扩展 @Kiran 的答案(我同意它更加简洁和 Pythonic),这添加了编解码器来支持 UTF-8 的读写:
import codecs
from tempfile import mkstemp
from shutil import move
from os import remove
def replace(source_file_path, pattern, substring):
fh, target_file_path = mkstemp()
with codecs.open(target_file_path, 'w', 'utf-8') as target_file:
with codecs.open(source_file_path, 'r', 'utf-8') as source_file:
for line in source_file:
target_file.write(line.replace(pattern, substring))
remove(source_file_path)
move(target_file_path, source_file_path)
Run Code Online (Sandbox Code Playgroud)