我有一个包含此类数据的日志文件:
2020-07-28 10:07:01 (pool-3-thread-5id) DEBUG ResourceLoaderHelper: 10 - Trying to upload data
2020-07-28 10:07:01 (pool-3-thread-5id) DEBUG ResourceLoaderHelper: 66 - Trying to upload data
2020-07-28 10:07:01 (pool-3-thread-5id) DEBUG ValidationXmlParser: 127 - No META-Only annotation
2020-07-28 14:48:00 (pool-2-thread-1id) DEBUG MessageWriter: 55 - Send message ErrorOutputMessage(super=NotificationOutputMessage(super=OutputMessage(type=null, messageId=116345, reqId=af24112))), error=ErrorOutputMessage.Error(code=400, text={
"errors": [
"Message type error"
]
})) to exchange FOS
2020-07-28 10:07:01 (pool-3-thread-5id) DEBUG ValidatorFactoryImpl: 578 - Scoped message interpolator.
Run Code Online (Sandbox Code Playgroud)
我尝试以这种方式读取该文件:
data <- readr::read_lines(file = "log_data.log", progress = FALSE)
log_df <- setDT(tibble::enframe(data, name = NULL))
Run Code Online (Sandbox Code Playgroud)
但是这个数据框看起来像这样:
value
1 2020-07-28 10:07:01 (pool-3-thread-5id) DEBUG ResourceLoaderHelper: 10 - Trying to upload data
2 2020-07-28 10:07:01 (pool-3-thread-5id) DEBUG ResourceLoaderHelper: 66 - Trying to upload data
3 2020-07-28 10:07:01 (pool-3-thread-5id) DEBUG ValidationXmlParser: 127 - No META-Only annotation
4 2020-07-28 14:48:00 (pool-2-thread-1id) DEBUG MessageWriter: 55 - Send message ErrorOutputMessage(super=NotificationOutputMessage(super=OutputMessage(type=null, messageId=116345, reqId=af24112))), error=ErrorOutputMessage.Error(code=400, text={
5 "errors": [
6 "Message type error"
7 ]
8 })) to exchange FOS
9 2020-07-28 10:07:01 (pool-3-thread-5id) DEBUG ValidatorFactoryImpl: 578 - Scoped message interpolator.
Run Code Online (Sandbox Code Playgroud)
所以当你看到第 4 行被分成几行时,认为它是一排。我怎么能读取这个日志文件,所以它明白每一行必须以时间戳开头?我应该以某种方式使用正则表达式吗?