xTw*_*eDx 4 regex parsing swift
假设我有一个日志文件,我已将其拆分为一个字符串数组。例如,我在这里有这些行。
123.4.5.1 - - [03/Sep/2013:18:38:48 -0600] "GET /products/car/ HTTP/1.1" 200 3327 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit /537.36 (KHTML, like Gecko) Chrome/29.0.1547.65 Safari/537.36"
123.4.5.6 - - [03/Sep/2013:18:38:58 -0600] "GET /jobs/ HTTP/1.1" 500 821 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:23.0 ) 壁虎/20100101 Firefox/23.0"
我可以用典型的字符串操作来解析这些,但是我认为有一个更好的方法可以用 Regex 来做到这一点。我试图遵循某人在python 中使用的类似模式,但我无法弄清楚。这是我的尝试。
这是模式: ([(\d.)]+) - - [(. ?)] "(. ?)" (\d+) - "(. ?)" "(. ?)" 当我尝试使用它,我没有匹配项。
let lines = contents.split(separator: "\n")
let pattern = "([(\\d\\.)]+) - - \\[(.*?)\\] \"(.*?)\" (\\d+) - \"(.*?)\" \"(.*?)\""
let regex = try! NSRegularExpression(pattern: pattern, options: [])
for line in lines {
let range = NSRange(location: 0, length: line.utf16.count)
let parsedData = regex.firstMatch(in: String(line), options: [], range: range)
print(parsedData)
}
Run Code Online (Sandbox Code Playgroud)
如果我可以将数据提取到一个模型中,那将是最好的。我需要确保代码是高性能和快速的,因为我应该考虑数千行。
let someResult = (String, String, String, String, String, String) or
let someObject: LogFile = LogFile(String, String, String...)
Run Code Online (Sandbox Code Playgroud)
我会寻找解析的行被分解成它的各个部分。IP, OS, OS Version,Browser Browser Version等等。任何真正的数据解析就足够了。
使用您显示的样本,您能否尝试以下操作。
^((?:\d+\.){3}\d+).*?\[([^]]*)\].*?"([^"]*)"\s*(\d+)\s*(\d+)\s*"-"\s*"([^"]*)"$
Run Code Online (Sandbox Code Playgroud)
说明:为以上添加详细说明。
^( ##Starting a capturing group checking from starting of value here.
(?:\d+\.){3}\d+ ##In a non-capturing group matching 3 digits followed by . with 1 or more digits
) ##Closing 1st capturing group here.
.*?\[ ##Matching non greedy till [ here.
([^]]*) ##Creating 2nd capturing group till ] here.
\].*?" ##Matching ] and non greedy till " here.
([^"]*) ##Creating 3rd capturing group which has values till " here.
"\s* ##Matching " spaces one or more occurrences here.
(\d+) ##Creating 4th capturing group here which has all digits here.
\s* ##Matching spaces one or more occurrences here.
(\d+) ##Creating 5th capturing group here which has all digits here.
\s*"-"\s*" ##Spaces 1 or more occurrences "-" followed by spaces 1 or more occurrences " here.
([^"]*) ##Creating 6th capturing group till " here.
"$ ##Matching " at last.
Run Code Online (Sandbox Code Playgroud)