Gle*_*len 5 java regex unix command-line
基本上,我正在传递一个字符串,我需要以与命令行选项由*nix shell标记的方式相同的方式对其进行标记.
说我有以下字符串
"Hello\" World" "Hello Universe" Hi
Run Code Online (Sandbox Code Playgroud)
我怎么能把它变成3元素列表
以下是我的第一次尝试,但它有很多问题
码:
public void test() {
String str = "\"Hello\\\" World\" \"Hello Universe\" Hi";
List<String> list = split(str);
}
public static List<String> split(String str) {
Pattern pattern = Pattern.compile(
"\"[^\"]*\"" + /* double quoted token*/
"|'[^']*'" + /*single quoted token*/
"|[A-Za-z']+" /*everything else*/
);
List<String> opts = new ArrayList<String>();
Scanner scanner = new Scanner(str).useDelimiter(pattern);
String token;
while ((token = scanner.findInLine(pattern)) != null) {
opts.add(token);
}
return opts;
}
Run Code Online (Sandbox Code Playgroud)
所以下面代码的错误输出是
编辑我对非正则表达式解决方案完全开放.这只是我想到的第一个解决方案
如果您决定放弃正则表达式并进行解析,则有几种选择。如果您愿意只使用双引号或单引号(但不能同时使用两者)作为引用,那么您可以使用 StreamTokenizer 轻松解决此问题:
public static List<String> tokenize(String s) throws IOException {
List<String> opts = new ArrayList<String>();
StreamTokenizer st = new StreamTokenizer(new StringReader(s));
st.quoteChar('\"');
while (st.nextToken() != StreamTokenizer.TT_EOF) {
opts.add(st.sval);
}
return opts;
}
Run Code Online (Sandbox Code Playgroud)
如果您必须支持两个引号,这里有一个应该可行的简单实现(请注意,像 '"blah \" blah"blah' 这样的字符串将产生类似 'blah " blahblah' 的内容。如果这不行,您将需要进行一些更改):
public static List<String> splitSSV(String in) throws IOException {
ArrayList<String> out = new ArrayList<String>();
StringReader r = new StringReader(in);
StringBuilder b = new StringBuilder();
int inQuote = -1;
boolean escape = false;
int c;
// read each character
while ((c = r.read()) != -1) {
if (escape) { // if the previous char is escape, add the current char
b.append((char)c);
escape = false;
continue;
}
switch (c) {
case '\\': // deal with escape char
escape = true;
break;
case '\"':
case '\'': // deal with quote chars
if (c == '\"' || c == '\'') {
if (inQuote == -1) { // not in a quote
inQuote = c; // now we are
} else {
inQuote = -1; // we were in a quote and now we aren't
}
}
break;
case ' ':
if (inQuote == -1) { // if we aren't in a quote, then add token to list
out.add(b.toString());
b.setLength(0);
} else {
b.append((char)c); // else append space to current token
}
break;
default:
b.append((char)c); // append all other chars to current token
}
}
if (b.length() > 0) {
out.add(b.toString()); // add final token to list
}
return out;
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1810 次 |
| 最近记录: |