假设"每个START_PUNCTUATION在1-3个代码点之后都有一个等效的END_PUNCTUATION,如果它有一个",这似乎是真的,这个代码片段应该为每个可能的字符列出它们:
public class EndPunct {
private static final int UNICODE_MAX = Character.MAX_CODE_POINT;
public static void main(String args[]) {
for (int i = 0; i < UNICODE_MAX; i++) {
if (!Character.isValidCodePoint(i)) {
continue;
}
if (Character.getType(i) == Character.START_PUNCTUATION) {
Character.UnicodeBlock currentBlock = Character.UnicodeBlock.of(i);
boolean found = false;
for (int newchar = i+1 ; newchar < Math.min(UNICODE_MAX, i+3); newchar++) {
if (!(Character.UnicodeBlock.of(newchar).equals(currentBlock))) {
break;
}
if (Character.getType(newchar) == Character.END_PUNCTUATION) {
System.out.println(toChar(i) + " matches " + toChar(newchar)
+ " (codepoints u+" + Integer.toHexString(i) + " and u+" +Integer.toHexString(newchar) + ")");
found = true;
break;
}
}
if (!found) {
System.out.println("NOT FOUND for " + toChar(i) + " [position u+" + Integer.toHexString(i) + "]");
}
}
}
}
public static String toChar(int codePoint) {
return new String(Character.toChars(codePoint));
}
}
Run Code Online (Sandbox Code Playgroud)
从它的输出,你可以看到这似乎适用于其他字符,除了两个:
( matches ) (codepoints u+28 and u+29)
[ matches ] (codepoints u+5b and u+5d)
{ matches } (codepoints u+7b and u+7d)
? matches ? (codepoints u+f3a and u+f3b)
? matches ? (codepoints u+f3c and u+f3d)
? matches ? (codepoints u+169b and u+169c)
NOT FOUND for ‚ [position u+201a]
NOT FOUND for „ [position u+201e]
? matches ? (codepoints u+2045 and u+2046)
? matches ? (codepoints u+207d and u+207e)
? matches ? (codepoints u+208d and u+208e)
? matches ? (codepoints u+2329 and u+232a)
? matches ? (codepoints u+2768 and u+2769)
? matches ? (codepoints u+276a and u+276b)
? matches ? (codepoints u+276c and u+276d)
? matches ? (codepoints u+276e and u+276f)
? matches ? (codepoints u+2770 and u+2771)
? matches ? (codepoints u+2772 and u+2773)
? matches ? (codepoints u+2774 and u+2775)
? matches ? (codepoints u+27c5 and u+27c6)
? matches ? (codepoints u+27e6 and u+27e7)
? matches ? (codepoints u+27e8 and u+27e9)
? matches ? (codepoints u+27ea and u+27eb)
? matches ? (codepoints u+27ec and u+27ed)
? matches ? (codepoints u+27ee and u+27ef)
? matches ? (codepoints u+2983 and u+2984)
? matches ? (codepoints u+2985 and u+2986)
? matches ? (codepoints u+2987 and u+2988)
? matches ? (codepoints u+2989 and u+298a)
? matches ? (codepoints u+298b and u+298c)
? matches ? (codepoints u+298d and u+298e)
? matches ? (codepoints u+298f and u+2990)
? matches ? (codepoints u+2991 and u+2992)
? matches ? (codepoints u+2993 and u+2994)
? matches ? (codepoints u+2995 and u+2996)
? matches ? (codepoints u+2997 and u+2998)
? matches ? (codepoints u+29d8 and u+29d9)
? matches ? (codepoints u+29da and u+29db)
? matches ? (codepoints u+29fc and u+29fd)
? matches ? (codepoints u+2e22 and u+2e23)
? matches ? (codepoints u+2e24 and u+2e25)
? matches ? (codepoints u+2e26 and u+2e27)
? matches ? (codepoints u+2e28 and u+2e29)
? matches ? (codepoints u+3008 and u+3009)
? matches ? (codepoints u+300a and u+300b)
? matches ? (codepoints u+300c and u+300d)
? matches ? (codepoints u+300e and u+300f)
? matches ? (codepoints u+3010 and u+3011)
? matches ? (codepoints u+3014 and u+3015)
? matches ? (codepoints u+3016 and u+3017)
? matches ? (codepoints u+3018 and u+3019)
? matches ? (codepoints u+301a and u+301b)
? matches ? (codepoints u+301d and u+301e)
? matches ? (codepoints u+fd3e and u+fd3f)
? matches ? (codepoints u+fe17 and u+fe18)
? matches ? (codepoints u+fe35 and u+fe36)
? matches ? (codepoints u+fe37 and u+fe38)
? matches ? (codepoints u+fe39 and u+fe3a)
? matches ? (codepoints u+fe3b and u+fe3c)
? matches ? (codepoints u+fe3d and u+fe3e)
? matches ? (codepoints u+fe3f and u+fe40)
? matches ? (codepoints u+fe41 and u+fe42)
? matches ? (codepoints u+fe43 and u+fe44)
? matches ? (codepoints u+fe47 and u+fe48)
? matches ? (codepoints u+fe59 and u+fe5a)
? matches ? (codepoints u+fe5b and u+fe5c)
? matches ? (codepoints u+fe5d and u+fe5e)
? matches ? (codepoints u+ff08 and u+ff09)
? matches ? (codepoints u+ff3b and u+ff3d)
? matches ? (codepoints u+ff5b and u+ff5d)
? matches ? (codepoints u+ff5f and u+ff60)
? matches ? (codepoints u+ff62 and u+ff63)
Run Code Online (Sandbox Code Playgroud)
u + 201a是单个低引号,u + 201e是双低引号.对于那些,没有匹配的角色.对于其他人来说,这种方法似乎有效,因此它似乎适用于每个匹配的方法.但是,可能没有任何保证.
| 归档时间: |
|
| 查看次数: |
312 次 |
| 最近记录: |