我一直在努力识别文本,最终在单词的中间或末尾使用大写/大写字母(例如 wOrd 代替 word;less 代替 less)。
谢谢!电阻
\\b(?!(?:[A-Z]+|eV|mW|kJ)\\b)([A-Z]?)(\\w+)
$1\\L$2
解释:
\n\n\\b # word boundary\n(?! # negative lookahead, make sure we haven\'t a word \n (?: # non capture group\n [A-Z]+ # all word is uppercase\n | # OR\n eV # literally (electronVolt)\n | # OR\n mW # literally (milliWatt)\n | # OR\n kJ # literally (kiloJoule) (You can add all exeptions you want here)\n )\\b\n)\n([A-Z]?) # group 1, an optional uppercase at the beginning of a word\n(\\w+) # group 2, 1 or more word character\n
Run Code Online (Sandbox Code Playgroud)\n\n替代品:
\n\n$1 # content of group 1, optional uppercase\n\\L$2 # lowercased group 2\n
Run Code Online (Sandbox Code Playgroud)\n\n鉴于:
\n\nWORD, WoRd, Word, IS, tHE, iT, eV, mW, kJ\n
Run Code Online (Sandbox Code Playgroud)\n\n给定示例的结果:
\n\nWORD, Word, Word, IS, the, it, eV, mW, kJ\n
Run Code Online (Sandbox Code Playgroud)\n\n屏幕截图:
\n\n\n\n\n\n\\b(?!(?:[A-Z]+|(?:[yzafpn\xc2\xb5mcdhk]|da)(?:[ACFJKLNSTVW]|Bq|Gy|Hz|Pa|Sv|Wb|eV))\\b)([A-Z]?)(\\w+)\n
Run Code Online (Sandbox Code Playgroud)\n\n在哪里:
\n\n[yzafpn\xc2\xb5mcdhk] # y(octo), z(epto), a(tto), f(emto), p(ico), n(ano), \xc2\xb5(micro), m(illi), c(enti), d(eci), h(ecto), k(ilo)\nda # deca\n\n[ACFJKLNSTVW] # A(mp\xc3\xa8re), C(oulomb), F(arad), J(oule), K(elvin), L(iter), N(ewton), S(iemens), T(esla), V(olt), W(att)\nBq # Becquerel\nGy # Gray\nHz # Hertz\nPa # Pascal\nSv # Sievert\nWb # Weber\neV # Electronvolt\n
Run Code Online (Sandbox Code Playgroud)\n