6 c string algorithm optimization
假设我的字符串很长,我想看看一列是allLower,allUpper还是mixedCase。例如下面的列
text
"hello"
"New"
"items"
"iTem12"
"-3nXy"
Run Code Online (Sandbox Code Playgroud)
文字是mixedCase。确定这一点的幼稚算法可能是:
int is_mixed_case, is_all_lower, is_all_upper;
int has_lower = 0;
int has_upper = 0;
// for each row...for each column...
for (int i = 0; (c=s[i]) != '\0'; i++) {
if (c >='a' && c <= 'z') {
has_lower = 1;
if (has_upper) break;
}
else if (c >='A' && c <= 'Z') {
has_upper = 1;
if (has_lower) break;
}
}
is_all_lower = has_lower && !has_upper;
is_all_upper = has_upper && !has_lower;
is_mixed_case = has_lower && has_upper;
Run Code Online (Sandbox Code Playgroud)
但是,我敢肯定会有更高效的方法来做到这一点。进行此算法/计算的最有效方法是什么?
如果您知道将要使用的字符编码(我在代码示例中使用了ISO/IEC 8859-15),那么查找表可能是最快的解决方案。这还允许您决定扩展字符集中的哪些字符,例如 µ 或 ß,您将计为大写、小写或非字母。
char test_case(const char *s) {
static const char alphabet[] = {
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // ABCDEFGHIJKLMNO
1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0, // PQRSTUVWXYZ
0,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, // abcdefghijklmno
2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0, // pqrstuvwxyz
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,1,0,2,0,2,0,0,0,0, // Š š ª
0,0,0,0,0,1,2,0,0,2,0,2,0,1,2,1, // ޵ ž º ŒœŸ
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, // ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ
1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1, // ÐÑÒÓÔÕÖ ØÙÚÛÜÝÞß
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, // àáâãäåæçèéêëìíîï
2,2,2,2,2,2,2,0,2,2,2,2,2,2,2,2}; // ðñòóôõö øùúûüýþÿ
char cases = 0;
while (*s && cases != 3) {
cases |= alphabet[(unsigned char) *s++];
}
return cases; // 0 = none, 1 = upper, 2 = lower, 3 = mixed
}
Run Code Online (Sandbox Code Playgroud)
正如chux的评论中所建议的,您可以将 的值设置alphabet[0]为 4,然后cases < 3在 while 循环中只需要一个条件。
| 归档时间: |
|
| 查看次数: |
192 次 |
| 最近记录: |