xpe*_*oni 75
前几天我遇到了这个问题的一半编码.对可用选项不满意,在看了这个C示例代码之后,我决定推出自己的C++ url-encode函数:
#include <cctype>
#include <iomanip>
#include <sstream>
#include <string>
using namespace std;
string url_encode(const string &value) {
ostringstream escaped;
escaped.fill('0');
escaped << hex;
for (string::const_iterator i = value.begin(), n = value.end(); i != n; ++i) {
string::value_type c = (*i);
// Keep alphanumeric and other accepted characters intact
if (isalnum(c) || c == '-' || c == '_' || c == '.' || c == '~') {
escaped << c;
continue;
}
// Any other characters are percent-encoded
escaped << uppercase;
escaped << '%' << setw(2) << int((unsigned char) c);
escaped << nouppercase;
}
return escaped.str();
}
Run Code Online (Sandbox Code Playgroud)
解码功能的实现留给读者练习.:P
use*_*593 70
回答我自己的问题......
libcurl有curl_easy_escape用于编码.
对于解码,curl_easy_unescape
小智 12
string urlDecode(string &SRC) {
string ret;
char ch;
int i, ii;
for (i=0; i<SRC.length(); i++) {
if (int(SRC[i])==37) {
sscanf(SRC.substr(i+1,2).c_str(), "%x", &ii);
ch=static_cast<char>(ii);
ret+=ch;
i=i+2;
} else {
ret+=SRC[i];
}
}
return (ret);
}
Run Code Online (Sandbox Code Playgroud)
不是最好的,但工作正常;-)
Yur*_*kiy 10
cpp-netlib有功能
namespace boost {
namespace network {
namespace uri {
inline std::string decoded(const std::string &input);
inline std::string encoded(const std::string &input);
}
}
}
Run Code Online (Sandbox Code Playgroud)
它们允许非常容易地编码和解码URL字符串.
通常在编码时将'%'添加到char的int值将不起作用,该值应该是十六进制等效值.例如'/'是'%2F'而不是'%47'.
我认为这是url编码和解码的最佳和简洁的解决方案(没有太多的头依赖性).
string urlEncode(string str){
string new_str = "";
char c;
int ic;
const char* chars = str.c_str();
char bufHex[10];
int len = strlen(chars);
for(int i=0;i<len;i++){
c = chars[i];
ic = c;
// uncomment this if you want to encode spaces with +
/*if (c==' ') new_str += '+';
else */if (isalnum(c) || c == '-' || c == '_' || c == '.' || c == '~') new_str += c;
else {
sprintf(bufHex,"%X",c);
if(ic < 16)
new_str += "%0";
else
new_str += "%";
new_str += bufHex;
}
}
return new_str;
}
string urlDecode(string str){
string ret;
char ch;
int i, ii, len = str.length();
for (i=0; i < len; i++){
if(str[i] != '%'){
if(str[i] == '+')
ret += ' ';
else
ret += str[i];
}else{
sscanf(str.substr(i + 1, 2).c_str(), "%x", &ii);
ch = static_cast<char>(ii);
ret += ch;
i = i + 2;
}
}
return ret;
}
Run Code Online (Sandbox Code Playgroud)
[Necromancer模式开启]
在寻找快速,现代,平台独立和优雅的解决方案时偶然发现了这个问题.不像上面的任何一个,cpp-netlib将成为赢家,但它在"解码"功能中具有可怕的内存漏洞.所以我提出了提升的精神qi/karma解决方案.
namespace bsq = boost::spirit::qi;
namespace bk = boost::spirit::karma;
bsq::int_parser<unsigned char, 16, 2, 2> hex_byte;
template <typename InputIterator>
struct unescaped_string
: bsq::grammar<InputIterator, std::string(char const *)> {
unescaped_string() : unescaped_string::base_type(unesc_str) {
unesc_char.add("+", ' ');
unesc_str = *(unesc_char | "%" >> hex_byte | bsq::char_);
}
bsq::rule<InputIterator, std::string(char const *)> unesc_str;
bsq::symbols<char const, char const> unesc_char;
};
template <typename OutputIterator>
struct escaped_string : bk::grammar<OutputIterator, std::string(char const *)> {
escaped_string() : escaped_string::base_type(esc_str) {
esc_str = *(bk::char_("a-zA-Z0-9_.~-") | "%" << bk::right_align(2,0)[bk::hex]);
}
bk::rule<OutputIterator, std::string(char const *)> esc_str;
};
Run Code Online (Sandbox Code Playgroud)
以上用法如下:
std::string unescape(const std::string &input) {
std::string retVal;
retVal.reserve(input.size());
typedef std::string::const_iterator iterator_type;
char const *start = "";
iterator_type beg = input.begin();
iterator_type end = input.end();
unescaped_string<iterator_type> p;
if (!bsq::parse(beg, end, p(start), retVal))
retVal = input;
return retVal;
}
std::string escape(const std::string &input) {
typedef std::back_insert_iterator<std::string> sink_type;
std::string retVal;
retVal.reserve(input.size() * 3);
sink_type sink(retVal);
char const *start = "";
escaped_string<sink_type> g;
if (!bk::generate(sink, g(start), input))
retVal = input;
return retVal;
}
Run Code Online (Sandbox Code Playgroud)
[死灵法师模式关闭]
EDIT01:修复了零填充的东西 - 特别感谢Hartmut Kaiser
EDIT02:在CoLiRu上生活
在 win32 c++ 应用程序中搜索用于解码 url 的 api 时,我最终遇到了这个问题。由于这个问题并没有完全指定平台,假设 windows 不是一件坏事。
InternetCanonicalizeUrl 是 Windows 程序的 API。更多信息在这里
LPTSTR lpOutputBuffer = new TCHAR[1];
DWORD dwSize = 1;
BOOL fRes = ::InternetCanonicalizeUrl(strUrl, lpOutputBuffer, &dwSize, ICU_DECODE | ICU_NO_ENCODE);
DWORD dwError = ::GetLastError();
if (!fRes && dwError == ERROR_INSUFFICIENT_BUFFER)
{
delete lpOutputBuffer;
lpOutputBuffer = new TCHAR[dwSize];
fRes = ::InternetCanonicalizeUrl(strUrl, lpOutputBuffer, &dwSize, ICU_DECODE | ICU_NO_ENCODE);
if (fRes)
{
//lpOutputBuffer has decoded url
}
else
{
//failed to decode
}
if (lpOutputBuffer !=NULL)
{
delete [] lpOutputBuffer;
lpOutputBuffer = NULL;
}
}
else
{
//some other error OR the input string url is just 1 char and was successfully decoded
}
Run Code Online (Sandbox Code Playgroud)
InternetCrackUrl (这里) 似乎也有标志来指定是否解码 url
受到xperroni的启发,我写了一个解码器.谢谢你的指针.
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
char from_hex(char ch) {
return isdigit(ch) ? ch - '0' : tolower(ch) - 'a' + 10;
}
string url_decode(string text) {
char h;
ostringstream escaped;
escaped.fill('0');
for (auto i = text.begin(), n = text.end(); i != n; ++i) {
string::value_type c = (*i);
if (c == '%') {
if (i[1] && i[2]) {
h = from_hex(i[1]) << 4 | from_hex(i[2]);
escaped << h;
i += 2;
}
} else if (c == '+') {
escaped << ' ';
} else {
escaped << c;
}
}
return escaped.str();
}
int main(int argc, char** argv) {
string msg = "J%C3%B8rn!";
cout << msg << endl;
string decodemsg = url_decode(msg);
cout << decodemsg << endl;
return 0;
}
Run Code Online (Sandbox Code Playgroud)
编辑:删除了不需要的cctype和iomainip包含.