Pau*_*aul 28 c++ serialization design-patterns
每当我发现自己需要在C++程序中序列化对象时,我就会回到这种模式:
class Serializable {
public:
static Serializable *deserialize(istream &is) {
int id;
is >> id;
switch(id) {
case EXAMPLE_ID:
return new ExampleClass(is);
//...
}
}
void serialize(ostream &os) {
os << getClassID();
serializeMe(os);
}
protected:
int getClassID()=0;
void serializeMe(ostream &os)=0;
};
Run Code Online (Sandbox Code Playgroud)
以上在实践中效果很好.但是,我听说这种类ID转换是邪恶的,反模式; 什么是在C++中处理序列化的标准OO方式?
Yac*_*oby 27
使用像Boost Serialization这样的东西,虽然绝不是一个标准,但它是一个(大多数情况下)编写得非常好的库,可以为你完成繁重的工作.
我最后一次使用清晰的继承树手动解析预定义的记录结构时,我最终使用具有可注册类的工厂模式(即使用密钥映射到(模板)创建者函数而不是许多开关函数)试着避免你遇到的问题.
编辑
上一段中提到的对象工厂的基本C++实现.
/**
* A class for creating objects, with the type of object created based on a key
*
* @param K the key
* @param T the super class that all created classes derive from
*/
template<typename K, typename T>
class Factory {
private:
typedef T *(*CreateObjectFunc)();
/**
* A map keys (K) to functions (CreateObjectFunc)
* When creating a new type, we simply call the function with the required key
*/
std::map<K, CreateObjectFunc> mObjectCreator;
/**
* Pointers to this function are inserted into the map and called when creating objects
*
* @param S the type of class to create
* @return a object with the type of S
*/
template<typename S>
static T* createObject(){
return new S();
}
public:
/**
* Registers a class to that it can be created via createObject()
*
* @param S the class to register, this must ve a subclass of T
* @param id the id to associate with the class. This ID must be unique
*/
template<typename S>
void registerClass(K id){
if (mObjectCreator.find(id) != mObjectCreator.end()){
//your error handling here
}
mObjectCreator.insert( std::make_pair<K,CreateObjectFunc>(id, &createObject<S> ) );
}
/**
* Returns true if a given key exists
*
* @param id the id to check exists
* @return true if the id exists
*/
bool hasClass(K id){
return mObjectCreator.find(id) != mObjectCreator.end();
}
/**
* Creates an object based on an id. It will return null if the key doesn't exist
*
* @param id the id of the object to create
* @return the new object or null if the object id doesn't exist
*/
T* createObject(K id){
//Don't use hasClass here as doing so would involve two lookups
typename std::map<K, CreateObjectFunc>::iterator iter = mObjectCreator.find(id);
if (iter == mObjectCreator.end()){
return NULL;
}
//calls the required createObject() function
return ((*iter).second)();
}
};
Run Code Online (Sandbox Code Playgroud)
Mat*_* M. 20
序列化是C++中一个棘手的话题......
快速提问:
2是有用的,并有它们的用途.
Boost.Serialization通常是最推荐用于序列化的库,尽管operator&
根据常量序列化或反序列化的奇怪选择实际上是滥用运算符重载.
对于消息传递,我宁愿建议使用Google Protocol Buffer.它们提供了一种干净的语法来描述消息,并为各种语言生成编码器和解码器.当性能很重要时还有另一个优点:它允许通过设计进行惰性反序列化(即,一次只有blob的一部分).
继续
现在,至于实施的细节,它实际上取决于你的愿望.
tag
+ 系统factory
.它只是多态类的必要条件.你需要一个factory
每个继承树(kind
)然后......代码当然可以模板化!kind
都给它一个id
唯一的kind
,所以我序列化id
而不是指针.只要您没有循环依赖关系并且首先序列化指向/引用的对象,某些框架就会处理它.就个人而言,我尽可能地尝试将序列化/反序列化的代码与运行该类的实际代码分开.特别是,我尝试在源文件中隔离它,以便对这部分代码的更改不会消除二进制兼容性.
版本控制
我通常会尝试将一个版本的序列化和反序列化保持在一起.检查它们是否真正对称更容易.我还尝试在我的序列化框架中直接抽象版本控制处理+一些其他的东西,因为DRY应该遵守:)
关于错误处理
为了简化错误检测,我通常使用一对"标记"(特殊字节)将一个对象与另一个对象分开.它允许我在反序列化期间立即抛出,因为我可以检测到流的去同步问题(即,有点吃太多字节或没有充分利用).
如果你想要允许反序列化,即反序列化流的其余部分,即使之前发生了某些事情,你也必须转向字节数:每个对象前面都有字节数,只能吃掉这么多字节(并且是预期的)吃它们全部).这种方法很好,因为它允许部分反序列化:即,您可以保存对象所需的流部分,并在必要时仅对其进行反序列化.
标记(你的类ID)在这里很有用,不(只)派遣,而是简单地检查你实际上是反序列化正确类型的对象.它还允许漂亮的错误消息.
以下是您可能希望的一些错误消息/例外:
No version X for object TYPE: only Y and Z
Stream is corrupted: here are the next few bytes BBBBBBBBBBBBBBBBBBB
TYPE (version X) was not completely deserialized
Trying to deserialize a TYPE1 in TYPE2
需要注意的是,据我记得都Boost.Serialization
和protobuf
真正的帮助错误/版本处理.
protobuf
也有一些额外的好处,因为它的嵌套消息的能力:
对应的是,由于消息的固定格式,处理多态性更难.你必须仔细设计它们.
Yacoby的答案可以进一步扩展.
我相信如果实际上实现了一个反射系统,序列化可以用类似于托管语言的方式实现.
多年来,我们一直在使用自动化方法.
我是工作C++后处理器和Reflection库的实现者之一:LSDC工具和Linderdaum Engine Core(iObject + RTTI + Linker/Loader).请参阅http://www.linderdaum.com上的来源
类工厂抽象出类实例化的过程.
要初始化特定成员,您可以添加一些侵入式RTTI并自动生成它们的加载/保存过程.
假设您在层次结构的顶部有iObject类.
// Base class with intrusive RTTI
class iObject
{
public:
iMetaClass* FMetaClass;
};
///The iMetaClass stores the list of properties and provides the Construct() method:
// List of properties
class iMetaClass: public iObject
{
public:
virtual iObject* Construct() const = 0;
/// List of all the properties (excluding the ones from base class)
vector<iProperty*> FProperties;
/// Support the hierarchy
iMetaClass* FSuperClass;
/// Name of the class
string FName;
};
// The NativeMetaClass<T> template implements the Construct() method.
template <class T> class NativeMetaClass: public iMetaClass
{
public:
virtual iObject* Construct() const
{
iObject* Res = new T();
Res->FMetaClass = this;
return Res;
}
};
// mlNode is the representation of the markup language: xml, json or whatever else.
// The hierarchy might have come from the XML file or JSON or some custom script
class mlNode {
public:
string FName;
string FValue;
vector<mlNode*> FChildren;
};
class iProperty: public iObject {
public:
/// Load the property from internal tree representation
virtual void Load( iObject* TheObject, mlNode* Node ) const = 0;
/// Serialize the property to some internal representation
virtual mlNode* Save( iObject* TheObject ) const = 0;
};
/// function to save a single field
typedef mlNode* ( *SaveFunction_t )( iObject* Obj );
/// function to load a single field from mlNode
typedef void ( *LoadFunction_t )( mlNode* Node, iObject* Obj );
// The implementation for a scalar/iObject field
// The array-based property requires somewhat different implementation
// Load/Save functions are autogenerated by some tool.
class clFieldProperty : public iProperty {
public:
clFieldProperty() {}
virtual ~clFieldProperty() {}
/// Load single field of an object
virtual void Load( iObject* TheObject, mlNode* Node ) const {
FLoadFunction(TheObject, Node);
}
/// Save single field of an object
virtual mlNode* Save( iObject* TheObject, mlNode** Result ) const {
return FSaveFunction(TheObject);
}
public:
// these pointers are set in property registration code
LoadFunction_t FLoadFunction;
SaveFunction_t FSaveFunction;
};
// The Loader class stores the list of metaclasses
class Loader: public iObject {
public:
void RegisterMetaclass(iMetaClass* C) { FClasses[C->FName] = C; }
iObject* CreateByName(const string& ClassName) { return FClasses[ClassName]->Construct(); }
/// The implementation is an almost trivial iteration of all the properties
/// in the metaclass and calling the iProperty's Load/Save methods for each field
void LoadFromNode(mlNode* Source, iObject** Result);
/// Create the tree-based representation of the object
mlNode* Save(iObject* Source);
map<string, iMetaClass*> FClasses;
};
Run Code Online (Sandbox Code Playgroud)
当您定义从iObject派生的ConcreteClass时,您使用一些扩展和代码生成器工具来生成保存/加载过程列表和注册代码.
让我们看看这个示例的代码.
在框架的某个地方,我们有一个空的正式定义
#define PROPERTY(...)
/// vec3 is a custom type with implementation omitted for brevity
/// ConcreteClass2 is also omitted
class ConcreteClass: public iObject {
public:
ConcreteClass(): FInt(10), FString("Default") {}
/// Inform the tool about our properties
PROPERTY(Name=Int, Type=int, FieldName=FInt)
/// We can also provide get/set accessors
PROPERTY(Name=Int, Type=vec3, Getter=GetPos, Setter=SetPos)
/// And the other field
PROPERTY(Name=Str, Type=string, FieldName=FString)
/// And the embedded object
PROPERTY(Name=Embedded, Type=ConcreteClass2, FieldName=FEmbedded)
/// public field
int FInt;
/// public field
string FString;
/// public embedded object
ConcreteClass2* FEmbedded;
/// Getter
vec3 GetPos() const { return FPos; }
/// Setter
void SetPos(const vec3& Pos) { FPos = Pos; }
private:
vec3 FPos;
};
Run Code Online (Sandbox Code Playgroud)
自动生成的注册码将是:
/// Call this to add everything to the linker
void Register_ConcreteClass(Linker* L) {
iMetaClass* C = new NativeMetaClass<ConcreteClass>();
C->FName = "ConcreteClass";
iProperty* P;
P = new FieldProperty();
P->FName = "Int";
P->FLoadFunction = &Load_ConcreteClass_FInt_Field;
P->FSaveFunction = &Save_ConcreteClass_FInt_Field;
C->FProperties.push_back(P);
... same for FString and GetPos/SetPos
C->FSuperClass = L->FClasses["iObject"];
L->RegisterClass(C);
}
// The autogenerated loaders (no error checking for brevity):
void Load_ConcreteClass_FInt_Field(iObject* Dest, mlNode* Val) {
dynamic_cast<ConcereteClass*>Object->FInt = Str2Int(Val->FValue);
}
mlNode* Save_ConcreteClass_FInt_Field(iObject* Dest, mlNode* Val) {
mlNode* Res = new mlNode();
Res->FValue = Int2Str( dynamic_cast<ConcereteClass*>Object->FInt );
return Res;
}
/// similar code for FString and GetPos/SetPos pair with obvious changes
Run Code Online (Sandbox Code Playgroud)
现在,如果你有类似JSON的分层脚本
Object("ConcreteClass") {
Int 50
Str 10
Pos 1.5 2.2 3.3
Embedded("ConcreteClass2") {
SomeProp Value
}
}
Run Code Online (Sandbox Code Playgroud)
Linker对象将解析Save/Load方法中的所有类和属性.
对于长篇文章感到抱歉,当所有错误处理都进入时,实现会变得更大.
也许我并不聪明,但我认为最终你编写的代码类似于编写,只是因为C++没有运行时机制来做任何不同的事情.问题是它是由开发人员定制编写的,是通过模板元编程生成的(这是我怀疑boost.serialization的作用),还是通过IDL编译器/代码生成器等外部工具生成的.
这三种机制中的哪一种(也许还有其他可能性)的问题应该在每个项目的基础上进行评估.