sub*_*sub 7 c++ interpreter typing
"介绍"
我对C++比较陌生.我完成了所有基本的工作,并设法为我的编程语言构建了2-3个简单的解释器.
给出的第一件事让我头疼:用C++实现我语言的类型系统
想一想:Ruby,Python,PHP和Co.有很多内置类型,显然是用C实现的.所以我第一次尝试的是让我的语言中有三种可能的类型:Int,字符串和零.
我想出了这个:
enum ValueType
{
Int, String, Nil
};
class Value
{
public:
ValueType type;
int intVal;
string stringVal;
};
Run Code Online (Sandbox Code Playgroud)
是的,哇,我知道.由于必须一直调用字符串分配器,所以传递这个类非常慢.
下次我尝试过类似的东西:
enum ValueType
{
Int, String, Nil
};
extern string stringTable[255];
class Value
{
public:
ValueType type;
int index;
};
Run Code Online (Sandbox Code Playgroud)
我会存储所有字符串stringTable
并将其位置写入index
.如果类型Value
是Int
,我只是存储在整数index
,它不会在所有使用一个int索引来访问另一个INT意义,还是?
无论如何,上面也让我头疼.过了一段时间,从这里的表中访问字符串,在那里引用它并在那里复制它变得越来越多 - 我失去了控制.我不得不放下翻译稿.
现在:好的,所以C和C++是静态类型的.
上面提到的语言的主要实现如何处理程序中的不同类型(fixnums,bignums,nums,strings,arrays,resources,...)?
我应该怎么做以获得许多不同类型的最大速度?
解决方案与上面的简化版本相比如何?
一个明显的解决方案是定义类型层次结构:
class Type
{
};
class Int : public Type
{
};
class String : public Type
{
};
Run Code Online (Sandbox Code Playgroud)
等等。作为一个完整的例子,让我们为一种小型语言编写一个解释器。该语言允许像这样声明变量:
var a 10
Run Code Online (Sandbox Code Playgroud)
这将创建一个Int
对象,为其分配值10
并将其存储在名为 的变量表中a
。可以对变量调用操作。例如,两个 Int 值的加法运算如下所示:
+ a b
Run Code Online (Sandbox Code Playgroud)
这是解释器的完整代码:
#include <iostream>
#include <string>
#include <vector>
#include <sstream>
#include <cstdlib>
#include <map>
// The base Type object from which all data types are derived.
class Type
{
public:
typedef std::vector<Type*> TypeVector;
virtual ~Type () { }
// Some functions that you may want all types of objects to support:
// Returns the string representation of the object.
virtual const std::string toString () const = 0;
// Returns true if other_obj is the same as this.
virtual bool equals (const Type &other_obj) = 0;
// Invokes an operation on this object with the objects in args
// as arguments.
virtual Type* invoke (const std::string &opr, const TypeVector &args) = 0;
};
// An implementation of Type to represent an integer. The C++ int is
// used to actually store the value. As a consequence this type is
// machine dependent, which might not be what you want for a real
// high-level language.
class Int : public Type
{
public:
Int () : value_ (0), ret_ (NULL) { }
Int (int v) : value_ (v), ret_ (NULL) { }
Int (const std::string &v) : value_ (atoi (v.c_str ())), ret_ (NULL) { }
virtual ~Int ()
{
delete ret_;
}
virtual const std::string toString () const
{
std::ostringstream out;
out << value_;
return out.str ();
}
virtual bool equals (const Type &other_obj)
{
if (&other_obj == this)
return true;
try
{
const Int &i = dynamic_cast<const Int&> (other_obj);
return value_ == i.value_;
}
catch (std::bad_cast ex)
{
return false;
}
}
// As of now, Int supports only addition, represented by '+'.
virtual Type* invoke (const std::string &opr, const TypeVector &args)
{
if (opr == "+")
{
return add (args);
}
return NULL;
}
private:
Type* add (const TypeVector &args)
{
if (ret_ == NULL) ret_ = new Int;
Int *i = dynamic_cast<Int*> (ret_);
Int *arg = dynamic_cast<Int*> (args[0]);
i->value_ = value_ + arg->value_;
return ret_;
}
int value_;
Type *ret_;
};
// We use std::map as a symbol (or variable) table.
typedef std::map<std::string, Type*> VarsTable;
typedef std::vector<std::string> Tokens;
// A simple tokenizer for our language. Takes a line and
// tokenizes it based on whitespaces.
static void
tokenize (const std::string &line, Tokens &tokens)
{
std::istringstream in (line, std::istringstream::in);
while (!in.eof ())
{
std::string token;
in >> token;
tokens.push_back (token);
}
}
// Maps varName to an Int object in the symbol table. To support
// other Types, we need a more complex interpreter that actually infers
// the type of object by looking at the format of value.
static void
setVar (const std::string &varName, const std::string &value,
VarsTable &vars)
{
Type *t = new Int (value);
vars[varName] = t;
}
// Returns a previously mapped value from the symbol table.
static Type *
getVar (const std::string &varName, const VarsTable &vars)
{
VarsTable::const_iterator iter = vars.find (varName);
if (iter == vars.end ())
{
std::cout << "Variable " << varName
<< " not found." << std::endl;
return NULL;
}
return const_cast<Type*> (iter->second);
}
// Invokes opr on the object mapped to the name var01.
// opr should represent a binary operation. var02 will
// be pushed to the args vector. The string represenation of
// the result is printed to the console.
static void
invoke (const std::string &opr, const std::string &var01,
const std::string &var02, const VarsTable &vars)
{
Type::TypeVector args;
Type *arg01 = getVar (var01, vars);
if (arg01 == NULL) return;
Type *arg02 = getVar (var02, vars);
if (arg02 == NULL) return;
args.push_back (arg02);
Type *ret = NULL;
if ((ret = arg01->invoke (opr, args)) != NULL)
std::cout << "=> " << ret->toString () << std::endl;
else
std::cout << "Failed to invoke " << opr << " on "
<< var01 << std::endl;
}
// A simple REPL for our language. Type 'quit' to exit
// the loop.
int
main (int argc, char **argv)
{
VarsTable vars;
std::string line;
while (std::getline (std::cin, line))
{
if (line == "quit")
break;
else
{
Tokens tokens;
tokenize (line, tokens);
if (tokens.size () != 3)
{
std::cout << "Invalid expression." << std::endl;
continue;
}
if (tokens[0] == "var")
setVar (tokens[1], tokens[2], vars);
else
invoke (tokens[0], tokens[1], tokens[2], vars);
}
}
return 0;
}
Run Code Online (Sandbox Code Playgroud)
与口译员交互的示例:
/home/me $ ./mylang
var a 10
var b 20
+ a b
30
+ a c
Variable c not found.
quit
Run Code Online (Sandbox Code Playgroud)
您可以在这里执行几项不同的操作。不同的解决方案及时出现,其中大多数需要动态分配实际数据(boost::variant可以避免为小对象使用动态分配的内存——感谢@MSalters)。
\n\n纯C方法:
\n\n存储类型信息和指向必须根据类型信息(通常是枚举)进行解释的内存的 void 指针:
\n\nenum type_t {\n integer,\n string,\n null\n};\ntypedef struct variable {\n type_t type;\n void * datum;\n} variable_t;\nvoid init_int_variable( variable_t * var, int value )\n{\n var->type\xc2\xa0=\xc2\xa0integer;\n\xc2\xa0\xc2\xa0\xc2\xa0var->datum = malloc( sizeof(int) );\n *((int)var->datum) = value;\n}\nvoid fini_variable( variable_t var ) // optionally by pointer\n{\n free( var.datum );\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n在 C++ 中,您可以通过使用类来简化使用来改进这种方法,但更重要的是,您可以寻求更复杂的解决方案,并使用现有库作为 boost::any 或 boost::variant ,为同一问题提供不同的解决方案。
\n\nboost::any 和 boost::variant 都将值存储在动态分配的内存中,通常通过指向层次结构中虚拟类的指针,并使用重新解释(向下转换)为具体类型的运算符。
\n 归档时间: |
|
查看次数: |
1220 次 |
最近记录: |