Hos*_*ein 8 c++ compiler-construction assembly design-patterns
我正在制作8051汇编程序.
在所有东西都是一个读取下一个令牌的标记器之前,设置错误标志,识别EOF等.
然后是编译器的主循环,它读取下一个标记并检查有效的助记符:
mnemonic= NextToken();
if (mnemonic.Error)
{
//throw some error
}
else if (mnemonic.Text == "ADD")
{
...
}
else if (mnemonic.Text == "ADDC")
{
...
}
Run Code Online (Sandbox Code Playgroud)
它继续有几个案例.更糟糕的是每个案例中的代码,它检查有效参数然后将其转换为编译代码.现在它看起来像这样:
if (mnemonic.Text == "MOV")
{
arg1 = NextToken();
if (arg1.Error) { /* throw error */ break; }
arg2 = NextToken();
if (arg2.Error) { /* throw error */ break; }
if (arg1.Text == "A")
{
if (arg2.Text == "B")
output << 0x1234; //Example compiled code
else if (arg2.Text == "@B")
output << 0x5678; //Example compiled code
else
/* throw "Invalid parameters" */
}
else if (arg1.Text == "B")
{
if (arg2.Text == "A")
output << 0x9ABC; //Example compiled code
else if (arg2.Text == "@A")
output << 0x0DEF; //Example compiled code
else
/* throw "Invalid parameters" */
}
}
Run Code Online (Sandbox Code Playgroud)
对于每个助记符,我必须检查有效参数,然后创建正确的编译代码.用于检查每种情况下每个助记符重复的有效参数的非常相似的代码.
那么是否有改进此代码的设计模式?
或者只是一种更简单的方法来实现它?
编辑:我接受了基座的回答,多亏了他.如果你有这方面的想法,我将很乐意学习它们.谢谢大家.
多年来我编写了许多汇编程序来进行手工解析,坦率地说,使用语法语言和解析器生成器可能会更好.
这就是原因 - 典型的装配线可能看起来像这样:
[label:] [instruction|directive][newline]
Run Code Online (Sandbox Code Playgroud)
并且指令将是:
plain-mnemonic|mnemonic-withargs
Run Code Online (Sandbox Code Playgroud)
并且指令将是:
plain-directive|directive-withargs
Run Code Online (Sandbox Code Playgroud)
等等
有了像Gold这样的解析器生成器,你应该能够在几个小时内删除8051的语法.这种过度解析的优点是,您可以在汇编代码中使用足够复杂的表达式,如:
.define kMagicNumber 0xdeadbeef
CMPA #(2 * kMagicNumber + 1)
Run Code Online (Sandbox Code Playgroud)
这可能是一个真正的熊手工做.
如果您想手动完成,请列出所有助记符,其中还包括它们支持的各种允许寻址模式以及每种寻址模式,每个变量将采用的字节数以及它的操作码.像这样的东西:
enum {
Implied = 1, Direct = 2, Extended = 4, Indexed = 8 // etc
} AddressingMode;
/* for a 4 char mnemonic, this struct will be 5 bytes. A typical small processor
* has on the order of 100 instructions, making this table come in at ~500 bytes when all
* is said and done.
* The time to binary search that will be, worst case 8 compares on the mnemonic.
* I claim that I/O will take way more time than look up.
* You will also need a table and/or a routine that given a mnemonic and addressing mode
* will give you the actual opcode.
*/
struct InstructionInfo {
char Mnemonic[4];
char AddessingMode;
}
/* order them by mnemonic */
static InstructionInfo instrs[] = {
{ {'A', 'D', 'D', '\0'}, Direct|Extended|Indexed },
{ {'A', 'D', 'D', 'A'}, Direct|Extended|Indexed },
{ {'S', 'U', 'B', '\0'}, Direct|Extended|Indexed },
{ {'S', 'U', 'B', 'A'}, Direct|Extended|Indexed }
}; /* etc */
static int nInstrs = sizeof(instrs)/sizeof(InstrcutionInfo);
InstructionInfo *GetInstruction(char *mnemonic) {
/* binary search for mnemonic */
}
int InstructionSize(AddressingMode mode)
{
switch (mode) {
case Inplied: return 1;
/ * etc */
}
}
Run Code Online (Sandbox Code Playgroud)
然后,您将获得每个指令的列表,而这些指令又包含所有寻址模式的列表.
所以你的解析器变成这样:
char *line = ReadLine();
int nextStart = 0;
int labelLen;
char *label = GetLabel(line, &labelLen, nextStart, &nextStart); // may be empty
int mnemonicLen;
char *mnemonic = GetMnemonic(line, &mnemonicLen, nextStart, &nextStart); // may be empty
if (IsOpcode(mnemonic, mnemonicLen)) {
AddressingModeInfo info = GetAddressingModeInfo(line, nextStart, &nextStart);
if (IsValidInstruction(mnemonic, info)) {
GenerateCode(mnemonic, info);
}
else throw new BadInstructionException(mnemonic, info);
}
else if (IsDirective()) { /* etc. */ }
Run Code Online (Sandbox Code Playgroud)