使用libclang从内存中的C代码生成程序集

Ger*_*cia 3 c++ llvm clang libclang

我需要实现一个库,它使用LLVM/Clang作为后端将C代码编译为eBPF字节码.代码将从内存中读取,我需要将结果汇编代码也存储在内存中.

到目前为止,我已经能够使用以下代码编译为LLVM IR:

#include <string>
#include <vector>

#include <clang/Frontend/CompilerInstance.h>
#include <clang/Basic/DiagnosticOptions.h>
#include <clang/Frontend/TextDiagnosticPrinter.h>
#include <clang/CodeGen/CodeGenAction.h>
#include <clang/Basic/TargetInfo.h>
#include <llvm/Support/TargetSelect.h>

using namespace std;
using namespace clang;
using namespace llvm;

int main() {

    constexpr auto testCodeFileName = "test.cpp";
    constexpr auto testCode = "int test() { return 2+2; }";

    // Prepare compilation arguments
    vector<const char *> args;
    args.push_back(testCodeFileName);

    // Prepare DiagnosticEngine 
    DiagnosticOptions DiagOpts;
    TextDiagnosticPrinter *textDiagPrinter =
            new clang::TextDiagnosticPrinter(errs(),
                                         &DiagOpts);
    IntrusiveRefCntPtr<clang::DiagnosticIDs> pDiagIDs;
    DiagnosticsEngine *pDiagnosticsEngine =
            new DiagnosticsEngine(pDiagIDs,
                                         &DiagOpts,
                                         textDiagPrinter);

    // Initialize CompilerInvocation
    CompilerInvocation *CI = new CompilerInvocation();
    CompilerInvocation::CreateFromArgs(*CI, &args[0], &args[0] +     args.size(), *pDiagnosticsEngine);

    // Map code filename to a memoryBuffer
    StringRef testCodeData(testCode);
    unique_ptr<MemoryBuffer> buffer = MemoryBuffer::getMemBufferCopy(testCodeData);
    CI->getPreprocessorOpts().addRemappedFile(testCodeFileName, buffer.get());


    // Create and initialize CompilerInstance
    CompilerInstance Clang;
    Clang.setInvocation(CI);
    Clang.createDiagnostics();

    // Set target (I guess I can initialize only the BPF target, but I don't know how)
    InitializeAllTargets();
    const std::shared_ptr<clang::TargetOptions> targetOptions = std::make_shared<clang::TargetOptions>();
    targetOptions->Triple = string("bpf");
    TargetInfo *pTargetInfo = TargetInfo::CreateTargetInfo(*pDiagnosticsEngine,targetOptions);
    Clang.setTarget(pTargetInfo);

    // Create and execute action
    // CodeGenAction *compilerAction = new EmitLLVMOnlyAction();
    CodeGenAction *compilerAction = new EmitAssemblyAction();
    Clang.ExecuteAction(*compilerAction);

    buffer.release();
}
Run Code Online (Sandbox Code Playgroud)

要编译我使用以下CMakeLists.txt:

cmake_minimum_required(VERSION 3.3.2)
project(clang_backend CXX)

set(CMAKE_CXX_COMPILER "clang++")

execute_process(COMMAND llvm-config --cxxflags OUTPUT_VARIABLE LLVM_CONFIG OUTPUT_STRIP_TRAILING_WHITESPACE)
execute_process(COMMAND llvm-config --libs OUTPUT_VARIABLE LLVM_LIBS OUTPUT_STRIP_TRAILING_WHITESPACE)

set(CMAKE_CXX_FLAGS ${LLVM_CONFIG})

set(CLANG_LIBS clang clangFrontend clangDriver clangSerialization clangParse
    clangCodeGen  clangSema clangAnalysis clangEdit clangAST clangLex
    clangBasic )

add_executable(clang_backend main.cpp)
target_link_libraries(clang_backend ${CLANG_LIBS})
target_link_libraries(clang_backend ${LLVM_LIBS})
Run Code Online (Sandbox Code Playgroud)

如果我理解正确,如果我将编译器操作更改为EmitAssemblyAction(),我应该能够生成汇编代码,但我可能没有初始化某些东西,因为我在llvm :: TargetPassConfig :: addPassesToHandleExceptions中遇到分段错误(这个= this @ entry = 0x6d8d30)/tmp/llvm-3.7.1.src/lib/CodeGen/Passes.cpp:419

这一行的代码是:

switch (TM->getMCAsmInfo()->getExceptionHandlingType()) {
Run Code Online (Sandbox Code Playgroud)

有没有人有一个例子或知道我错过了什么?

Mat*_*son 5

因此,如果您使用asserts编译LLVM,则错误会更加清晰,它实际上会告诉您需要执行的操作:

x: .../src/llvm/lib/CodeGen/LLVMTargetMachine.cpp:63: 
void llvm::LLVMTargetMachine::initAsmInfo(): 
Assertion `TmpAsmInfo && "MCAsmInfo not initialized. " 
"Make sure you include the correct TargetSelect.h" 
"and that InitializeAllTargetMCs() is being invoked!"' failed.
Run Code Online (Sandbox Code Playgroud)

(我添加了一些换行符,因为它打印成一条长行).

InitializeAllTargetMCs()开头添加了必需之后main,我又遇到了另一个错误.看一下我的编译器的目标文件生成,我"猜到"这是另一个InitializeAll*调用的问题.一点点的测试,结果证明你也需要InitializeAllAsmPrinters();- 这是有意义的,因为你想要生成汇编代码.

我不完全确定如何"看到"代码中的结果,但是将这两个添加到开头main会使它运行完成而不是断言,退出时出现错误或崩溃 - 这通常是右边的一个很好的步骤方向.

所以这就是main"我的"代码中的样子:

int main() {

    constexpr auto testCodeFileName = "test.cpp";
    constexpr auto testCode = "int test() { return 2+2; }";

    InitializeAllTargetMCs();
    InitializeAllAsmPrinters();

    // Prepare compilation arguments
    vector<const char *> args;
    args.push_back(testCodeFileName);

    // Prepare DiagnosticEngine 
    DiagnosticOptions DiagOpts;
    TextDiagnosticPrinter *textDiagPrinter =
            new clang::TextDiagnosticPrinter(errs(),
                                         &DiagOpts);
    IntrusiveRefCntPtr<clang::DiagnosticIDs> pDiagIDs;
    DiagnosticsEngine *pDiagnosticsEngine =
            new DiagnosticsEngine(pDiagIDs,
                                         &DiagOpts,
                                         textDiagPrinter);

    // Initialize CompilerInvocation
    CompilerInvocation *CI = new CompilerInvocation();
    CompilerInvocation::CreateFromArgs(*CI, &args[0], &args[0] +     args.size(), *pDiagnosticsEngine);

    // Map code filename to a memoryBuffer
    StringRef testCodeData(testCode);
    unique_ptr<MemoryBuffer> buffer = MemoryBuffer::getMemBufferCopy(testCodeData);
    CI->getPreprocessorOpts().addRemappedFile(testCodeFileName, buffer.get());


    // Create and initialize CompilerInstance
    CompilerInstance Clang;
    Clang.setInvocation(CI);
    Clang.createDiagnostics();

    // Set target (I guess I can initialize only the BPF target, but I don't know how)
    InitializeAllTargets();
    const std::shared_ptr<clang::TargetOptions> targetOptions = std::make_shared<clang::TargetOptions>();
    targetOptions->Triple = string("bpf");
    TargetInfo *pTargetInfo = TargetInfo::CreateTargetInfo(*pDiagnosticsEngine,targetOptions);
    Clang.setTarget(pTargetInfo);

    // Create and execute action
    // CodeGenAction *compilerAction = new EmitLLVMOnlyAction();
    CodeGenAction *compilerAction = new EmitAssemblyAction();
    Clang.ExecuteAction(*compilerAction);

    buffer.release();
}
Run Code Online (Sandbox Code Playgroud)

我强烈建议如果你想用clang和LLVM进行开发,你可以构建一个Clang和LLVM的调试版本 - 这将有助于追踪"为什么",并且还可以及早发现问题并在更明显的地方找到问题.使用-DCMAKE_BUILD_TYPE=Debugcmake吃出味道.

获取LLVM和Clang构建的完整脚本:

export CC=clang
export CXX=clang++ 
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=/usr/local/llvm-debug -DLLVM_TAR
GETS_TO_BUILD=X86 ../llvm
Run Code Online (Sandbox Code Playgroud)

[我使用的是3.8版本的延迟预发布来测试这个,但我非常怀疑它与3.7.1在这方面有很大的不同]