Kaleidoscope 语言: LLVM IR

当我们完成了解析之后,我们就可以把 AST 转换为 LLVM IR。为什么要转为 IR,这不不是必须的, 对于解释性语言,直接就可以进行解释执行,比如 Evaluating Expressions

但是对于编译性语言,我们需要进行编译,编译成字节码,而字节码就是一种 IR,这次我们将把 AST 转换为 LLVM IR,然后就可以通过 LLVM 运行

基类 #

/// ExprAST - Base class for all expression nodes.
class ExprAST {
public:
  virtual ~ExprAST() = default;
  // 新增函数,用于生成 LLVM IR 的 Virtual method, 由子类实现
  virtual Value *codegen() = 0;
};

比如 NumberExprASTcodegen 函数就很简单

Value *NumberExprAST::codegen() {
  return ConstantFP::get(*TheContext, APFloat(Val));
}

比如 BinOps 如下

Value *BinaryExprAST::codegen() {
  Value *L = LHS->codegen();
  Value *R = RHS->codegen();
  if (!L || !R)
    return nullptr;

  switch (Op) {
  case '+':
    return Builder->CreateFAdd(L, R, "addtmp");
  case '-':
    return Builder->CreateFSub(L, R, "subtmp");
  case '*':
    return Builder->CreateFMul(L, R, "multmp");
  case '<':
    L = Builder->CreateFCmpULT(L, R, "cmptmp");
    // Convert bool 0/1 to double 0.0 or 1.0
    return Builder->CreateUIToFP(L, Type::getDoubleTy(TheContext),
                                 "booltmp");
  default:
    return LogErrorV("invalid binary operator");
  }
}

Call AST 转换如下

Value *CallExprAST::codegen() {
  // Look up the name in the global module table.
  Function *CalleeF = TheModule->getFunction(Callee);
  if (!CalleeF)
    return LogErrorV("Unknown function referenced");

  // If argument mismatch error.
  if (CalleeF->arg_size() != Args.size())
    return LogErrorV("Incorrect # arguments passed");

  std::vector<Value *> ArgsV;
  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
    ArgsV.push_back(Args[i]->codegen());
    if (!ArgsV.back())
      return nullptr;
  }

  return Builder->CreateCall(CalleeF, ArgsV, "calltmp");
}

最终效果 #

ready> def foo(a b) a*a + 2*a*b + b*b;
ready> Read function definition:define double @foo(double %a, double %b) {
entry:
  %multmp = fmul double %a, %a
  %multmp1 = fmul double 2.000000e+00, %a
  %multmp2 = fmul double %multmp1, %b
  %addtmp = fadd double %multmp, %multmp2
  %multmp3 = fmul double %b, %b
  %addtmp4 = fadd double %addtmp, %multmp3
  ret double %addtmp4
}

几乎可读性还是很高的。

参考 #

comments powered by Disqus