Kaleidoscope 语言: User Define Operator

对于一门高级语言来说,用户如果能够自定义操作符,那就更酷啦,这章我们来完成这部分功能。

设计的语法如下

# Logical unary not.
def unary!(v)
  if v then
    0
  else
    1;

# Define > with the same precedence as <.
def binary> 10 (LHS RHS)
  RHS < LHS;

# Binary "logical or", (note that it does not "short circuit")
def binary| 5 (LHS RHS)
  if LHS then
    1
  else if RHS then
    1
  else
    0;

# Define = with slightly lower precedence than relationals.
def binary= 9 (LHS RHS)
  !(LHS < RHS | LHS > RHS);

对于操作符来说,本质上也是一个函数调用,只不过我们需要在后文中能够识别出来这个操作符。

所以第一步我们需要解析这个操作符定义

Parser #

// ::= binary LETTER number? (id, id)
std::unique_ptr<PrototypeAST> ParsePrototype() {
    std::string FnName;

    unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
    unsigned BinaryPrecedence = 30;

    switch (CurTok) {
        default:
            return LogErrorP("Expected function name in prototype");
        case tok_identifier:
            FnName = IdentifierStr;
            Kind = 0;
            getNextToken();
            break;
        case tok_unary: // 如果是单一操作数的自定义操作符
            getNextToken(); // eat unary.
            if (!isascii(CurTok))
                return LogErrorP("Expected unary operator");
            FnName = "unary";
            FnName += (char)CurTok; // 读取操作符定义
            Kind = 1;
            getNextToken();
            break;
        case tok_binary:  // 如果是二元操作数的自定义操作符
            getNextToken(); // eat binary.
            if (!isascii(CurTok))
                return LogErrorP("Expected binary operator");
            FnName = "binary";
            FnName += (char)CurTok; // 读取操作符定义
            Kind = 2;
            getNextToken();

            // 对于 binary 操作符需要定义一个优先级,这里需要读取一下
            if (CurTok == tok_number) {
                if (NumVal < 1 || NumVal > 100)
                    return LogErrorP("Invalid precedence: must be 1..100");
                BinaryPrecedence = (unsigned)NumVal;
                getNextToken();
            }
            break;
    }

    // 再后面其实就和函数定义类似了
    if (CurTok != '(')
        return LogErrorP("Expected '(' in prototype");

    std::vector<std::string> ArgNames;
    while (getNextToken() == tok_identifier)
        ArgNames.push_back(IdentifierStr);
    if (CurTok != ')')
        return LogErrorP("Expected ')' in prototype");

    // success.
    getNextToken(); // eat ')'.

    // Verify right number of names for operator.
    if (Kind && ArgNames.size() != Kind)
        return LogErrorP("Invalid number of operands for operator");

    return std::make_unique<PrototypeAST>(
        FnName,
        std::move(ArgNames),
        Kind != 0,
        BinaryPrecedence);
}

CodeGen #

在Binary 操作符的代码生成的地方,我们需要将我们自定义的操作符也调用一下

llvm::Value* BinaryExprAST::codegen() {
    llvm::Value* L = LHS->codegen();
    llvm::Value* R = RHS->codegen();
    if (!L || !R)
        return nullptr;

    switch (Op) {
        case '+':
            return Builder->CreateFAdd(L, R, "addtmp");
        case '-':
            return Builder->CreateFSub(L, R, "subtmp");
        case '*':
            return Builder->CreateFMul(L, R, "multmp");
        case '<':
            L = Builder->CreateFCmpULT(L, R, "cmptmp");
        // Convert bool 0/1 to double 0.0 or 1.0
            return Builder->CreateUIToFP(L, llvm::Type::getDoubleTy(*TheContext), "booltmp");
        default:
            break;
    }

    // 如果不是内置函数,我们需要当成函数来调用
    llvm::Function* F = getFunction(std::string("binary") + Op);
    assert(F && "binary operator not found!");

    llvm::Value* Ops[2] = {L, R};
    return Builder->CreateCall(F, Ops, "binop");
}

对于单一操作符就类似,这里就不多过多的展示了。

comments powered by Disqus