对于一门高级语言来说,用户如果能够自定义操作符,那就更酷啦,这章我们来完成这部分功能。
设计的语法如下
# Logical unary not.
def unary!(v)
if v then
0
else
1;
# Define > with the same precedence as <.
def binary> 10 (LHS RHS)
RHS < LHS;
# Binary "logical or", (note that it does not "short circuit")
def binary| 5 (LHS RHS)
if LHS then
1
else if RHS then
1
else
0;
# Define = with slightly lower precedence than relationals.
def binary= 9 (LHS RHS)
!(LHS < RHS | LHS > RHS);
对于操作符来说,本质上也是一个函数调用,只不过我们需要在后文中能够识别出来这个操作符。
所以第一步我们需要解析这个操作符定义
Parser #
// ::= binary LETTER number? (id, id)
std::unique_ptr<PrototypeAST> ParsePrototype() {
std::string FnName;
unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
unsigned BinaryPrecedence = 30;
switch (CurTok) {
default:
return LogErrorP("Expected function name in prototype");
case tok_identifier:
FnName = IdentifierStr;
Kind = 0;
getNextToken();
break;
case tok_unary: // 如果是单一操作数的自定义操作符
getNextToken(); // eat unary.
if (!isascii(CurTok))
return LogErrorP("Expected unary operator");
FnName = "unary";
FnName += (char)CurTok; // 读取操作符定义
Kind = 1;
getNextToken();
break;
case tok_binary: // 如果是二元操作数的自定义操作符
getNextToken(); // eat binary.
if (!isascii(CurTok))
return LogErrorP("Expected binary operator");
FnName = "binary";
FnName += (char)CurTok; // 读取操作符定义
Kind = 2;
getNextToken();
// 对于 binary 操作符需要定义一个优先级,这里需要读取一下
if (CurTok == tok_number) {
if (NumVal < 1 || NumVal > 100)
return LogErrorP("Invalid precedence: must be 1..100");
BinaryPrecedence = (unsigned)NumVal;
getNextToken();
}
break;
}
// 再后面其实就和函数定义类似了
if (CurTok != '(')
return LogErrorP("Expected '(' in prototype");
std::vector<std::string> ArgNames;
while (getNextToken() == tok_identifier)
ArgNames.push_back(IdentifierStr);
if (CurTok != ')')
return LogErrorP("Expected ')' in prototype");
// success.
getNextToken(); // eat ')'.
// Verify right number of names for operator.
if (Kind && ArgNames.size() != Kind)
return LogErrorP("Invalid number of operands for operator");
return std::make_unique<PrototypeAST>(
FnName,
std::move(ArgNames),
Kind != 0,
BinaryPrecedence);
}
CodeGen #
在Binary 操作符的代码生成的地方,我们需要将我们自定义的操作符也调用一下
llvm::Value* BinaryExprAST::codegen() {
llvm::Value* L = LHS->codegen();
llvm::Value* R = RHS->codegen();
if (!L || !R)
return nullptr;
switch (Op) {
case '+':
return Builder->CreateFAdd(L, R, "addtmp");
case '-':
return Builder->CreateFSub(L, R, "subtmp");
case '*':
return Builder->CreateFMul(L, R, "multmp");
case '<':
L = Builder->CreateFCmpULT(L, R, "cmptmp");
// Convert bool 0/1 to double 0.0 or 1.0
return Builder->CreateUIToFP(L, llvm::Type::getDoubleTy(*TheContext), "booltmp");
default:
break;
}
// 如果不是内置函数,我们需要当成函数来调用
llvm::Function* F = getFunction(std::string("binary") + Op);
assert(F && "binary operator not found!");
llvm::Value* Ops[2] = {L, R};
return Builder->CreateCall(F, Ops, "binop");
}
对于单一操作符就类似,这里就不多过多的展示了。