class=”markdown_views prism-atom-one-dark”>
Introduction to LLVM
From: https://www.jianshu.com/p/1367dad95445
What is LLVM?
The LLVM project is a collection of modular, reusable compiler and toolchain technologies.
The American Association for Computing Machinery (ACM) awarded LLVM its 2012 Software System Award. Previously awarded software and technologies include: Java, Apache, Mosaic, the World Wide Web, Smalltalk, UNIX, Eclipse, etc. . Founder: Chris Lattner is also the father of Swift.
An interesting fact: Chris Latter originally wanted to write a low-level virtual machine, which is also the origin of the name of LLVM, low level virtual machine, which is the same as Java’s JVM virtual machine, but later, llvm has never been used as a virtual machine. Even the fame of LLVM has spread. So people decided to still call him LLVM, and more often it was used as a “trademark”, but it had nothing to do with virtual machines. The official description is as follows
The name “LLVM” itself is not an acronym; it is the full name of the project.
Compiler architecture
Traditional Compiler Architecture
- Frontend: front end
Lexical analysis, syntax analysis, semantic analysis, intermediate code generation - Optimizer: Optimizer
intermediate code optimization - Backend: Backend
Generate machine code
LLVM Architecture
-
Different front-end and back-end use a unified intermediate code LLVM Intermediate Representation (LLVM IR)
-
If you need to support a new programming language, you only need to implement a new front end
-
If you need to support a new hardware device, you only need to implement a new backend
-
The optimization phase is a general phase, it is aimed at a unified LLVM IR, no matter whether it supports new programming languages or new hardware devices, it does not need to modify the optimization phase
-
In contrast, GCC’s front-end and back-end are not too separated, and the front-end and back-end are coupled together. So it becomes very difficult for GCC to support a new language, or to support a new target platform
-
LLVM is now used as a common infrastructure for implementing various static and runtime compiled languages (GCC family, Java, .NET, Python, Ruby, Scheme, Haskell, D, etc.)
What is Clang
A subproject of the LLVM project, a C/C++/Objective-C compiler front-end based on the LLVM architecture.
*Compared to GCC, Clang has the following advantages*
- Fast compilation speed: On some platforms, the compilation speed of Clang is significantly faster than that of GCC (compiling OC in Debug mode is 3 times faster than GGC)
- Small memory usage: the memory occupied by the AST generated by Clang is about one-fifth of that of GCC
- Modular design: Clang adopts library-based modular design, which is easy for IDE integration and reuse for other purposes
- Diagnostic information is highly readable: During the compilation process, Clang creates and retains a large amount of detailed metadata (metadata), which is beneficial for debugging and error reporting
- The design is clear and simple, easy to understand, easy to expand and enhance
The overall architecture of LLVM, the front end uses clang, the broad sense of LLVM refers to the entire LLVM architecture, and the general narrow sense of LLVM refers to the LLVM back end (including code optimization and object code generation).
Source code (c/c++) passes clang–> intermediate code (after a series of optimizations, the optimization uses Pass) –> machine code
Compilation process of OC source files
Here, use Xcode to create a Test project, and then cd to the previous path of main.m.
Command line to view the compilation process: $ clang -ccc-print-phases main.m
$ clang -ccc-print-phases main.m
0: input, "main.m", objective-c
1: preprocessor, {
0}, objective-c-cpp-output
2: compiler, {
1}, ir
3: backend, {
2}, assembler
4: assembler, {
3}, object
5: linker, {
4}, image
6: bind-arch, "x86_64", {
5}, image
- Find the main.m file
- Preprocessor, processing include, import, macro definition
- The compiler compiles and compiles into IR intermediate code
- Backend, generate object code
- Compilation
- Link other dynamic library static library
- Compile to code for an architecture
Check the preprocessor (preprocessing) result: $ clang -E main.m
When this command is typed out, the terminal will print a lot of information, roughly as follows:
# 1 "main.m"
# 1 "" 1
# 1 "" 3
# 353 "" 3
# 1 "" 1
# 1 "" 2
# 1 "main.m" 2
.
.
.
int main(int argc, const char * argv[]) {
@autoreleasepool {
NSLog(@"Hello, World!");
}
return 0;
}
Lexical Analysis
Lexical analysis, generate Token: $ clang -fmodules -E -Xclang -dump-tokens main.m
Divide the code into small units (token )
For example:
void test(int a, int b ){
int c = a + b - 3;
}
void 'void' [StartOfLine] Loc=main.m:18:1
identifier 'test' [LeadingSpace] Loc=main.m :18:6>
l_paren '(' Loc=main.m:18: 10
int 'int' Loc=main.m:18:11
identifier 'a' [LeadingSpace] Loc=main.m :18:15>
comma ',' Loc=main.m:18: 16
int 'int' [LeadingSpace] Loc=<main.m:18:18
identifier 'b' [LeadingSpace] Loc=main.m :18:22>
r_paren ')' Loc=main.m:18: 23
l_brace '{' Loc=main.m:18: 24
int 'int' [StartOfLine] [LeadingSpace] Loc=main.m:19:5
identifier 'c' [LeadingSpace] Loc=main.m :19:9>
equal '=' [LeadingSpace] Loc=main.m :19:11>
identifier 'a' [LeadingSpace] Loc=main.m :19:13>
plus '+' [LeadingSpace] Loc=main.m :19:15>
identifier 'b' <span cl32, i32* %4, align 4
%8 = add nsw i32 %6, %7
%9 = sub nsw i32 %8, 3
store i32 %9, i32* %5, align 4
ret void
}
IR Basic Grammar
Comments start with a semicolon;
Global identifiers start with @, local identifiers start with %
alloca, allocate memory in the current function stack frame
i32, 32bit, 4 bytes Meaning
align, memory alignment
store, write data
load, read data
official grammar referencehttps://llvm.org/docs/LangRef.html
Application and Practice
Our development is based on source code development, so we need to download and compile the source code first.
Source code download
# Download LLVM
$ git clone https://git.llvm.org/git/llvm.git/
# Download clang
$ cd llvm/tools
$ git clone https://git.llvm.org/git/clang.git/
# Remarks: clang is a subproject of llvm, but their source code is separate, we need to put clang in the llvm/tools directory.
Source code compilation
The clang we typed out in the terminal here is the default built-in clang compiler in xcode. If we want to develop LLVM ourselves, we need to compile our own clang compiler.
# First install cmake and ninja (install brew first, https://brew.sh/)
$ brew install cmake
$ brew install ninja
# If the installation of ninja fails, you can directly get the release version from github and put it in [/usr/local/bin]
# https://github.com/ninja-build/ninja/releases
# Create a new [llvm_build] directory in the same level directory as the LLVM source code (finally generate [build.
$ cd llvm_build
$ cmake -G Ninja ../llvm -DCMAKE_INSTALL_PREFIX= LLVM installation path
# Remarks: Generating build.ninja means that the compilation is successful, -DCMAKE_INSTALL_PREFIX means that the compiled things are placed in the specified path, and -D means parameters.
# For more cmake related options, please refer to: https://llvm.org/docs/CMake.html
Next, execute the compilation and installation instructions in sequence
$ ninja
# After compiling, the [llvm_build] directory is about 21.05 G (this is really big)
$ ninja install
Then here our compilation is complete.
Another way is to compile through Xcode, generate an Xcode project and then compile, but the speed is very slow (may take more than 1 hour).
# The method is as follows:
# Create a new [llvm_xcode] directory under the same level directory of llvm
$ cd llvm_xcode
$ cmake -G Xcode ../llvm
Application and Practice Reference
- libclang, libTooling
Official reference: https://clang.llvm.org/docs/Tooling.html
Application: syntax tree analysis, language conversion, etc. - Clang Plugin Development
Official Reference
1. https://clang.llvm.org/docs/ClangPlugins.html
2. https://clang.llvm.org/docs/ExternalClangExamples.html
3. https://clang.llvm.org/docs/RAVFrontendAction.html
Application: code inspection (naming specification, code specification), etc. - Pass development
Official reference: https:/ /llvm.org/docs/WritingAnLLVMPass.html
Application: code optimization, code obfuscation, etc. - Develop a new programming language
1. https://llvm-tutorial-cn.readthedocs.io/en/latest/index.html
2. https://kaleidoscope-llvm-tutorial-zh-cn.readthedocs.io /zh_CN/latest/
Reference:
https://juejin.im/post/5bfba01df265da614273939a
ps://juejin.im/post/5bfba01df265da614273939a