1024programmer Blog In-depth explanation of LLVM_Adenialzz’s blog

In-depth explanation of LLVM_Adenialzz’s blog

class=”markdown_views prism-atom-one-dark”>

Introduction to LLVM

From: https://www.jianshu.com/p/1367dad95445

What is LLVM?

The LLVM project is a collection of modular, reusable compiler and toolchain technologies.

The American Association for Computing Machinery (ACM) awarded LLVM its 2012 Software System Award. Previously awarded software and technologies include: Java, Apache, Mosaic, the World Wide Web, Smalltalk, UNIX, Eclipse, etc. . Founder: Chris Lattner is also the father of Swift.

An interesting fact: Chris Latter originally wanted to write a low-level virtual machine, which is also the origin of the name of LLVM, low level virtual machine, which is the same as Java’s JVM virtual machine, but later, llvm has never been used as a virtual machine. Even the fame of LLVM has spread. So people decided to still call him LLVM, and more often it was used as a “trademark”, but it had nothing to do with virtual machines. The official description is as follows
The name “LLVM” itself is not an acronym; it is the full name of the project.

Compiler architecture

Traditional Compiler Architecture

Insert picture description here

  • Frontend: front end
    Lexical analysis, syntax analysis, semantic analysis, intermediate code generation
  • Optimizer: Optimizer
    intermediate code optimization
  • Backend: Backend
    Generate machine code

LLVM Architecture

Insert picture description here

  • Different front-end and back-end use a unified intermediate code LLVM Intermediate Representation (LLVM IR)

  • If you need to support a new programming language, you only need to implement a new front end

  • If you need to support a new hardware device, you only need to implement a new backend

  • The optimization phase is a general phase, it is aimed at a unified LLVM IR, no matter whether it supports new programming languages ​​or new hardware devices, it does not need to modify the optimization phase

  • In contrast, GCC’s front-end and back-end are not too separated, and the front-end and back-end are coupled together. So it becomes very difficult for GCC to support a new language, or to support a new target platform

  • LLVM is now used as a common infrastructure for implementing various static and runtime compiled languages ​​(GCC family, Java, .NET, Python, Ruby, Scheme, Haskell, D, etc.)

What is Clang

A subproject of the LLVM project, a C/C++/Objective-C compiler front-end based on the LLVM architecture.

*Compared to GCC, Clang has the following advantages*

  • Fast compilation speed: On some platforms, the compilation speed of Clang is significantly faster than that of GCC (compiling OC in Debug mode is 3 times faster than GGC)
  • Small memory usage: the memory occupied by the AST generated by Clang is about one-fifth of that of GCC
  • Modular design: Clang adopts library-based modular design, which is easy for IDE integration and reuse for other purposes
  • Diagnostic information is highly readable: During the compilation process, Clang creates and retains a large amount of detailed metadata (metadata), which is beneficial for debugging and error reporting
  • The design is clear and simple, easy to understand, easy to expand and enhance

Insert picture description here

The overall architecture of LLVM, the front end uses clang, the broad sense of LLVM refers to the entire LLVM architecture, and the general narrow sense of LLVM refers to the LLVM back end (including code optimization and object code generation).

Source code (c/c++) passes clang–> intermediate code (after a series of optimizations, the optimization uses Pass) –> machine code

Compilation process of OC source files

Here, use Xcode to create a Test project, and then cd to the previous path of main.m.
Command line to view the compilation process: $ clang -ccc-print-phases main.m

$ clang -ccc-print-phases main.m

 0: input, "main.m", objective-c
 1: preprocessor, {
     0}, objective-c-cpp-output
 2: compiler, {
     1}, ir
 3: backend, {
     2}, assembler
 4: assembler, {
     3}, object
 5: linker, {
     4}, image
 6: bind-arch, "x86_64", {
     5}, image
 
  1. Find the main.m file
  2. Preprocessor, processing include, import, macro definition
  3. The compiler compiles and compiles into IR intermediate code
  4. Backend, generate object code
  5. Compilation
  6. Link other dynamic library static library
  7. Compile to code for an architecture

Check the preprocessor (preprocessing) result: $ clang -E main.m
When this command is typed out, the terminal will print a lot of information, roughly as follows:

# 1 "main.m"
 # 1 "" 1
 # 1 "" 3
 # 353 "" 3
 # 1 "" 1
 # 1 "" 2
 # 1 "main.m" 2
 .
 .
 .
   int main(int argc, const char * argv[]) {
     
 @autoreleasepool {
     
     NSLog(@"Hello, World!");
 }
 return 0;
 }
 

Lexical Analysis

Lexical analysis, generate Token: $ clang -fmodules -E -Xclang -dump-tokens main.m
Divide the code into small units (token )

For example:

void test(int a, int b  ){
     
        int c = a + b - 3;
   }
 
void 'void' [StartOfLine] Loc=main.m:18:1
 identifier 'test' [LeadingSpace] Loc=main.m  :18:6>
 l_paren '(' Loc=main.m:18:  10
 int 'int' Loc=main.m:18:11
 identifier 'a' [LeadingSpace] Loc=main.m  :18:15>
 comma ',' Loc=main.m:18:  16
 int 'int' [LeadingSpace] Loc=<main.m:18:18
 identifier 'b' [LeadingSpace] Loc=main.m  :18:22>
 r_paren ')' Loc=main.m:18:  23
 l_brace '{' Loc=main.m:18:  24
 int 'int' [StartOfLine] [LeadingSpace] Loc=main.m:19:5
 identifier 'c' [LeadingSpace] Loc=main.m  :19:9>
 equal '=' [LeadingSpace] Loc=main.m  :19:11>
 identifier 'a' [LeadingSpace] Loc=main.m  :19:13>
 plus '+' [LeadingSpace] Loc=main.m  :19:15>
 identifier 'b' <span cl32, i32* %4, align 4
   %8 = add nsw i32 %6, %7
   %9 = sub nsw i32 %8, 3
   store i32 %9, i32* %5, align 4
   ret void
 }
 

IR Basic Grammar

Comments start with a semicolon;
Global identifiers start with @, local identifiers start with %
alloca, allocate memory in the current function stack frame
i32, 32bit, 4 bytes Meaning
align, memory alignment
store, write data
load, read data
official grammar referencehttps://llvm.org/docs/LangRef.html

Application and Practice

Our development is based on source code development, so we need to download and compile the source code first.

Source code download

# Download LLVM
 $ git clone https://git.llvm.org/git/llvm.git/

 # Download clang
 $ cd llvm/tools
 $ git clone https://git.llvm.org/git/clang.git/

 # Remarks: clang is a subproject of llvm, but their source code is separate, we need to put clang in the llvm/tools directory.  
 

Source code compilation

The clang we typed out in the terminal here is the default built-in clang compiler in xcode. If we want to develop LLVM ourselves, we need to compile our own clang compiler.

# First install cmake and ninja (install brew first, https://brew.sh/)
 $ brew install cmake
 $ brew install ninja

 # If the installation of ninja fails, you can directly get the release version from github and put it in [/usr/local/bin]
 # https://github.com/ninja-build/ninja/releases

 # Create a new [llvm_build] directory in the same level directory as the LLVM source code (finally generate [build.

 $ cd llvm_build
 $ cmake -G Ninja ../llvm -DCMAKE_INSTALL_PREFIX= LLVM installation path

 # Remarks: Generating build.ninja means that the compilation is successful, -DCMAKE_INSTALL_PREFIX means that the compiled things are placed in the specified path, and -D means parameters.  

 # For more cmake related options, please refer to: https://llvm.org/docs/CMake.html
 

Next, execute the compilation and installation instructions in sequence

$ ninja
 # After compiling, the [llvm_build] directory is about 21.05 G (this is really big)
 $ ninja install
 

Then here our compilation is complete.

Another way is to compile through Xcode, generate an Xcode project and then compile, but the speed is very slow (may take more than 1 hour).

# The method is as follows:
 # Create a new [llvm_xcode] directory under the same level directory of llvm
 $ cd llvm_xcode
 $ cmake -G Xcode ../llvm
 

Application and Practice Reference

Reference:
https://juejin.im/post/5bfba01df265da614273939a

ps://juejin.im/post/5bfba01df265da614273939a

This article is from the internet and does not represent1024programmerPosition, please indicate the source when reprinting:https://www.1024programmer.com/in-depth-explanation-of-llvm_adenialzzs-blog/

author: admin

Previous article
Next article

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact Us

Contact us

181-3619-1160

Online consultation: QQ交谈

E-mail: [email protected]

Working hours: Monday to Friday, 9:00-17:30, holidays off

Follow wechat
Scan wechat and follow us

Scan wechat and follow us

Follow Weibo
Back to top
首页
微信
电话
搜索