Guide: Many architects want to understand and master high-performance server development. One way to do this is to read the good source code, where nginx is an industry-renowned implementation of a high-performance web server. How to effectively read and understand nginx? This article introduces core modules such as HTTP and common problems in reading nginx source code, helping everyone to better read and understand the implementation of nginx key modules.
Chen Ke, with ten years of experience in the industry, has worked as a development engineer & architect in Zhejiang Telecom, Alibaba, Huawei, and Wuba Tongcheng. He is currently responsible for the back-end architecture and operation and maintenance of Helijia. Blog address: http://www.dumpcache.com/wiki/doku.php
After Nginx parses the request line and request header, it defines a total of eleven stages, which are introduced as follows
How the HTTP module works
Eleven phase definitions of HTTP processing
typedef enum {
NGX_HTTP_POST_READ_PHASE = 0, // Read request content phase
NGX_HTTP_SERVER_REWRITE_PHASE, // Server request address rewriting phase
NGX_HTTP_FIND_CONFIG_PHASE, // Configuration lookup phase
NGX_HTTP_REWRITE_PHASE, // Location request address rewriting phase
NGX_HTTP_POST_REWRITE_PHASE, // request address rewrite submission phase
NGX_HTTP_PREACCESS_PHASE, // access permission check preparation phase
NGX_HTTP_ACCESS_PHASE, // Access permission check phase
NGX_HTTP_POST_ACCESS_PHASE, // Access permission check submission phase
NGX_HTTP_TRY_FILES_PHASE, // configuration item try_files processing phase
NGX_HTTP_CONTENT_PHASE, // content generation phase
NGX_HTTP_LOG_PHASE // log module processing phase
} ngx_http_phases;
1. Read request content stage
There is no default handler at this stage, it is mainly used to read the request body and process the request body accordingly
Server requests address rewriting stage. This stage mainly deals with the global (server block) rewrite rules.
2. Configuration search phase
This stage is mainly to find the corresponding location through uri. Then associate the uri and location data. The main processing logic at this stage is in the checker function, and custom handlers cannot be mounted.
3. Location request address rewriting stage
This mainly deals with the rewrite of the location block.
4. Request address rewrite submission stage
post rewrite, this is mainly for some verification and finishing work, so that it can be handed over to the following modules. This phase cannot mount custom handlers.
5. Access permission check preparation stage
For example, the access of flow control is placed in this phase, that is to say, it mainly performs some relatively coarse-grained access.
6. Access permission check stage
For example, access control and permission verification are placed in this phase. Generally speaking, the processing actions are handed over to the following modules. This is mainly to do some fine-grained access.
7. Location request address rewriting stage
This mainly deals with the rewrite of the location block.
8. Access permission check submission stage
Generally speaking, after the above access module gets the access_code, this module will operate according to the access_code. This phase cannot mount a custom handler.
9. Configuration item try_files processing stage
The try_file module corresponds to the try_files directive in the configuration file. This phase cannot mount custom handlers. Checks for the existence of files in order, returning the first file found. A trailing slash indicates a folder -$uri/. If none of the files can be found, an internal redirect is performed to the last parameter.
10. Content generation stage
The content processing module generates file content. If it is php, it calls phpcgi. If it is a proxy, it forwards it to the corresponding back-end server
11. Log module processing stage
The log processing module must be executed at the end of each request. Used to print access logs.
Custom handlers can sometimes be mounted in different phases, and they can all run normally. If a custom handler depends on the result of a certain phase, it must be mounted on the phase behind the phase. self�Multiple worker processes manage an epoll event pool respectively.
In this way, the cpu can be used as much as possible, and the whole design idea of ngx is around asynchronous and non-blocking. Event mechanism similar to Redis.
6. Doesn’t Apache have an advantage over Nginx for static file processing?
I used ab to do a performance test before. For a 1M page, the performance of ngx is stronger than Apache. I don’t know how the conclusion that Apache has more advantages than ngx comes out. In theory, the epoll model will have less overhead than the multi-threaded model.
7. In response to question 5, does it mean that each worker process should set up a thread pool?
Each worker process does not need a thread pool. epoll is an event pool. After the event is ready, you can call back the callback function you registered before.
8. If the number of assigned work processes matches the number of CPU cores, since the work process works asynchronously and there is no blocking, is it meaningful to build a thread pool for each work process? , because the cpu cores are all running and there is no idle time.
I think it is still meaningful, after all, the CPU will not give you the monopoly of those processes, the time slice will be switched, and the thread pool can improve the throughput. Of course, this needs to be tested, and you need to look at the usage scenario.
9. For question 3, which thread pool should be used?
Nginx introduces a thread pool to solve the problem of performance degradation caused by some long-blocked calls. You can add options to enable the thread pool at compile time.
10. Nginx’s high performance and high throughput. What is the most important design decision? Asynchronous non-blocking?
Asynchronous and non-blocking is definitely a core point. In addition, ngx is very stingy about any place that uses memory, and it is all done on the framework. And if you can’t use it, don’t use it, so when you write a module yourself, you get the memory from the pool. The release is also done by ngx for you. Moreover, all data structures of ngx are carefully designed according to the scene. If you leave the ngx scene, you will feel weird everywhere else.