Web container design
Developing a web container involves many technologies at different levels, such as knowledge of the communication layer, knowledge of the program language level, etc., and an available web container is a relatively large system, and it takes a long time to explain it clearly. This article aims to introduce how to design a web container, only discuss the implementation ideas, and does not involve too many specific implementations. Break it into several modules and components. Each component module is responsible for different functions. The following figure lists some basic components and introduces each component.
Connect to the receiver
The main responsibility is to listen to whether there is a client socket connection and receive the socket, and then hand over the socket to the task executor (thread pool) for execution. Continuously read sockets from the bottom of the system, do as little processing as possible, and then throw them into the thread pool. Why emphasize the need to deal with as little as possible? This is related to system performance issues, and excessive processing will seriously affect throughput. Because there is generally only one receiver (one thread is responsible for socket reception), it will likely have an impact on the length of each reception process. Therefore, the receiver does very little and simple work, only maintaining a few state variables, the accumulation operation of the flow control gate, the reception operation of the serverSocket, setting some properties of the received socket, placing the received socket into the thread pool, and some exception handling. Other logic that takes a long time to process is handed over to the thread pool, such as reading the underlying data of the socket, parsing the http protocol packets and responding to some operations of the client, etc.
Connection number controller
For a machine, the total traffic of access requests has a peak period and the server has a physical limit. In order to ensure that the web server is not washed away, we need to take some measures to protect and prevent it. The traffic here needs to be explained slightly more about the number of socket connections, which controls the traffic by controlling the number of socket connections. One effective method is to adopt flow control, which is like adding a gate to the inlet of the flow. The size of the gate determines the size of the flow. Once the maximum flow is reached, the gate will be closed and stopped receiving until there is an idle channel. Counters can be implemented using JDK's AQS framework.
Socket factory
Different usage occasions may require different security levels. For example, when payment-related transactions, the information must be encrypted before sending, which also involves the process of key negotiation, while in other ordinary occasions, there is no need to encrypt the packet. Reflecting to the application layer is a problem of using http and https.
Simply put, the TLS/SSL protocol provides authentication services for each communication ① to authenticate the legality of the entity identity of this session. ② Provide encryption services, and the strong encryption mechanism can ensure that messages during communication will not be deciphered. ③ Provide tamper-proof services, use the Hash algorithm to sign messages, and ensure that the communication content is not tampered with by verification of signatures.
The http protocol corresponds to Socket, while https corresponds to SSLSocket. How to generate Sockets and SSLSockets is handed over to the socket factory.
Task Definer - Task
Define the tasks to be executed and tell the thread pool what kind of tasks to be executed. The task is mainly divided into three points: processing sockets and responding to clients, decrementing the connection count counter, and closing the socket. Among them, the processing of socket is the most important and most complex. It includes reading the underlying socket byte stream, parsing the http protocol request message (analysis of the request line, request header, request body and other information), obtaining the path based on the request line analysis to find the resources of the web project on the corresponding host, and assembling the http protocol response message according to the processing results and outputting it to the client.
Task executor
A thread pool with a maximum and minimum number of threads is called a "task executor" because the thread pool can be regarded as starting several threads to continuously detect a task queue, and once a task needs to be executed is found, it will be executed. The maximum and minimum number of threads limit, the redundant thread recovery time limit, the rejection action made by the thread pool when the maximum number of threads exceeds, etc.
Message reading
Used to read packets from the client to the underlying operating system and provide a buffering mechanism. Copy the message to desBuf.
Message output
Used to write packets processed by web containers to the operating system and provide a buffering mechanism. Write the message outputBuf to the operating system through the buffer.
Input filter
In this reading process, some additional processing is desired, and these additional processing may be done differently according to different conditions. Considering the program decoupling and extension, filters are introduced. Only after filtering operations can we reach desBuf through layers of filters. This process is like adding processing levels. The corresponding operations will be performed after passing through the levels, and finally the source data to the destination data will be completed.
Output filter
It is similar to the input filter function, and is used when message output.
Message parser
Provides the ability to parse various parts of the http protocol.
Request Generator
According to the object-oriented idea, the attributes and protocol fields related to the request in each request process are abstracted into a Request object. It includes three parts: request line, request header, and request body. What values are needed during processing can be directly obtained from the request object. Provides convenience for implementing servlet standards.
Response generator
Corresponding to the request, a response object generator is required. It includes three parts: response row, response header, and response body. The relevant values in the processing result can be directly set to the response object. Provides convenience for implementing servlet standards.
Address Mapper
An address mapper is a router that requests and resources. A requested access is mapped according to the path to find the resource of the response to the requesting client.
life cycle
In order to further modularize, the entire container has many components, which may require different events at different moments and require a life cycle to manage all components in a unified manner. For example, the startup, stop, and shutdown of all components are separated from the unified management of life cycles, which can facilitate the management of the life cycle of these components. I hope to do something before and after something happens in a certain state? Add a lifecycle listener to achieve gracefully.
JMX Manager
Monitoring and management of system operation status, server performance, collection of server-related parameters, JVM load, web connection number, thread pool, database connection pool, cache management, configuration file reloading, etc. It can provide some remote visual management, with high real-time performance. It also provides a solution for the management of distributed systems.
Web Loader
WebLoader is used to load web application projects. A web container may contain several web applications. In order to achieve isolation between lib and servlet, a different class loader, ClassLoader, must be used for each web application, and these class loaders are not parent-child relationships, so as to achieve class isolation effect, that is, the lib of a web application will not be used by other web applications.
Session Manager
The session manager mainly manages sessions, including: ① Generate sessionids. Generally, cookies or urls do not have jsessionid values, and the sessionid needs to be regenerated as sessionid. ② Many client sessions are saved in the server. For timeout sessions, you must clean up regularly to ensure that the server memory is not wasted. ③For some important sessions, they can be persisted to disk and can be reloaded into memory for use when needed.
Run log
Record some warnings, exceptions, and errors at runtime.
Access log
The access log generally records the client's access related information, including the client ip, request time, request protocol, request method, request byte number, response code, session id, processing time, etc. The access log can count the number of visiting users, the rules of access time distribution, personal hobbies, etc., and these data can help companies make decisions in their operational strategies.
Security Manager
A web project runs on a web container platform, which is like embedding an application on a platform to run. To make the embedded program run normally, the platform must be able to run safely and normally. And to ensure that the platform is not affected by embedded applications to the greatest extent, the two achieve the effect of isolation to a certain extent. At startup, the policy file is specified by -Djava.security.manager -Djava.security.policy==web.policy, which defines various permissions.
Operation monitoring & remote management
Provides a platform that can monitor the running status of web containers in real time and can be managed remotely.
Cluster
There are generally two types of clusters: ① Load balancing clusters, which generally use a certain distribution algorithm to evenly distribute access traffic to each machine in the cluster for processing. ② High availability clusters, cluster communication connects several machines. This kind of cluster focuses more on ensuring the external availability of the entire cluster through automatic switching or traffic transfer after a machine in the cluster fails.
Generally, web requests are stateless and can be directly clustered, but the session involves statefulness and requires the use of cluster communication technology to copy sessions. Related technologies include multicast and unicast.
Servlet Engine
The servlet engine uses reflection to generate objects in the servlet and jsp in the web application and put them into the servlet object pool, and calls the corresponding methods according to the actual situation. The web application places business logic processing in the dopost or dog method. When the web container processes the request, it will process it according to the processing logic defined here, and the response client will be responded to.
JSP compiler
According to the specification, JSP is eventually compiled into servlet execution, so the jsp file must be compiled according to the specification. The JSP compiler actually translates the jsp syntax and processes it according to the jsp syntax.
A web container basically contains the functions of the components introduced above. You can build a web container that allows your web to run through it according to the implementation of each component module.
Thank you for reading, I hope it can help you. Thank you for your support for this site!