In the previous lecture, we introduced the functions of CGI programs. Today, we enter: Part 2, the concept of CGI programming. The content of this lecture is the basis for an in-depth understanding of CGI programs.
This series of lectures uses Delphi to write CGI programs. For this content, Delphi has been encapsulated in its classes. You may think that this lecture is unnecessary. But I think one of the advantages of CGI is that there are many available development languages (this issue will be mentioned below), and the content of this lecture can be used in any programming language (including Delphi). Therefore, to take advantage of CGI, the content of this lecture is still necessary; besides, the content of this lecture is the basis for an in-depth understanding of CGI programs.
2. CGI specifications:
Typically, a WEB server is a powerful computer, but it is impossible to utilize all the processing power. The emergence of CGI allows people to use the processing power of the WEB server to provide interesting and dynamic content to remote clients. The CGI specification applies to WEB servers and applications running on the WEB. It is not part of the HTTP protocol, but most WEB servers support this specification, such as NCSA httpd, CERN httpd, Apache httpd, IIS and the OmniHTTPD we use.
2.1, CGI Overview
CGI defines a set of rules that are followed in the interoperation between WEB servers, browsers and applications. For example, query the remote database system through a WEB browser:
2.2. Language:
CGI programs can be written in any language that can be executed on the WEB server. You should choose the language you are most familiar with and best suited for your current job. For example: Perl language is suitable for string and file processing, C is more suitable for large and complex programs, Visual Basic and Delphi are suitable for database processing, and so on. The following are commonly used CGI programming languages:
C
C++
Perl
Tcl
Python
Shell Scripts
Visual Basic
Delphi
Applescript
2.3, CGI method:
The way to call CGI is called a CGI method. There are three main CGI methods:
2.3.1, GET method:
The GET method is the method used by the browser to make a request to the WEB server. When using this method, the CGI program obtains data from the environment variable QUERY_STRING. In order to obtain the input parameters, the CGI program must analyze this environment variable. When the data to be transmitted is very long, the POST method should be used.
2.3.2, POST method:
When using the POST method, CGI programs get input data from stdin (standard input). Since there is no EOF (End Of File) at the end of the input data, the CGI program must use the environment variable CONTENT_LENGTH value in order to correctly read the input data. The biggest advantage of using this method is that it can transmit a large amount of data, while the GET method cannot transmit a large amount of data due to the limitation of the URL length (generally no more than 1024 bytes). At this time, the POST method is the only option.
2.3.3, HEAD method:
The HEAD method is basically the same as the GET method, except that it transmits data from the WEB server to the browser. Moreover, only the HTTP header information is transmitted.
2.4. Interface specifications:
The following will introduce the four main methods for WEB servers to communicate with CGI programs: environment variables, command line, standard input and standard output. (Based on CGI Version 1.1)
2.4.1. Environment variables:
AUTH_TYPE: If the server supports acknowledgments and the script is protected, gives the type of acknowledgment.
CONTENT_LENGTH: Gives the length of data transmitted using the POST method in bytes. The variable is empty when using the GET method.
CONTENT_TYPE: Gives the MINE type of data transmitted when using the POST method. The variable is empty when using the GET method. Such as: application/x-www-form-urlencoded.
GETWAY_INTERFACE: Give the CGI specification name and version number, such as: CGI/1.1.
PATH_INFO: gives the additional path information after the CGI program name in the URL.
PATH_TRANSLATED: The physical path of the CGI program, usually the WEB root directory, script name and additional path information.
QUERY_STRING: The information after the "?" character in the URL. This environment variable gives the input data when using the GET method.
REMOTE_ADDR: The IP of the remote computer making the request.
REMOTE_HOST: The name of the remote computer making the request.
REMOTE_IDENT: Gives the username defined in RFC 931.
Note: RFC 931 is the authoritative document of the Internet, describing the method of confirming the identification of users in TCP connections. Documentation at: http://sunsite.auc.dk/RFC/rfc/rfc931.html.
REMOTE_USER: Gives the authorized username of the client making the request.
REQUEST_METHOD: The method to make the request, which can be GET, HEAD and POST.
SCRIPT_NAME: The virtual path to execute the CGI program, such as: /cgi-bin/query.cgi.
SERVER_NAME: The domain name or IP address of the computer running the WEB server software, such as: www.chinabyte.com.
SERVER_PORT: The port number of the WEB server, the default value is 80.
SERVER_PROTOCOL: The protocol name and version number used by the WEB server, such as: HTTPD/1.0.
SERVER_SOFTWARE: The name of the WEB server that executes CGI programs. The format is "server name/version number", such as: NCSA/1.5b5.
HTTP_ACCEPT: "Acccpect: header line" sent by the client, corresponding to the MIME type that the client can handle, in the format of "type/subtype, type/subtype, etc.", such as: */*, image/gif, image/ jpeg.
HTTP_REFERER: Yes Referer: The directory of the header line, containing the URL of the form (Form) when making a CGI request, such as: http://www.chinabyte.com/register.form.
HTTP_USER_AGENT: The name of the client browser that made the request, such as: Mozilla/1.2N (Windows;I;32bit).
You can see the above environment variables using the demonstration program in the previous lecture.
2.4.2, command line:
The CGI command line is only used when querying with ISINDEX. An ISINDEX query is a special query enclosed between <ISINDEX> and <BASE HREP="..">. The command line can take multiple parameters.
2.4.3. Standard input:
When using the POST method, the CGI program gets the transmitted data from stdin. As mentioned before, the CONTENT_TYPE and CONTENT_LENGTH environment variable values must be used. What should be noted is that the URL in the data is encoded, such as spaces are replaced by plus signs, ~ is replaced by %7E, etc.
2.4.4. Standard output:
CGI programs send data to the browser through standard output, or commands that can be interpreted by the WEB server. CGI programs can talk to the browser through the WEB server, and their program names must start with "nph-", which represents unanalyzed header information. The CGI program is responsible for the correctness of the HTTP header information returned to the browser.
When not using the nph-program, the server looks for three special headers that the CGI may return:
Content-type: MIME type header information, such as: When entering HTML, "Content-type:text/html" is commonly used.
Location: Tell the server that you are pointing to another document. The server either redirects the client or sends the document content, depending on whether the URL is a full path or a relative path.
Status: The status line sent by the server to the client. The format is: nnnXXXXX, nnn is a three-digit code, and XXXXX is the corresponding description text.