I believe that the Internet has become an indispensable part of people's lives. The applications of rich clients such as ajax, flex, etc. make people more happy to experience many functions that could only be implemented in C/S. For example, Google Opportunity has moved all the most basic office applications to the Internet. Of course, while being convenient, it undoubtedly makes the page slower and slower. I am doing front-end development. In terms of performance, according to Yahoo's survey, the backend only accounts for 5%, while the front-end is as high as 95%, of which 88% of things can be optimized.
The above is the life cycle diagram of a web2.0 page. The engineer vividly says that it is divided into four stages: pregnancy, birth, graduation, and marriage. If we can realize this process instead of a simple request-response when we click on the web link, we can dig out many details that can improve performance. Today I listened to a lecture on the research of web performance by Xiao Ma Ge on Taobao on the Yahoo development team. I felt that I gained a lot and wanted to share it on the blog.
I believe many people have heard of 14 rules for optimizing website performance. More information can be found in developer.yahoo.com
1. Reduce the number of HTTP requests as much as possible [content]
2. Use CDN (Content Delivery Network) [server]
3. Add Expires header (or Cache-control) [server]
4. Gzip component [server]
5. Place the CSS style above the page [css]
6. Move the script to the bottom (including inline) [javascript]
7. Avoid using Expressions in CSS [css]
8. Separate JavaScript and CSS into external files [javascript] [css]
9. Reduce DNS queries [content]
10. Compress JavaScript and CSS (including inline) [javascript] [css]
11. Avoid redirecting [server]
12. Remove duplicate scripts [javascript]
13. Configure Entity Tags (ETags) [css]
14. Make AJAX cache
There is a plug-in yslow under firefox, which is integrated into firebug. You can use it to easily see how your website performs in these aspects.
This is the result of using yslow to evaluate my website Xifengfang. Unfortunately, it only has 51 points. hehe. The scores of major websites in China are not high. I just tested it and both Sina and NetEase were 31 points. Then yahoo (US)'s score is indeed 97 points! It can be seen that Yahoo has made efforts in this regard. Judging from the 14 rules they have summarized, there are many details that have been added to us now, and some practices are even a bit perverted.
Article 1: Minimize the number of HTTP requests as much as possible (Make Fewer HTTP Requests)
http requests are overhead, and finding ways to reduce the number of requests can naturally increase the speed of the web page. Common methods are merging css, js (merging css and js files in a page respectively), and Image maps and css sprites, etc. Of course, perhaps splitting css and js files is due to considerations in terms of CSS structure, sharing, etc. The Alibaba Chinese website was to develop separately during development, and then merge js and css in the background. This was still a request for the browser, but it could still be restored to multiple during development, which was convenient for management and repeated references. Yahoo even recommends writing the home page css and js directly into the page file instead of external references. Because the homepage has too many visits, doing so can reduce the number of two requests. In fact, many domestic portals do this.
css sprites refers to only combining the background image on the page into one, and then using the value that cannot be defined by the background-position attribute of css to obtain its background. Taobao and Alibaba Chinese websites are currently doing this. If you are interested, you can take a look at the background pictures of Taobao and Alibaba.
http://www.cssssprites.com/ This is a tool website that can automatically merge the pictures you upload and give the corresponding background-position coordinates. And output the results in png and gif formats.
Article 2: Use a Content Delivery Network
To be honest, I don’t know much about CDN. Simply put, by adding a new network architecture to the existing Internet, publishing the content of the website to the cache server closest to the user. Through DNS load balancing technology, it is judged that the user’s source is accessing the cache server nearby to obtain the required content. Users in Hangzhou access the content on the server near Hangzhou, and those in Beijing access the content on the server near Beijing. This can effectively reduce the time of data transmission on the network and improve the speed. For more detailed content, you can refer to the explanation of CDN on Baidu Encyclopedia. Yahoo! Distribution of static content to CDN reduces user impact time by 20% or more.
CDN technology diagram:
CDN networking diagram:
Article 3: Add an Expire/Cache-Control header: Add an Expires Header
Now more and more pictures, scripts, css, and flash are embedded in the page, and when we access them, we will inevitably make many http requests. In fact, we can cache these files by setting Expires header. Expire actually specifies the cache time of a specific type of file in the browser through the header message. Most images do not need to be modified frequently after being published. After being cached, the browser will no longer need to download these files from the server and read them directly from the cache. This will greatly speed up accessing the page again. A typical HTTP 1.1 protocol returns header information:
HTTP/1.1 200 OK
Date: Fri, 30 Oct 1998 13:19:41 GMT
Server: Apache/1.3.3 (Unix)
Cache-Control: max-age=3600, must-revalidate
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT
ETag: 3e86-410-3596fbbc
Content-Length: 1040
Content-Type: text/html
It can be done by setting Cache-Control and Expires through server-side scripts.
For example, if set in php, expires after 30 days:
<!--pHeader(Cache-Control: must-revalidate);$offset = 60 * 60 * 24 * 30;$ExpStr = Expires: . gmdate(D, d MYH:i:s, time() + $offset) . GMT;Header($ExpStr);--> can also be done by configuring the server itself, so these are not very clear, haha. If you want to know more, please refer to http://www.web-caching.com/
As far as I know, the Expires expiration time of Alibaba Chinese website is 30 days. However, there were problems during this period, especially when setting the script expiration time, you should carefully consider it, otherwise it may take a long time for the client to perceive such changes after the corresponding script function is updated. I encountered this problem when I was working on the [suggest project] before. Therefore, we should carefully consider which ones should be cached and which ones should not be cached.
Article 4: Enable Gzip compression: Gzip Components
The idea of Gzip is to compress files on the server side first and then transfer them. This can significantly reduce the size of file transfers. After the transmission is completed, the browser will re-decompress the compressed content and execute it. Current browsers can support gzip well. Not only can the browser recognize it, but also the major crawlers can also recognize it. All seoers can rest assured. Moreover, the compression ratio of gzip is very large, generally the compression ratio is 85%, which means that the 100K page on the server can be compressed to about 25K before being sent to the client. For specific Gzip compression principles, you can refer to the article "Gzip compression algorithm" on csdn. Yahoo particularly emphasizes that all text content should be compressed by gzip: html (php), js, css, xml, txt... Our website is doing a good job in this regard, it is an A. In the past, our homepage was not A, because there were many js on the homepage with advertising codes. The js of the websites of these ad code owners were not compressed by gzip, which would also drag down our website.
Most of the above three points are server-side content, and I only have a brief understanding of them. What is wrong is to be corrected.
Article 5: Put Stylesheets at the Top
Put css on top of the page, why is this? Because browsers such as ie, firefox will not render anything until all css is transmitted completely. The reason is as simple as Brother Ma said. css, full name: Cascading Style Sheets. Cascading means that the css behind can cover the previous css, and the css with high levels can cover the css with low levels. In [css! Important] This hierarchy relationship has been mentioned at the bottom of this article, and here we only need to know that CSS can be overwritten. Since the previous one can be overwritten, it is undoubtedly reasonable for the browser to render after it is fully loaded. In many browsers, such as IE, the problem of putting the style sheet at the bottom of the page is that it prohibits the order of display of web page content. If the browser blocks the display to avoid repainting the page elements, the user can only see blank pages. Firefox does not block display, but this means that when the stylesheet is downloaded, some page elements may need to be repainted, which causes flickering issues. So we should get the css loaded as soon as possible
Following this meaning, if we look at it carefully, there are actually some areas that can be optimized. For example, the two css files contained on this site are <link rel=stylesheet rev=stylesheet href=http://www.space007.com/themes/google/style/google.css type=text/css media=screen /> and <link rel=stylesheet rev=stylesheet href=http://www.space007.com/css/print.css type=text/css media=print />. From media, we can see that the first css is for the browser, and the second css file is for the print style. From the user's behavioral habits, the action to print the page must occur after the page page is displayed. Therefore, a better method should be to dynamically add css to the page after the page is loaded, so that it can improve the speed. (Ha ha)
Article 6: Put scripts at the Bottom
There are two purposes for placing scripts at the bottom of the page: 1. Because the execution of script scripts prevents blocking the download of the page. During the loading process of page, when the browser reads the js execution statement, it will definitely explain it all and read the following content next. If you don’t believe it, you can write a js loop to see if the things under the page will still come out. (The execution of setTimeout and setInterval is a bit similar to multithreading, and the following content rendering will continue before the corresponding response time.) The logic behind the browser is that js may execute location.href or other functions that may completely interrupt this page process at any time, that is, of course, it must be loaded after it is completed. Therefore, putting it at the end of the page can effectively reduce the loading time of the page's visual elements. 2. The second problem caused by the script is that it blocks the number of parallel downloads. The HTTP/1.1 specification recommends that the number of parallel downloads for each host of the browser should not exceed 2 (IE can only be 2, and other browsers such as ff are set to 2 by default, but the new ie8 can reach 6). So if you distribute image files to multiple machines, you can achieve more than 2 parallel downloads. However, when the script file is downloaded, the browser will not start other parallel downloads.
Of course, for each website, the feasibility of putting all scripts at the bottom of the page is still questionable. For example, the page of Alibaba Chinese website. There are inline js in many places, and the display of pages depends heavily on this. I admit that this is far from the concept of non-invasive scripts, but many historical problems are not so easy to solve.
Article 7: Avoid using Expressions in CSS (Avoid CSS Expressions)
However, there will be two more meaningless nesting layers, which is definitely not good. A better solution is needed.
Article 8: Put both javascript and css in external files (Make JavaScript and CSS External)
I think this is still easy to understand. This is not only done from the perspective of performance optimization, but also from the perspective of easy code maintenance. Writing css and js on the page content can reduce 2 requests, but also increase the page size. If you have cached css and js, there will be no extra http requests twice. Of course, I also said before that some special page developers will still choose inline css and js files.
Article 9: Reduce DNS Lookups
There is a one-to-one correspondence between the domain name and the IP address on the Internet. The domain name (kuqin.com) is easy to remember, but the computer does not recognize it. The recognition between the computers must be converted into an IP address. Each computer on the network has an independent IP address. The conversion work between a domain name and an IP address is called domain name resolution, also known as DNS query. A DNS resolution process will take 20-120 milliseconds, and the browser will not download anything under the domain name until the dns query is over. Therefore, reducing the time of dns query can speed up the loading speed of the page. Yahoo recommends that the number of domain names contained in a page should be controlled at 2-4 as much as possible. This requires a good plan for the overall page. We are not doing well at the moment, and many advertising systems that manage points have dragged us down.
Article 10: Compress JavaScript and CSS (Minify JavaScript)
It is obvious to compress the left and right of JS and CSS, reducing the number of page bytes. The page loading speed is naturally faster if it has a small capacity. In addition to reducing volume, compression can also provide a certain amount of protection. We did a good job in this. Common compression tools include JsMin, YUI compressor, etc. In addition, like http://dean.edwards.name/packer/, we also provide us with a very convenient online compression tool. You can see the capacity difference between compressed js files and uncompressed js files on the jQuery web page:
Of course, one disadvantage of compression is that the readability of the code is gone. I believe many front-end friends have encountered this problem: the effect of looking at Google is very cool, but when looking at its source code, there are a lot of characters crowded together, and even the function names have been replaced. I'm so sweaty! Isn't it very inconvenient for maintenance if your own code is the same? The current practice of all Alibaba Chinese websites is to compress it on the server side when js and css are released. This way we can easily maintain our own code.
Article 11: Avoid Redirects
Not long ago, I saw the article "Internet Explorer and Connection Limits" on ieblog. For example, when you enter http://www.kuqin.com/, the server will automatically generate a 301 server and turn to http://www.kuqin.com/. You can see it by looking at the address bar of the browser. This kind of redirection naturally also takes time. Of course this is just an example. There are many reasons for redirection, but the unchangeable thing is that every time the redirection is added, a web request will be added, so it should be minimized.
Article 12: Remove Duplicate Scripts
I don't want to say this, but it's also true from the perspective of performance, but also from the perspective of code specifications. But I have to admit that many times we will add some perhaps duplicate code because the graph is quick. Perhaps a unified CSS framework and JS framework can better solve our problems. Xiaozhu’s point of view is right, not only to be non-repeatable, but also to be reusable.
Article 13: Configure entity tags (ETags) (Configure ETags)
I don't understand this either, haha. I found a detailed explanation on inforQ "Using ETags to reduce the bandwidth and load of web applications". Interested students can go and take a look.
Article 14: Make Ajax Cacheable
Ajax still needs to be cached? When making ajax requests, you often need to add a timestamp to avoid cache. It's important to remember that asynchronous does not imply instantaneous. Remember, even if AJAX is generated dynamically and works only for one user, they can still be cached.