Traffic source function is available in traffic statistics services. Traffic source is a concept for the visit level. In other words, when the visit is established, the traffic source of the landing page is the visit's Traffic source. Although there are many types of Traffic source, unfortunately, based on JS now, there are only two ways to obtain Traffic source - document.referrer and window.opener. What's more unfortunate is that there are not many scenarios that window.opener is suitable, and document.referrer is so weak that it is impossible to accurately determine the source of traffic in many scenarios.
Overview of document.referrer
In terms of usage, document.referrer hopes to track browser behavior. If a page A is opened, then the actions that may occur on the browser include user operations and JS code.
Let’s first take a look at the actions that users may perform when opening page A:
| 1 | Enter the address of A directly in the address bar |
| 2 | Left click link A from page B and jump to page A |
| 3 | Right-click link A from page B to open in a new window |
| 4 | Right-click link A from page B and open it in the new tab |
| 5 | Drag link A to address bar |
| 6 | Drag link A to the tab bar |
| 7 | Use the browser's forward and back buttons |
Note that the link here refers to the <A> tag, but if there is an event or target, it should be a different matter.
Possible ways to open a page by JS:
| 1 | Modify window.location |
| 2 | Use window.open |
| 3 | Click flash |
The above lists some methods for the client to open the page. In addition, if the server redirection technology is used, page A can also be presented to the visitors.
Here is a specific browser test. If the above situations are the above, how does document.referrer perform:
| Serial number | Scene | IE8.0 | FF3.6 | FF4.0 | chrome |
| 1 | Enter the address of A directly in the address bar | " " | " " | " " | " " |
| 2 | Left click link A from page B, and page A replace page B (target='_self') | √ | √ | √ | √ |
| 3 | Left click link A from page B, A opens in a new window (target='_blank') | √ | √ | √ | √ |
| 3 | Right-click link A from page B to open in a new window | √ | √ | √ | " " |
| 4 | Right-click link A from page B and open it in the new tab | √ | √ | √ | " " |
| 5 | Drag link A to the address bar with the mouse | / | " " | " " | " " |
| 6 | Mouse drag link A to the tab bar | " " | " " | " " | " " |
| 7 | Use the browser's forward and back buttons | Keep | Keep | Keep | Keep |
| 8 | Modify window.location to open page A (same domain) | " " | √ | √ | √ |
| 9 | Open page A using window.open | " " | √ | √ | √ |
| 10 | Click flash to open page A | ||||
| 11 | Server redirect to page A | " " | " " | " " | " " |
Where " " means an empty string, √ means that the source page can be correctly judged, and keep means that the referrer that will not change the page when forward and backward will not change. From this table, we can see that document.referrer can cover about half of the cases. However, for some more commonly used operations, such as dragging the link to the tab bar with the mouse, moving forward and backward, etc., it cannot be properly handled.
Source of document.referrer
When the browser requests page A from the server, it will send an HTTP request. The header of this request will have the Referer attribute. After the server receives the request, it can extract the Referer in the header to determine which page the visitor initiated the request.
Generally speaking, what is the Referer in the header sent by the browser when requesting A, then what is the value of document.referre after getting page A. The above picture is a header requesting page A, and the document.referre of A is http://localhost/Test/b.html.
If the header does not contain Referre, then when using document.referre, it will be assigned as an empty string.
About HTTPS Request
If you click on an HTTPS link on a normal HTTP page, you can attach the Referer information to the https request header, and then you can still use document.referre to get the normal http page in the HTTPS page.
Similarly, if you click another HTTPS link on one https page, you can attach the Referer information to the header of the request.
However, if you click on the http link from an http page, unfortunately, the sent http request header cannot contain information about the http page, which may be due to a protection measure for the http page.
Forged Referer information
According to the above description, document.referre is derived from the Referer in the Header. Then if you want to modify the value of document.referre, theoretically, you only need to modify the request header. You can replace the existing Referer in the Header with the value you want. If it is not originally available, you can add the Referer.
On the client side, tampering with the header is a very easy thing. Before an http request on a page is sent out, you can use the packet interceptor tool to intercept it, then analyze the header information, and modify the Referre.
After searching, FireFox can be easily modified using the RefControl plug-in. Anyway, cheating on Traffic source is a breeze.
Page Force Refresh
Shortly after I finished writing, I found that a way to jump page was missing, that is, to force the page to be specified in the meta tag in the html to refresh. For example, write in b.html
Copy the code as follows: <meta http-equiv="Refresh" content="5;URL=a.html">
Then after 5 seconds, the browser will automatically initiate a page request to the server.
After testing, in IE8, FF3.6-FF4.0, there will be no Referer information, but Chrome can add b.html as a Referer to the header by mistake.