Front-end engineers all know that JavaScript has basic exception handling capabilities. We can throw new Error(), and the browser will also throw an exception when we call the API error. But it is estimated that most front-end engineers have never considered collecting these exception information
Anyway, as long as the refresh fails to reproduce after a JavaScript error, the user can solve the problem by refreshing, and the browser will not crash, and it will be fine if it has not happened. This assumption was true before the Single Page App became popular. The current Single Page App is extremely complex after running for a period of time. Users may have performed several input operations before they came here. How can they refresh if they say they want? Wouldn't you completely rework the previous operations? So it is still necessary for us to capture and analyze these exception information, and then we can modify the code to avoid affecting the user experience.
How to catch exceptions
We wrote ourselves throw new Error(). If we want to capture, we can certainly capture it because we know very well where throw is written. However, exceptions that occur when calling the browser API are not necessarily so easy to catch. Some APIs say that exceptions will be thrown in the standard, and some APIs only have individual browsers throw exceptions due to implementation differences or defects. For the former we can also catch it through try-catch, for the latter we must listen for global exceptions and then catch it.
try-catch
If some browser APIs are known to throw exceptions, we need to put the call into try-catch to avoid the entire program entering an illegal state due to errors. For example, window.localStorage is such an API. An exception will be thrown after writing data exceeds the capacity limit, and this will also be true in Safari's private browsing mode.
The code copy is as follows:
try {
localStorage.setItem('date', Date.now());
} catch (error) {
reportError(error);
}
Another common try-catch applicable scenario is callbacks. Because the code of the callback function is uncontrollable, we don’t know how good the code is, and whether other APIs that throw exceptions will be called. In order not to cause other codes to be executed after calling the callback due to callback errors, it is necessary to put the call back into try-catch.
The code copy is as follows:
listeners.forEach(function(listener) {
try {
listener();
} catch (error) {
reportError(error);
}
});
window.onerror
For places that try-catch cannot cover, if an exception occurs, it can only be captured through window.onerror.
The code copy is as follows:
window.onerror =
function(errorMessage, scriptURI, lineNumber) {
reportError({
message: errorMessage,
script: scriptURI,
line: lineNumber
});
}
Be careful not to be clever and use window.addEventListener or window.attachEvent to listen to window.onerror. Many browsers only implement window.onerror, or only window.onerror implementation is standard. Considering that the standard draft also defines window.onerror, we just need to use window.onerror.
Property missing
Suppose we have a reportError function to collect caught exceptions and then send them in batches to server-side storage for query and analysis, what information do we want to collect? More useful information includes: error type (name), error message (message), script file address (script), line number (line), column number (column), and stack trace. If an exception is caught through try-catch, all of these information are on the Error object (supported by mainstream browsers), so reportError can also collect this information. But if it is captured through window.onerror, we all know that this event function has only 3 parameters, so the unexpected information of these 3 parameters is lost.
Serialize messages
If the Error object is created by ourselves, then error.message is controlled by us. Basically, what we put into error.message, and what will be the first parameter (message) of window.onerror. (The browser will actually make slightly modified, such as adding the 'Uncaught Error: ' prefix.) Therefore, we can serialize the attributes we are concerned about (such as JSON.Stringify) and store them in error.message, and then read them in window.onerror to deserialize them. Of course, this is limited to the Error object we created ourselves.
The fifth parameter
Browser manufacturers also know the restrictions that people are subject to when using window.onerror, so they start adding new parameters to window.onerror. Considering that only row numbers and no column numbers seem to be very symmetrical, IE first added the column numbers and placed them in the fourth parameter. However, what everyone is more concerned about is whether they can get the complete stack, so Firefox said it would be better to put the stack in the fifth parameter. But Chrome said that it would be better to put the entire Error object in the fifth parameter. Any attribute you want to read, including custom attributes. As a result, because Chrome is faster, a new window.onerror signature was implemented in Chrome 30, resulting in the following writing of the standard draft.
The code copy is as follows:
window.onerror = function(
errorMessage,
scriptURI,
lineNumber,
columnNumber,
error
) {
if (error) {
reportError(error);
} else {
reportError({
message: errorMessage,
script: scriptURI,
line: lineNumber,
column: columnNumber
});
}
}
Regularity of attributes
The names of the Error object properties we discussed before are based on Chrome naming methods. However, different browsers name the Error object properties differently. For example, the script file address is called script in Chrome but filename in Firefox. Therefore, we also need a special function to normalize the Error object, that is, to map different attribute names to a unified attribute name. For specific practices, please refer to this article. Although the browser implementation will be updated, it will not be too difficult for anyone to maintain such a mapping table.
Similar is the stack trace format. This property saves the stack information of an exception when it occurs in plain text. Since the text formats used by each browser are different, it is also necessary to maintain a regular expression to extract the function name (identifier), file (script), line number (line) and column number (column) of each frame from plain text.
Security Limitations
If you have also encountered an error with the message 'Script error.', you will understand what I'm talking about, which is actually the browser's limitations for script files from different sources. The reason for this security restriction is as follows: Suppose that the HTML returned by an online banker after logging in is different from the HTML seen by an anonymous user, a third-party website can put the URI of this online bank into the script.src attribute. Of course, HTML cannot be parsed as JS, so the browser will throw an exception, and this third-party website can determine whether the user is logged in by analyzing the location of the exception. For this reason, the browser filters all exceptions thrown by different source script files, leaving only an unchanged message like 'Script error.', and all other attributes disappear.
For websites of a certain scale, it is normal for script files to be placed on CDNs and different sources are placed. Now even if you build a small website yourself, common frameworks such as jQuery and Backbone can directly reference the version on the public CDN to speed up user downloads. So this security restriction does cause some trouble, causing the exception information we collect from Chrome and Firefox to be useless 'Script error.'.
CORS
If you want to bypass this restriction, just ensure that the script file and the page themselves are of the same origin. But wouldn't placing script files on servers that are not accelerated by CDN to slow down the user's download speed? One solution is to continue to place the script file on the CDN, use XMLHttpRequest to download the content back through CORS, and then create a <script> tag to inject it into the page. The code embedded in the page is of course the same origin.
This is simple to say, but there are many details to implement. To give a simple example:
The code copy is as follows:
<script src="http://cdn.com/step1.js"></script>
<script>
(function step2() {})();
</script>
<script src="http://cdn.com/step3.js"></script>
We all know that if there are dependencies in step1, step2, and step3, it must be executed strictly in this order, otherwise an error may occur. The browser can request step1 and step3 files in parallel, but the order is guaranteed when executed. If we obtain the file contents of step1 and step3 by using XMLHttpRequest, we need to ensure the correct order by ourselves. In addition, don't forget step2. Step2 can be executed when step1 is downloaded in a non-blocking form, so we must also interfere with step2 and let it wait for step1 to complete before executing.
If we already have a complete set of tools to generate <script> tags for different pages on the website, we need to adjust this set of tools to make changes to <script> tags:
The code copy is as follows:
<script>
scheduleRemoteScript('http://cdn.com/step1.js');
</script>
<script>
scheduleInlineScript(function code() {
(function step2() {})();
});
</script>
<script>
scheduleRemoteScript('http://cdn.com/step3.js');
</script>
We need to implement the two functions scheduleRemoteScript and scheduleInlineScript, and ensure that they are defined before the first <script> tag that references the external script file, and then the remaining <script> tags will be rewritten into the above form. Note that the step2 function that was executed immediately was placed in a larger code function. The code function will not be executed, it is just a container, so that the original step2 code can be retained without escaping, but will not be executed immediately.
Next, we need to implement a complete mechanism to ensure that the file content downloaded by scheduleRemoteScript based on the address and the code directly obtained by scheduleInlineScript can be executed one by one in the correct order. I won't give the detailed code here. If you are interested, you can implement it yourself.
Line number check
Getting content through CORS and injecting code into the page can break through security restrictions, but it will introduce a new problem, that is, line number conflicts. Originally, the unique script file could be located through error.script, and then the unique line number could be located through error.line. Now, since all the codes embedded in the page are all codes, multiple <script> tags cannot be distinguished by error.script. However, the line number inside each <script> tag is calculated from 1, which results in us being unable to use exception information to locate the source code location where the error is located.
To avoid line number conflicts, we can waste some line numbers so that the line number intervals used by the actual code in each <script> tag do not overlap with each other. For example, assuming that the actual code in each <script> tag does not exceed 1000 lines, then I can let the code in the first <script> tag occupy line 11000, the code in the second <script> tag occupy line 10012000 (1000 empty lines were inserted before), the code in the third <script> tag takes line 20013000 (2000 empty lines were inserted before), and so on. Then we use the data-* attribute to record this information for easy back-checking.
The code copy is as follows:
<script
data-src="http://cdn.com/step1.js"
data-line-start="1"
>
// code for step 1
</script>
<script data-line-start="1001">
// '/n' * 1000
// code for step 2
</script>
<script
data-src="http://cdn.com/step3.js"
data-line-start="2001"
>
// '/n' * 2000
// code for step 3
</script>
After this processing, if an error error.line is 3005, it means that the actual error.script should be 'http://cdn.com/step3.js', and the actual error.line should be 5. We can complete this line number reverse check in the reportError function mentioned earlier.
Of course, since we cannot guarantee that each script file has only 1000 lines, it is also possible that some script files are significantly less than 1000 lines, so there is no need to fixedly allocate a 1000 line interval to each <script> tag. We can allocate intervals based on the actual number of script lines, just ensure that the intervals used by each <script> tag do not overlap.
crossorigin attribute
The security restrictions imposed by browsers on content from different sources are of course not limited to <script> tags. Since XMLHttpRequest can break through this limitation through CORS, why are resources directly referenced through tags not allowed? This is certainly OK.
The limitation of referring to different source script files for <script> tags also applies to referring to different source image files for <img> tags. If a <img> tag is a different source, once used in the <canvas> drawing, the <canvas> will become a write-only state, ensuring that the website cannot steal unauthorized image data from different sources through JavaScript. Later, the <img> tag solved this problem by introducing the crossorigin attribute. If crossorigin="anonymous", it is equivalent to anonymous CORS; if crossorigin="use-credentials", it is equivalent to a certified CORS.
Since the <img> tag can do this, why can't the <script> tag do this? So the browser manufacturer added the same crossorigin attribute to the <script> tag to solve the above security restrictions. Now Chrome and Firefox support for this property is completely free. Safari will treat crossorigin="anonymous" as crossorigin="use-credentials", and the result is that if the server only supports anonymous CORS, Safari will treat authentication as failure. Because the CDN server is designed to return only static content for performance reasons, it is impossible to dynamically return the HTTP header required to authenticate CORS based on requests. Safari is equivalent to not being able to use this feature to solve the above problem.
Summarize
JavaScript exception handling looks simple and is no different from other languages, but it is not that easy to catch all exceptions and analyze the properties. Now, although some third-party services provide Google Analytics-like services that catch JavaScript exceptions, if you want to understand the details and principles, you must do it yourself.