If you are now required to optimize the Java code you write, what would you do? In this article, the author introduces four methods that can improve system performance and code readability. If you are interested in this, let's take a look.
Our usual programming tasks are nothing more than applying the same technical suite to different projects. In most cases, these technologies can meet the goals. However, some projects may require special techniques, so engineers have to study in depth to find the easiest but most effective methods. In a previous article, we discussed four special technologies that can be used when necessary to create better Java software; while in this article we will introduce some common design strategies and goal implementation techniques that help solve common problems, namely:
Only purposeful optimization
Use enums as much as possible for constants
Redefine the equals() method in the class
Use as much polymorphism as possible
It is worth noting that the techniques described in this article are not applicable to all cases. In addition, when and where these technologies should be used, they require users to carefully consider.
1. Only do purposeful optimization
Large software systems must be very concerned about performance issues. Although we hope to be able to write the most efficient code, many times, if we want to optimize the code, we have no idea how to start. For example, will the following code affect performance?
public void processIntegers(List<Integer> integers) { for (Integer value: integers) { for (int i = integers.size() - 1; i >= 0; i--) { value += integers.get(i); } }}It depends on the situation. In the above code, we can see that its processing algorithm is O(n³) (using large O symbols), where n is the size of the list set. If n is only 5, then there will be no problem, only 25 iterations will be performed. But if n is 100,000, it may affect performance. Please note that even so, we cannot determine that there will be problems. Although this method requires 1 billion logical iterations, whether it will have an impact on performance remains to be discussed.
For example, suppose the client executes this code in its own thread and is waiting asynchronously for the calculation to complete, then its execution time may be acceptable. Similarly, if the system is deployed in a production environment but no client calls it, then there is no need for us to optimize this code, because it will not consume the overall performance of the system at all. In fact, the system will become more complex after optimizing performance, but the tragic thing is that the system's performance does not improve as a result.
The most important thing is that there is no free lunch in the world, so in order to reduce the cost, we usually use technologies such as cache, loop expansion or pre-calculated values to achieve optimization, which in turn increases the complexity of the system and reduces the readability of the code. If this optimization can improve the performance of the system, it is worth it even if it becomes complicated, but before making a decision, you must first know these two pieces of information:
What are the performance requirements
Where is the performance bottleneck
First of all, we need to know clearly what the performance requirements are. If it is ultimately within the requirements and the end user has not raised any objections, then there is no need to perform performance optimization. However, when new functions are added or the system's data volume reaches a certain scale, it must be optimized, otherwise problems may arise.
In this case, it should not be based on intuition or inspection. Because even experienced developers like Martin Fowler are prone to making some wrong optimizations, as explained in the article Refactoring (page 70):
If you analyze enough programs, you will find the interesting thing about performance that most of your time is wasted in a small part of the code in the system. If all codes are optimized the same, the end result is that 90% of the optimization is wasted, because the code after optimization does not run much frequency. The time spent on optimizing without goals is a waste of time.
As a battle-hardened developer, we should take this view seriously. The first guess is not only not only that the system's performance has not been improved, but 90% of the development time is a completely wasted. Instead, we should execute common use cases in production (or pre-production) and find out which part of the system is consuming system resources during execution, and then configure the system. For example, only 10% of the code that consumes most resources, then optimizing the remaining 90% of the code is a waste of time.
According to the analysis results, if we want to use this knowledge, we should start with the most common situations. Because this will ensure that the actual effort will ultimately improve the performance of the system. After each optimization, the analysis steps should be repeated. Because this not only ensures that the performance of the system is really improved, it can also be seen which part of the performance bottleneck is after optimizing the system (because after solving one bottleneck, other bottlenecks may consume more overall resources of the system). It should be noted that the percentage of time spent in existing bottlenecks is likely to increase, as the remaining bottlenecks are temporarily unchanged, and the overall execution time should be reduced as the target bottleneck is eliminated.
Although it takes a lot of capacity to fully check profiles in Java systems, there are some very common tools that can help discover system performance hotspots, including JMeter, AppDynamics, and YourKit. In addition, you can also refer to DZone's performance monitoring guide for more information on Java program performance optimization.
Although performance is a very important component of many large software systems and is part of the automated test suite in the product delivery pipeline, it cannot be optimized blindly and without purpose. Instead, specific optimizations should be made to the performance bottlenecks that have been mastered. This not only helps us avoid increasing the complexity of the system, but also allows us to avoid detours and avoid doing time-wasting optimizations.
2. Try to use enums for constants
There are many scenarios where users need to list a set of predefined or constant values, such as HTTP response codes that may be encountered in web applications. One of the most common implementation techniques is to create a new class, which contains many static final type values. Each value should have a comment describing what the value means:
public class HttpResponseCodes { public static final int OK = 200; public static final int NOT_FOUND = 404; public static final int FORBIDDEN = 403;}if (getHttpResponse().getStatusCode() == HttpResponseCodes.OK) { // Do something if the response code is OK }It is already very good to have this idea, but there are still some disadvantages:
No strict verification of incoming integer values
Since it is a basic data type, the method on the status code cannot be called
In the first case, a specific constant is simply created to represent a special integer value, but there is no restriction on the method or variable, so the value used may be beyond the scope of the definition. For example:
public class HttpResponseHandler { public static void printMessage(int statusCode) { System.out.println("Recived status of " + statusCode); }}HttpResponseHandler.printMessage(15000);Although 15000 is not a valid HTTP response code, there is no restriction on the server side that the client must provide valid integers. In the second case, we have no way to define a method for the status code. For example, if you want to check whether a given status code is a successful code, you must define a separate function:
public class HttpResponseCodes { public static final int OK = 200; public static final int NOT_FOUND = 404; public static final int FORBIDDEN = 403; public static boolean isSuccess(int statusCode) { return statusCode >= 200 && statusCode < 300; }}if (HttpResponseCodes.isSuccess(getHttpResponse().getStatusCode())) { // Do something if the response code is a success code }To solve these problems, we need to change the constant type from the base data type to a custom type and allow only specific objects of the custom class. This is exactly what Java enums are for. Using enum, we can solve these two problems at once:
public enum HttpResponseCodes { OK(200), FORBIDDEN(403), NOT_FOUND(404); private final int code; HttpResponseCodes(int code) { this.code = code; } public int getCode() { return code; } public boolean isSuccess() { return code >= 200 && code < 300; }}if (getHttpResponse().getStatusCode().isSuccess()) { // Do something if the response code is a success code }Similarly, it is now possible to require that the status code that must be valid when calling the method:
public class HttpResponseHandler { public static void printMessage(HttpResponseCode statusCode) { System.out.println("Recived status of " + statusCode.getCode()); }}HttpResponseHandler.printMessage(HttpResponseCode.OK);It is worth noting that this example shows that if it is a constant, you should try to use enums, but it does not mean that you should use enums under all circumstances. In some cases, it may be desirable to use a constant to represent a particular value, but other values are also allowed. For example, everyone may know about Pi, and we can use a constant to capture this value (and reuse it):
public class NumericConstants { public static final double PI = 3.14; public static final double UNIT_CIRCLE_AREA = PI * PI;}public class Rug { private final double area; public class Run(double area) { this.area = area; } public double getCost() { return area * 2; }}// Create a carpet that is 4 feet in diameter (radius of 2 feet)Rug fourFootRug = new Rug(2 * NumericConstants.UNIT_CIRCLE_AREA);Therefore, the rules for using enums can be summarized as:
When all possible discrete values have been known in advance, then you can use enumeration
Take the HTTP response code mentioned above as an example. We may know all the values of the HTTP status code (can be found in RFC 7231, which defines the HTTP 1.1 protocol). Therefore, enumeration is used. In calculating pi, we don't know all possible values about pi (any possible double is valid), but at the same time we want to create a constant for the circular rugs to make the calculation easier (easier to read); therefore a series of constants are defined.
If you cannot know all possible values in advance, but want to include fields or methods for each value, then the easiest way is to create a new class to represent the data. Although I have never said that there should be no enumeration in any scenario, the key to know where and when to not use enumeration is to be aware of all values in advance and prohibit the use of any other value.
3. Redefine the equals() method in the class
Object recognition can be a difficult problem to solve: if two objects occupy the same position in memory, are they the same? If their ids are the same, are they the same? Or what if all fields are equal? Although each class has its own identification logic, there are many Western countries in the system that need to judge whether they are equal. For example, there is a class below that indicates order purchase...
public class Purchase { private long id; public long getId() { return id; } public void setId(long id) { this.id = id; }}...As written below, there must be many places in the code that are similar:
Purchase originalPurchase = new Purchase();Purchase updatedPurchase = new Purchase();if (originalPurchase.getId() == updatedPurchase.getId()) { // Execute some logic for equal purchases }The more these logic calls (in turn, it violates the DRY principle), Purchase
The identity information of the class will also become more and more. If for some reason, Purchase has been changed
The identity logic of a class (for example, the type of identifier has been changed), so there must be many places where the identity logic is updated.
We should initialize this logic inside the class, rather than spreading the identity logic of the Purchase class too much through the system. At first glance, we can create a new method, such as isSame, whose inclusion parameter is a Purchase object, and compare the ids of each object to see if they are the same:
public class Purchase { private long id; public boolean isSame(Purchase other) { return getId() == other.gerId(); }}Although this is an effective solution, the built-in functionality of Java is ignored: using the equals method. Each class in Java inherits the Object class, although it is implicit, so it also inherits the equals method. By default, this method checks the object identity (same object in memory), as shown in the following code snippet in the object class definition (version 1.8.0_131) in JDK:
public boolean equals(Object obj) {return (this == obj);}This equals method acts as a natural location for injecting identity logic (implemented by overriding the default equals):
public class Purchase { private long id; public long getId() { return id; } public void setId(long id) { this.id = id; } @Override public boolean equals(Object other) { if (this == other) { return true; } else if (!(other instanceof Purchase)) { return false; } else { return ((Purchase) other).getId() == getId(); } }}Although this equals method looks complicated, since the equals method only accepts parameters of type objects, we only need to consider three cases:
Another object is the current object (i.e. originalPurchase.equals(originalPurchase)), by definition, they are the same object, so return true
The other object is not a Purchase object, in this case we cannot compare the id of Purchase, so the two objects are not equal
Other objects are not the same object, but are instances of Purchase. Therefore, whether equal depends on whether the current Purchase id and other Purchase are equal. Now we can refactor our previous conditions, as follows:
Purchase originalPurchase = new Purchase();Purchase updatedPurchase = new Purchase();if (originalPurchase.equals(updatedPurchase)) { // Execute some logic for equal purchases }In addition to reducing replication in the system, refactoring the default equals method has some other advantages. For example, if we construct a list of Purchase objects and check if the list contains another Purchase object with the same ID (different objects in memory), then we get a true value because the two values are considered equal:
List<Purchase> purchases = new ArrayList<>();purchases.add(originalPurchase);purchases.contains(updatedPurchase); // True
Usually, no matter where you are, if you need to determine whether the two classes are equal, you only need to use the rewritten equals method. If we want to use the equals method implicitly due to inheriting the Object object to judge equality, we can also use the == operator, as follows:
if (originalPurchase == updatedPurchase) { // The two objects are the same objects in memory }It should also be noted that after the equals method is rewritten, the hashCode method should also be rewritten. More information on the relationship between these two methods and how to correctly define hashCode
Method, see this thread.
As we have seen, overwriting the equals method not only initializes the identity logic inside the class, but also reduces the spread of this logic throughout the system, it also allows the Java language to make well-informed decisions about the class.
4. Use polymorphisms as much as possible
For any programming language, conditional sentences are a very common structure, and there are certain reasons for their existence. Because different combinations can allow the user to change the behavior of the system based on the given value or the instantaneous state of the object. Assuming that the user needs to calculate the balance of each bank account, the following code can be developed:
public enum BankAccountType { CHECKING, SAVINGS, CERTIFICATE_OF_DEPOSIT;}public class BankAccount { private final BankAccountType type; public BankAccount(BankAccountType type) { this.type = type; } public double getInterestRate() { switch(type) { case CHECKING: return 0.03; // 3% case SAVINGS: return 0.04; // 4% case CERTIFICATE_OF_DEPOSIT: return 0.05; // 5% default: throw new UnsupportedOperationException(); } } public boolean supportsDeposits() { switch(type) { case CHECKING: return true; case SAVINGS: return true; case CERTIFICATE_OF_DEPOSIT: return false; default: throw new UnsupportedOperationException(); } }}Although the above code meets the basic requirements, there is an obvious flaw: the user only determines the behavior of the system based on the type of the given account. This not only requires users to check the account type before making a decision, but also needs to repeat this logic when making a decision. For example, in the above design, the user must check in both methods. This can lead to out of control, especially when receiving a need to add a new account type.
We can use polymorphism to make decisions implicitly, rather than using account types to distinguish them. To do this, we convert the concrete classes of BankAccount into an interface and pass the decision process into a series of concrete classes that represent each type of bank account:
/** * Java learning and communication QQ group: 589809992 Let’s learn Java together! */public interface BankAccount { public double getInterestRate(); public boolean support Deposits();}public class CheckingAccount implements BankAccount { @Override public double getIntestRate() { return 0.03; } @Override public boolean support Deposits() { return true; }}public class SavingsAccount implements BankAccount { @Override public double getIntestRate() { return 0.04; } @Override public boolean supportsDeposis() { return true; }}public class CertificateOfDepositAccount implements BankAccount { @Override public double getIntestRate() { return 0.05; } @Override public boolean supportDeposis() { return false; }}This not only encapsulates information specific to each account into its own class, but also supports users to change their designs in two important ways. First, if you want to add a new bank account type, you just need to create a new specific class, implement the BankAccount interface, and give the specific implementation of the two methods. In conditional structure design, we have to add a new value to the enum, add a new case statement in both methods, and insert the logic of the new account under each case statement.
Second, if we want to add a new method in the BankAccount interface, we just need to add a new method in each concrete class. In conditional design, we have to copy the existing switch statement and add it to our new method. In addition, we have to add logic for each account type in each case statement.
Mathematically, when we create a new method or add a new type, we have to make the same number of logical changes in the polymorphic and conditional design. For example, if we add a new method in a polymorphic design, we have to add the new method to the concrete classes of all n bank accounts, and in a conditional design, we have to add n new case statements in our new method. If we add a new account type in the polymorphic design, we must implement all m numbers in the BankAccount interface, and in the conditional design, we must add a new case statement to each m existing method.
Although the number of changes we have to make is equal, the nature of changes is completely different. In polymorphic design, if we add a new account type and forget to include a method, the compiler throws an error because we don't implement all methods in our BankAccount interface. In conditional design, there is no such check to ensure that each type has a case statement. If a new type is added, we can simply forget to update each switch statement. The more serious this problem is, the more we repeat our switch statement. We are humans and we tend to make mistakes. So, any time we can rely on the compiler to remind us of errors, we should do this.
The second important note about these two designs is that they are equivalent externally. For example, if we want to check the interest rate for a checking account, the conditional design will look like this:
BankAccount checkingAccount = new BankAccount(BankAccountType.CHECKING);System.out.println(checkingAccount.getInterestRate()); // Output: 0.03
Instead, polymorphic designs will be similar to the following:
BankAccount checkingAccount = new CheckingAccount();System.out.println(checkingAccount.getInterestRate()); // Output: 0.03
From an external point of view, we are just calling getintereUNK() on the BankAccount object. This will be even more obvious if we abstract the creation process into a factory class:
public class ConditionalAccountFactory { public static BankAccount createCheckingAccount() { return new BankAccount(BankAccountType.CHECKING); }}public class PolymorphicAccountFactory { public static BankAccount createCheckingAccount() { return new CheckingAccount(); }}// In both cases, we create the accounts using a factoryBankAccount conditionalCheckingAccount = ConditionalAccountFactory.createCheckingAccount();BankAccount polymorphicCheckingAccount = PolymorphicAccountFactory.createCheckingAccount();// In both cases, the call to obtain the interest rate is the sameSystem.out.println(conditionalCheckingAccount.getInterestRate()); // Output: 0.03System.out.println(polymorphicCheckingAccount.getInterestRate()); // Output: 0.03It is very common to replace conditional logic with polymorphic classes, so methods have been published to reconstruct conditional statements into polymorphic classes. Here is a simple example. In addition, Martin Fowler's Refactoring (p. 255) also describes the detailed process of performing this reconstruction.
Like other techniques in this article, there is no hard and fast rule on when to perform a transition from conditional logic to polymorphic classes. In fact, we do not recommend using it in any situation. In a test-driven design: For example, Kent Beck designed a simple currency system with the goal of using polymorphic classes, but found that this made the design too complicated and redesigned his design into a non-polymorphic style. Experience and reasonable judgment will determine when the right time to convert the conditional code into polymorphic code.
Conclusion
As programmers, although the conventional techniques used in normal times can solve most problems, sometimes we should break this routine and actively demand some innovation. After all, as a developer, expanding the breadth and depth of his knowledge not only allows us to make smarter decisions, but also makes us smarter.