The purpose of this article is to introduce Java generics, so that everyone can have a final, clear and accurate understanding of all aspects of Java generics, and also lay the foundation for the next article "Re-understanding Java Reflection".
Introduction
Generics are a very important knowledge point in Java. Generics are widely used in Java collection frameworks. In this article, we will look at the design of Java generics from scratch, which will involve wildcard processing and distressing type erasure.
Generic Basics
Generic Classes
Let's first define a simple Box class:
public class Box { private String object; public void set(String object) { this.object = object; } public String get() { return object; }}This is the most common practice. One of the disadvantages of this is that only String-type elements can be loaded in the Box. In the future, if we need to load other types of elements such as Integer, we must also rewrite another Box. The code cannot be reused, and using generics can solve this problem well.
public class Box<T> { // T stands for "Type" private T t; public void set(T t) { this.t = t; } public T get() { return t; }}This way our Box class can be reused, and we can replace T with any type we want:
Box<Integer> integerBox = new Box<Integer>();Box<Double> doubleBox = new Box<Double>();Box<String> stringBox = new Box<String>();
Generic Methods
After reading the generic class, let’s learn about generic methods. Declaring a generic method is simple, just add a form similar to <K, V> to the return type:
public class Util { public static <K, V> boolean compare(Pair<K, V> p1, Pair<K, V> p2) { return p1.getKey().equals(p2.getKey()) && p1.getValue().equals(p2.getValue()); }}public class Pair<K, V> { private K key; private V value; public Pair(K key, V value) { this.key = key; this.value = value; } public void setKey(K key) { this.key = key; } public void setValue(V value) { this.value = value; } public K getKey() { return key; } public V getValue() { return value; }}We can call generic methods like this:
Pair<Integer, String> p1 = new Pair<>(1, "apple");Pair<Integer, String> p2 = new Pair<>(2, "pear");boolean same = Util.<Integer, String>compare(p1, p2);
Or use type inference in Java 1.7/1.8 to let Java automatically deduce the corresponding type parameters:
Pair<Integer, String> p1 = new Pair<>(1, "apple");Pair<Integer, String> p2 = new Pair<>(2, "pear");boolean same = Util.compare(p1, p2);
Boundary symbol
Now we want to implement such a function to find the number of elements in a generic array that are larger than a specific element. We can implement it like this:
public static <T> int countGreaterThan(T[] anArray, T elem) { int count = 0; for (T e : anArray) if (e > elem) // compiler error ++count; return count;}But this is obviously wrong, because except for primitive types such as short, int, double, long, float, byte, char, etc., other classes may not necessarily use operators>, so the compiler reports an error. How to solve this problem? The answer is to use the boundary symbol.
public interface Comparable<T> { public int compareTo(T o);}Make a declaration similar to the following, which is equivalent to telling the compiler that the type parameter T represents classes that implement the Comparable interface, which is equivalent to telling the compiler that they all implement at least the compareTo method.
public static <T extends Comparable<T>> int countGreaterThan(T[] anArray, T elem) { int count = 0; for (T e : anArray) if (e.compareTo(elem) > 0) ++count; return count;}Wildcard
Before understanding wildcards, we must first clarify a concept, or borrow the Box class we defined above, suppose we add a method like this:
public void boxTest(Box<Number> n) { /* ... */ }So what type of parameters does Box<Number> n allow to accept? Can we pass in Box<Integer> or Box<Double>? The answer is no. Although Integer and Double are subclasses of Number, there is no relationship between Box<Integer> or Box<Double> and Box<Number> in generics. This is very important, and we will use a complete example to deepen our understanding.
First, we define a few simple classes, and we will use them below:
class Fruit {}class Apple extends Fruit {}class Orange extends Fruit {}In the following example, we create a generic class Reader, and then in f1(), when we try Fruit f = fruitReader.readExact(apples); the compiler will report an error because there is no relationship between List<Fruit> and List<Apple>.
public class GenericReading { static List<Apple> apples = Arrays.asList(new Apple()); static List<Fruit> fruit = Arrays.asList(new Fruit()); static class Reader<T> { T readExact(List<T> list) { return list.get(0); } } static void f1() { Reader<Fruit> fruitReader = new Reader<Fruit>(); // Errors: List<Fruit> cannot be applied to List<Apple>. // Fruit f = fruitReader.readExact(apples); } public static void main(String[] args) { f1(); }}But according to our usual thinking habits, there must be a connection between Apple and Fruit, but the compiler cannot recognize it. So how can I solve this problem in generic code? We can solve this problem by using wildcards:
static class CovariantReader<T> { T readCovariant(List<? extends T> list) { return list.get(0); }}static void f2() { CovariantReader<Fruit> fruitReader = new CovariantReader<Fruit>(); Fruit f = fruitReader.readCovariant(fruit); Fruit a = fruitReader.readCovariant(apples);}public static void main(String[] args) { f2();}This is quite similar to telling the compiler that the parameters accepted by the fruitReader's readCovariant method is as long as the subclass that satisfies Fruit (including Fruit itself), so that the relationship between the subclass and the parent class is also associated.
PECS Principles
We saw a usage similar to <? extends T> above. Using it, we can get elements from the list, so can we add elements into the list? Let's try it:
public class GenericsAndCovariance { public static void main(String[] args) { // Wildcards allow covariance: List<? extends Fruit> flist = new ArrayList<Apple>(); // Compile Error: can't add any type of object: // flist.add(new Apple()) // flist.add(new Orange()) // flist.add(new Fruit()) // flist.add(new Object()) flist.add(null); // Legal but uninteresting // We Know that it returns at least Fruit: Fruit f = flist.get(0); }}The answer is no, the Java compiler does not allow us to do this, why? We might as well consider this problem from the perspective of the compiler. Because List<? extends Fruit> flist can have many meanings:
List<? extends Fruit> flist = new ArrayList<Fruit>();List<? extends Fruit> flist = new ArrayList<Apple>();List<? extends Fruit> flist = new ArrayList<Orange>();
Therefore, for collection classes that implement <? extends T>, they can only be regarded as a Producer providing (get) element to the outside, and cannot be used as a Consumer to obtain (add) elements to the outside.
What should we do if we want to add the element? You can use <? super T>:
public class GenericWriting { static List<Apple> apples = new ArrayList<Apple>(); static List<Fruit> fruit = new ArrayList<Fruit>(); static <T> void writeExact(List<T> list, T item) { list.add(item); } static void f1() { writeExact(apples, new Apple()); writeExact(fruit, new Apple()); } static <T> void writeWithWildcard(List<? super T> list, T item) { list.add(item) } static void f2() { writeWithWildcard(apples, new Apple()); writeWithWildcard(fruit, new Apple()); } public static void main(String[] args) { f1(); f2(); }}In this way, we can add elements to the container, but the disadvantage of using super is that we cannot get elements in the container in the future. The reason is very simple. We continue to consider this issue from the perspective of the compiler. For List<? super Apple> list, it can have the following meanings:
List<? super Apple> list = new ArrayList<Apple>();List<? super Apple> list = new ArrayList<Fruit>();List<? super Apple> list = new ArrayList<Object>();
When we try to get an Apple through list, we may get a Fruit, which can be other types of Fruit such as Orange.
Based on the example above, we can summarize a rule, "Producer Extends, Consumer Super":
After reading some Java collections source code, we can find that we usually use the two together, such as the following:
public class Collections { public static <T> void copy(List<? super T> dest, List<? extends T> src) { for (int i=0; i<src.size(); i++) dest.set(i, src.get(i)); }}Type Erase
Perhaps the most distressing thing about Java generics is the type erasure, especially for programmers with C++ experience. Type erasure means that Java generics can only be used for static type checking during compilation, and then the code generated by the compiler will erase the corresponding type information. In this way, during the run, the JVM actually knows the specific type represented by the generic. The purpose of this is because Java generics were introduced after 1.5. In order to maintain downward compatibility, you can only do type erasing to be compatible with previous non-generic code. For this point, if you read the source code of the Java collection framework, you can find that some classes do not actually support generics.
Having said so much, what does generic erasure mean? Let's first look at the following simple example:
public class Node<T> { private T data; private Node<T> next; public Node(T data, Node<T> next) } this.data = data; this.next = next; } public T getData() { return data; } // ...}After the compiler completes the corresponding type check, the above code will actually be converted to:
public class Node { private Object data; private Node next; public Node(Object data, Node next) { this.data = data; this.next = next; } public Object getData() { return data; } // ...}This means that no matter whether we declare Node<String> or Node<Integer>, the JVM is all considered Node<Object> during runtime. Is there any way to solve this problem? This requires us to reset the bounds ourselves and modify the above code to the following:
public class Node<T extends Comparable<T>> { private T data; private Node<T> next; public Node(T data, Node<T> next) { this.data = data; this.next = next; } public T getData() { return data; } // ...}In this way, the compiler will replace the place where T appears with Comparable instead of the default Object:
public class Node { private Comparable data; private Node next; public Node(Comparable data, Node next) { this.data = data; this.next = next; } public Comparable getData() { return data; } // ...}The above concept may be easier to understand, but in fact, generic erasure brings far more problems. Next, let’s take a systematic look at some of the problems brought by type erasure. Some problems may not be encountered in C++ generics, but you need to be extra careful in Java.
Question 1
Generic arrays are not allowed in Java. If the compiler does something like the following, it will report an error:
List<Integer>[] arrayOfLists = new List<Integer>[2]; // compile-time error
Why doesn't the compiler support the above practice? Continue to use reverse thinking, we consider this issue from the perspective of the compiler.
Let's first look at the following example:
Object[] strings = new String[2];strings[0] = "hi"; // OKstrings[1] = 100; // An ArrayStoreException is thrown.
The above code is easy to understand. String arrays cannot store integer elements, and such errors often need to be discovered until the code is run, and the compiler cannot recognize them. Next, let’s take a look at what will happen if Java supports the creation of generic arrays:
Object[] stringLists = new List<String>[]; // compiler error, but pretend it's allowed stringLists[0] = new ArrayList<String>(); // OK// An ArrayStoreException should be thrown, but the runtime can't detect it.stringLists[1] = new ArrayList<Integer>();
Suppose we support the creation of generic arrays. Since the type information during the runtime has been erased, the JVM actually does not know the difference between new ArrayList<String>() and new ArrayList<Integer>() at all. If such errors occur in practical application scenarios, they will be very difficult to detect.
If you are still skeptical of this, you can try running the following code:
public class ErasedTypeEquivalence { public static void main(String[] args) { Class c1 = new ArrayList<String>().getClass(); Class c2 = new ArrayList<Integer>().getClass(); System.out.println(c1 == c2); // true }}Question 2
Continue to reuse our Node class above. For generic code, the Java compiler will actually secretly help us implement a Bridge method.
public class Node<T> { public T data; public Node(T data) { this.data = data; } public void setData(T data) { System.out.println("Node.setData"); this.data = data; }}public class MyNode extends Node<Integer> { public MyNode(Integer data) { super(data); } public void setData(Integer data) { System.out.println("MyNode.setData"); super.setData(data); }}After reading the above analysis, you may think that after type erasing, the compiler will turn Node and MyNode into the following:
public class Node { public Object data; public Node(Object data) { this.data = data; } public void setData(Object data) { System.out.println("Node.setData"); this.data = data; }}public class MyNode extends Node { public MyNode(Integer data) { super(data); } public void setData(Integer data) { System.out.println("MyNode.setData"); super.setData(data); }}Actually, this is not the case. Let's first look at the following code. When this code is run, a ClassCastException will be thrown, prompting that String cannot be converted to Integer:
MyNode mn = new MyNode(5);Node n = mn; // A raw type - compiler throws an unchecked warningn.setData("Hello"); // Causes a ClassCastException to be thrown.// Integer x = mn.data;If we follow the code we generated above, we should not report an error when running to line 3 (note that I commented out line 4), because the setData(String data) method does not exist in MyNode, so we can only call the setData(Object data) method of the parent class Node. Since this way, the above line 3 code should not report an error, because of course String can be converted to Object, so how is ClassCastException thrown?
In fact, the Java compiler automatically handles the above code:
class MyNode extends Node { // Bridge method generated by the compiler public void setData(Object data) { setData((Integer) data); } public void setData(Integer data) { System.out.println("MyNode.setData"); super.setData(data); } // ...}This is why the above error is reported. When setData((Integer) data); String cannot be converted to Integer. Therefore, when the compiler prompts unchecked warning in line 2 above, we cannot choose to ignore it, otherwise we will have to wait until the run time to find the exception. It would be great if we added Node<Integer> n = mn at the beginning, so that the compiler can help us find errors in advance.
Question 3
As we mentioned above, Java generics can only provide static type checking to a large extent, and then the type information will be erased, so the compiler will not pass the following method of using type parameters to create instances:
public static <E> void append(List<E> list) { E elem = new E(); // compile-time error list.add(elem);}But what should we do if we want to create instances using type parameters in certain scenarios? Reflection can be used to solve this problem:
public static <E> void append(List<E> list, Class<E> cls) throws Exception { E elem = cls.newInstance(); // OK list.add(elem);}We can call it like this:
List<String> ls = new ArrayList<>();append(ls, String.class);
In fact, for the above problem, you can also use Factory and Template design patterns to solve it. Interested friends may wish to take a look at the explanation of Creating instance of types in Chapter 15 in Thinking in Java. We will not go into it here.
Question 4
We cannot use the instanceof keyword directly for generic code, because the Java compiler will erase all relevant generic type information when generating the code, just as the JVM we verified above cannot recognize the difference between ArrayList<Integer> and ArrayList<String> during the runtime:
public static <E> void rtti(List<E> list) { if (list instanceof ArrayList<Integer>) { // compile-time error // ... }}=> { ArrayList<Integer>, ArrayList<String>, LinkedList<Character>, ... }As above, we can use wildcards to reset bounds to solve this problem:
public static void rtti(List<?> list) { if (list instanceof ArrayList<?>) { // OK; instanceof requires a reifiable type // ... }}Summarize
The above is all about re-understanding Java generics in this article, I hope it will be helpful to everyone. Interested friends can continue to refer to this site:
Detailed explanation of Java array basics
The basics of Java programming: imitating user login code sharing
The Basics of Java Network Programming: One-way Communication
If there are any shortcomings, please leave a message to point it out. Thank you friends for your support for this site!