Detailed explanation of Set, List, Map in Java by using more and learning more

Author：Eve Cole Update Time：2025-04-15 23:16:01

For a long time, the data list that is used more frequently in the code is mainly List, and they are all ArrayList. It feels like this thing is enough. ArrayList is a wrapper tool class used to implement dynamic arrays, so that when writing code, you can pull in and out, iterate and traverse, which is quite convenient.

I don’t know when I started to slowly start tool classes such as HashMap and HashSet often appear in the code. It should be said that HashMap is more common, and it is also a classic interview question, so I will read more of it in daily life. When I first started using it, I simply understood it as a key-value corresponding table, and it is more convenient to use keys to find data. After further investigation, I found out

This thing has some secrets, especially after the new version of JDK changes HashMap to a tree, the code is a bit complicated.

I started using Set less, but I accidentally found a TreeSet in a code. I found that this class can come with smoothness. It felt quite interesting. I gradually discovered that this is also a good tool.

If you write too much code, you will feel the importance of the basics, so here I write a short article to briefly organize some knowledge about the collection.

OK, let's sort it out briefly:

•List: That is, a list, which supports the functions of arrays and linked lists, and is generally linear.
•Map: It is a mapping table, which stores the corresponding relationship between keys and values.
•Set: means set, mainly used to sort data and sort

Let's take a look at List first

List is a window for storing linear data, such as: ArrayList for arrays and LinkedList for linked lists.

ArrayList

This is an array list, but it provides automatic expansion function to realize the List interface. External operations are accessed through the interface declaration method, which is safe and convenient.

The key to ArrayList is automatic capacity expansion. The initial capacity can be set when the object is initialized or the default capacity can be measured. If the array size is not particularly clear, the initial size can be not specified. If it is clear, a size can be specified, which reduces the lag caused by dynamic expansion. Speaking of this, we need to talk about how the expansion is implemented. Look at the following code:

 private void grow(int minCapacity) { // overflow-conscious code int oldCapacity = elementData.length; int newCapacity = oldCapacity + (oldCapacity >> 1); if (newCapacity - minCapacity < 0) newCapacity = minCapacity; if (newCapacity - MAX_ARRAY_SIZE > 0) newCapacity = hugeCapacity(minCapacity); // minCapacity is usually close to size, so this is a win: elementData = Arrays.copyOf(elementData, newCapacity); }

grow is a method that triggers ArrayList when adding elements or some easy checks. Main process:

1. Get the length of the array and move it right, which is equivalent to oldCapacity/2, and get the new length

2. If this length is less than the minimum capacity, it is easy to use it directly.

3. If it is greater than the maximum, take a maximum value. A hugeCapacity method will be called here, mainly to compare minCapacity with MAX_ARRAY_SIZE. If minCapacity is greater than MAX_ARRAY_SIZE, take Integer.MAX_VALUE, otherwise take MAX_ARRAY_SIZE. Interestingly, MAX_ARRAY_SIZE takes Integer.MAX_VALUE - 8; I don’t know what the meaning of doing this is

4. Finally, call a copy method to copy the existing number into a new array.

Because of this copying process, if the array is relatively large, then the expansion will certainly cause lag. So if you know the maximum value from the beginning and it is easy to grow to this value, then specifying the size when starting initialization will have a certain effect.

LinkedList

This is a tool class for linked lists. The advantages of linked lists are that they add and delete things faster, but they will find them slower.

As for the code, it seems that nothing special is that it is linked together by a string of pointers. Of course, Java uses objects instead to create a Node object. The Node itself points to the previous Node and the next Node. This is the structure of the linked list:

 private static class Node<E> { E item; Node<E> next; Node<E> prev; Node(Node<E> prev, E element, Node<E> next) { this.item = element; this.next = next; this.prev = prev; } }

Then use two Nodes to point to the head and tail and it is done. The following code:

 /** * Pointer to first node. * Invariant: (first == null && last == null) || * (first.prev == null && first.item != null) */ transient Node<E> first; /** * Pointer to last node. * Invariant: (first == null && last == null) || * (last.next == null && last.item != null) */ transient Node<E> last;

See an add operation:

 /** * Links e as last element. */ void linkLast(E e) { final Node<E> l = last; final Node<E> newNode = new Node<>(l, e, null); last = newNode; if (l == null) first = newNode; else l.next = newNode; size++; modCount++; }

The past is:

1. Get the last Node and put it in l

2. Create a new Node and fetch the data into this Node. The creation process will point the prev of the new Node to l, so that it will be connected to the chain.

3. Then point last to this new Node

4. Then determine whether l is null. If null is null, it means that it is an empty linked list. The new node is the first element. In this way, the first must also point to newNode

5. If it is not empty, point the next of l to newNode

6. Accumulation counter

The deletion operation is also the front and back Node pointing to the move operation of this Node.

Let's take a look at the map

Map is an application of a mapping table for keys and values. The main implementation classes: HashMap, HashTable, TreeMap

HashMap and HashTable

HashMap is the one that uses the hash algorithm for key-value mapping. HashTable is a thread-safe class with synchronization. This is the main difference between them. The principle is similar, and they are all achieved through bucket + chain combination. The bucket is used to store keys, and the value needs to be stored in a linked list due to the Hash collision.

•The significance of bucket lies in efficiency, and can be positioned in one step through Hash calculations.
•The meaning of linked lists is to access the data of repeated hash

The specific principle I wrote a "Study Notes: Hashtable and HashMap" before

But I just saw that JDK1.8’s HashMap has changed its storage structure and adopts a red and black tree structure. This may solve the efficiency of linked list search? No detailed study was conducted.

TreeMap

After reading the code of TreeMap, I found that I still use the tree structure, red and black trees. Since the red and black trees are ordered, they naturally have the sorting function. Of course, you can also specify the comparison method through the comparator to achieve specific sorting.

Because the tree structure is used to store, it will be more troublesome to add and delete data. Let’s take a look at the put code:

 public V put(K key, V value) { Entry<K,V> t = root; if (t == null) { compare(key, key); // type (and possible null) check root = new Entry<>(key, value, null); size = 1; modCount++; return null; } int cmp; Entry<K,V> parent; // split comparator and comparable paths Comparator<? super K> cpr = comparator; if (cpr != null) { do { parent = t; cmp = cpr.compare(key, t.key); if (cmp < 0) t = t.left; else if (cmp > 0) t = t.right; else return t.setValue(value); } while (t != null); } else { if (key == null) throw new NullPointerException(); @SuppressWarnings("unchecked") Comparable<? super K> k = (Comparable<? super K>) key; do { parent = t; cmp = k.compareTo(t.key); if (cmp < 0) t = t.left; else if (cmp > 0) t = t.right; else return t.setValue(value); } while (t != null); } Entry<K,V> e = new Entry<>(key, value, parent); if (cmp < 0) parent.left = e; else parent.right = e; fixAfterInsertion(e); size++; modCount++; return null; }

1. First check whether the root node exists. If it does not exist, it means that it is the first piece of data and is directly used as the root of the tree.

2. Determine whether there is a comparator. If there is, use the comparator to find the storage location of the data. If the result of the comparator returns is less than 0, take the left, and take the right, otherwise replace the value of the current node directly.

3. If there is no comparator, the key is directly compared with the key of the node. The comparison is the same as the previous method.

4. The next step is to create a child node on the found parent and put it in the left or right child nodes.

5. fixAfterInsertion is to color nodes

6. Accumulator processing

It will also be a bit troublesome when removing the data. In addition to deleting the data, you also need to rebalance the red and black trees.

In addition, TreeMap implements the NavigableMap<K,V> interface, so it also provides some return operations on the data set.

Finally, take a look at Set

Set mainly has two types of applications: HashSet and TreeSet.

HashSet

The literal meaning is very clear, using a collection of Hash. The characteristic of this collection is to use the Hash algorithm to store data, so the data is not duplicated and the access is relatively fast. How did it be done?

 public boolean add(E e) { return map.put(e, PRESENT)==null; }

It turns out that there is a map object. Let’s look at what the map is?

 private transient HashMap<E,Object> map;

It is a HashMap. Those who know HashMap will understand that such data will not be repeated. Because the object itself is stored as a key when deposited, only one copy will exist in the HashMap.

After understanding this and other things, you will understand very well.

TreeSet

This set is used to sort the set, which means that in addition to the ability to sort heavy, it can also have its own sorting function. But after looking at the code of TreeSet, I found that it was implemented in the basics of TreeMap. More precisely, it should be a derived class of NavigableMap. TreeSet is based on TreeMap without specifying map by default.

 public TreeSet() { this(new TreeMap<E,Object>()); }

So, what we may pay more attention to here is how TreeSet is heavy-duty? Let's take a look at the method of add:

 public boolean add(E e) { return m.put(e, PRESENT)==null; }

It is somewhat similar to HashSet, both of which are based on the features of Map to achieve heavy loading. It's really simple and effective.

The above is the detailed explanation of Set, List, and Map in Java that the editor brings to you. I hope you can support Wulin.com more~