Analyzing the source code of ArrayList and LinkedList list structure in Java

Author：Eve Cole Update Time：2025-03-28 06:00:02

1. ArrayList source code analysis (JDK7)

ArrayList maintains a dynamic Object array internally. The dynamic addition and deletion of ArrayList is the dynamic addition and deletion of this pair of groups.

1. ArrayList construction and initialization

 ArrayList instance variable//ArrayList default capacity private static final int DEFAULT_CAPACITY = 10;//Default empty Object array, used to define empty ArrayListprivate static final Object[] EMPTY_ELEMENTDATA = {};//ArrayList stores the Object array private transient Object[] elementData;//The number of elements in ArrayList private int size;

ArrayList constructor:

No parameter constructor: that is, construct an empty Object[]

 public ArrayList() { super(); this.elementData = EMPTY_ELEMENTDATA;}

Specify the capacity size construct:

 public ArrayList(int initialCapacity) { super(); if (initialCapacity < 0) throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); this.elementData = new Object[initialCapacity];}

Specify a collection structure that implements the Collection interface:

 public ArrayList(Collection<? extends E> c) { elementData = c.toArray(); size = elementData.length; // c.toArray might (incorrectly) not return Object[] (see 6260652) if (elementData.getClass() != Object[].class) elementData = Arrays.copyOf(elementData, size, Object[].class);}

This also explains the role of Collection, and the reason why java-collection-framwork designs the Collection interface instead of directly using List, Set and other interfaces.

2. Capacity allocation mechanism of ArrayList

Capacity cap for ArrayList: ArrayList capacity is upper limit, and theories allow the allocation of Integer.Max_VALUE - 8 size capacity. However, how much can be allocated depends on the stack settings, and VM parameters need to be set

 private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

Extend the volume when calling the Add method

 public boolean add(E e) { ensureCapacityInternal(size + 1); // Increments modCount!! elementData[size++] = e; return true; }

The ensureCapacityInternal(int) method actually determines a minimum expansion size.

 private void ensureCapacityInternal(int minCapacity) { if (elementData == EMPTY_ELEMENTDATA) { minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity); } ensureExplicitCapacity(minCapacity); } private void ensureExplicitCapacity(int minCapacity) { modCount++; // overflow-conscious code if (minCapacity - elementData.length > 0) grow(minCapacity); }

About modCount: modCount is defined in the abstract class AbstractList. The source code comments basically explain its use: when using iterator to traverse, it is used to check whether the elements in the list have structural changes (a count of the number of list elements has changed). It is mainly used in a multi-threaded environment to prevent one thread from iterating and another thread modifying the structure of this list.
The growth method is a real expansion method

 private void grow(int minCapacity) { // overflow-conscious code int oldCapacity = elementData.length; int newCapacity = oldCapacity + (oldCapacity >> 1); if (newCapacity - minCapacity < 0) newCapacity = minCapacity; if (newCapacity - MAX_ARRAY_SIZE > 0) newCapacity = hugeCapacity(minCapacity); // minCapacity is usually close to size, so this is a win: elementData = Arrays.copyOf(elementData, newCapacity); }

There is also a hugeCapacity method for how much capacity is expanded

 private static int hugeCapacity(int minCapacity) { if (minCapacity < 0) // overflow throw new OutOfMemoryError(); return (minCapacity > MAX_ARRAY_SIZE) ? Integer.MAX_VALUE : MAX_ARRAY_SIZE; }

Summarize:
Each expansion is accompanied by a copy of the array, so giving the right capacity at a time will improve performance a little.
The following figure is the entire expansion process I summarized:

3.ArrayList iterator

There are two main iterators of ArrayList and ListItr, but an ArrayListSpliterator is also added in jDK1.8. Let’s learn the source code analysis of Itr and ListItr respectively.

(1) Itr: can only go backwards

 private class Itr implements Iterator<E> { int cursor; // index of next element to return int lastRet = -1; // index of last element returned; -1 if no such // expectedModCount is a copy of modCount int expectedModCount = modCount; public boolean hasNext() { return cursor != size; } @SuppressWarnings("unchecked") public E next() { checkForComodification(); // Record the current position int i = cursor; if (i >= size) throw new NoSuchElementException(); Object[] elementData = ArrayList.this.elementData; if (i >= elementData.length) throw new ConcurrentModificationException(); //The position of the next element cursor = i + 1; return (E) elementData[lastRet = i]; } //Use the iterator's remove method public void remove() { if (lastRet < 0) throw new IllegalStateException(); checkForComodification(); try { //Note how the inner class calls the outer class ArrayList.this.remove(lastRet); //After removing, you need to re-adjust the position of each pointer cursor = lastRet; lastRet = -1; expectedModCount = modCount; } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } } final void checkForComodification() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); } }

From the source code, it can be seen that the Itr iterator is a forward iterator, which provides a next method to obtain elements in the ArrayList.
checkForComodification is a fail-fast error detection mechanism in java-collection-framwork. Operation on the same set in a multi-threaded environment may trigger the fail-fast mechanism and throw a ConcurrentModificationException exception.

The Itr iterator defines a copy of the expectedModCount record modCount. When ArrayList performs operations to change structure, such as Add, remove, and clear methods, the value of modCount will change.

Through the Itr source code, it can be seen that calling the next and remove methods will trigger the fail-fast check. At this time, if an exception occurs when other threads are performing operations that change the set structure while traversing the set.

(2) ListItr: Supports forward and backward traversal. Let’s take a look at the source code of ListItr:

 private class ListItr extends Itr implements ListIterator<E> { ListItr(int index) { super(); cursor = index; } public boolean hasPrevious() { return cursor != 0; } public int nextIndex() { return cursor; } public int previousIndex() { return cursor - 1; } @SuppressWarnings("unchecked") public E previous() { checkForComodification(); //The position of the previous element of the arrayList int i = cursor - 1; if (i < 0) throw new NoSuchElementException(); Object[] elementData = ArrayList.this.elementData; if (i >= elementData.length) throw new ConcurrentModificationException(); cursor = i; return (E) elementData[lastRet = i]; } //The set method is added to this iterator public void set(E e) { if (lastRet < 0) throw new IllegalStateException(); checkForComodification(); try { ArrayList.this.set(lastRet, e); } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } } //This iterator adds the add method public void add(E e) { checkForComodification(); try { int i = cursor; ArrayList.this.add(i, e); //Remark the pointer position cursor = i + 1; lastRet = -1; expectedModCount = modCount; } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } } }

The implementation of ListItr is basically the same as Itr, adding methods that can be traversed previously, as well as add and set methods.

(3) Use CopyOnWriteArrayList in java.util.concurrent to solve fast-fail problem

CopyOnWriteArrayList is thread-safe. For details, let’s take a look at its add method source code:

 public boolean add(E e) { final ReentrantLock lock = this.lock; lock.lock(); try { Object[] elements = getArray(); int len = elements.length; Object[] newElements = Arrays.copyOf(elements, len + 1); newElements[len] = e; setArray(newElements); return true; } finally { lock.unlock(); } }

CopyOnWriteArrayList is an ArrayList copied on write. When starting the operation of writing data, Arrays.copyOf is a new array, which will not affect the read operation.
This cost is to lose memory and bring about performance problems. When CopyOnWriteArrayList is written, a copy object is generated in memory, and the original object still exists.
CopyOnWriteArrayList cannot guarantee the data's consistency in real time, it can only guarantee the results consistency. Suitable for scenarios such as cache when reading more and writing more and writing less in concurrent situations.

(4) Other methods source code of ArrayList:

A private method batchRemove(Collection<?>c, boolean complement), that is, batch removal operation

 private boolean batchRemove(Collection<?> c, boolean complement) { //The reason for using final is mentioned below final Object[] elementData = this.elementData; int r = 0, w = 0; boolean modified = false; try { //Tranquility through the elements in the List and verify for (; r < size; r++) if (c.contains(elementData[r]) == complement) elementData[w++] = elementData[r]; } finally { //If an exception occurs in try, ensure the data consistency and perform the following copy operation if (r != size) { System.arraycopy(elementData, r, elementData, w, size - r); w += size - r; } //Clean unused elements and notify GC to recycle if (w != size) { // clear to let GC do its work for (int i = w; i < size; i++) elementData[i] = null; modCount += size - w; size = w; modified = true; } } return modified; }

The variable modified by final refers to the same reference to maintain the consistency of the data later.
In this method, when you want to retain elements in Collection c, the complement value is true; when you want to remove elements in C, the complement value is false. This becomes the retainAll and removeAll methods respectively.

swap: swap the two positions in the arrayList

2. LinkedList source code analysis (JDK7)

LinkedList is a linked list. Compared with the order table, the linked list does not need to use continuous memory units to store data. Reduces the problem of moving elements caused by modifying the container structure, and sequential access is relatively efficient.

1. Definition of Node

LinkedList in JDK is a bidirectional linked list, each node stores information about the previous node and the next node respectively. Its definition is as follows:

 private static class Node<E> { E item; Node<E> next; Node<E> prev; Node<E> (Node<E> prev, E element, Node<E> next) { this.item = element; this.next = next; this.prev = prev; }}

2. LinkedList construction and initialization

Member: 3 member variables are maintained in LinkedList to record the number of nodes in the linked list, the predecessor and successor of nodes

 transient int size = 0;transient Node<E> first;transient Node<E> last;

Constructor: The default constructor is to construct an empty LinkedList

 public LinkedList() {}

Or construct based on other containers, and later we will write a constructor to form an ordered link list.

 public LinkedList(Collection<? extends E> c) { this(); addAll(c);}

Here is a little extra. For the difference between the generic modifier? super T and extends T, see this article about the difference between super T and extends T in generics.

3. Structural operation of LinkedList

Header insertion method: that is, insert an element in the header of the linked list

 private void linkFirst(E e) { final Node<E> f = first; final Node<E> newNode = new Node<>(null, e, f); first = newNode; //Judge whether it is an empty linked list if (f == null) last = newNode; else f.prev = newNode; size++; modCount++; }

Tail insertion method: that is, insert an element at the end of the linked list

 void linkLast(E e) { final Node<E> l = last; final Node<E> newNode = new Node<>(l, e, null); last = newNode; if (l == null) first = newNode; else l.next = newNode; size++; modCount++; }

Before inserting into the current node: Find the front drive of the current node

 void linkBefore(E e, Node<E> succ) { //Determine whether the node is not empty of course final Node<E> pred = succ.prev; final Node<E> newNode = new Node<>(pred, e, succ); succ.prev = newNode; //Determine whether the current node is the first node if (pred == null) first = newNode; else pred.next = newNode; size++; modCount++; }

Header deletion method: Delete the first node of the linked list

 private E unlinkFirst(Node<E> f) { // assert f == first && f != null; final E element = f.item; final Node<E> next = f.next; f.item = null; f.next = null; // help GC first = next; if (next == null) last = null; else next.prev = null; size--; modCount++; return element; }

Tail deletion method: delete the last node of the linked list

 private E unlinkLast(Node<E> l) { //Make sure l==last and l != null final E element = l.item; final Node<E> prev = l.prev; l.item = null; l.prev = null; // help GC last = prev; if (prev == null) first = null; else prev.next = null; size--; modCount++; return element; }

4. Maintain consistency between List interface and Deque

The List interface allows the use of subscripts to implement random access to containers, and it is easy to implement random access to arrays like this. For linked lists, JDK also logically uses the count of nodes in linked lists to give the implementation of random access

 Node<E> node(int index) { // Ensure the correctness of index if (index < (size >> 1)) { Node<E> x = first; for (int i = 0; i < index; i++) x = x.next; return x; } else { Node<E> x = last; for (int i = size - 1; i > index; i--) x = x.prev; return x; } }

index is the count of the first half, search from the beginning. The index belongs to the count of the second half, and searches from the end. Make full use of the characteristics of two-way linked lists.
Therefore, add(int index, T t), get(int), set(int) and other methods can be easily implemented.

LinkedList implements the Deque interface, that is, LinkedList implements the method of double-ended queue containers. Here are some API summary.

5. LinkedList traversal

Since LinkedList is a two-way linked list, you can naturally traverse it back and forth. Like ArrayList, LinkedList also has fail-fast problems when it comes to multi-threading container operation.
The issue of fail-fast has been explained in the previous article, so I won’t talk about it here.

Regarding iterators, LinkedList has a listIterator bidirectional iterator, and a DescendingIterator inverse iterator. All are very simple. Source code is not analyzed

If you traverse elements, the cost of random access is relatively high.

3. LinkedList, ArrayList, Vector summary

1. LinkedList and ArrayList

ArrayList implements a data structure based on dynamic arrays, and LinkedList is based on a data structure based on a linked list.

For random access to get and set, ArrayList feels better than LinkedList because LinkedList moves the pointer.

For new and delete operations add and remove, LinedList has a better advantage because ArrayList needs to move data. This depends on the actual situation. If only a single piece of data is inserted or deleted, the speed of ArrayList is better than that of LinkedList. However, if data is inserted randomly in batches, the speed of LinkedList is much better than that of ArrayList. Because every time an ArrayList inserts data, it is necessary to move the insertion point and all data afterwards.

2. ArrayList and Vector

vector is thread-synchronous, so it is also thread-safe, while arraylist is thread-asyn, which is not safe. If thread safety factors are not taken into account, arraylist is generally more efficient.

If the number of elements in the set is greater than the length of the current set array, the vector growth rate is 100% of the current array length, and the arraylist growth rate is 50% of the current array length. If using data with relatively large amounts of data in the set, using vector has certain advantages.

If you look for data in a specified location, the time used by vector and arraylist are the same, both 0(1), and you can use vector and arraylist at this time. If the time spent moving the data at a specified location is 0(ni)n, which is the total length, you should consider using linklist, because it takes 0(1) to move the data at a specified location, and the time spent querying the data at a specified location is 0(i).

ArrayList and Vector use arrays to store data. The number of elements in this array is larger than the actual stored data to add and insert elements. Both allow direct serial number index elements. However, inserting data must be designed to move array elements and other memory operations, so index data is fast and slow to insert data. Vector uses synchronized method (thread safe), so performance is worse than ArrayList. LinkedList uses a bidirectional linked list to store data. Indexing data according to serial number requires forward or backward traversal, but when inserting data, only the front and back items of this item are recorded, so inserting several degrees is faster!