Source code analysis under java.util.HashMap jdk1.8

Based on jdk1.8, this paper analyzes the source code, analyzes the implementation of the code and the possible misuse

HashMap: hash hash hash table. It exists in the form of key value key value pairs. It is unordered. The time dimension and space dimension of the search are both O(1). It enables us to quickly collect some data

Catalog

1. See the figure below for class structure:

1.1 initialization parameters

2. Analysis of common methods

2.1 initialization method

2.2 put method

2.3 get method

2.4 replace method

2.5 remove method

3 capacity expansion mechanism

4 existing problems

 

1. See the figure below for class structure:

1.1 initialization parameters

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

The default initial capacity is 16, which adopts binary displacement. It is equivalent to math.pow (2,4) in java;

 static final int MAXIMUM_CAPACITY = 1 << 30;

The maximum capacity is the power of 30 of 2. If it is greater than this value during initialization or expansion, the power of 30 of 2 is taken, which is equivalent to Math.pow(2, 30) in java;

static final float DEFAULT_LOAD_FACTOR = 0.75f;

Default load factor. When the capacity in use reaches the current capacity * 0.75f, capacity expansion can be considered as a pre expansion mechanism

static final int TREEIFY_THRESHOLD = 8;

When the threshold value is reached, the link list will be converted to a red black tree, which is mainly to optimize and reduce the query link

static final int UNTREEIFY_THRESHOLD = 6;

Turning node threshold: when the storage of the first bucket is a red black tree, its red black tree node is less than this threshold and turns back to the node data structure, which should be less than the tree [threshold]. If the two conflict may cause the conversion to a red black tree and then add elements, it will turn back to the node structure

static final int MIN_TREEIFY_CAPACITY = 64;

The default minimum red black tree capacity defined is generally set to tree [threshold * 4] to reduce frequent expansion, because red black trees also have initial capacity

static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;
// .....

}

Defined inner class, one-way linked list structure. The current object contains the reference of the next object. It is used to store the data in the bucket

transient Node<K,V>[] table;

Data set

transient Set<Map.Entry<K,V>> entrySet;

Set of key value pairs

transient int size;

Number of key value pair mappings for the current hash table

transient int modCount;

The number of changes to the current hash table structure

int threshold;

capacity

final float loadFactor;

Load factor

2. Analysis of common methods

2.1 initialization method

public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

When the user-defined initial capacity and load factor are passed in during initialization, the validity is verified. The initialization capacity cannot be less than 0. If the initialization capacity is greater than the maximum allowed capacity, i.e. 2 ^ 30, then 2 ^ 30 is taken. If the load factor is less than or equal to 0 or not a number, an exception is returned. The default load factor and capacity are taken by the nonparametric constructor

2.2 put method

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
    
    // Take the hash of the key value, and use the hashCode method of the key to take the exclusive or of the high 16 bits and the low 16 bits
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            // If the hash is null or the capacity is 0, call the resize() method to initialize
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            // If there is no element in the current bucket, the head element of the linked list is judged here. If there is no element, it is empty and inserted directly
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                // It already exists. Judge whether the key value is equal or not. Put it into e if it exists
                e = p;
            else if (p instanceof TreeNode)
                // If it is not equal, determine whether it is a red black tree structure. If it is, insert the red black tree
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                // When it is node structure
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        // Insert if the next element of the list does not exist
                        p.next = newNode(hash, key, value, null);
                        // Determine whether the number of elements has reached the threshold value of turning into red black tree, and turn the node into red black tree when it reaches the threshold value
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        // To judge whether it already exists, do not operate if it exists. The head element has been judged before, so the head element may exist for one more verification
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                // The inserted key value already exists. Onlyifasent passes in false when put ting, and is ignored here. When the previously existing key value pair value is empty, the new value is assigned
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                // End insert current element
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // Structural change record + 1
        ++modCount;
        if (++size > threshold)
            // If the current length exceeds the set length, expand the capacity
            resize();
        afterNodeInsertion(evict);
        return null;
    }

2.3 get method

    public V get(Object key) {
        Node<K,V> e;
        // hash method is the same as put
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        // Get the bucket index through the key hash and the array length, and then assign the first value
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                // If the first element matches, the result is returned
                return first;
            if ((e = first.next) != null) {
                // Next element is not empty, judge whether it is a red black tree
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    // The maximum number of times to find is 7
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

2.4 replace method

    @Override
    public boolean replace(K key, V oldValue, V newValue) {
        Node<K,V> e; V v;
        // Get the element of the specified key through getNode, and then judge whether it is equal to oldValue. If it is equal to modify, give up. An implementation of cas
        if ((e = getNode(hash(key), key)) != null &&
            ((v = e.value) == oldValue || (v != null && v.equals(oldValue)))) {
            e.value = newValue;
            afterNodeAccess(e);
            return true;
        }
        return false;
    }

2.5 remove method

    public V remove(Object key) {
        Node<K,V> e;
        return (e = removeNode(hash(key), key, null, false, true)) == null ?
            null : e.value;
    }

    final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {
            Node<K,V> node = null, e; K k; V v;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                node = p;
            else if ((e = p.next) != null) {
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
                else {
                    do {
                        if (e.hash == hash &&
                            ((k = e.key) == key ||
                             (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while ((e = e.next) != null);
                }
            }

            // Element found
            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                else if (node == p)
                    // If it is a head element, the next element set to the head is the head element
                    tab[index] = node.next;
                else
                    p.next = node.next;
                ++modCount;
                --size;
                afterNodeRemoval(node);
                return node;
            }
        }
        return null;
    }

 

3 capacity expansion mechanism

    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        // cap is array capacity
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            // If the length of the array has reached the set default maximum, set the length of the total number of elements to int maximum
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            // After the expansion, the array length is less than the default maximum and greater than the default length, and the element capacity is expanded
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        // If the original load factor is greater than 0, the value is assigned
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            // Otherwise, the initialization parameter is assigned
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;// Easy to form a ring
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;// Easy to form a ring
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

Each put will check whether the current Capacity reaches the Capacity expansion threshold, where the threshold is Capacity * loadFactor, and the Capacity expansion mechanism is one bit of left displacement, i.e. double Capacity expansion, initial 16

 

4 existing problems

In the process of capacity expansion, if multi-threaded access causes a circular list, if you query a non-existent value, it will always be in a loop, forming a dead loop

When a large amount of data is inserted and the initial capacity is not set large enough, frequent capacity expansion will occur. However, each capacity expansion will rewrite the previous data, which is extremely time-consuming and takes up memory

Published 6 original articles, won praise 2, visited 10000+
Private letter follow

Tags: less Java

Posted on Wed, 11 Mar 2020 00:41:47 -0700 by etherboo