Search This Blog

11 August 2021

Internal Implementation of ConcurrentHashMap in Java

Internal Implementation of ConcurrentHashMap in Java

Internal Implementation of ConcurrentHashMap in Java

ConcurrentHashMap is a part of the java.util.concurrent package and is designed to handle concurrent access to the map without compromising thread safety or performance. This article provides an in-depth look at the internal implementation of ConcurrentHashMap in Java.

1. Introduction to ConcurrentHashMap

ConcurrentHashMap is a thread-safe variant of HashMap designed for concurrent access. It provides high concurrency with performance optimized for multi-threaded environments. Unlike Hashtable, ConcurrentHashMap does not lock the entire map but uses finer-grained locking mechanisms to allow concurrent reads and writes.

2. Key Concepts and Data Structures

The internal implementation of ConcurrentHashMap involves several key concepts and data structures:

2.1 Segments

In earlier versions of Java (prior to Java 8), ConcurrentHashMap was divided into segments, each acting as a separate hash table. This segmentation allowed finer-grained locking. However, in Java 8, the segmentation strategy was replaced with a more optimized approach using a single array of nodes.

2.2 Node

The Node class represents an entry in the ConcurrentHashMap. Each node contains a key-value pair, the hash of the key, and a reference to the next node in the chain (for handling collisions).

static class Node implements Map.Entry {
    final int hash;
    final K key;
    volatile V val;
    volatile Node next;

    Node(int hash, K key, V val, Node next) {
        this.hash = hash;
        this.key = key;
        this.val = val;
        this.next = next;
    }

    public final K getKey() { return key; }
    public final V getValue() { return val; }
    public final int hashCode() { return key.hashCode() ^ val.hashCode(); }
    public final String toString() { return key + "=" + val; }
    public final V setValue(V value) { throw new UnsupportedOperationException(); }
    public final boolean equals(Object o) { ... }
}

2.3 TreeBin

When the number of nodes in a bin exceeds a certain threshold, the bin is converted to a balanced tree (TreeBin) to improve performance. This structure is similar to a red-black tree.

static final class TreeBin extends Node {
    TreeNode root;
    volatile TreeNode first;
    volatile Thread waiter;
    volatile int lockState;
    // other tree-related fields and methods
}

3. Locking Mechanism

ConcurrentHashMap uses a variety of locking mechanisms to ensure thread safety while maintaining high performance. In Java 8, the primary techniques are:

3.1 CAS (Compare-And-Swap)

CAS operations are used extensively in ConcurrentHashMap to achieve lock-free reads and writes. CAS is a low-level atomic instruction that compares the current value with an expected value and, if they match, updates the value atomically.

3.2 Synchronized Blocks

For certain operations where CAS is not sufficient, synchronized blocks are used to ensure thread safety. These blocks are used sparingly to minimize contention and performance overhead.

4. Internal Operations

Let's explore some key internal operations of ConcurrentHashMap, such as get, put, and remove.

4.1 get Operation

The get operation is lock-free and uses volatile reads to ensure visibility of changes made by other threads. It traverses the bin list or tree to find the matching key.

public V get(Object key) {
    Node[] tab; Node e, p; int n, eh; K ek;
    int h = spread(key.hashCode());
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (e = tabAt(tab, (n - 1) & h)) != null) {
        if ((eh = e.hash) == h) {
            if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                return e.val;
        }
        else if (eh < 0)
            return (p = e.find(h, key)) != null ? p.val : null;
        while ((e = e.next) != null) {
            if (e.hash == h &&
                ((ek = e.key) == key || (ek != null && key.equals(ek))))
                return e.val;
        }
    }
    return null;
}

4.2 put Operation

The put operation uses CAS to insert a new node if the bin is empty. If the bin is not empty, it locks the bin and inserts the new node, converting the bin to a tree if necessary.

final V putVal(K key, V value, boolean onlyIfAbsent) {
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node[] tab = table;;) {
        Node f; int n, i, fh; K fk;
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            if (casTabAt(tab, i, null,
                         new Node(hash, key, value, null)))
                break;                   // no lock when adding to empty bin
        }
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);
        else {
            V oldVal = null;
            synchronized (f) {
                if (tabAt(tab, i) == f) {
                    if (fh >= 0) {
                        binCount = 1;
                        for (Node e = f;; ++binCount) {
                            K ek;
                            if (e.hash == hash &&
                                ((ek = e.key) == key || (ek != null && key.equals(ek)))) {
                                oldVal = e.val;
                                if (!onlyIfAbsent)
                                    e.val = value;
                                break;
                            }
                            Node pred = e;
                            if ((e = e.next) == null) {
                                pred.next = new Node(hash, key,
                                                         value, null);
                                break;
                            }
                        }
                    }
                    else if (f instanceof TreeBin) {
                        Node p;
                        binCount = 2;
                        if ((p = ((TreeBin)f).putTreeVal(hash, key,
                                                              value)) != null) {
                            oldVal = p.val;
                            if (!onlyIfAbsent)
                                p.val = value;
                        }
                    }
                }
            }
            if (binCount != 0) {
                if (binCount >= TREEIFY_THRESHOLD)
                    treeifyBin(tab, i);
                if (oldVal != null)
                    return oldVal;
                break;
            }
        }
    }
    addCount(1L, binCount);
    return null;
}

4.3 remove Operation

The remove operation also uses synchronized blocks to ensure thread safety when removing a node. It traverses the bin list or tree to find and remove the matching node.

public V remove(Object key) {
    return replaceNode(key, null, null);
}

final V replaceNode(Object key, V value, Object cv) {
    int hash = spread(key.hashCode());
    for (Node[] tab = table;;) {
        Node f; int n, i, fh;
        if (tab == null || (n = tab.length) == 0 ||
            (f = tabAt(tab, i = (n - 1) & hash)) == null)
            break;
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);
            else {
V oldVal = null;
boolean validated = false;
synchronized (f) {
if (tabAt(tab, i) == f) {
if (fh >= 0) {
validated = true;
for (Node e = f, pred = null;;) {
K ek;
if (e.hash == hash &&
((ek = e.key) == key || (ek != null && key.equals(ek)))) {
V ev = e.val;
if (cv == null || cv == ev ||
(ev != null && cv.equals(ev))) {
oldVal = ev;
if (value != null)
e.val = value;
else if (pred != null)
pred.next = e.next;
else
setTabAt(tab, i, e.next);
}
break;
}
pred = e;
if ((e = e.next) == null)
break;
}
}
else if (f instanceof TreeBin) {
validated = true;
TreeBin t = (TreeBin)f;
TreeNode r, p;
if ((r = t.root) != null &&
(p = r.findTreeNode(hash, key, null)) != null) {
V pv = p.val;
if (cv == null || cv == pv ||
(pv != null && cv.equals(pv))) {
oldVal = pv;
if (value != null)
p.val = value;
else if (t.removeTreeNode(p))
setTabAt(tab, i, untreeify(t.first));
}
}
}
}
}
if (validated) {
if (oldVal != null) {
if (value == null)
addCount(-1L, -1);
return oldVal;
}
break;
}
}
}
return null;
}

Conclusion

ConcurrentHashMap is a powerful and efficient implementation of a thread-safe hash map in Java. Its internal design, including the use of CAS operations, synchronized blocks, and tree bins, allows it to handle high concurrency with minimal performance overhead. Understanding the internal workings of ConcurrentHashMap can help developers make better use of this data structure in their concurrent applications.

No comments:

Post a Comment