Tuesday, March 31, 2015

Concurrent collections: CopyOnWriteArrayList


Working with collections in a multi-thread application is a challenge. Just imagine the list which is accessible by a few threads and every thread is seeking for a chance to change the list data, that’s a typical showcase of the ConcurrentModificationException. In order to get rid of this and other problems JDK (since version 1.5) provides improved mechanisms for storing ‘Iterable’ data in multithread application.
This article is an overview of widely used concurrent collection such as CopyOnWriteArrayList

The name of the collection is very straightforward, every ‘write’ operation (add, remove, set) causes the copying and creation of the modified collection. It allows us to prevent ConcurrentModificationException as long as every thread’s iterator will have its own copy of the collection. The official Oracle documentation names such iterators as “snapshot” style iterator; this iterator “uses a reference to the state of the array at the point that the iterator was created”

This is the add operation for CopyOnWriteArrayList


public boolean add(E e) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        Object[] newElements = Arrays.copyOf(elements, len + 1);
        newElements[len] = e;
        setArray(newElements);
        return true;
    } finally {
        lock.unlock();
    }
}

Every time you modify collection the new copy of the whole collection is created – it’s quite expensive operation and that’s why it’s not recommended to use CopyOnWriteArrayList for frequently modified data. Another interesting method from the listing is setArray(newElements); it updates actual array and every new thread’s iterator will have updated version of the array – other thread iterators (which run in parallel) won’t be affected using their own local copies of the array.
Question: two or more threads modify the CopyOnArrayList simultaneously, what will be the result of their job?

Answer: the result depends on the thread order accessing to the collection.
Check the following code:


final CopyOnWriteArrayList<String> cwal = new CopyOnWriteArrayList<String>(); 
cwal.add("How are you?");  

Thread thA = new Thread(new Runnable() {   
 public void run() {
  System.out.println("Start processing " + Thread.currentThread().getName());    
  cwal.add("Hi!");
  System.out.println("Finish processing " + Thread.currentThread().getName());
 }
}, "Thread-A");

Thread thB = new Thread(new Runnable() {   
 public void run() {
  System.out.println("Start processing " + Thread.currentThread().getName());    
  cwal.clear();
  cwal.add("Bye!");
  System.out.println("Finish processing " + Thread.currentThread().getName());    
 }
}, "Thread-B");
thA.start();  
thB.start();
Thread.sleep(1000); // Give some time to threads to finish the jobs
for (String element : cwal) {
 System.out.print(element);
}
!!

The output will be:
Start processing Thread-B
Start processing Thread-A
Finish processing Thread-B
Finish processing Thread-A
Bye!Hi!

Alternatives
How about thread-safe collections like Vector or SynchronizedList, what’s wrong with them and why do we need CopyOnWriteArrayList? The answer key word is iterator. Vector and SynchronizedList modification methods (add, set, remove…) are thread safe, but iterator is not. The possible failure scenario:
1.  Thread-A prints the collection elements one-by-one using iterator
2. Meanwhile Thread-B removes some element from the collection which causes ConcurrentModificationException

To solve that issue Oracle Developers added CopyOnWriteArrayList to a Java Development Kit.

No comments:

Post a Comment