Friday, October 3, 2014

Merge Sort in Java (+JavaScript)

Merge sort algorithm has a worst-case performance of O(n log n) - that's a good performance of comparison-based sorting algorithm, moreover this algorithm is easy to understand, remember and repeat after while.
For implementation I'm going to use mergeSort method with the following signature:
<T extends Comparable<T>> void mergeSort(T[] arrayToSort, T[]resArray)
arrayToSort - array to be sorted
resArray - empty array with the same length as arrayToSort which will have all arrayToSort's items in a sorted order

In step one of merge sort we need to copy the content of arrayToSort into resArray and specify array's start and end indicies.


int lo = 0;
int hi = arrayToSort.length - 1;
for (int iter = 0; iter < arrayToSort.length; ++iter) {
 resArray[iter] = arrayToSort[iter];
} 

Now it's time to split an array in a central element and get two subarrays
int mid = lo + (hi - lo) / 2; // this is a central element index of array
1-st subarray elements will be from lo to mid indicies
2-nd subarray elements will be from mid + 1 to hi indicies


The following function can break up into subarrays recursively:

<T extends Comparable<T>> void sortMerge(T[] arrayToSort, T[] resArray, int lo, int hi) {
    if (lo >= hi)
  return;
 int mid = lo + (hi - lo) / 2;
 sortMerge(arrayToSort, resArray, lo, mid);
 sortMerge(arrayToSort, resArray, mid + 1, hi);
 //merge(arrayToSort, resArray, lo, mid, hi);
}

Using these subarrays, comparing their elements one-by-one we need to push elements into resArray in accordance with the comparison rule. This is gonna be implemented in commented 'merge' method but firstly let's implement a method for checking whether a subarray has been already sorted

<T extends Comparable<T>> boolean isSorted(T[] array, int lo, int hi) {
 for (int iter = lo + 1; iter <= hi; ++iter) {
  if (array[iter].compareTo(array[iter - 1]) < 0) { // order is up to you
   return false;
  }
 }
 return true;
}

Merge method requires that both subarrays are sorted, on a first step it's reached by our recursion method which breaks up target array for a single element subarrays (one element per one array)

In a merge method we need to pass our arrays and indicies of subarrays: lo, mid, hi. Within these indicies we'll get two subarrays array1[lo .. mid] (from lo to mid indicies), array2[mid+1 .. hi]. resArray starts from lo and ends on hi: [lo..hi]
Three incremental parameters should be used correspondingly for indexing between three arrays: iter1 for array1, iter2 for array2, iterRes for resArray.
int iter1 = lo; // increment for array1
int iter2 = mid + 1; // increment for array2
for (int iterRes = lo; iterRes <= hi; ++iterRes) // increment through resArray

Inside for loop we need to compare elements of two subarrays one-by-one.
[pseudocode]

if (array1[iter1] > array2[iter2]) {
 arrayRes[iterRes] = array2[iter2];
 iter2 = iter2 + 1; // move to the next element of array2
} else if (array1[iter1] <= array2[iter2]) {
 arrayRes[iterRes] = array1[iter1];
 iter1 = iter1 + 1; // move to the next element of array1
} else if (iter1 > mid) { // array1 processing is completed
 arrayRes[iterRes] = array2[iter2]; // fill the arrayRes with rest of array2 values
 iter2 = iter2 + 1;
} else if (iter2 > hi) { // array1 processing is completed
 arrayRes[iterRes] = array1[iter1]; // fill the arrayRes with rest of array1 values
 iter1 = iter1 + 1;
}

Animation from wikipedia article for better understanding.

Complete source code:

 public <T extends Comparable<T>> void mergeSort(T[] arrayToSort,
   T[] resArray) {
  for (int iter = 0; iter < arrayToSort.length; ++iter) {
   resArray[iter] = arrayToSort[iter];
  }
  if (arrayToSort.length < 2) {
   resArray = arrayToSort;
  }
  int lo = 0;
  int hi = arrayToSort.length - 1;
  mergeSort(arrayToSort, resArray, lo, hi);
 }

 <T extends Comparable<T>> void mergeSort(T[] arrayToSort, T[] resArray,
   int lo, int hi) {
  if (lo >= hi) {
   return;
  }
  int mid = lo + (hi - lo) / 2;
  mergeSort(arrayToSort, resArray, lo, mid);
  mergeSort(arrayToSort, resArray, mid + 1, hi);
  merge(arrayToSort, resArray, lo, mid, hi);
 }

 <T extends Comparable<T>> void merge(T[] arrayToSort, T[] resArray, int lo,
   int mid, int hi) {
  if (!isSorted(resArray, lo, mid)) {
   return;
  }
  if (!isSorted(resArray, mid + 1, hi)) {
   return;
  }
  for (int iterRes = lo; iterRes <= hi; ++iterRes) {
   arrayToSort[iterRes] = resArray[iterRes];
  }
  int iter1 = lo;
  int iter2 = mid + 1;
  for (int iterRes = lo; iterRes <= hi; ++iterRes) {
   if (iter1 > mid) {
    resArray[iterRes] = arrayToSort[iter2];
    ++iter2;
   } else if (iter2 > hi) {
    resArray[iterRes] = arrayToSort[iter1];
    ++iter1;
   } else if (arrayToSort[iter1].compareTo(arrayToSort[iter2]) <= 0) {
    resArray[iterRes] = arrayToSort[iter1];
    ++iter1;
   } else {
    resArray[iterRes] = arrayToSort[iter2];
    ++iter2;
   }
  }
 }

 <T extends Comparable<T>> boolean isSorted(T[] array, int lo, int hi) {
  for (int iter = lo + 1; iter <= hi; ++iter) {
   if (array[iter].compareTo(array[iter - 1]) < 0) {
    return false;
   }
  }
  return true;
 }

Is it used by Java?
In accordance with Java 7 specification, merge sort algorithm is used by Collectons.sort(List<T>) as well as by Arrays.sort(Object[]) method. However  Arrays,sort(int[]) uses quicksort algorithm and other 'sort' methods which has array of primitives as parameter uses quicksort algorithm. Why? Looks like it's been implemented in that way because of memory. It's assumed that if software developer uses primitives in his/her code developer is aware of memory usage - quicksort doesn't need to create any additional structure to store temporary values unlike merge sort.

JavaScript
In a JavaScript source code MergeSort looks much better, thanks to slice and concat methods

function merge(left, right){
    var resArr  = [];
    var iter1 = 0;
    var iter2 = 0;

    while (iter1 < left.length && iter2 < right.length){
        if (left[iter1] < right[iter2]){
            resArr.push(left[iter1]);
            ++iter1;
        } else {
            resArr.push(right[iter2]);
            ++iter2;
        }
    }

    return resArr.concat(left.slice(iter1)).concat(right.slice(iter2));
}

function mergeSort(items){

    if (items.length < 2) {
        return items;
    }

    var middle = Math.floor(items.length / 2);
    var left = items.slice(0, middle);
    var right = items.slice(middle);

    return merge(mergeSort(left), mergeSort(right));
}

No comments:

Post a Comment