What is Merge Sort?

The Merge Sort is the most common algorythm for sorting data using the “divide and conquer” technique. In this algorithm, the problem is divided into subproblems and then after sorting they are merged together. Let us say we have a list of unsorted numbers, the result of a class for a certain subject. To sort them by ascending order we will need to place them in a list starting from low to high. In this Merge Sort algorithm, the list will be divided into smaller lists to sort them into an ascending order and then it will merge the results for better understanding. Merge Sort in Java can be explained through an example of an array {6,9,8,2,4,1}, consider it as a result of a class test out of 10. The array (of results) will be repeatedly divided into smaller chunks until their size is 1. Then the process of merging takes place while sorting the numbers simultaneously. This will provide us with a result that begins with the lowest marks to the highest marks received. Merge Sort in Java - 1This array will be divided into two arrays which will contain 3 elements each as shown below in step 2 and it keeps on dividing until unity is reached in step 4. Then the Merge Sort algorithm starts to sort the numbers one step at a time (step 5) and then merges the numbers into a bigger array in steps 6 & 7.

Applications of Merge Sort

Merge Sort is widely used in various domains due to its efficiency and stability. Some of its key applications include:

  • Sorting Linked Lists: Merge Sort is particularly suitable for linked lists because it does not require random access to elements, unlike algorithms like Quick Sort.
  • Large Data Sets: Merge Sort is used for sorting large data sets because it guarantees a worst-case time complexity of O(n log n).
  • External Sorting: Merge Sort is ideal for external sorting scenarios, such as sorting data stored on disk, as it processes data in chunks that fit into memory.
  • Stable Sorting Needs: It is used in applications where stability (maintaining the relative order of equal elements) is critical, such as in database systems and libraries.

Implementation

In the implementation we will write code for the merge sort algorithm in Java. The variables required will be the input array and the length of the array. These two parameters will further be used to introduce further parameters to create the merge sort function. Let's have a look at the snippet below to understand the general working of the Merge Sort Algorithm in Java.

Merge_Sort_Algo (Array, Beginning, End)
/** Three parameters required for the Merge Sort Algorithm
 * Array = values of the array
 * Beginning = the starting element of the array
 * End = the ending element of the array*/

if (Beginning < End) // condition check Beginning must be less than End

set Middle = (Beginning + End) / 2 // Assigning Middle to the array

Merge_Sort_Algo (Array, Beginning, Middle) /** Sorting and merging of elements from Beginning to the Middle */

Merge_Sort_Algo (Array, Middle +1, End) /** Sorting and merging of elements from Middle to the End */

Merge (Array, Beginning, Middle, End) // Merging both the sorted arrays

end of if

End Merge_Sort_Algo
First through the if condition Beginning and End are used to determine the Middle. Then in the next step, 2 new subarrays are created starting from the Beginning to the Middle and the other starting from the Middle +1 to the End. These arrays are divided until their length becomes 1 and then through the Merge Function the sorted subarrays Beginning, Middle, Middle+1 and End all are merged back to acquire the solution.

Example

The following code in Java explains the merge sort algorithm:

import java.util.Arrays;

class HelloWorld {
    
    public static void merge( 

  int[] array, int[] new_array_1, int[] new_array_2, int left, int right) {
   // defining parameters

    int i = 0, j = 0, k = 0;

    while (i < left && j < right) {  // conditions for merging

        if (new_array_1[i] <= new_array_2[j]) {
            array[k++] = new_array_1[i++];
        }
        else {
            array[k++] = new_array_2[j++];
        }
    }

    while (i < left) {
        array[k++] = new_array_1[i++];
    }

    while (j < right) {
        array[k++] = new_array_2[j++];
    }
}

    public static void mergeSort(int[] array, int length) { /** required parameters */
	if (length < 2) {  //condition for the length of array
    	return;
	}

	int middle = length / 2;  // defining new parameter middle

	int [ ] new_array_1 = new int [middle]; /** defining the new first array after division */
	int [ ] new_array_2 = new int [length - middle]; /** defining the new second array */
 
 
	for (int i = 0; i < middle; i++) { /**applying condition for sorting of new array 1 */
    	new_array_1 [ i ] = array [ i ];
	}

	for (int i = middle; i < length ; i++) { /**applying condition for sorting of new array 2 */
    	new_array_2 [ i - middle] = array [ i ];
	}

	mergeSort (new_array_1, middle); /** calling merge sort function for new array 1 */
	mergeSort (new_array_2, length - middle); /** calling merge sort function for new array 2 */
 
 
	merge(array, new_array_1, new_array_2, middle, length - middle); /** calling function for merging of new array 1 and new array 2 */
}

    
    public static void main(String[] args) {
        
        int [ ] testScores = {6,9,8,2,4,1}; 
        int size = testScores.length;
        
        System.out.println("Original Array " + Arrays.toString(testScores) + "\n");

        mergeSort(testScores, size);
        
        System.out.println("After Merge Sort " + Arrays.toString(testScores) + "\n");
    }
}

Output

Original Array [6, 9, 8, 2, 4, 1] After Merge Sort [1, 2, 4, 6, 8, 9]
Merge Sort in Java - 2

Advantages of Merge Sort

Merge Sort offers several advantages that make it a preferred choice in specific scenarios:

  • Stability: Merge Sort is a stable sorting algorithm, meaning that it preserves the relative order of equal elements. This is essential in applications where maintaining this order is necessary, such as sorting data with multiple criteria.
  • Suitability for Linked Lists: Unlike array-based algorithms, Merge Sort works efficiently with linked lists. It avoids the overhead of random access operations and leverages the inherent structure of linked lists for splitting and merging.
  • Consistent Performance: Merge Sort provides a guaranteed O(n log n) time complexity in all cases (best, average, and worst), making it predictable and reliable for performance-critical applications.

Example: Merge Sort with a Linked List

class Node {
    int value;
    Node next;

    Node(int value) {
        this.value = value;
        this.next = null;
    }
}

public class MergeSortLinkedList {
    public static Node mergeSort(Node head) {
        if (head == null || head.next == null) {
            return head;
        }

        Node middle = getMiddle(head);
        Node nextOfMiddle = middle.next;
        middle.next = null;

        Node left = mergeSort(head);
        Node right = mergeSort(nextOfMiddle);

        return merge(left, right);
    }

    private static Node getMiddle(Node head) {
        if (head == null) {
            return head;
        }
        Node slow = head, fast = head;
        while (fast.next != null && fast.next.next != null) {
            slow = slow.next;
            fast = fast.next.next;
        }
        return slow;
    }

    private static Node merge(Node left, Node right) {
        Node result = null;
        if (left == null) return right;
        if (right == null) return left;

        if (left.value <= right.value) {
            result = left;
            result.next = merge(left.next, right);
        } else {
            result = right;
            result.next = merge(left, right.next);
        }
        return result;
    }

    public static void main(String[] args) {
        Node head = new Node(4);
        head.next = new Node(2);
        head.next.next = new Node(5);
        head.next.next.next = new Node(1);
        head.next.next.next.next = new Node(3);

        head = mergeSort(head);

        while (head != null) {
            System.out.print(head.value + " ");
            head = head.next;
        }
    }
}

Efficiency of Merge Sort in Industry Applications

Merge Sort's efficiency and stability make it a popular choice in industry applications. Here are some scenarios where it excels:

  • Big Data Processing: Merge Sort is used in distributed systems like Hadoop and Spark for sorting large datasets across multiple nodes.
  • Database Management: Many database engines use Merge Sort for sorting data during query processing and index creation due to its stability.
  • Sorting in Libraries: Standard libraries in programming languages like Python and Java often implement Merge Sort for stable sorting requirements.
  • Scientific Computing: Merge Sort is used in scientific applications where predictable performance and stability are essential for handling large numerical datasets.

Detailed Complexity Analysis of Merge Sort

Merge Sort divides an array into smaller subarrays, sorts them, and then merges the sorted subarrays. This process leads to predictable and efficient performance characteristics.

Time Complexity

  • Divide Step: The array is divided into two halves at each level of recursion. This operation takes O(1) time.
  • Merge Step: Merging two sorted halves takes O(n) time, where n is the total number of elements being merged.
  • Recursive Calls: At each level, the array is divided into smaller subarrays until each subarray contains only one element. The depth of the recursion tree is log(n).

Combining these steps, the overall time complexity of Merge Sort is O(n log n) in all cases.

Space Complexity

  • Auxiliary Space: Merge Sort requires additional space to store the temporary arrays used during the merge process. The space complexity is O(n).
  • Recursive Stack: The recursion depth contributes to the stack space, which is O(log n).

Thus, the total space complexity of Merge Sort is O(n), which includes both auxiliary space and recursive stack space.

Recursive Nature of Merge Sort

Merge Sort is inherently recursive, breaking the problem into smaller subproblems until each subproblem is trivial (a single-element array). The recursive nature has the following implications:

  • Divide Step: Each recursion divides the array into two halves, ensuring that the problem size reduces exponentially.
  • Conquer Step: Each recursion level sorts smaller subarrays and merges them into a sorted array.
  • Stack Usage: The recursive calls require stack space proportional to the recursion depth, which is O(log n).

Example of Recursion Tree

The recursion tree for an array of size 8:

 Level 0: [8 elements]
 Level 1: [4 elements] [4 elements]
 Level 2: [2 elements] [2 elements] [2 elements] [2 elements]
 Level 3: [1 element] [1 element] [1 element] [1 element] [1 element] [1 element] [1 element] [1 element]

At each level, the total number of elements remains constant, but the number of subarrays doubles, and their size halves.

Best, Average, and Worst-Case Time Complexities

Merge Sort performs consistently across different scenarios:

  • Best Case: O(n log n)
    • Occurs when the input array is already sorted. The algorithm still divides and merges the array.
  • Average Case: O(n log n)
    • Applies to random or unsorted input arrays. The performance remains consistent due to the divide-and-conquer approach.
  • Worst Case: O(n log n)
    • Occurs for reverse-sorted or completely unsorted arrays. Merge Sort still processes all elements predictably.

Conclusion

Merge Sort in Java is a simple algorithm to acquire a sorted list from an unsorted list of numbers. The basic method of ‘divide and conquer’ is applied to access the sorted array from an unsorted array.