# [Heap heap] Introduction to binary Heap, source code implementation and minimum Heap to solve the problem of TOPK

Thank you, little brother Love data structure It's a classic.
Thank you, little brother Love data structure It's a classic.
Thank you, little brother Love data structure It's a classic.

# Heap

## Emergence of heap

A data structure is designed to store integers. Three interfaces are required

• Get maximum
• Delete Max

Compare the time complexity with the learned data structure:

In order to meet this demand, a new data structure - heap

• Get the maximum value: O(logn)O(logn)O(logn)
• Delete Max: O(logn)O(logn)O(logn)

## Brief introduction of heap

Heap is a tree data structure. Common heap implementations include

• Binary Heap
• Multi fork heap (D-heap, D-ary Heap)
• Index Heap
• Binomial Heap
• Fibonacci Heap
• Leftist Heap
• Skew Heap

Important properties of heap: the value of any node is always ≥ \ geq ≥ (⩽ \ leqslant ⩽) the value of the child node

• If the value of any node is always ≥ \ geq ≥ the value of the child node, it is called: maximum heap, large root heap and large top heap
• If the value of any node is always the value of \ leqslant ⩽ child node, it is called: minimum heap, small root heap, small top heap

Elements in the heap must be comparable (like a binary search tree)

# Binary Heap

The logical structure of binary heap is a complete binary tree, so it is also called complete binary heap

In view of some characteristics of complete binary tree, the bottom layer (physical structure) of binary heap is usually realized by array

The law of index i (n is the number of elements)

• If i = 0, it is the root node
• If I > 0, the index of its parent node is floor ((I - 1) / 2)
• If 2i+1 ⩽ n − 12i+1 \ leqslant n - 12i+1 ⩽ n − 1, the index of its left child node is 2i+12i + 12i+1
• If 2I + 1 > n − 12I + 1 > n - 12I + 1 > n − 1, it has no left child node
• If 2i+2 ⩽ n − 12i+2 \ leqslant n - 12i+2 ⩽ n − 1, the index of its right child node is 2i+12i + 12i+1
• If 2I + 2 > n − 12I + 2 > n - 12I + 2 > n − 1, it has no right child node

## Get maximum

It mainly checks whether the array cannot be empty, and throws an exception if it is empty.

```public E get() {
emptyCheck();
return elements[0];
}
private void emptyCheck() {
if (size == 0) {
throw new IndexOutOfBoundsException("Heap is empty");
}
}
```

The train of thought is:

• Put the added node at the end of the array

• Loop up (Sift Up), that is:

• If node > parent node
Swap location with parent
• If node < = parent node, or node has no parent node
Exit loop
• Time complexity: O(logn)O(logn)O(logn)

```/**
*/
// Check that the passed in element is not empty
elementNotNullCheck(element);
// Expand the capacity to ensure that the capacity is greater than the current number of elements
ensureCapacity(size + 1);
// Put the element to be added at the end of the array
elements[size++] = element;
// Up filtration operation
siftUp(size - 1);
}
/**
* Let the elements in index position filter up
*/
private void siftUp(int index) {
E e = elements[index]; // Nodes to add
while (index > 0) { // Until the root node is compared
int pindex = (index - 1) >> 1; // Parent node index
E p = elements[pindex]; // Add parent of node
if (compare(e, p) <= 0) return;

// Exchange the contents of index and pindex locations
E tmp = elements[index];
elements[index] = elements[pindex];
elements[pindex] = tmp;

index = pindex;
}
}
```

## Maximum heap - add optimization

The difference from the above is:

• Not every time the value of the node > the value of the parent node is swapped directly
• Instead, the newly added nodes are backed up and the final location is determined

```/**
*/
// Check that the passed in element is not empty
elementNotNullCheck(element);
// Expand the capacity to ensure that the capacity is greater than the current number of elements
ensureCapacity(size + 1);
// Put the element to be added at the end of the array
elements[size++] = element;
// Up filtration operation
siftUp(size - 1);
}
/**
* Let the elements in index position filter up
*/
private void siftUp(int index) {

E element = elements[index];
while (index > 0) {
// Parent node index = (child node index-1) / 2
int parentIndex = (index - 1) >> 1;
E parent = elements[parentIndex];
if (compare(element, parent) <= 0) break;

// Store parent element in index location
elements[index] = parent;

index = parentIndex;
}
elements[index] = element;
}
```

## Maximum heap - delete

The train of thought is:

• Cover the root node with the last node
• Delete last node
• The loop performs a sift down operation, that is:
• If node < the largest child node
Swap location with the largest child node
• If node > = the largest child node, or node has no child nodes
Exit loop
• Time complexity: O(logn)O(logn)O(logn)
• Swap location operations can be optimized like adding

```/**
* Max heap - delete top element
*/
public E remove() {
// Detection array cannot be empty
emptyCheck();
// Get the index of the last element of the array
int lastIndex = --size;
// Get the node to delete the element
E root = elements[0];
elements[0] = elements[lastIndex];
elements[lastIndex] = null;

siftDown(0); // Filtration operation
return root;
}
/**
* Let the elements in index position filter down
*/
private void siftDown(int index) {
E element = elements[index];
int half = size >> 1; // Number of non leaf nodes
// Index of the first leaf = = number of non leaf nodes
// Index < index of the first leaf node
// It must be ensured that the index position is a non leaf node
while (index < half) {
// There are two types of nodes in index
// 1. Only the left child node
// 2. There are left and right child nodes at the same time

// By default, the left child node is compared with it
int childIndex = (index << 1) + 1;
E child = elements[childIndex];

// Right child node
int rightIndex = childIndex + 1;

// Select the one with the largest left and right child nodes
if (rightIndex < size && compare(elements[rightIndex], child) > 0) {
child = elements[childIndex = rightIndex];
}

if (compare(element, child) >= 0) break;

// Store the child nodes in the index location
elements[index] = child;
// Reset index
index = childIndex;
}
elements[index] = element;
}
```

## replace

```public E replace(E element) {
elementNotNullCheck(element);

E root = null;
if (size == 0) {
elements[0] = element;
size++;
} else {
root = elements[0];
elements[0] = element;
siftDown(0);
}
return root;
}
```

## Maximum heap - heapfy

• Top down filter
• Bottom up filter

### Efficiency contrast

• Top down time complexity of up filtering: O(nlogn)O(nlogn)O(nlogn)
• Bottom up filtering time complexity: O(nlogk)O(nlogk)O(nlogk)

# Binary heap source code

## Heap.java, the basic interface of heap

```public interface Heap<E> {
int size();	// Number of elements
boolean isEmpty();	// Is it empty?
void clear();	// empty
E get();	// Get the top element
E remove(); // Delete the top of heap element
E replace(E element); // Insert a new element while deleting the top of heap element
}
```

## Abstract class AbstractHeap.java

```@SuppressWarnings("unchecked")
public abstract class AbstractHeap<E> implements Heap<E> {
protected int size;
protected Comparator<E> comparator;

public AbstractHeap(Comparator<E> comparator) {
this.comparator = comparator;
}

public AbstractHeap() {
this(null);
}

@Override
public int size() {
return size;
}

@Override
public boolean isEmpty() {
return size == 0;
}

protected int compare(E e1, E e2) {
return comparator != null ? comparator.compare(e1, e2)
: ((Comparable<E>)e1).compareTo(e2);
}
}
```

## Binary heap.java

```/**
* Binary reactor (maximum reactor)
*/
@SuppressWarnings("unchecked")
public class BinaryHeap<E> extends AbstractHeap<E> {
private E[] elements;
private static final int DEFAULT_CAPACITY = 10;

public BinaryHeap(E[] elements, Comparator<E> comparator)  {
super(comparator);

if (elements == null || elements.length == 0) {
this.elements = (E[]) new Object[DEFAULT_CAPACITY];
} else {
size = elements.length;
int capacity = Math.max(elements.length, DEFAULT_CAPACITY);
this.elements = (E[]) new Object[capacity];
for (int i = 0; i < elements.length; i++) {
this.elements[i] = elements[i];
}
heapify();
}
}

public BinaryHeap(E[] elements)  {
this(elements, null);
}

public BinaryHeap(Comparator<E> comparator) {
this(null, comparator);
}

public BinaryHeap() {
this(null, null);
}

@Override
public void clear() {
for (int i = 0; i < size; i++) {
elements[i] = null;
}
size = 0;
}

@Override
elementNotNullCheck(element);
ensureCapacity(size + 1);
elements[size++] = element;
siftUp(size - 1);
}

@Override
public E get() {
emptyCheck();
return elements[0];
}

/**
* Delete the top of heap element
*/
public E remove() {
emptyCheck();

int lastIndex = --size;
E root = elements[0];
elements[0] = elements[lastIndex];
elements[lastIndex] = null;

siftDown(0);
return root;
}

@Override
public E replace(E element) {
elementNotNullCheck(element);

E root = null;
if (size == 0) {
elements[0] = element;
size++;
} else {
root = elements[0];
elements[0] = element;
siftDown(0);
}
return root;
}

/**
* Batch building
*/
private void heapify() {
// Top down filter
//		for (int i = 1; i < size; i++) {
//			siftUp(i);
//		}

// Bottom up filter
for (int i = (size >> 1) - 1; i >= 0; i--) {
siftDown(i);
}
}

/**
* Let the elements in index position filter down
* @param index
*/
private void siftDown(int index) {
E element = elements[index];
int half = size >> 1; // Number of non leaf nodes
// Index of the first leaf = = number of non leaf nodes
// Index < index of the first leaf node
// It must be ensured that the index position is a non leaf node
while (index < half) {
// There are two types of nodes in index
// 1. Only the left child node
// 2. There are left and right child nodes at the same time

// By default, the left child node is compared with it
int childIndex = (index << 1) + 1;
E child = elements[childIndex];

// Right child node
int rightIndex = childIndex + 1;

// Select the one with the largest left and right child nodes
if (rightIndex < size && compare(elements[rightIndex], child) > 0) {
child = elements[childIndex = rightIndex];
}

if (compare(element, child) >= 0) break;

// Store the child nodes in the index location
elements[index] = child;
// Reset index
index = childIndex;
}
elements[index] = element;
}

/**
* Let the elements in index position filter up
*/
private void siftUp(int index) {
//		E e = elements[index];
//		while (index > 0) {
//			int pindex = (index - 1) >> 1;
//			E p = elements[pindex];
//			if (compare(e, p) <= 0) return;
//
//			//Exchange the contents of index and pindex locations
//			E tmp = elements[index];
//			elements[index] = elements[pindex];
//			elements[pindex] = tmp;
//
//			index = pindex;
//		}
E element = elements[index];
while (index > 0) {
// Parent node index = (child node index-1) / 2
int parentIndex = (index - 1) >> 1;
E parent = elements[parentIndex];
if (compare(element, parent) <= 0) break;

// Store parent element in index location
elements[index] = parent;

index = parentIndex;
}
elements[index] = element;
}

private void ensureCapacity(int capacity) {
int oldCapacity = elements.length;
if (oldCapacity >= capacity) return;

// New capacity is 1.5 times the old capacity
int newCapacity = oldCapacity + (oldCapacity >> 1);
E[] newElements = (E[]) new Object[newCapacity];
for (int i = 0; i < size; i++) {
newElements[i] = elements[i];
}
elements = newElements;
}

private void emptyCheck() {
if (size == 0) {
throw new IndexOutOfBoundsException("Heap is empty");
}
}

private void elementNotNullCheck(E element) {
if (element == null) {
throw new IllegalArgumentException("element must not be null");
}
}
}
```

# Build a minimum heap

After writing the maximum heap, the implementation of the minimum heap does not need to modify the source code, but only needs to pass in the comparator which is the opposite way to the maximum heap when creating the heap.

```public static void main(String[] args) {
Integer[] data = {88, 44, 53, 41, 16, 6, 70, 18, 85, 98, 81, 23, 36, 43, 37};
BinaryHeap<Integer> heap = new BinaryHeap<>(data, new Comparator<Integer>() {
public int compare(Integer o1, Integer o2) {
return o2 - o1; // Opposite to the maximum heap
}
});
}
```

# TOP K problem

What is the TopK problem

• From the N integers, find the largest number of first k (k < n)
• For example: find the largest 100 integers out of 1 million integers

One of the solutions of TopK problem: it can be solved by data structure "heap"

The time complexity of O(nlogn)O(nlogn)O(nlogn) O (nlogn) is required if the sorting algorithm is used for full sorting

If binary heap is used, the time complexity of O(nlogk)O(nlogk)O(nlogk) can be used

• Create a new small top heap
• Scan n integers
• First, put the first k traversals into the heap
• Starting from the number k+1, if it is greater than the heap top element, use the replace operation
(delete the top element and add the number k+1 to the heap)
• After scanning, all that remains in the heap is the largest number of first k
```public static void main(String[] args) {
// Create a new small top heap
BinaryHeap<Integer> heap = new BinaryHeap<>(new Comparator<Integer>() {
public int compare(Integer o1, Integer o2) {
return o2 - o1;
}
});

// Find the largest number of first k
int k = 3;
Integer[] data = {51, 30, 39, 92, 74, 25, 16, 93,
91, 19, 54, 47, 73, 62, 76, 63, 35, 18,
90, 6, 65, 49, 3, 26, 61, 21, 48};
for (int i = 0; i < data.length; i++) {
if (heap.size() < k) { // First k added to small top heap
} else if (data[i] > heap.get()) { // If it is the k + 1 number, and it is greater than the top element of the heap
heap.replace(data[i]); // logk
}
}
// O(nlogk)
}
```

What if we find the smallest number of the first k?

• Use big top pile
• Use the replace operation if it is smaller than the heap top element
140 original articles published, 30 praised, 10000 visitors+

Tags: Java

Posted on Sat, 14 Mar 2020 06:21:44 -0700 by drifter