Huffman tree

1, Introduction to Huffman tree

1. Constructing Huffman tree

Calculate the probability of occurrence of each character, and use this probability as the ratio of weight.

The Huffman tree is constructed by using these weighted characters.

Specific construction steps:

Firstly, each character with weight is taken as root node to form forest, and each node stores corresponding weight.

Secondly, two trees with the least weight are selected as the left and right subtrees of a new tree, and the weight of the new tree is equal to the sum of the left and right subtrees.

③ remove the previous two trees from the forest and add new trees to the forest.

④ repeat steps 2 and 3 until only one tree left in the forest is Huffman tree.



2. Huffman code

Huffman coding is the basis of modern compression algorithm.

If you want to transfer the string ABC directly using ASCII code is a bit lengthy. In addition, for the other party to be able to decode, it can be encoded as follows:

In fact, it's a little wasteful. The essence of Huffman coding is to use the character with the lowest frequency to encode into the longest binary number, while the character with higher frequency to encode into the shorter binary number, which saves a lot of space.

Huffman coding is actually based on Huffman tree, which stipulates that left is 0 and right is 0. Finally, the Huffman code of the corresponding element is the digital combination from the root node to the corresponding leaf node.



3. Huffman tree summary

The Huffman tree constructed with n weighted characters has n leaf nodes. In other words, the last n characters in the Huffman tree are leaf nodes. Therefore, it is doomed that the Huffman code of each character cannot be the prefix of other character codes.

Huffman tree is the shortest tree with weight path,





2, Huffman tree realization

1, analysis

The implementation of Huffman tree needs to take out the two trees with the least weight from the forest, so it can be realized with priority queue or the minimum heap. The bottom layer of the priority queue of jdk source code suddenly finds that the minimum heap is used, so you need to pay attention to it when comparing!


2, implementation

	public class HuffmanTree<E> {
	    private int size = 0;
	    private Node<E> root;
	    private PriorityQueue<Node<E>> forest;
	    private E nodeName;
	
	    /**
	     * Name the generated node
	     *
	     * @param nodeName Redundant node name
	     */
	    public HuffmanTree(E nodeName) {
	        this.nodeName = nodeName;
	    }
	
	    public HuffmanTree() {
	        this(null);
	    }
	
	    private static class Node<E> implements Comparable {
	        int weight;
	        E element;
	        Node<E> parent;
	        Node<E> left;
	        Node<E> right;
	
	        public Node(int weight, E element, Node<E> parent) {
	            this.weight = weight;
	            this.element = element;
	            this.parent = parent;
	        }
	
	        @Override
	        public int compareTo(Object o) {
	            return this.weight - ((Node<E>) o).weight;
	        }
	    }
	
	    /**
	     * Generate forest from incoming HashMap
	     */
	    public void generateForest(HashMap<E, Integer> map) {
	        forest = new PriorityQueue<>();
	        Set<E> sets = map.keySet();
	        for (E e : sets) {
	            forest.offer(new Node<>(map.get(e), e, null));
	        }
	    }
	
	    public void doHuffmanTree() {
	        if (root == null)
	            root = new Node<E>(0, null, null);
	        this.size = (forest.size() << 1) - 1;
	        while (forest.size() != 1) {
	            Node<E> left = forest.remove();
	            Node<E> right = forest.remove();
	            if (left == null || right == null)
	                return;
	            Node<E> newNode = new Node<>(left.weight + right.weight, nodeName, null);
	            newNode.left = left;
	            newNode.right = right;
	            root = newNode;
	            forest.offer(newNode);
	        }
	    }
	
	    /**
	     * Order traversal in Huffman tree
	     */
	    public void traversal() {
	        inOrderTraversal(root);
	    }
	
	    private void inOrderTraversal(Node<E> node) {
	        if (node == null)
	            return;
	        inOrderTraversal(node.left);
	        System.out.print(node.element + " ");
	        inOrderTraversal(node.right);
	    }
	
	    public static void main(String[] args) {
	        HashMap<String, Integer> map = new HashMap<>();
	
	        map.put("B", 3);
	        map.put("C", 8);
	        map.put("D", 6);
	        map.put("A", 1);
	        map.put("E", 2);
	        HuffmanTree<String> huffmanTree = new HuffmanTree<>("tempNode");
	        huffmanTree.generateForest(map);
	        huffmanTree.doHuffmanTree();
	        System.out.println(huffmanTree.size);
	        huffmanTree.traversal();
	    }
	}
Published 49 original articles, won praise 5, visited 4301
Private letter follow

Tags: ascii JDK

Posted on Thu, 05 Mar 2020 05:28:40 -0800 by 01chris