Preface
I think friends who have learned data structures must know Huffman. This great god invented the famous "optimal binary tree". In order to commemorate him, we call it the "Huffman tree". The Huffman tree can be used for Huffman encoding, and the knowledge of encoding is very great, such as for compression, cryptography, etc. Let’s take a look at what the Huffman tree is today.
concept
Of course, one of the routines is that we need to understand some basic concepts.
1. Path length: The path that forms the two nodes from one node in the tree to another node. The number of branches on the path is called the path length.
2. Tree path length: the sum of path lengths from the tree root to each node. What we call a complete binary tree is the shortest path length of this kind.
3. The weighted path length of the tree: If a weighted value is assigned to each leaf node of the tree, the weighted path length of the tree is equal to the sum of the product of the path length from the root node to all leaf nodes and the weight of the leaf node.
So how do we determine whether a tree is an optimal binary tree? Let’s take a look at the following trees:
Their power lengths are:
WPL1: 7*2+5*2+2*2+4*2=36
WPL2: 7*3+5*3+2*1+4*2=46
WPL3: 7*1+5*2+2*3+4*3=35
It is obvious that the third tree has the shortest weight path (you don’t believe it, you can try it. If you can find a shorter one, you will probably win the Turing Award). This is what we call the "optimal binary tree (Hafman tree)". Its construction method is very simple. You select the node with the smallest weight and place it at the bottom of the tree, and place the two smallest connections into a new node. It should be noted that the weight of the formed new node should be equal to the sum of the weights of these two nodes. Then put this new node back into the node where we need to form the tree to continue sorting. In this way, all the nodes with information stored in the Hafman tree are on the leaf nodes.
After finishing the concept, I may be a little bit "unclear".
Let’s give an example to build it clearly.
There is a string: aaaaaaaabbbbaaaacccccccddddddfff
In the first step, we first count the number of times each character appears and call it the weight of the character. a 15 ,b 5, c 8, d 6, f 3.
The second step is to find the two characters with the smallest weight, b5 and f3, and build the node.
Then remove f3 and b5, now a15, c8, d6, fb8.
Step 3, repeat the second step until only one node is left is built.
It's now dfb14, a15, c8.
at last,
OK, so our Huffman tree is constructed.
Steps to build
According to the above logic, to summarize it, there are a few steps:
1. Statistics the number of characters and characters appearing in a string;
2. Create nodes according to the structure of the first step;
3. Sort the node weights ascending order;
4. Take out the two nodes with the smallest weight and generate a new parent node;
5. Delete the two nodes with the smallest weight and store the parent node in the list;
6. Repeat step 4 or 5 until there is a node left;
7. Assign the last node to the root node.
Java code
The principle is finished, and the next step is to implement the code.
First, there is a node class to store data.
package huffman;/** * Node class* @author yuxiu * */public class Node { public String code;// Hafman encoding of node public int codeSize;// length of node Huffman encoding public String data;// node data public int count;// node weight public Node lChild; public Node rChild; public Node() { } public Node(String data, int count) { this.data = data; this.count = count; } public Node(int count, Node lChild, Node rChild) { this.count = count; this.lChild = lChild; this.rChild = rChild; } public Node(String data, int count, Node lChild, Node rChild) { this.data = data; this.count = count; this.lChild = lChild; this.rChild = rChild; }}Then there is the process of implementation.
package huffman;import java.io.*;import java.util.*;public class Huffman { private String str;// Originally used for compression private String newStr = "";// The string connected by Huffman encoding private Node root;// The root node of the Huffman binary tree private boolean flag;// Whether the latest characters already exist private ArrayList<String> charList;// The queue for storing different characters is the same character in the same position private ArrayList<Node> NodeList;// The queue for storing nodes 15 16 /** * Build the Huffman tree* * @param str */ public void creatHfmTree(String str) { this.str = str; charList = new ArrayList<String>(); NodeList = new ArrayList<Node>(); // 1. Statistics the number of characters and characters appearing in a string// The basic idea is to put an unordered string such as ababccdebed into the charList, which is aa,bbb,cc,dd,ee // And the length of the string in the list is the corresponding weight for (int i = 0; i < str.length(); i++) { char ch = str.charAt(i); // Take out the character flag = true; for (int j = 0; j < charList.size(); j++) { if (charList.get(j).charAt(0) == ch) {// If the same character is found String s = charList.get(j) + ch; charList.set(j, s); flag = false; break; } } if (flag) { charList.add(charList.size(), ch + ""); } } // 2. Create a node according to the structure of the first step for (int i = 0; i < charList.size(); i++) { String data = charList.get(i).charAt(0) + ""; // Get the first character of each string in charList int count = charList.get(i).length(); // The length of the string in the list is the corresponding weight Node node = new Node(data, count); // Create node object NodeList.add(i, node); // Join to the node queue} // 3. Sort(NodeList); while (NodeList.size() > 1) {// When the number of nodes is greater than one // 4. Take out the two nodes with the smallest weight and generate a new parent node // 5. Delete the two nodes with the smallest weight and store the parent node in the list Node left = NodeList.remove(0); Node right = NodeList.remove(0); int parentWeight = left.count + right.count;// The weight of the parent node is equal to the sum of the weights of the child node Node parent = new Node(parentWeight, left, right); NodeList.add(0, parent); // Put the parent node at the first position} // 6. Repeat the fourth and fifth steps to be the while loop// 7. Assign the last node to the root node root = NodeList.get(0); } /** * Ascending order* * @param nodelist */ public void Sort(ArrayList<Node> nodelist) { for (int i = 0; i < nodelist.size() - 1; i++) { for (int j = i + 1; j < nodelist.size(); j++) { Node temp; if (nodelist.get(i).count > nodelist.get(j).count) { temp = nodelist.get(i); nodelist.set(i, nodelist.get(j)); nodelist.set(j, temp); } } } } /** * Traversal* * @param node * Node */ public void output(Node node) { if (node.lChild != null) { output(node.lChild); } System.out.print(node.count + " "); // In-order traversal if (node.rChild != null) { output(node.rChild); } } public void output() { output(root); }/** * Main method* * @param args */ public static void main(String[] args) { Huffman huff = new Huffman();//Create the Havalman object huff.creatHfmTree("sdfassvvvdfgsfdfsdfs");//Construct the tree}Summarize
The above is all about implementing the Huffman tree based on JAVA. I hope this article will be helpful to everyone to learn how to use JAVA. If you have any questions, please leave a message to discuss.