This article describes the Java implementation of the cosine similarity function of calculating sparse matrix. Share it for your reference, as follows:
import java.util.HashMap;public class MyUDF{ /** * UDF Evaluate interface* * UDF is one-to-one at the record level, and one-to-one or many-to-one at the field. The Evaluate method is called once on each record, input as one or more fields, and output as one field*/ public Double evaluation(String a, String b) { // TODO: Please modify the parameters and return values as needed, and implement your own logic here if(a==null || b==null) return 0.0; String temp1[]=a.split(","); String temp2[]=b.split(","); if (temp1==null || temp2==null) { return 0.0; } HashMap<String, Double> map1=new HashMap<String, Double>(); HashMap<String, Double> map2=new HashMap<String, Double>(); for(String temp:temp1) { String t[]=temp.split(":"); map1.put(t[0], Double.parseDouble(t[1])); } for(String temp:temp2) { String t[]=temp.split(":"); map2.put(t[0], Double.parseDouble(t[1])); } double fenzi=0; double fenmu1=0; for(String i:map1.keySet()) { double value=map1.get(i); if (map2.get(i)!=null) { fenzi+=value*map2.get(i); } fenmu1+=value*value; } double fenmu2=0; for(double i:map2.values()) { fenmu2+=i*i; } double fenmu=Math.sqrt(fenmu1)*Math.sqrt(fenmu2); return fenzi/fenmu; } public static void main(String[] args) { String a="12:500,14:100,20:200"; String b="12:500,14:100,30:100"; MyUDF myUDF=new MyUDF(); System.out.println(myUDF.evaluate(a, b)); }}Running results:
0.9135468796041984
For more information about Java algorithms, readers who are interested in this site can view the topics: "Java Data Structure and Algorithm Tutorial", "Summary of Java Operation DOM Node Tips", "Summary of Java File and Directory Operation Tips" and "Summary of Java Cache Operation Tips"
I hope this article will be helpful to everyone's Java programming.