The main research in this article is Java programming through matching and combining data (data preprocessing), as follows.
Data description
The following program combines data in the following format.
Each row of this table represents the user id and the user's characteristics. Among them, a user has only one feature vector, that is, the first column will not be repeated.
The first column of this table represents the user's id, the second column represents the movie the user watches, the third column represents the user's score of the movie (1-13 points), and the fourth column represents the user's score of the movie, but the score range is 1-5 points.
Problem description
When preprocessing data, how to add user characteristics to the second table? In fact, the method is very simple. Just match the user id of the second table with the user id of the first table. The merge result is shown in the figure below.
Data Processor
package deal;import java.io.BufferedReader;import java.io.File;import java.io.FileInputStream;import java.io.IOException;import java.io.InputStreamReader;import java.math.BigDecimal;import java.util.ArrayList;import java.util.HashMap;import java.util.List;/* * author: Qian Yang, School of Management, Hefei University of Technology* email: [email protected]*/public class GetPUser {public static List<String> readDocs(String docsPath,String code) throws IOException{BufferedReader reader = new BufferedReader( new InputStreamReader( new FileInputStream( new File(docsPath)),code));String s=null;List<String> userproductscore=new ArrayList<String>(); while ((s=reader.readLine())!=null) {userproductscore.add(s);}reader.close();return userproductscore;}public static HashMap<String, String> MAPread(String docsPath1,String code1) throws IOException{BufferedReader reader1 = new BufferedReader( new InputStreamReader( new FileInputStream( new FileInputStream( new File(docsPath1)),code1));String s1=null;HashMap<String,String> userfeaturemap=new HashMap<String,String>(); while ((s1=reader1.readLine())!=null) {String arr[]=s1.split("/t");String feature="";for (int i = 1; i < arr.length; i++) {BigDecimal db = new BigDecimal(arr[i]);String ii = db.toPlainString();feature+=ii+" ";}userfeaturemap.put(s1.split("/t")[0], feature);}reader1.close();return userfeaturemap;}public static List<String> match(List<String> userproductscore,HashMap<String, String> userfeaturemap) throws IOException{List<String> userscoreandfeature=new ArrayList<>(); for (int i = 0; i < userproductscore.size(); i++) {//Get user idString user_id=userproductscore.get(i).split("/t")[0];//Get user feature String userfeature = userfeaturemap.get(user_id);userscoreandfeature.add(userproductscore.get(i)+"/t"+userfeature);System.out.println(userproductscore.get(i)+"/t"+userfeature);}return userscoreandfeature;}public static void main(String[] args) throws IOException {//Read two text List<String> userproductscore=readDocs("data/train/ydata-ymovies-user-movie-ratings-train-v1_0.txt","gbk");HashMap<String, String> userfeaturemap=MAPread("data/fileofuser/yahoo.txt","utf-8");//Match result match(userproductscore,userfeaturemap);}}
Summarize
The above is all about Java programming through matching and combining data instance analysis (data preprocessing). I hope it will be helpful to everyone. Interested friends can continue to refer to other related topics on this site. If there are any shortcomings, please leave a message to point it out. Thank you friends for your support for this site!