Today my friend asked me this question: JS finds duplicate data of multiple arrays
Note:
1. To be more precise, as long as there are more than two duplicate data in multiple arrays, then this data is what I need
2. There is no duplicate value in the data in a single array (of course, you can deduplicate it if there is one)
3. Time-consuming issue, this is very important
source code:
<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"> <title>Get duplicate data in multiple arrays</title></head><body> <script type="text/javascript"> //calculate time function useTime(date1,date2){ var date3=date2.getTime()-date1.getTime() //The number of milliseconds of the time difference//Calculate the number of days of the difference var days=Math.floor(date3/(24*3600*1000)) //Calculate the number of hours var leave1=date3%(24*3600*1000) //The remaining milliseconds after the number of days var hours=Math.floor(leave1/(3600*1000)) //The remaining milliseconds after the number of hours var minutes=Math.floor(leave2/(60*1000)) //The remaining milliseconds after the number of hours var minutes var leaves3=leave2%(60*1000) //The remaining milliseconds after the number of minutes var seconds=Math.round(leave3/1000) return "Time:"+days+" "+hours+":"+minutes+":"+seconds+":"+seconds+" "+leave3%1000;//+"''"; } //Return the data composed of the number between min and max, the length is max-min+1 (the data is fixed, but the order is random) function getArr(min,max){ var arr = []; var numToPush = min; for (var i = 0; i < max-min+1; i++) { var len = arr.length; if (len==0) { arr.push(numToPush++); }else{ var randIndex = Math.floor(Math.random()*len); arr.push(numToPush++); // A certain one in arr exchanges var with the last one tmp = arr[randIndex]; arr[randIndex] = arr[len]; arr[len] = tmp; } } return arr; } //Return the data composed of the number between min and max, the number of them is num(random data) function randomArr(min,max,num){ var arr = []; for (var i = 0; i < num; i++) { var randomNumber = Math.floor(Math.random()*(max-min)+min); var inArr = false; for (var i = 0; i < arr.length; i++) { if(arr[i]==randomNumber){ inArr = true; num--; break; } } if (!inArr) { arr.push(randomNumber); } } return arr; } //get duplicate data function getDumplicate(){ var num = arguments.length; if (num<2) { return [];}; var obj = { ret:[], //Storage the same data container:[] //Storage different data} for (var i = 0; i < 3; i++) { // console.log(arguments[i]); var arr = arguments[i]; obj = deal(arr,obj); } return obj; } //Process a single array, compare it with the data in the container, and obtain duplicate data (problem: too large data volume will cause too much data in the container) function deal(arr,obj){ var len = obj.container.length; if(len==0) { obj.container = arr; }else{ var arrlen = arr.length; for (var j = 0; j < arrlen; j++) {//Transf the array, each element is compared with the container var conlen = obj.container.length; var intoContainer = false; for (var i = 0; i < conlen; i++) { var conValue = obj.container[i]; if(arr[j]==conValue){ //Repeat ret obj.ret.push(arr[j]); intoContainer = true; } } if(intoContainer&&!inArr(arr[j],obj.container)){ obj.container.push(arr[j]); //No repetition into the container} } } return obj; } //Detection whether this data already exists in the array function inArr(obj,arr){ var exist = false; var len = arr.length; for (var i = 0; i < len; i++) { if (arr[i]==obj) { exist = true; } } return exist; } //-------------------------测试-------------------------------------------- var date = new Date(); var arr_a = getArr(1,20); var arr_b = getArr(18,35); var arr_c = getArr(34,50); var dumpData= getDumplicate(arr_a,arr_b,arr_c); console.log(dumpData.ret); //console.log(dumpData.container); console.log(useTime(date,new Date())); console.log("-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- useTime(date1,new Date()); console.log(useTime); </script></body></html> result:
We test more data: 3 arrays are randomly generated, a total of 3W pieces of data
result:
5 arrays of 5W data: (Data distribution: 1W/array)
5 arrays of 10W data: (data distribution: 5W, 4W, 3W, 2W, 1W)
10 arrays 10W data: (Data distribution: 1W/array)
100 arrays 100W data: (Data distribution: 1W/array)
in conclusion:
1. How much time is spent depends on your algorithm
2. When the total data remains unchanged: try to have as many arrays as possible, and there should not be too much data in a single array. Of course, it cannot be generalized.
3. In this test, a single array of 1W data is OK, 5W data is not dead, and 10W data is please contact Hua Tuo
question:
1. The algorithm is written temporarily (in fact, there is no algorithm^_^), and needs to be improved
2. An array container is used in the test code to store non-duplicate data.
Then the problem is: too much data will cause too much data in the container, and then... you know.
3. The test data is generated randomly and only numbers. If it is another object, please test it separately (mainly because the test data is difficult to generate (⊙o⊙)…)
4. Multidimensional array not tested (test performance may not be good 0_0)
The above is all the content of this article. I hope it will be helpful to everyone's learning and I hope everyone will support Wulin.com more.