Using html5 canvas to crack simple verification code and getImageData interface application

Author：Eve Cole Update Time：2025-05-31 22:32:01

Comment: The canvas in HTML 5 has an interface getImageData that can be used to obtain pixel data from the verification code image. Each pixel has four values: r, g, b, a. r, g, b are red, green and blue, and a is transparency. After observing, let’s talk about the ideas and implementation code. If you are interested, don’t you go away.

Our school’s academic affairs management system (it seems to be more than just for our school) is not explained when the course selection time server collapses. Sometimes, in order to choose a course, you have to repeatedly enter the verification code. When I think of thousands of college students wasting time on entering the verification code, I feel that I have an obligation to save humanity.

I searched and saw this article, it was from 3 years ago. I referred to the first half and used the TamperMonkey plug-in to roughly achieve the desired effect. You can get this script in Userscript, which is also available on GitHub. The code is ugly, please debug and give me advice.

Let's talk about the idea: The canvas in HTML 5 has an interface getImageData that can be used to obtain pixel data from the verification code image. Each pixel has four values: r, g, b, a. r, g, b are red, green and blue, and a is transparency.

It was observed that the verification code of the academic affairs management system is 5 numbers, and the font size remains unchanged. Although the background is disturbed, it is obviously very different from the font color, so a very rough method was used: we know that the lighter the color, the larger the rgb value, the darker the color, and the less rgb value. So I judged each pixel point. The sum of rgb is less than 350 (this value is measured) is the pixels belonging to the font. For the sake of easy observation, the rgb value is set to 255, otherwise it is set to 0. This gives you a picture with black background and white characters.

var ctx = canvas.getContext('2d');

ctx.drawImage(img,0,0);

var c = ctx.getImageData(0,0,img.width,img.height);

for(i=0; i<c.height; i++){

for(j=0; j<c.width; j++){

var x = (i*4)*c.width+(j*4);

var r = c.data[x];

var g = c.data[x+1];

var b = c.data[x+2];

if(r+g+b > 350){

c.data[x] = c.data[x+1] = c.data[x+2] = 0;

}

else{

c.data[x] = c.data[x+1] = c.data[x+2] = 255;

}

Then I used the drawing tool to enlarge the picture, observed it, and found that each number is a 12*8 pixel rectangle, and then I found that the number of pixels corresponding to each number is the same, so I made a special judgment (for example, if there are pixels in the middle, it must be 8 instead of 0). Then... just observe... the coordinates of the matrix corresponding to each number... write this function:

function getNum(imgData,x1,y1,x2,y2){

var num = 0;

for(i=y1; i<y2; i++){

for(j=x1; j<x2; j++){

var x = (i*4)*imgData.width+(j*4);

if(imgData.data[x] == 255)num++;

}

switch(num)

{

case 56:{

j = (x1+x2)/2;

i = (y1+y2)/2;

var x = (i*4)*imgData.width+(j*4);

if(imgData.data[x] == 255)

return 8;

else

return 0;

}

case 30:return 1;

case 50:return 2;

case 51:return 3;

case 48:return 4;

case 57:return 5;

case 58:{

i = y2-2;

j = x1;

var x = (i*4)*imgData.width+(j*4);

if(imgData.data[x] == 255)

return 9;

else

return 6;

}

case 37:return 7;

default:return 0;

}

The original text uses a neural network to judge, and the accuracy rate is greatly improved. I don’t know how to use it, so it’s useless…

The verification code accuracy I obtained using this method is also more than 95%, which is enough for the time being. If you have time, you can study the neural network.

Students who need it can use it. The Chrome browser must first install TamperMonkey, and Firefox is GeaseMonkey, and then install this script.