Since many websites now have added verification code technology to enhance security and prevent automatic operation of programs. But it brings trouble to the majority of webmasters to promote websites. So I am going to write this article about verification code recognition technology, and the shortcomings are inevitable! I never write anything, but today I just want to get out of date!
The majority of webmasters promote their own websites and often publish some promotional advertisements. If they rely on manual labor, it is too slow and expensive. Therefore, the ideal method is to use mass sending software. However, many websites now have verification codes, which has become a technical difficulty of mass sending software, and identifying is also a difficult point among the difficulties. OK, don’t talk about it, and get back to the point!
The example I gave is a verification code that is more difficult to identify. It does not discuss verification codes that do not deform, change fonts, change sizes, and rotate. I may not write the code here, but just provide the ideas I wrote. According to this idea, the programs I wrote are much higher than those sold on the market. (If you are interested, please ask me, I don’t want to help others promote it here, haha~~~)
First, start with the digital verification code. The letters are more troublesome than numbers. However, if you figure out the recognition of the digital verification code, it will be easier to use the letters.
The verification code is generally a picture, and it is generally a 4-digit number. The processing process is: first divide it into 4 parts, and then identify it one by one. Since the segmentation is relatively simple, I won’t talk about it here, I will only talk about how to identify it here.
My method is to divide the pictures that need to be identified into 5 rows, 3 columns, and 15 blocks. Why divide them into 15 blocks? Look at the picture first!
○■○
■○■
■○■
■○■
○■○
○■○
■○
○■○
○■○
■■■
■■■
○○■
■■■
■○○
■■■
■■■
○○■
■■■
○○■
■■■
Let me give you these 4 examples first, and you can draw the rest by yourself. If you have done verification code recognition, you will definitely quickly understand why it is divided into 15 blocks. In fact, it is mainly because this division is more reasonable and can improve the recognition rate.
My method is to divide the pictures that need to be identified into 5 rows, 3 columns, and 15 blocks, and then calculate each block. When the effective pixels in each block exceed the percentage, they are marked as ■. If they do not exceed it, they are marked as ○ (I used ■, ○, you can mark it as 1 and 0 for the sake of display). Here you should note that the percentage here can be 67%, 50%, 33%, and 20% according to the thickness of the font. Why do you need these numbers? It is mainly related to the computer's floating-point number operation. If you choose these numbers, the calculation will be faster and will not be prone to errors. Otherwise, the computer will also make errors when performing a large number of calculations! Of course, here, you can choose the percentage of the verification code picture that suits you! !
If the verification code does not deform, does not change the font, does not change the size, or does not rotate, our recognition work will basically end at this point, because a relatively clear block diagram can be obtained, which is enough for most forums. ^_^