There are many ways to prevent collection at present. Let me first introduce the common anti-collection strategies, their disadvantages and collection countermeasures:
1. Determine the number of visits to this website page by an IP within a certain period of time. If it obviously exceeds the normal browsing speed, the IP will be denied.
Disadvantages:
1. This method is only applicable to dynamic pages, such as: asp/jsp/php, etc.... Static pages cannot determine the number of times a certain IP visits this site's page for a certain period of time.
2. This method will seriously affect the inclusion of search engine spiders, because when the inclusion of search engine spiders, the browsing speed will be relatively fast and multi-threaded. This method also rejects search engine spiders' files included in the site
Collecting strategies: Only slow down the collection speed, or do not
Suggestion: Make a search engine spider IP library, which only allows search engine spiders to quickly browse the content on the site. It is not easy to collect the IP library of search engine spiders. A search engine spider does not necessarily have only one fixed IP address.
Comment: This method is more effective for preventing collection, but it will affect search engines' inclusion.
2. Encrypt content pages with javascript
Disadvantages: This method is suitable for static pages, but it will seriously affect the inclusion of search engines. The content received by search engines is also encrypted.
Collecting countermeasures: It is recommended not to choose. If you have to choose, you can also choose the JS script that decrypts the password.
Suggestions: There are currently no good suggestions for improvement
Comment: It is recommended that webmasters who expect search engines to bring traffic to do not use this method.
3. Replace the specific marks on the content page with "specific marks + hidden copyright text"
Disadvantages: This method has little disadvantages, it will only increase the size of the page file, but it is easy to reverse the collection.
Collecting strategies: replace the copyrighted text collected with hidden copyright text content, or replace it with your own copyright.
Suggestions: There are currently no good suggestions for improvement
Comment: I feel that it is not very practical. Even if I add random hidden words, it is equivalent to adding more.
4. Only allow users to browse after logging in
Disadvantage: This method will seriously affect the inclusion of search engine spiders
Collecting countermeasures: Someone has posted countermeasures articles. For details, please refer to this. "How does the ASP thief program use XMLHTTP to implement form submission and cookies or session sending"
Suggestions: There are currently no good suggestions for improvement
Comment: It is recommended that webmasters who expect search engines to bring traffic to do not use this method. However, this method is effective against general collection procedures.
5. Use javascript and vbscript scripts to paginate
Disadvantages: Influencing search engines to include it
Collect countermeasures: analyze javascript and vbscript scripts, find out their paging rules, and make a paging collection page corresponding to this site by yourself.
Suggestions: There are currently no good suggestions for improvement