Clear syntax colored version: http://gwx.showus.net/blog/article.asp?id=229
Original creation is very hard, please indicate the original link when reprinting: http://gwx.showus.net/blog/article.asp?id=229
Web acquisition program? Web crawler? Xiaolun program? No matter what you call it, this kind of program is quite widely used. This article does not discuss copyright or moral issues caused by the use of this program, but only discusses the implementation of this program in the ASP+VBScript environment:-)
Preparation knowledge: In addition to general ASP+VBScript knowledge, you also need to understand xmlhttp objects and regular expression objects. The xmlhttp object is the protagonist of Ajax, who is currently in the limelight; and after learning regular expressions, you no longer have to worry about dealing with complex strings.
The RegEx gadget is very useful when writing and debugging regular expressions.
Table of contents
Crawl a remote web page and save it locally
Improvement: Handle garbled code
Download pictures (and other files) of remote web pages at the same time
Improved: Detecting real URLs
Improvement: Avoid repeated downloads
Practical examples (taking **** as an example)
Analysis list page
Content page tips
Analyze the previous page, next page in the content page
Advanced Topic: UTF-8 and GB2312 Conversion
More advanced topics: crawling after login, client forgery
The collection procedures you have
Original link: http://gwx.showus.net/blog/article.asp?id=229
1. Crawl a remote web page and save it to local
'For debugging, the intermediate results will be checked several times later
DiminDebug:inDebug=True
SubD(Str)
IfinDebug=FalseThenExitSub
Response.Write("<divstyle='color:#003399;border:solid1px#003399;background:#EEF7FF;margin:1px;font-size:12px;padding:4px;'>")
Response.Write(Str&"</div>")
Response.Flush()
EndSub
'Process: Save2File
'Function: Save text or byte stream as a file
'Parameter: sContent to save content
'sFile is saved to a file, like "files/abc.htm"
'Is bText a text or not
'Write does it overwrite the existing file
SubSave2File(sContent,sFile,bText,bOverWrite)
CallD("Save2File:"+sFile+"*Whether text: "&bText)
DimSaveOption, TypeOption
If(bOverWrite=True)ThenSaveOption=2ElseSaveOption=1
If(bText=True)ThenTypeOption=2ElseTypeOption=1
SetAds=Server.CreateObject("Adodb.Stream")
WithAds
.Type=TypeOption