I don’t know why the coding of major search engines is different now. Of course, it’s either GB2312 or UTF-8. The coding problem is a headache... It’s so troublesome...
We obtain keywords, which are usually analyzed through the URL of the visiting page. For example
http://www.google.com/search?hl=zh-CN&q=%E5%AD%A4%E7%8B%AC&lr=
You all know that this is encoded through urlencode.
We need to go through 2 steps to get the information. The first step is to perform urldecode. When we live with ordinary parameters, this is done by the ASP itself, but now we have to do manual decoding.
There are many functions online, but they all solve GB2312.UTF-8 for the GB2312 page. For this, we can easily decode it first, and then judge its encoding based on the search engine. If it is UTF-8, it will be converted to GB2312.
But since my website is a UTF-8 page. And the UTF-8 page I only found the urldecode encoding that solves UTF-8 characters. I paused here for a long time, and in the end I could only use the worst method to submit the split keywords to an ASP page of GB2312 using xmlhttp, and then live in garbled code (GB2312) and then convert GB2312 toUTF-8.
The following main implementation code.
PublicFunctionGetSearchKeyword(RefererUrl)'Search keywords
ifRefererUrl=orlen(RefererUrl)<1thenexitfunction
onerrorresumenext
Dimre
Setre=NewRegExp
re.IgnoreCase=True
re.Global=True
Dima,b,j
'Fuzzy search keywords, this method is faster and has a larger range
re.Pattern=(word=([^&]*)|q=([^&]*)|p=([^&]*)|query=([^&]*)|name=([^&]*)|_searchkey=([^&]*)|baidu.*?w=([^&]*))
Seta=re.Execute(RefererUrl)
Ifa.Count>0then
Setb=a(a.Count-1).SubMatches
Forj=1tob.Count
IfLen(b(j))>0then
ifinstr(1,RefererUrl,google,1)then
GetSearchKeyword=Trim(U8Decode(b(j)))
elseifinstr(1,refererurl,yahoo,1)then
GetSearchKeyword=Trim(U8Decode(b(j)))
elseifinstr(1,refererurl,yisou,1)then
GetSearchKeyword=Trim(getkey(b(j)))
elseifinstr(1,refererurl,3721,1)then
GetSearchKeyword=Trim(getkey(b(j)))
else
GetSearchKeyword=Trim(getkey(b(j)))
endif
ExitFunction