Linux查询的Nginx日志探查百度蜘蛛的行踪

新手做站总是有一个疑问,就是我的网站到底百度有没有来看过啊,如何查看百度蜘蛛的爬取时间呢

我们可以通过nginx的日志来获取一些信息,比如用户的来源IP、使用的终端和一个链接的访问量等。

利用好日志可以获得非常多有价值的信息,这里简单的来说下怎么方便的查看百度蜘蛛访问量和百度蜘

蛛都抓取了哪些内容,查看整个蜘蛛抓取的详细信息。

连上ssh,我们找到Nginx的日志

cd /home/wwwlogs

root@:/home/wwwlogs# ls
access.log  nginx_error.log 你的域名.log

来看一下有多少条,还是比较可怜的,新网站就172条,好几个月了都

root@:/home/wwwlogs# cat 你的域名.log |grep baidu.com |wc -l 172

从日志上可以看出来百度蜘蛛不怎么喜欢来一个礼拜才来一次,不如一个月来一次咯

cat 你的域名.log |grep baidu.com

108.162.215.* - - [17/Apr/2021:07:41:52 +0800] "GET / HTTP/1.1" 200 27399 "http://baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"
172.69.35.* - - [18/Apr/2021:00:34:22 +0800] "GET / HTTP/1.1" 200 25242 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
172.69.35.* - - [19/Apr/2021:12:47:27 +0800] "GET /sitemap.txt HTTP/1.1" 304 0 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
108.162.215.* - - [20/Apr/2021:08:31:34 +0800] "GET / HTTP/1.1" 200 28260 "http://baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"
108.162.215.* - - [20/Apr/2021:08:32:54 +0800] "GET / HTTP/1.1" 200 28076 "http://baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN"
172.69.35.* - - [28/Apr/2021:16:45:48 +0800] "GET /?s=%E7%85%8E%E9%A5%BC%E4%BE%A0 HTTP/1.1" 200 9301 "https://tongji.baidu.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36"
172.69.35.* - - [29/Apr/2021:17:18:46 +0800] "GET / HTTP/1.1" 200 14432 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
172.69.35.* - - [29/Apr/2021:17:19:02 +0800] "GET / HTTP/1.1" 200 15143 "-" "Mozilla/5.0 (Linux;u;Android 4.2.2;zh-cn;) AppleWebKit/534.46 (KHTML,like Gecko) Version/5.1 Mobile Safari/10600.6.3 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html"

足以可见,新网站的日子不怎么好过,坚持就是胜利

阅读剩余
THE END