Funny requests on my webserver
Now and then I have a look at the logging of my webserver
(http://www.mavetju.org, hosted currently on topaz.mdcc.cx). Not
everything seems to make sense on how browsers identify themselves
and on what they request...
Fast downloading
Program like GetRight use the seperated TCP streams to make
downloading faster. But that's not always going smarter...
ngrep-lib-1.1.tar.gz is 660735 bytes, but they're downloading here
1661886 bytes, more than two and a half times the size.
203.162.44.68 - - [21/Oct/2001:07:18:49 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 133459
203.162.44.68 - - [21/Oct/2001:07:20:48 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 71359
203.162.44.68 - - [21/Oct/2001:07:20:56 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 64767
203.162.44.68 - - [21/Oct/2001:07:21:36 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 90680
203.162.44.68 - - [21/Oct/2001:07:21:36 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 140359
203.162.44.68 - - [21/Oct/2001:07:21:49 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 83779
203.162.44.68 - - [21/Oct/2001:07:22:28 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 104479
203.162.44.68 - - [21/Oct/2001:07:23:25 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 118279
203.162.44.68 - - [21/Oct/2001:07:23:32 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 46335
203.162.44.68 - - [21/Oct/2001:07:24:23 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 89299
203.162.44.68 - - [21/Oct/2001:07:24:39 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 61699
203.162.44.68 - - [21/Oct/2001:07:24:41 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 108619
203.162.44.68 - - [21/Oct/2001:07:24:44 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 89299
203.162.44.68 - - [21/Oct/2001:07:25:24 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 69979
203.162.44.68 - - [21/Oct/2001:07:25:34 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 7423
203.162.44.68 - - [21/Oct/2001:07:25:46 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 81019
203.162.44.68 - - [21/Oct/2001:07:26:58 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 15615
203.162.44.68 - - [21/Oct/2001:07:27:32 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 82399
203.162.44.68 - - [21/Oct/2001:07:27:34 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 67219
203.162.44.68 - - [21/Oct/2001:07:27:43 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 65840
203.162.44.68 - - [21/Oct/2001:07:30:01 +0200] "GET /download/ngrep-lib-1.1.tar.gz HTTP/1.0" 206 69980
|
And which version of Windows do you use?
It's friendly of some webbrowsers to tell which operating system
the client runs, but don't believe everything you see:
f-218-189.frankfurt.ipdial.viaginterkom.de - - [21/Oct/2001:14:49:20 +0200] "GET /networking/tools.phtml HTTP/1.0" 200 1568 "http://www.mavetju.org/networking/tools.phtml" "SpaceBison/0.01 [fu] (Win67; X; ShonenKnife)" 1 www.mavetju.org
29.61.226.200.in-addr.arpa.ig.com.br - - [02/Sep/2001:21:51:10 +0200] "GET /networking/programming.phtml HTTP/1.1" 200 915 "http://freshmeat.net/releases/56602/" "Mozilla/4.0 (compatible; MSIE 5.0; Linux 2.2.19 i686) Opera 5.0 [en]" 1 www.mavetju.org
206.33.106.47 - - [04/Sep/2001:16:18:45 +0200] "GET /networking/programming.phtml HTTP/1.0" 200 903 "'http://www.boston.com/'" "Mozilla/6.6.6 [en] (GNU/Hurd; i686)" 1 www.mavetju.org
gatelord1.hen.nl - - [24/Jan/2002:08:51:51 +0100] "GET /unix/general.php HTTP/1.0" 200 4574 "-" "Mozilla/4.0 (compatible; MSIE 5.5; CP/M; 64-bit high_encr)" 1 www.mavetju.org
rs2-pc05.externet.hu - - [24/Jan/2002:09:08:48 +0100] "GET /download/dnstracer-1.2.tar.gz HTTP/1.0" 200 79750 "-" "Nutscrape/R9.0 (CP/M; 8-bit)" 12 www.mavetju.org
|
Politicly correct clients
Imagine to see this browser identification coming from a .mil domain...
bu-wcs1-kelly.nipr.mil - - [06/Feb/2002:22:34:45 +0100] "GET /unix/general.php HTTP/1.0" 200 4660 "-" "Nuk-em/0.01- (RadarRange; 1-nibble)" 0 www.mavetju.org
|
Additional tools installed
It's so easy for Windows programs to add more stuff to the browser
identification field that it's not funny.
c1780508-a.grlnd1.tx.home.com - - [22/Oct/2001:05:32:34 +0200] "GET /programming/php.phtml HTTP/1.1" 200 2859 "http://www.hotscripts.com/Detailed/11677.html" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)" 1 www.mavetju.org
student133108.resnet.potsdam.edu - - [22/Oct/2001:05:37:22 +0200] "GET /unix/whyIdontuselinux.phtml HTTP/1.1" 200 8765 "http://www.google.com/search?hl=en&q=DESQview+unix" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)" 3 www.mavetju.org
211.23.203.22 - - [22/Oct/2001:18:31:18 +0200] "GET / HTTP/1.1" 200 3008 "http://www.openfind.com.tw/cgi-bin/tw/webquery?query=%70%72%6f%62%6f%61%72%64&database=TW&group_by=site&pass=1" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; COM+ 1.0.2204; .NET CLR 1.0.2914)" 4 www.mavetju.org
ac90a01b.ipt.aol.com - - [25/Oct/2001:20:53:44 +0200] "GET / HTTP/1.1" 200 3095 "http://the-infinite.org/lists/romlist/2001/10/msg00113.html" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; MSOCD; ONDOWN3.2; AtHome021SI)" 1 www.mavetju.org
ppp-205-187.movi.com.ar - - [23/Aug/2001:07:58:47 +0200] "GET /download/php-radius-1.0.tar.gz HTTP/1.1" 200 2994 "http://www.hotscripts.com/New/2001-08-22/PHP.html" "Mozilla/4.0 (compatible; MSIE 5.01; Windows 98; Nanosoft, Laenciclopedia, onlinesoft)" 2 www.mavetju.org
|
People with too much knowledge...
The thing above can of course be used by people who know how they
can hack their registry:
aberration.reticent.org - - [05/Jun/2001:17:48:17 +0200] "GET /unix/general.phtml HTTP/1.1" 200 1486 "http://piss.me.off" ""Mozilla/7.9 (compatible; MSIE 6.0; MS Sux Ass)"" 4 www.mavetju.org
host213-122-153-121.btinternet.com - - [23/Aug/2001:15:28:12 +0200] "GET /programming/php.phtml HTTP/1.1" 200 2013 "http://www.hotscripts.com/New/2001-08-22/PHP.html" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; BTinternet V8.4; 133T Hax0r)" 0 www.mavetju.org
|
Hardware support
Some hardware is so happy that they want the whole world to know it...
proxy1.fm.intel.com - - [17/Oct/2001:20:05:58 +0200] "GET / HTTP/1.0" 200 2996 "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; DVD Owner)" 0 www.mavetju.org
|
Strange referer-fields
The ignore-list on freshmeat is for all the projects being ignored
by a user. How comes that despite the fact that it's on the ignore
list there are references to it?
v2test - - [16/Nov/2001:07:10:27 +0100] "GET /networking/ipfw-graph-changes.txt HTTP/1.1" 200 586 "http://freshmeat.net/filters/ignore/17888/" "Mozilla/4.0 (compatible; MSIE 5.5; Window NT 5.0)" 0 www.mavetju.org
|
Stupid webcrawlers
Webcrawlers are a service to the public. But not all webcrawlers
are smart: You would expect that it only needed to download robots.txt
only once per archival run.
crawl8-public.alexa.com - - [04/Nov/2001:19:56:15 +0100] "GET /robots.txt HTTP/1.0" 200 126 "-" "ia_archiver" 0 mavetju.org
crawl8-public.alexa.com - - [04/Nov/2001:19:56:21 +0100] "GET /pics/wedding/photos/dsc00061.jpg HTTP/1.0" 200 596409 "-" "ia_archiver" 6 mavetju.org
crawl8-public.alexa.com - - [04/Nov/2001:20:33:35 +0100] "GET /robots.txt HTTP/1.0" 200 126 "-" "ia_archiver" 0 mavetju.org
crawl8-public.alexa.com - - [04/Nov/2001:20:33:36 +0100] "GET /pics/wedding/thumbs/dsc00001.jpg HTTP/1.0" 200 7783 "-" "ia_archiver" 0 mavetju.org
crawl8-public.alexa.com - - [04/Nov/2001:21:05:34 +0100] "GET /robots.txt HTTP/1.0" 200 126 "-" "ia_archiver" 0 mavetju.org
crawl8-public.alexa.com - - [04/Nov/2001:21:05:35 +0100] "GET /pics/wedding/thumbs/dsc00002.jpg HTTP/1.0" 200 7451 "-" "ia_archiver" 0 mavetju.org
crawl8-public.alexa.com - - [04/Nov/2001:21:37:01 +0100] "GET /robots.txt HTTP/1.0" 200 126 "-" "ia_archiver" 0 mavetju.org
crawl8-public.alexa.com - - [04/Nov/2001:21:37:02 +0100] "GET /pics/wedding/thumbs/dsc00003.jpg HTTP/1.0" 200 7642 "-" "ia_archiver" 0 mavetju.org
crawl8-public.alexa.com - - [04/Nov/2001:22:14:20 +0100] "GET /robots.txt HTTP/1.0" 200 126 "-" "ia_archiver" 0 mavetju.org
crawl8-public.alexa.com - - [04/Nov/2001:22:14:21 +0100] "GET /pics/wedding/thumbs/dsc00004.jpg HTTP/1.0" 200 7014 "-" "ia_archiver" 0 mavetju.org
|
|