Server logs: The source of site statistics
Every web server keeps a set of log files that record every request
for a file by a browser and the response the server gave to that
request. In other words, each time you click a link on a page located
on www.bu.edu, a line of text describing the request is added to
a log file.
Log files become very large, very quickly. Web server log files
on www.bu.edu generally reach about 300 MB per day and are compressed
and archived every day to conserve disk space. BU maintains log
archives going back to 2000. Even after log files have been removed,
their data is preserved through site reports. NIS has been providing
site reports to web developers since April 1999.
Two types of log files are used to produce site statistics reports.
Access logs contain information
such as date, time, IP address, browser, filename, filetype, server
status, and amount of data transferred for every file request.
Below is an excerpt from a typical access log (spaces added between
entries for readability):
cc-ma003.mita.cc.keio.ac.jp
- - [23/Jan/2002:00:10:48 -0500] "GET /search/graphics/banner-spacer.gif
HTTP/1.1" 200 89 0.001222 0.000
000 0.000000 "http://web.bu.edu/search/" "Mozilla/4.0
(compatible; MSIE 5.5; Windows NT 5.0)"
cc-ma003.mita.cc.keio.ac.jp - - [23/Jan/2002:00:10:48 -0500]
"GET /search/graphics/banner-title.gif HTTP/1.1" 200
2067 0.001270 0.00
0000 0.000000 "http://web.bu.edu/search/" "Mozilla/4.0
(compatible; MSIE 5.5; Windows NT 5.0)"
pm510-15.dialip.mich.net - - [23/Jan/2002:00:10:49 -0500]
"GET /law/graphics/second/law-logo.gif HTTP/1.0" 200
2880 0.001849 0.00000
0 0.000000 "http://www.bu.edu/law/alumni/index.html"
"Mozilla/4.7 [en] (Win98; I)"
cc-ma003.mita.cc.keio.ac.jp - - [23/Jan/2002:00:10:49 -0500]
"GET /search/graphics/banner-widespacer.gif HTTP/1.1"
200 104 0.001430
0.000000 0.000000 "http://web.bu.edu/search/" "Mozilla/4.0
(compatible; MSIE 5.5; Windows NT 5.0)"
Error logs track failed requests,
such as when you click a link on a web site and get a "File
Not Found" message. In addition to "true" errors
such as incorrect HTML code or scripts containing bugs, errors can
be caused by visitors mistyping file names or leaving a page before
all its associated files have completely loaded.
Below is an excerpt from a typical error log:
[Wed Jan 23 11:02:51 2002] [error]
[client 24.128.190.210] File does not exist: /afs/bu.edu/cwis/web/r/o/roybal/articles/graphics/about_us.gif
[Wed Jan 23 11:02:51 2002] [error] [client 24.128.190.210]
File does not exist: /afs/bu.edu/cwis/web/r/o/roybal/articles/graphics/
publications_button_on.gif
Other servers, such as Real media server, also keep logs. The format
of these logs varies but they can also be analyzed for information
about content usage. |