Answers Search Help
Boston University home page
Understanding Site Statistics
 
    What Are Site Stats
 
 
 
 
    Your Reports
 
 
 
 
 
 
 
    Search Engines
 
 
    Increasing Traffic
 
 
 
 
 
    Also See
 
 
 

Course Outline

Offered by Networked Information Services

Instructor: TBD
Prerequisites: A web site on www.bu.edu is recommended.
2 hours

rule

Intro: What are site statistics and where do they come from? (10 minutes)

1. Site statistics are a summary of who has viewed what on your site and when they viewed it.
  1. Web servers generate and store log files.
  2. Log files are rotated on a daily basis.
  3. Log files archives go back to 2000.
2. Web servers keep two types of log files: access and error.
  1. Access log files contain information for every file request, such as date, time, browser, IP address of visitor, file requested, previous file requested, a status code, and amount of data transferred.
  2. Error logs track failed requests, such as when a visitor gets a File Not Found error page.
  3. These examples are of web server logs. Other types of servers, such as Real servers, also keep logs but those follow a different format.

Producing Stats Reports: The basics (10 minutes)

1. Who gets them: All departmental (not personal) sites on www.bu.edu receive both a site stats report and a broken links report.

2. How they're generated: Reports are produced by two programs running on the server, Analog 5.1 and ReportMagic 2.10. Broken link reports are generated by Linklint.

3. When they're generated: These reports are generated the first week of every month.

4. Where to find them: The reports are stored at: http://www.bu.edu/reports/<your directory name>

5. Limitations of stats reports: they should be interpreted judiciously, as there are inherent difficulties with collecting complete and accurate data.

Hands-on exercise: Understanding a stats report (20 minutes)

1. Summary of report items

  1. successful server requests: all requests, including graphics
  2. successful requests for pages: web pages only, does not include graphics or any other types of files (much smaller number)
  3. failed requests: file not found, incorrect permissions
  4. redirected requests: user was directed to a different file instead. The most common cause of these requests is that the user has incorrectly requested a directory name without the trailing slash. The other common cause of redirected requests is their use as "click-thru" advertising banners.
  5. distinct files requested: potentially of interest if you know how many total files comprise your site, to determine what percent are actually being visited
  6. distinct hosts served: the best indicator of how many visitors you have, particularly when viewed in the daily breakdown.
  7. total data transferred: if you have a 50k file, and it's transferred 10 times, the total data transferred is 500k. This is useful for getting a sense of how much bandwidth your site is utilizing.

2. Directory and Request

3. Domain and Host

  1. Domains are the part of the web address that indicate the network visitors are connecting from. In the US the primary types include: .edu domains are educational institutions, .org domains are non-profits, and .com domains are corporate entities, .mil (military) and .gov (government).
  2. There are also country domains. These range from fairly common (.uk and .ca) to quite obscure, for example, .tr (Turkey) and .er (Eritrea). It's very interesting to see how far-flung your audience is. Note that the presence of a country domain does not necessarily mean that the visitor is physically located in that country, just that he or she is connecting to the Internet via a server that is using that country's domain.
  3. Host name indicates the specific network and computer visitors are coming from. Some will look very familiar, such as xyz.bu.edu and xyz.aol.com.

4. Referring Site and Referring URL reports

  1. These reports indicate what site and web page visitors viewed prior to your site.
  2. The Referring URL is particularly useful because it provides specific information about the content that is drawing people to your site.

5. Search Word

  1. These are terms that visitors typed into various search engines that brought them to your site.
  2. You can get an idea of what topics people are most interested in. You can also see if the words you'd expect to be most frequently searched for appear at the top of the list.

6. Other (file type, file size, status, browser, redirection, failure, failed referrer)

Hands-on exercise: Broken links report (10 minutes)

1. The broken link report identifies incorrect references to pages that result in File Not Found errors.

  1. Your statistics include two separate reports of broken links: one report of links that point to sites within www.bu.edu, another for links that point outside www.bu.edu. Each entry in each report lists the page where the broken links occur, followed by a list of the broken links on that page.
  2. Obviously, you can correct links that point to pages within your site yourself. Note that frequently several of the links are to the same missing file or incorrect URL -- make sure you to correct all of them.
  3. To fix broken links that point to sites off www.bu.edu, first check to see if the file has been renamed or moved (you may need to contact the site's webmaster to find out). However, if the file has been removed you will have to remove your link as well.
2. Broken anchors (#)
  1. Same as broken link reports, but for internal page anchors.

Getting the most from search engines (20 minutes)

1. Evolution of search engine technology
  1. Search engines originally indexed sites using techniques such as how many times a keyword appeared on a page, or what information was contained in meta tags in the page heading, or even by a process of manually adding and ranking pages.
  2. Search engines now use "spider" technology to continually find and add new sites to their databases. Rankings are based on the concept of importance (that is, the more sites linking to your site, and the more highly ranked those sites are, the better your own site's ranking will be).

2. Common myths about search engines

  1. You need to submit your pages to search engines.
  2. You need to be indexed in every search engine out there.
  3. You must have a short URL and/or a .com domain name.
  4. You must use keywords, preferably lots of them.
  5. You must have meta tags on every page.

Increasing traffic to your site (40 minutes)

1. There are two factors that drive people to your site: quality and quantity.

  1. Quality content is relevant, useful, and unique. If your site contains information found nowhere else, people will seek it out.
  2. Quality also implies quality of experience. A well-designed interface, visually appealing graphics, working links, and attention to writing style and grammar make your site attractive.
  3. Quantity means comprehensive coverage of your site's topics.
  4. In other words, your goal is to have your site be the authority on its subject, and once the visitor is there, be extensive enough to answer all the visitor's needs.

2. Hands-on exercise: Use the information from site reports to improve your site

  1. Correct or remove broken links: people will stay longer and return more frequently to your site if their experience is error-free and dependable.
  2. Identify the most frequently visited pages and consider increasing the number of pages devoted to that content. Note whether those pages have different navigation, design, etc. that people may find easier to use.
  3. Find out what sites people are coming from via the referrers page.
  4. Use the host report to see which spiders have visited your site. Note that overall, the biggest referrers are Yahoo, Google, and Inktomi (in use at such sites as CNET, MSN and AOL).

3. Publicize your site

  1. Any document or printed item
    1. Newsletters
    2. Letterhead and envelopes
    3. Business cards
    4. Event brochures
    5. T-shirts, caps, and mugs
  2. Submit your site to BU Features or What's New when appropriate
  3. Press release or other news announcement
  4. Paid listings in search engines

Resources (10 minutes)

  1. searchenginewatch.com
  2. spider-food.net
  3. searchterms.com
  4. inktomi.com
  5. google.com
  6. yahoo.com
  7. webmonkey.com

 

WebCentral Using Publishing Learning Training Consulting WebCentral
Answers Search Help
NIS  |  OIT  |  Boston University  |   February 6, 2007