Interpreting Webalizer Reports
The first page
On this page, you'll see a graph such as this:

This is a summary view of the past 12 months, for comparison purposes. You'll be able to see long term trends in your web traffic at a quick glance.
The next (and last) thing of interest on the first page of your webalizer report is a table which summarizes each month for which you have run a webalizer report, and which gives both daily averages and monthly totals for Hits, Files, Pages and Visits, as well as the monthly total in Kilobytes*.
| Summary by Month | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Month | Daily Avg | Monthly Totals | ||||||||
| Hits | Files | Pages | Visits | Sites | KBytes | Visits | Pages | Files | Hits | |
| Jul 2004 | 55 | 13 | 2 | 1 | 97 | 7627 | 42 | 64 | 423 | 1718 | 
| Jun 2004 | 629 | 263 | 9 | 3 | 114 | 157292 | 105 | 274 | 7896 | 18876 | 
| May 2004 | 49 | 22 | 2 | 1 | 112 | 40322 | 51 | 76 | 665 | 1498 | 
| Apr 2004 | 5 | 3 | 1 | 0 | 51 | 4714 | 24 | 34 | 118 | 160 | 
| Mar 2004 | 6 | 2 | 1 | 1 | 56 | 98 | 33 | 37 | 78 | 210 | 
| Feb 2004 | 13 | 4 | 1 | 1 | 73 | 1434 | 35 | 40 | 126 | 384 | 
| Jan 2004 | 12 | 9 | 1 | 1 | 55 | 466 | 35 | 49 | 294 | 380 | 
| Dec 2003 | 24 | 4 | 2 | 1 | 50 | 1192 | 44 | 62 | 148 | 732 | 
| Nov 2003 | 24 | 3 | 1 | 1 | 63 | 284 | 42 | 50 | 92 | 728 | 
| Oct 2003 | 89 | 55 | 3 | 1 | 58 | 28154 | 56 | 112 | 1708 | 2788 | 
| Sep 2003 | 17 | 14 | 1 | 1 | 19 | 450 | 37 | 47 | 422 | 516 | 
| Aug 2003 | 8 | 6 | 1 | 1 | 24 | 2797 | 34 | 42 | 208 | 272 | 
| Totals | 244830 | 538 | 887 | 12178 | 28262 | |||||
Each month in the left hand column is a link to a more detailed breakdown of that month's traffic.
The Details
The data being presented to you takes the form of Hits, Files, Visits,
   Sites, Pages, Kilobytes, URLs, Referrers, User Agents and Response codes.
   All of these data are generated by interpreting a series of web transfers
   as logged by our Apache web servers.  Apache creates a line such as this:
   
66.196.90.216 - - [12/Aug/2004:04:02:03 -0400] "GET http://www.arrgh.net/music/data.php?composer_name=Gorecki HTTP/1.0" 200 659 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
  
Each such line breaks down to the following data:
- Internet address of the machine visiting your site
 - The date, time, and time zone as an offset from GMT
 - The specific request. In most cases, this will be GET, but there may also be POST and HEAD requests.
 - The URL being requested
 - The HTTP protocol used for the request
 - The response code
 - The number of bytes transferred
 - The referring page
 - The user agent
 
From this data, webalizer crafts various views of your web traffic.
Definitions
- Hits measure the total number of requests made to the server
    during the given time period (month, day, hour etc..).  Each separate
    item on a single page will produce a hit when the page is requested.
    For example, if you have a page with 3 graphics and some text, a request
    for that page will (usually) result in 4 hits.
 - Files measure the total number of hits (requests) that actually
    resulted in something being sent back to the user. Not all hits will
    send data, such as 404-Not Found requests and requests for pages that
    are already in the browsers cache.
 - A Site is a unique IP address (or hostname, if you are doing
    name resolution) that made requests to the server. This is less useful
    than it may appear, because many different computers can share a
    single address, and the same visitor can also visit from many addresses,
    so it should be used simply as a rough gauge as to the number of visitors
    to your server.
 - Pages are those URLs that would be considered the actual page
    being requested, and not all of the individual items that make it up
    (such as images and audio clips).  Webalizer's default is to consider
    any URL that has an extension of .htm, .html or .cgi as a Page.  If 
    you use php at panix, you might wish to add lines to your
    webalizer.conf file to add .php to this list:
    
PageType .htm*
PageType .cgi
PageType .php
 - A Visit is recorded when some remote site makes a request for a 
    page on your server for the first time.   As long as the same site keeps
    making requests within a given timeout period, they will all be 
    considered part of the same visit. If the site makes a request to your
    server, and the length of time since the last request is greater than
    the specified timeout (default is 30 minutes), a new visit is counted. 
    Since only pages will trigger a visit, remote sites that link 
    to graphic and other non-page URLs will not be counted in the visit
    totals.
 - A KByte* (KB) is 1024 bytes (1 Kilobyte). Used to show the amount
    of data that was transfered between the server and remote machines, 
    based on the data found in the server log.  Note that at Panix, the 
    logs you should be using for webalizer do not accurately reflect the 
    total number of Kilobytes sent, so webalizer's Kilobyte count should 
    not be used for accounting purposes, such as double checking transfers 
    on your bill.  See below for more information on this.
 - URL - Uniform Resource Locator.  All requests made to a web 
    server need to request something. A URL is that something, and represents 
    an object somewhere on your server, that is accessible to the remote user, 
    or results in an error (ie: 404 - Not found). URLs can be of any file type.
 - Referrers are those URLs that lead a user to your site or caused 
    the browser to request something from your server. The vast majority of 
    requests are made from your own URLs, since most HTML pages contain links 
    to other objects such as graphics files. If one of your HTML pages contains 
    links to 5 images, then each request for the HTML page will produce 
    5 more hits with your page as referrer.
 - Search Strings are obtained by examining the referrer string 
    and looking for known patterns from various search engines. The search 
    engines and the patterns to look for can be specified by the user within 
    a configuration file. The default will catch most of the major ones.
 - User Agents Are software programs which connect to the web server 
    and make requests.  Most User Agents are browsers, such as IE, Mozilla 
    or Netscape.  Each user agent reports itself in a unique way to your 
    server. Keep in mind however, that many browsers allow the user to change 
    it's reported name, so you might see some obvious fake names in the 
    listing.
 - Entry and Exit pages are those pages that were the first 
    requested in a visit (Entry), and the last requested (Exit). These pages 
    are calculated using the visits logic above. When a visit is first 
    triggered, the requested page is counted as an Entry page, and whatever 
    the last requested URL was, is counted as an Exit page.
 - Countries are determined based on the top level domain of the
    requesting site. This is questionable however, as there is no longer 
    strong enforcement of domains as there was in the past. A .COM domain
    may reside in the US, or somewhere else. An .IL domain may actually be 
    in Israel, however it may also be located in the US or elsewhere. 
    A large percentage may also be shown as Unresolved/Unknown because a
    fairly large percentage of dialup and other customer access points do 
    not resolve IP addresses to a name, and so are left as an IP address.
    If you are not doing name resolution in your reports, all hits will be
    recorded as Unresolved/Unknown here.
 - Response Codes are defined as part of the HTTP/1.1 protocol (RFC 2068; See Chapter 10). These codes are generated by the web server and indicate the completion status of each request made to it.
 
* You cannot use the Kbytes as reported by webalizer to check your billed transfers. Panix uses web accelerators known as Squids. The squids cache pages and, if the actual page has not changed, serves the page from it's cache rather than the web server. In the process of doing this, duplicate log entries are created for a File; one for the squids and one for the web server. The web server log entry will not show any bytes transferred, however, so you need to get squid logs as well as web logs to check on bytes transferred. This can be done using the '-a' switch to getlogs. We do not do this by default for webalizer processing because the duplicate log entries would render the rest of Webalizer's statistics grossly inaccurate.