Bitrix Site Manager

Terms and definitions

The below is the list of terms on which the ideology of the Statistics module is based.

Advertising campaign
The Statistics module provides a mechanism for counting visitors of the site. Each visitor thread can be identified by a number of features:
  1. Presence of the two parameters referer1, referer2 in links to your site (or their synonyms defined in the module settings). Example: http://www.bitrixsoft.com/?referer1=bsm&referer2=doc.
  2. Search engines that bring you visitors.
  3. Referring page masks.
  4. Masks of pages of your site viewed by your visitors.
For each visitor thread, the following indices are counted separately:
  1. Sessions.
  2. Hits.
  3. Hosts.
  4. Visitors.
  5. New visitors.
  6. Visitors who added the site to the Favorites.
  7. All events initialised by visitors of this advertising campaign.
Most of these figures may have two different indices in the advertising campaign reports:
  • for direct hits under an advertising campaign;
  • for returns under an advertising campaign.

Note
Advertising campaign is identified only at the time of first hit of a session.

Advertising campaign identifiers (referer1, referer2)
Advertising campaign (AC) identifiers are the two arbitrary strings which allow to identify the AC and print its name in the human readable form rather than a numerical ID, as well as to group different AC's in reports. These identifiers can be provided:
  • manually, in the advertising campaign editing form;
  • Automatically. If, during the first hit of the session, page parameters contain referer1 and referer2 (or their synonyms) and the option Automatically create Ad campaigns of the Statistics module settings is enabled, new AC's will be automatically created with identifiers whose values are taken from parameters referer1 and referer2.
Auxiliary AC event parameter (referer3)
You can define this parameter (referer3) in links to your site together with the advertising campaign identifiers. Example:

http://www.bitrixsoft.com/?referer1=bsm&referer2=doc&referer3=xxx
If a visitor clicks this link, they will be registered under the advertising campaign with identifier "bsm/doc" and the auxiliary parameter "xxx", which can be viewed in the Sessions report. This parameter helps personalize visits occurring under advertising campaigns.
Direct hit under advertising campaign
A direct hit under an advertising campaign is a session opened when a visitor is identified as the one who came under an advertising campaign. That is, if you click the link http://www.bitrix.ru/?referer1=bsm&referer2=doc and enter the site, you will have your session opened, and you will be identified as an advertising campaign visitor: r1: bsm; r2: doc. You are considered an advertising campaign direct-hit visitor as long as the session is open.
Return under advertising campaign
A return under an advertising campaign is a session opened for a visitor after they have paid a direct hit visit under this advertising campaign. In other words, if you, after a direct visit under the specific advertising campaign, have never entered or identified under other advertising campaigns, all your subsequent visits will be registered as returns under the initial advertising campaign.
Advertising campaign traffic
This term is an aggregate of the following statistical parameters:
  1. Sessions.
  2. Hits.
  3. Hosts.
  4. Visitors.
  5. New visitors.
  6. Visitors who added the site to the Favorites.
  7. All events initialed by visitors of this advertising campaign.
Rate of advertising campaign visitors
Average hits per session. This figure is calculated individually for both direct-hit and return sessions.
Advertising campaign expenses
Amount of money spent for an advertising campaign. This value can be set in the advertising campaign settings.
Advertising campaign income
Amount of money which is the result of hits made by visitors under the advertising campaign.
Advertising campaign profit
Difference between income and expenses of the advertising campaign.
Advertising campaign profitability
Ratio of profit to expenses.
Event
An event is any action occurring within the site or outside of it. The Statistics module allows to collect information only on those events whose processing have been primarily programmed by the site developer (on how to register events, see the information on the class CStatEvent methods). Examples of events are: file download, exit to an external payment system, order payment, order cancellation, banner click, exit to any other site, navigating to a page of your site.
Event type
Each registered event has its own type. The event type is an aggregate of the following notions:
  • event type identifiers (event1, event2);
  • name;
  • description;
  • sort index;
  • storage time of events of this type;
  • storage time of the dynamics of events of this type on daily basis;
  • additional settings.
Event type identifiers (event1, event2)
The event type identifiers are two arbitrary string values allowing to identify the event type in the human readable form rather than a numerical ID, as well as to group different event types in reports. These identifiers can be provided:
  • manually, in the event type editing form.
  • Automatically, when creating events using methods CStatEvent::AddByEvents, CStatEvent::AddCurrent. If no event type with the identifiers passed to these methods exists, it will be automatically created.
Auxiliary event parameter (event3)
This parameter can be provided when creating an event using the appropriate methods. Values of the auxiliary event parameters can be viewed in the Events report which allows to view the event details. This parameter helps personalize each single event.
Special event parameter
Used to identify a visitor when the latter exits to another site (e.g. to an external payment system). Later, the special parameter can be used to register events occurring outside the site (e.g. payment for merchandise). For more details, see CStatEvent::Add and Creating Events. Value of this parameter is a string containing:
  • the portal short identifier (from the Statistics module settings);
  • the session ID;
  • the visitor ID;;
  • the two character country identifier;
  • the advertising campaign ID;
  • the flag indicating the direct hit or return under an advertising campaign;
  • the two character site identifier.
Depending on the value of the parameter Encode additional parameter #EVENT_GID# for events (the Statistics module settings), the special parameter may have either the human readable or encoded form.
CSV file handler
Special PHP script serving to convert reports obtained from a payment system in any custom CSV format, to the standard CSV format accepted by the Statistics module for loading events. The standard CSV format allowed by the Statistics module contains:
Visitor
A visitor is the unique numerical identifier issued to a browser and stored in cookie (small files kept by browsers locally to store information given by web servers). If a browser (or any other software) does not support cookies (persistent and/or session), the system uses special mechanism to identify visitors, which is based on hash (MD5) calculated on the environment variables including IP address, browser settings, provider settings and other parameters uniquely identifying the visitor at the moment of entering the site.

In other words, visitor is a browser (or any other software) that entered the site.

The Statistics module uses different levels of visitor uniqueness. The Statistics menu offers unique visitors per day, as well as unique visitors for the whole period of keeping the statistics (from the moment of the module installation).

Advertising campaigns (as well as hosts) preserves the visitor uniqueness within the advertising campaign. In both cases, new visitors are those who entered the site for the first time.
New visitor
A visitor from whom the unique identifier could not be acquired. Such visitors are admitted to enter the site for the first time.
User
Synonymic to this term are "user account", "user profile". A user is an aggregate of the following:
  • login;
  • password;
  • e-mail
  • first name;
  • last name;
  • personal information;
  • work information;
  • administrator's notes.
At different times, A site visitor can be authorised as a different user.
Online visitors
Site visitors that generated at least one hit within the last N seconds. The N number can be specified in the Statistics module settings (the parameter On-line visitor interval).
UserAgent
A UserAgent is in fact the contents of the HTTP field UserAgent . The field is provided by the software communicating with a web server. Examples of such software are web browsers, download managers, offline browsers, search robots, etc. The software uses this field to identify itself during the communication. The contents of this field can be easily changed in many cases, many applications provide interface for doing it (for example, download manager FlashGet).

Examples of UserAgent’s:
  • Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; NetCaptor 7.0.2 Final; .NET CLR 1.0.3705) - NetCaptor.
  • ReGet SE version 1.3.1* - ReGet.
Session
A session in the Bitrix Site Manager is interpreted as the PHP session. A session object is created when the visitor enters the site, and is terminated after the browser window is closed. Also, a new session is opened upon the user authorization (logging in) and is closed with the user session termination (logging out). Terms session and visit are identical. In other words, a session is a visit to the site.
Host
A host is the IP address of a site visitor. The Bitrix Site Manager supports various levels of uniqueness for hosts in different reports.

The Statistics section displays the number of unique hosts (IP addresses) per day and the number of unique hosts for all the time of gathering the statistics (that is, since the module has been installed).

Advertising campaigns consider the IP address unique within a single advertising campaign only. All visitors under an advertising campaign and their IP addresses are stored in a special table. A visitor or the corresponding IP address is unique within this table only. Most of the existing statistical systems offer the maximum deviation between the number of hosts and visitors of 10%. The Statistics module of the Bitrix Site Manager offers the unexampled deviation level of 3%.
Hit
A hit in the Bitrix Site Manager is interpreted as one page view. The following actions also generate a hit:
  • clicking a link and the consequent page loading;
  • page reload by pressing F5 or Ctrl+F5;
  • opening a non-existing page (producing error 404).
Search phrase
If a visitor typed a search phrase on a search engine, found your site in the search results and clicked on it, the visitor would navigate to your site. In this case, the visitor entered your site using the search phrase. All page views via search phrases are tracked in the Statistics module and displayed in the Search keywords report. In addition to search phrases, the system stores the referring site and page, or the address of the search engine page containing search results.
Search engine
A search engine is any search engine on the Internet. For example: Google, Altavista, Yahoo, etc. Any search engine has the following features:
  • own UserAgent string identifying the search engine robot;
  • group of domains (addresses of sites providing search facilities);
  • variable containing the search phrase.
Search engine domain
One of possibly many search engine domains. For example, the Google search engine offers many domains, to name a few:
  • www.google.com;
  • www.google.ru;
  • www.google.de, etc.
Search engine UserAgent
UserAgent of a special software called search bot. The search bots index your site by downloading pages and registering their contents in the search engine database. Such database records are used for searches later. Usually, each search engine has its own search bot with an individual UserAgent.
Search engine hit
A search engine hit is the indexing of a single page. All search engines have a set of tools for indexing resources on the Internet. The process of indexing is performed by the so-called search robots. Each robot is uniquely identified by its UserAgent. When the robot attempts to index a page on your site, it sends a request to a web server and then loads the HTML code of the page and analyses its contents. The Statistics module registers such page downloads as the search engine hits. The search robots are distinguished by their UserAgent's.
Stop list
A stop list is a set of records used to filter visitors. The purpose of stop list is to restrict user's access to the site resources, redirect to other pages or sites.
Stop list record
A set of parameters used to filter the visitor thread and apply the required action: redirect, show a message, etc.
Subnet mask (in stop lists)
In the TCP/IP terms, a subnet mask (or net mask) is a bit mask (set of flags) that specifies which bits of the IP address specify a particular IP network or a host within a subnetwork. For example, an IP address of 12.34.56.78 with a subnet mask of 255.255.0.0 specifies net 12.34.0.0.

To extract the network ID (network address) from an arbitrary IP address using an arbitrary subnet mask, IP uses a mathematical operation called a logical AND comparison:

IP address:   00001100 00100010 00111000 1001110 (012.034.056.078)
Subnet mask:  11111111 11111111 11100000 0000000 (255.255.224.000)
------------------------------------------------------------------
Network ID:   00001100 00100010 00100000 0000000 (012.034.032.000)

A stop list record contains two records: network IP and subnet mask. A visitor is considered to match a stop list record if, having the subnet mask (of a stop list record) applied to their IP address, we get the network IP (of a stop list record).

Example 1. Deny access to all visitors from IP of 206.191.49.66.

  1. Specify the following:
    • network IP = 206.191.49.66
    • subnet mask = 255.255.255.255
  2. When a visitor with IP of 206.191.49.66 enters the site, the following operation occurs:
    Visitor IP address: 11010001 10111111 11000001 1000010 (206.191.049.066)
    Subnet mask:        11111111 11111111 11111111 1111111 (255.255.225.255)
    ------------------------------------------------------------------------
    Result:             11010001 10111111 11000001 1000010 (206.191.049.066)
    
    If the result matches the network IP (of a stop list record), the visitor is found in the stop list. Thus, all visitors from 206.191.49.66 will match the stop list record.

Example 2. Deny access to all visitors from subnet of 206.191.49.xxx (IP in the range from 206.191.49.1 to 206.191.49.255)

  1. specify:
    • network IP = 206.191.49.0
    • subnet mask = 255.255.255.0
  2. When a visitor with IP of 206.191.49.76 enters the site, the following operation occurs:
    Visitor IP address: 11010001 10111111 11000001 1001100 (206.191.049.076)
    Subnet mask:        11111111 11111111 11111111 0000000 (255.255.225.000)
    ------------------------------------------------------------------------
    Result:             11010001 10111111 11000001 0000000 (206.191.049.000)
    
    The result matches the network IP (of a stop list record), and consequently, the visitor is found in the stop list. Thus, all visitors with IP's from 206.191.49.1 to 206.191.49.255 will not be allowed to access the server.
Logical AND
The result of this operation behaves as follows:

Operand 1:  0 1 1 0
Operand 2:  1 0 1 0
-------------------
Result:     0 0 1 0
Referring site (page)
A referring page or referring site is the URL of the previous webpage from which a link was followed; that is, any web resource having links to your site. If a visitor enters your site using the link on such a page, the referring site and page address are registered in the Statistics module as referring.
Full path
The site full path is one or more pages that a visitor opened consequently by clicking links on these pages. If a visitor opens more than one page from a source page (i.e. in a new browser window), the path may fork thus forming many different paths.
Path section
The Statistics module defines a path section as a set of pages consisting of the first page of the path and an arbitrary number of the subsequent pages.

For example, if you open a page page1, then click a link to page2, and from the latter - to page3. This make a path consisting of the three segments:

  • page1
  • page1 –> page2
  • page1 -> page2 -> page3
If, from the page3, you click a link to a page4 ("Shift" + left click in the Internet Explorer), and then open a page5 in the similar manner, you will form 2 paths, each consisting of the 4 segments, the first 3 of which are identical.
IP address

Every machine that is on a network (a local network, or the Internet) has a unique IP number - four sets of numbers divided by period with up to three numbers in each set. In other words, the IP address is an identifier of a single network connection.

An IP address consists of 32 bits (IPv4) or 256 bits (IPv6). Rather than working with 32 bits at a time, it is a common practice to segment the 32 bits of an IP address into four 8-bit fields called octets. Each octet is converted to a decimal number in the range 0-255 and separated by a dot. This format is called dotted decimal notation. For example: decimal 128.10.2.30 for 10000000 00001010 00000010 00011110.

The IP addresses form the basic type of addressing used to identify each sender or receiver of information that is sent in packet across the Internet. On the server side, the IP addresses are assigned by system administrators, and automatically (or by users) on client side.

Each country has a definite range of IP addresses assigned. This allows to determine the visitor's country by the IP address.

IP address octet
Part of an IP address, in the range 0-255.
Browser language

Many browsers has an option to set languages in which a user would like to see pages. For example, the Internet Explorer offers this option in the menu:

"Tools" -> "Internet Options..." -> "General" tab -> click "Languages"

Error 404
The 404 error message is an HTTP standard response code indicating that the client was able to communicate with the server, but the server either could not find the file that was requested, or was unwilling to fulfill the request for it and did not wish to reveal the reason why.

You can customize the way this error is handled. For example, with Apache, you have to create a file .htaccess at the site root (or in the catalog where the requested page is located) with the following directive:

ErrorDocument 404 /404_handler.php
In this case, when a 404 error occurs, the server will execute the script /404_handler.php, which allows to handle the situation.

To register the 404 errors with the Statistics module, initialise the constant ERROR_404 with "Y" before the page epilogue:

define("ERROR_404", "Y");
HTTP request methods
There are several HTTP request methods: OPTIONS, GET, HEAD, POST, PUT, DELETE, TRACE, CONNECT. Most commonly used are: GET, POST, HEAD.
  • GET

    Information transfered using the GET method is appended onto the end of the action URI being requested. The most common method used today. Example:

    http://www.mysite.com/script.php?site_id=en&date1=2

    Variables and values are passed in pairs "variable=value" separated with "&". Each value must come URL ecoded.

  • POST

    The data to transfer is included in the body of the request.

  • HEAD

    Asks for the response identical to the one that would correspond to a GET request, but without the response body. This is useful for retrieving meta-information written in response headers, without having to transport the entire content.

Cookie
A cookie is a small piece of data which is sent from a web server to a web browser and stored locally on the user's machine. Commonly used to check the user's identify; store the time of last visit, preferences, the shopping cart ID etc.
Entrance point
The entrance point is the first hit (page viewed) in the session.
Exit point
The entrance point is the last hit (page viewed) in the session.
Activity limit

In the Statistics module settings, you can specify the activity limit. Exceeding the limit means a visitor exceeded the maximum number of hits for the certain period of time. If this is the case, the HTTP status "503 Service Unavailable" is set back to a visitor. Such an effect can be achieved by including the file /bitrix/activity_limit.php.

This function is generally used to lower the site load generated by search robots, offline browsers etc. If a search bot receives the 503 status, it reduces the site indexing frequency. You can disable activity limit checks for a search engine in its settings. The activity limit can play a part to repulse ddos attacks, but this is not a remedy since server resources are used to control the activity limit.

Favorites
Favorites is a folder in Internet Explorer and other browsers used to store shortcuts of web sites you wish to return to. When creating an entry in the Favorites, a client's browser send a request for the file /favicon.ico containing an image to be associated with the URL. This feature enables to register such events in the Statistics module.

There is a weak point in this technology. Many browsers (Firefox, NetCaptor, MyIE) request the favorite icon irrespective of the user's wish to add the site to the Favorites. This does not apply to Internet Explorer. Since most users run Internet Explorer, the mistake is insignificant.