How to track internet traffic on a local network. Principles of organizing IP traffic accounting

Programs for accounting traffic in local network quite a lot: both paid and free, very different in functionality. One of the most popular Open Source programs is SAMS. It runs on the Linux platform in conjunction with Squid.

SAMS requires PHP5, we will use Ubuntu Server 14.04. We will need Squid, Apache2, PHP5 packages with modules.

Internet traffic accounting in linux local network

Let's try to figure out how it works.

Squid distributes the Internet, accepting requests on port 3128. At the same time, it writes a detailed access.log. All management is done through the squid.conf file. Squid has a wide range of Internet access control capabilities: access control by addresses, bandwidth control for specific addresses, groups of addresses and networks.

SAMS works by analyzing the logs of the Squid proxy server. The traffic accounting system in the local network monitors the statistics of the proxy server and, in accordance with the specified policies, makes a decision to block, unblock or limit the speed for the Squid client.

SAMS Installation

Installing packages.

apt-get install apache2 php5 php5-mysql mysql-server php5-gd squid3

Download and install SAMS

wget https://github.com/inhab-magnus/sams2-deb/archive/master.zip

unzip master.zip

cd sams2-deb-master/

dpkg -i sams2_2.0.0-1.1_amd64.deb

Installing the web interface

dpkg -i apache2/sams2-web_2.0.0-1.1_all.deb

We make changes to the /etc/sams2.conf file.

DB_PASSWORD=/MySql password/

Launch SAMS

service sams2 start

Setting up Squid

Making changes to the /etc/squid3/squid.conf file

http_port 192.168.0.110:3128
cache_dir ufs /var/spool/squid3 2048 16 256

We enable logging and rotation of logs with storage for 31 days.

access_log daemon:/var/log/squid3/access.log squid

logfile_rotate 31

Stop Squid, create cache.

service squid3 stop

service squid3 start

For the purity of the experiment, we configure one of the browsers to work with the proxy 192.168.0.110 through port 3128. Having tried to connect, we get a connection failure - Squid does not have permissions to access the proxy.

SAMS initial setup

In another browser, open the address (192.168.0.110 - server address).

http://192.168.0.110/sams2

He will tell us that he cannot connect to the database and will offer to install.

Specify the database server (127.0.0.1), login and password from MySql.

The initial configuration of the traffic accounting system is completed. It remains only to configure the program.

Monitoring traffic in the local network

It is worth saying right away about user authorization.

In the Squid branch, open the proxy server and click the "Proxy server settings" button below.

The most important thing here is to specify your IP address in the addresses of folders and files, where necessary, otherwise the proxy server will not start.

The essence of all changes in the SAMS settings is that they are written to squid.conf. sams2deamon runs in the background, which monitors changes in the settings that need to be made to the configuration file (you can also set the tracking interval there).

Fill in the "User" and "IP address" fields. As a username, take the same IP (IP of the computer, not the server!). In the "Allowed traffic" field, enter "0", that is, without restrictions. All other fields are omitted.

A new acl will be added for this IP address and permission to work through Squid. If the config was not changed automatically, go to the proxy branch and click the "Reconfigure Squid" button. Changes to the config will be made manually.

We try to open any URL in the browser. We check the access.log and see the requests processed by the proxy. To check the operation of SAMS, open the "Users" page, click the "Recalculate user traffic" button below.

Using the buttons below to manage statistics, you can get detailed information on the statistics of user visits to pages.

This article will look at software solutions that will help control your traffic. Thanks to them, you can see a summary of the Internet connection consumption of a particular process and limit its priority. It is not necessary to view the recorded reports on a PC with special software installed in the OS - this can be done remotely. It will not be a problem to find out the cost of consumed resources and much more.

Software from SoftPerfect Research, which allows you to control the traffic consumed. The program provides additional settings that allow you to see information about consumed megabytes for a specific day or week, peak and off-peak hours. Provided an opportunity to see indicators of incoming and outgoing speed, received and sent data.

The tool will be especially useful in cases where limited 3G or LTE is used, and, accordingly, restrictions are required. If you have more than one account, statistics about each individual user will be displayed.

DU Meter

Application for tracking the consumption of resources from the World Wide Web. In the work area, you will see both the incoming and outgoing signal. By connecting account service dumeter.net, which is offered by the developer, you can collect statistics on the use of the flow of information from the Internet from all PCs. Flexible settings will help you filter the flow and send reports to your email.

Parameters allow you to specify restrictions when using a connection to the world wide web. In addition, you can specify the cost of the service package provided by your provider. There is a user manual in which you will find instructions for working with the existing functionality of the program.

Network Traffic Monitor

A utility that displays reports on network usage with a simple set of tools without the need for pre-installation. The main window displays statistics and a summary of the connection that has Internet access. The application is able to block the flow and restrict it, allowing the user to specify eigenvalues. In the settings, you can reset the recorded history. Available statistics can be written to a log file. An arsenal of necessary functionality will help fix the download and upload speed.

TrafficMonitor

The application is an excellent solution for countering information flow from the network. There are many indicators that show the amount of data consumed, return, speed, maximum and average values. Software settings allow you to determine the cost of the volume of information used at the present time.

In the generated reports there will be a list of actions related to the connection. The graph is displayed in a separate window, and the scale is displayed in real time, you will see it on top of all the programs in which you work. The solution is free and has a Russian-language interface.

NetLimiter

The program has modern design and powerful functionality. Its peculiarity is that it provides reports in which there is a summary of the traffic consumption of each process running on a PC. The statistics are perfectly sorted by different periods, and therefore it will be very easy to find the right period of time.

If NetLimiter is installed on another computer, then you can connect to it and control its firewall and other features. To automate processes within the application, rules are used that are compiled by the user himself. In the scheduler, you can create your own limits when using the services of a provider, as well as block access to the global and local networks.

DUTraffic

The features of this software is that it displays extended statistics. There is information about the connection from which the user entered the global space, sessions and their duration, as well as the duration of use, and much more. All reports are accompanied by information in the form of a diagram highlighting the duration of traffic consumption over time. In the settings you can customize almost any design element.

The graph that is displayed in a specific area is updated per second. Unfortunately, the utility is not supported by the developer, but it has a Russian interface and is distributed free of charge.

BWMeter

The program monitors the download / upload and speed of the existing connection. Using filters displays an alert if processes in the OS are consuming network resources. Different filters are used to solve a wide variety of tasks. The user will be able to fully customize the displayed graphs to their liking.

Among other things, the interface shows the duration of traffic consumption, the speed of receiving and uploading, as well as the minimum and maximum values. The utility can be configured to alert you when events such as downloaded megabytes and connection time occur. By entering the website address in corresponding line, you can check its ping, and the result is written to the log file.

BitMeter II

A solution for providing a summary of the use of the provider's services. There is data in both tabular and graphical representations. In the settings, alerts are configured for events related to the connection speed and the consumed stream. For convenience, BitMeter II allows you to calculate the amount of time it takes to download the amount of data entered by it in megabytes.

The functionality allows you to determine how much available volume is left provided by the provider, and when the limit is reached, a message about this is displayed in the taskbar. Moreover, the download can be limited in the settings tab, as well as monitor statistics remotely in browser mode.

The presented software products will be indispensable in controlling the consumption of Internet resources. The functionality of the applications will help to compile detailed reports, and the reports sent by e-mail are available for viewing at any convenient time.

Hello friends! Write about how to monitor traffic I was going right after I wrote the article ““, but somehow forgot. Now I remembered and will tell you about how to track how much traffic you spend, and we will do this with the help of free program NetWorx.

Know when connected unlimited Internet, then you don’t really need to monitor the traffic, except for the sake of interest. Yes, now all urban networks are usually unlimited, which cannot be said about 3G Internet, the tariffs of which usually go off scale.

All this summer I have been using CDMA Internet from Intertelecom, and I know all these nuances with traffic and tariffs firsthand. I already wrote about how to set up and improve the Internet from Intertelecom, read and. So, their “unlimited” tariff costs 150 hryvnia per month. As you can see, I put the word unlimited in quotation marks, why? Yes, because there is a speed limit, though only during the day, but there is nothing to rejoice at, the speed there is simply terrible, it is better to use GPRS already.

The most normal tariff is 5 hryvnia per day upon connection, that is, if you do not connect today, you do not pay. But it's not unlimited, it's 1000 megabytes a day, until 12 midnight. I have this tariff now, but at least the speed is decent, real average speed 200 Kbps . But 1000 Mb per day is not very much at such a speed, so in this case it is simply necessary to control the traffic. Moreover, after using this 1000 MB, the cost of one megabyte is 10 kopecks, which is not a little.

Even as soon as I connected this Internet, I began to look for a pretty program that would control my Internet traffic and I could set a warning when the limit was used up. And I found it, of course, not immediately, after trying a couple of things, I came across the NetWorx program. Which we will continue to talk about.

NetWorx will monitor the traffic

Now I will tell you where to get the programs and how to set it up.

1. Whatever you are looking for a program, I uploaded it to my hosting, so .

2. Run the downloaded file and install the program, I will not describe the installation process, I wrote about this in .

3. If after installation the program did not start itself, then we launch it with a shortcut on the desktop or in the start menu.

4. That's it, the program already counts your Internet traffic, it hides in the tray and quietly works for itself there. The working window of the program looks like this:

As you can see, the program displays Internet traffic for the current day and for the entire time, starting from the time you installed the program, you can see how much I burned :). In fact, the program does not need any settings. I'll just tell you how to set a quota in NetWorx, that is, traffic limits and how to make it so that the activity of incoming and outgoing Internet traffic is displayed in the tray icon.

5. Now let's make it so that the activity of Internet traffic is displayed in the tray.

Right-click on the program icon in the tray and select "Settings"

On the “Graph” tab, set it as in my screenshot, click “OK” and “Apply”. The NetWorx tray icon will now display internet connection activity.

6. And the last item in setting up this program will be setting a quota. For example, Intertelecom gives me only 1000 MB per day, so in order not to spend more than this rate, I set up the program so that when I use traffic by 80% it warns me.

Right-click on the program's tray icon and select "Quota".

You see, today I used up my limit by 53%, below there is a field where you can specify at what percentage to report that the traffic is ending. Let's click on the "Settings" button and set up the quota.

Everything is very simple here, first we set what quota you have, for example, I have a daily quota, then we set the traffic, I selected all traffic, that is, incoming and outgoing. Set "Clock" and "Units", I have megabytes. And of course, do not forget to specify the size of the quota, I have 1000 megabytes. Click "OK" and that's it, our quota is set.

That's it, the program is fully configured and ready to read your traffic. It will be launched along with the computer, and you just have to occasionally look in and see for interest how much traffic you have already burned. Good luck!

Internet access structure

In general, the network access structure looks like this:

External resources - the Internet, with all sites, servers, addresses and other things that do not belong to a network that you control.
An access device is a router (hardware or PC-based), switch, VPN server or hub.
Internal resources - a set of computers, subnets, subscribers, whose work in the network must be taken into account or controlled.
Management or accounting server - a device on which a specialized software. It can be functionally combined with a software router.

In this structure, network traffic flows from external resources to internal, and vice versa, through the access device. It sends traffic information to the management server. The control server processes this information, stores it in the database, displays it, issues lock commands. However, not all combinations of access devices (methods) and collection and management methods are compatible. O various options and will be discussed below.

Network traffic

To begin with, it is necessary to define what is meant by "network traffic" and what useful statistical information can be extracted from the user data stream.
IP version 4 remains the dominant internetworking protocol so far. The IP protocol corresponds to the 3rd layer of the OSI model (L3). The information (data) between the sender and the recipient is packed into packets - having a header and a "payload". The header defines where the packet comes from and where (sender and destination IP addresses), packet size, payload type. The bulk of network traffic is made up of packets with UDP and TCP payloads - these are Layer 4 (L4) protocols. In addition to addresses, the header of these two protocols contains port numbers that determine the type of service (application) that transmits data.

To transmit an IP packet over wires (or radio), network devices are forced to “wrap” (encapsulate) it into a Layer 2 (L2) protocol packet. The most common protocol of this type is Ethernet. The actual transfer "to the wire" is at the 1st level. Usually, the access device (router) does not analyze packet headers at a level higher than 4th (an exception is intelligent firewalls).
Information from the fields of addresses, ports, protocols and length counters from the L3 and L4 headers of data packets make up the “source material” that is used in traffic accounting and management. Actually volume transmitted information is in the Length field of the IP header (including the length of the header itself). By the way, due to packet fragmentation due to the MTU mechanism, the total amount of data transmitted is always over size payload.

The total length of the IP and TCP/UDP fields of the packet that are of interest to us in this context is 2...10% of the total packet length. If you process and store all this information batch by batch, there will not be enough resources. Fortunately, the vast majority of traffic is structured in such a way that it consists of a set of "dialogues" between external and internal network devices, the so-called "flows". For example, within a single e-mail forwarding operation (SMTP protocol), a TCP session is opened between the client and the server. It is characterized by a constant set of parameters (Source IP address, Source TCP port, Destination IP address Destination TCP port). Instead of processing and storing information packet by packet, it is much more convenient to store flow parameters (addresses and ports), as well as additional information - the number and sum of lengths of transmitted packets in each direction, optionally session duration, router interface indices, ToS field value, and so on. This approach is beneficial for connection-oriented protocols (TCP), where it is possible to explicitly intercept the moment the session ends. However, even for non-session-oriented protocols, it is possible to aggregate and logically complete a stream record by, for example, a timeout. Below is an excerpt from the SQL database of our own billing system that logs information about traffic flows:

It should be noted the case when the access device performs address translation (NAT, masquerading) to organize access to the Internet for computers on the local network using a single, external, public IP address. In this case special mechanism performs substitution of IP addresses and TCP / UDP ports of traffic packets, replacing internal (not routable on the Internet) addresses according to its dynamic translation table. In this configuration, it must be remembered that in order to correctly record data on internal network hosts, statistics must be collected in a way and in the place where the translation result does not yet “anonymize” internal addresses.

Methods for collecting information about traffic / statistics

You can capture and process information about passing traffic directly on the access device itself (PC router, VPN server), transferring it from this device to a separate server (NetFlow, SNMP), or “from the wire” (tap, SPAN). Let's analyze all the options in order.

PC router

Consider the simplest case - an access device (router) based on a PC with Linux OS.

How to set up such a server, address translation and routing, much has been written. We are interested in the next logical step - information on how to obtain information about the traffic passing through such a server. There are three common ways:

interception (copying) of packets passing through the server network card using the libpcap library
interception of packets passing through the built-in firewall
use of third-party tools for converting per-packet statistics (obtained by one of the two previous methods) into a stream of aggregated information netflow

libpcap

In the first case, a copy of the packet passing through the interface, after passing through the filter (man pcap-filter), can be requested by a client program on the server written using this library. The packet arrives with a Layer 2 (Ethernet) header. It is possible to limit the length of the captured information (if we are only interested in the information from its header). Examples of such programs are tcpdump and Wireshark. There is a Windows implementation of libpcap. In the case of using address translation on a PC router, such interception can only be performed on its internal interface connected to local users. On the external interface, after translation, IP packets do not contain information about the internal hosts of the network. However, with this method, it is impossible to take into account the traffic generated by the server itself on the Internet (which is important if a web or mail service is running on it).

The operation of libpcap requires support from the operating system, which currently comes down to installing a single library. In this case, the application (user) program that collects packages must:

open required interface
specify the filter through which to pass received packets, the size of the captured part (snaplen), the size of the buffer,
set the promisc parameter, which puts the network interface into capture mode for all packets passing by in general, and not just those addressed to the MAC address of this interface
set a function (callback) to be called on each received packet.

When transmitting a packet through the selected interface, after passing the filter, this function receives a buffer containing Ethernet, (VLAN), IP, etc. headers, total size up to snaplen. Since the libcap library copies packages, it is not possible to block their passage with it. In this case, the traffic collection and processing program will have to use alternative methods, for example, calling a script to place the specified IP address in the traffic blocking rule.

Firewall

Capturing data passing through the firewall allows you to take into account both the traffic of the server itself and the traffic of network users, even when address translation is running. The main thing in this case is to correctly formulate the capture rule, and put it in Right place. This rule activates the transmission of the packet towards the system library, from where the traffic accounting and management application can receive it. For Linux OS, iptables is used as a firewall, and interception tools are ipq, netfliter_queue or ulog . For OC FreeBSD - ipfw with rules like tee or divert . In any case, the firewall mechanism is supplemented by the ability to work with the user program in the following way:

A user program - a traffic handler registers itself in the system using a system call, or a library.
The user program or an external script sets a rule in the firewall, "wrapping" the selected traffic (according to the rule) inside the handler.
For each passing packet, the handler receives its contents in the form of a memory buffer (with IP headers, etc. After processing (accounting), the program must also tell the operating system kernel what to do next with such a packet - discard or pass it on. Alternatively, it is possible pass the modified packet to the kernel.

Since the IP packet is not copied, but sent to the analysis software, it becomes possible to "eject" it, and therefore, completely or partially limit traffic of a certain type (for example, to a selected local network subscriber). However, if the application stops responding to the kernel about its decision (hangs, for example), traffic through the server is simply blocked.
It should be noted that the described mechanisms, with significant amounts of transmitted traffic, create an excessive load on the server, which is associated with constant copying of data from the kernel to the user program. The method of collecting statistics at the level of the OS kernel does not have this drawback, with the issuance of aggregated statistics to the application program using the NetFlow protocol.

Netflow

This protocol was developed by Cisco Systems to export traffic information from routers for the purpose of traffic accounting and analysis. The most popular now version 5 provides the recipient with a structured data stream in the form of UDP packets containing information about the past traffic in the form of so-called flow records:

The volume of information about traffic is several orders of magnitude smaller than the traffic itself, which is especially important in large and distributed networks. Of course, it is impossible to block the transfer of information when collecting statistics on netflow (if you do not use additional mechanisms).
Currently becoming popular further development of this protocol is version 9, based on the flow record template structure, an implementation for devices from other manufacturers (sFlow). Recently, the IPFIX standard has been adopted, which allows statistics to be transmitted over protocols of deeper levels (for example, by application type).
The implementation of netflow sources (agents, probes) is available for PC routers, both in the form of utilities working according to the mechanisms described above (flowprobe, softflowd), and directly built into the OS kernel (FreeBSD: , Linux: ). For software routers, the netflow statistics stream can be received and processed locally on the router itself, or sent over the network (transmission protocol - over UDP) to the receiving device (collector).

The collector program can collect information from many sources at once, being able to distinguish between their traffic even with overlapping address spaces. With the help of additional tools, such as nprobe, it is also possible to carry out additional data aggregation, stream bifurcation or protocol conversion, which is important when managing a large and distributed network with dozens of routers.

The netflow export functions support routers from Cisco Systems, Mikrotik, and some others. Similar functionality (with other export protocols) is supported by all major network equipment manufacturers.

libpcap "outside"

Let's complicate the task a little. What if your access device is a third party hardware router? For example, D-Link, ASUS, Trendnet, etc. On it, most likely, it is impossible to put an additional software tool for retrieving data. Alternatively, you have an intelligent access device, but it is not possible to configure it (no rights, or it is controlled by your provider). In this case, it is possible to collect information about traffic directly at the junction point of the access device with the internal network, using the "hardware" means of copying packets. In this case, you will certainly need a separate server with a dedicated network card to receive copies of Ethernet packets.
The server must use the packet collection mechanism according to the libpcap method described above, and our task is to submit a data stream identical to the output from the access server to the input of the network card allocated for this. For this you can use:

Ethernet hub: A device that simply forwards packets between all of its ports indiscriminately. In modern realities, it can be found somewhere in a dusty warehouse, and this method is not recommended: it is unreliable, low speed(there are no hubs at a speed of 1 Gbps)
Ethernet - a switch with the ability to mirror (mirroring, SPAN ports. Modern intelligent (and expensive) switches allow you to copy all traffic (incoming, outgoing, both) to the specified port of another physical interface, VLAN, including remote (RSPAN)
Hardware splitter, which may require installation to collect two network cards instead of one - and this is in addition to the main, system one.

Naturally, you can configure the SPAN port on the access device itself (router), if it allows it - Cisco Catalyst 6500, Cisco ASA. Here is an example of such a configuration for a Cisco switch:
monitor session 1 source vlan 100 ! where do we get packages from
monitor session 1 destination interface Gi6/3! where do we ship packages?

SNMP

What if there is no router under our control, there is no desire to contact netflow, we are not interested in the details of the traffic of our users. They are simply connected to the network through a managed switch, and we just need to roughly estimate the amount of traffic that falls on each of its ports. As you know, network devices with the ability remote control support, and can display counters of packets (bytes) passing through network interfaces. To poll them, it would be correct to use the standardized SNMP remote management protocol. Using it, you can quite simply get not only the values of the specified counters, but also other parameters, such as the name and description of the interface, MAC addresses visible through it, and other useful information. This is done both by command line utilities (snmpwalk), graphical SNMP browsers, and more sophisticated network monitoring programs (rrdtools , cacti , zabbix , whats up gold, etc.). However, this method has two significant drawbacks:

traffic blocking can only be done by completely disabling the interface, using the same SNMP
traffic counters taken via SNMP refer to the sum of the lengths of Ethernet packets (with unicast, broadcast and multicast separately), while the rest of the tools described earlier give values relative to IP packets. This creates a noticeable discrepancy (especially on short packets) due to the overhead caused by the length of the Ethernet header (however, this can be approximately dealt with: L3_bytes = L2_bytes - L2_packets*38).

VPN

Separately, it is worth considering the case of user access to the network by explicitly establishing a connection to the access server. A classic example is the good old dial-up, whose analogue in modern world are VPN remote access services (PPTP, PPPoE, L2TP, OpenVPN, IPSEC)

The access device not only routes user IP traffic, but also acts as a specialized VPN server and terminates logical tunnels (often encrypted) within which user traffic is transmitted.
To account for such traffic, you can use both all the tools described above (and they are well suited for in-depth analysis by ports / protocols), as well as additional mechanisms that provide VPN access control tools. First of all, we will talk about the RADIUS protocol. His work is a rather complex topic. We will briefly mention that control (authorization) of access to the VPN server (RADIUS client) is controlled by a special application (RADIUS server), which has a database (text file, SQL, Active Directory) of valid users with their attributes (restrictions on connection speed, assigned IP addresses). In addition to the authorization process, the client periodically sends accounting messages to the server, information about the status of each currently running VPN session, including counters of transmitted bytes and packets.

Conclusion

Let's summarize all the methods of collecting traffic information described above together:

Let's sum up a little. In practice, there is a large number of methods of connecting a network you manage (with clients or office subscribers) to an external network infrastructure using a number of access tools - software and hardware routers, switches, VPN servers. However, in almost any case, you can come up with a scheme when information about the traffic transmitted over the network can be directed to a software or hardware tool for its analysis and control. It is also possible that this tool will allow feedback with the access device, applying intelligent access restriction algorithms for individual clients, protocols, and more.
This concludes the analysis of materiel. Of the unresolved topics remained:

how and where the collected traffic data goes
traffic accounting software
what is the difference between billing and a simple “counter”
how to limit traffic
recording and limiting visited websites

Any administrator sooner or later receives an instruction from the management: "calculate who goes to the network and how much he downloads." For providers, it is supplemented by the tasks of "letting anyone in, taking payment, restricting access." What to count? How? Where? There is a lot of fragmentary information, they are not structured. We will save the novice admin from tedious searches by providing him with general knowledge and useful links to the materiel.
In this article I will try to describe the principles of organizing the collection, accounting and control of traffic on the network. We will consider the issues of the issue, and list possible ways to retrieve information from network devices.

This is the first theoretical article in a series of articles dedicated to the collection, accounting, management and billing of traffic and IT resources.

Internet access structure

In general, the network access structure looks like this:

External resources - the Internet, with all sites, servers, addresses and other things that do not belong to a network that you control.
An access device is a router (hardware or PC-based), switch, VPN server or hub.
Internal resources - a set of computers, subnets, subscribers, whose work in the network must be taken into account or controlled.
Management or accounting server - a device that runs specialized software. It can be functionally combined with a software router.

Network traffic

To transmit an IP packet over wires (or radio), network devices are forced to “wrap” (encapsulate) it into a Layer 2 (L2) protocol packet. The most common protocol of this type is Ethernet. The actual transfer "to the wire" is at the 1st level. Usually, the access device (router) does not analyze packet headers at a level higher than 4th (an exception is intelligent firewalls).
Information from the fields of addresses, ports, protocols and length counters from the L3 and L4 headers of data packets make up the “source material” that is used in traffic accounting and management. The actual amount of information to be transferred is in the Length field of the IP header (including the length of the header itself). By the way, due to packet fragmentation due to the MTU mechanism, the total amount of data transmitted is always greater than the payload size.

It should be noted the case when the access device performs address translation (NAT, masquerading) to organize access to the Internet for computers on the local network using a single, external, public IP address. In this case, a special mechanism performs the substitution of IP addresses and TCP / UDP ports of traffic packets, replacing internal (not routable on the Internet) addresses according to its dynamic translation table. In this configuration, it must be remembered that in order to correctly record data on internal network hosts, statistics must be collected in a way and in the place where the translation result does not yet “anonymize” internal addresses.

Methods for collecting information about traffic / statistics

PC router

Consider the simplest case - an access device (router) based on a PC with Linux OS.

interception (copying) of packets passing through the server network card using the libpcap library
interception of packets passing through the built-in firewall
use of third-party tools for converting per-packet statistics (obtained by one of the two previous methods) into a stream of aggregated information netflow

libpcap

open required interface
specify the filter through which to pass received packets, the size of the captured part (snaplen), the size of the buffer,
set the promisc parameter, which puts the network interface into capture mode for all packets passing by in general, and not just those addressed to the MAC address of this interface
set a function (callback) to be called on each received packet.

Firewall

Capturing data passing through the firewall allows you to take into account both the traffic of the server itself and the traffic of network users, even when address translation is running. The main thing in this case is to correctly formulate the capture rule and put it in the right place. This rule activates the transmission of the packet towards the system library, from where the traffic accounting and management application can receive it. For Linux OS, iptables is used as a firewall, and interception tools are ipq, netfliter_queue or ulog . For OC FreeBSD - ipfw with rules like tee or divert . In any case, the firewall mechanism is supplemented by the ability to work with the user program in the following way:

A user program - a traffic handler registers itself in the system using a system call, or a library.
The user program or an external script sets a rule in the firewall, "wrapping" the selected traffic (according to the rule) inside the handler.
For each passing packet, the handler receives its contents in the form of a memory buffer (with IP headers, etc. After processing (accounting), the program must also tell the operating system kernel what to do next with such a packet - discard or pass it on. Alternatively, it is possible pass the modified packet to the kernel.

Netflow

The volume of information about traffic is several orders of magnitude smaller than the traffic itself, which is especially important in large and distributed networks. Of course, it is impossible to block the transfer of information when collecting statistics on netflow (if you do not use additional mechanisms).
Currently, the further development of this protocol is becoming popular - version 9, based on the flow record template structure, an implementation for devices from other manufacturers (sFlow). Recently, the IPFIX standard has been adopted, which allows statistics to be transmitted over protocols of deeper levels (for example, by application type).
The implementation of netflow sources (agents, probes) is available for PC routers, both in the form of utilities working according to the mechanisms described above (flowprobe, softflowd), and directly built into the OS kernel (FreeBSD: ng_netgraph , Linux: ). For software routers, the netflow statistics stream can be received and processed locally on the router itself, or sent over the network (transmission protocol - over UDP) to the receiving device (collector).

The collector program can collect information from many sources at once, being able to distinguish between their traffic even with overlapping address spaces. With the help of additional tools, such as nprobe, it is also possible to carry out additional data aggregation, stream bifurcation or protocol conversion, which is important when managing a large and distributed network with dozens of routers.

libpcap "outside"

Ethernet hub: A device that simply forwards packets between all of its ports indiscriminately. In modern realities, it can be found somewhere in a dusty warehouse, and this method is not recommended: unreliable, low speed (there are no hubs at a speed of 1 Gbps)
Ethernet - a switch with the ability to mirror (mirroring, SPAN ports. Modern intelligent (and expensive) switches allow you to copy all traffic (incoming, outgoing, both) to the specified port of another physical interface, VLAN, including remote (RSPAN)
Hardware splitter, which may require installation to collect two network cards instead of one - and this is in addition to the main, system one.

SNMP

What if there is no router under our control, there is no desire to contact netflow, we are not interested in the details of the traffic of our users. They are simply connected to the network through a managed switch, and we just need to roughly estimate the amount of traffic that falls on each of its ports. As you know, remotely managed network devices support and can display counters of packets (bytes) passing through network interfaces. To poll them, it would be correct to use the standardized SNMP remote management protocol. Using it, you can quite simply get not only the values of the specified counters, but also other parameters, such as the name and description of the interface, MAC addresses visible through it, and other useful information. This is done both by command line utilities (snmpwalk), graphical SNMP browsers, and more sophisticated network monitoring programs (rrdtools , cacti , zabbix , whats up gold, etc.). However, this method has two significant drawbacks:

traffic blocking can only be done by completely disabling the interface, using the same SNMP
traffic counters taken via SNMP refer to the sum of the lengths of Ethernet packets (with unicast, broadcast and multicast separately), while the rest of the tools described earlier give values relative to IP packets. This creates a noticeable discrepancy (especially on short packets) due to the overhead caused by the length of the Ethernet header (however, this can be approximately dealt with: L3_bytes = L2_bytes - L2_packets*38).

VPN

Separately, it is worth considering the case of user access to the network by explicitly establishing a connection to the access server. A classic example is the good old dial-up, the analogue of which in the modern world is remote access VPN services (PPTP, PPPoE, L2TP, OpenVPN, IPSEC)

Conclusion

Let's summarize all the methods of collecting traffic information described above together:

Let's sum up a little. In practice, there are a large number of methods for connecting a network you manage (with clients or office subscribers) to an external network infrastructure using a number of access tools - software and hardware routers, switches, VPN servers. However, in almost any case, you can come up with a scheme when information about the traffic transmitted over the network can be directed to a software or hardware tool for its analysis and management. It is also possible that this tool will allow feedback from the access device, applying intelligent access restriction algorithms for individual clients, protocols, and more.
This concludes the analysis of materiel. Of the unresolved topics remained:

how and where the collected traffic data goes
traffic accounting software
what is the difference between billing and a simple “counter”
how to limit traffic
recording and limiting visited websites

Tags: Add tags