Create a Baseline for Network monitoring and alarming
Create baseline ... Image sources: google

Create a Baseline for Network monitoring and alarming

This is a short article, so many things might be missing or not described.

To establish a baseline, the first step is to ask yourself, "In what ways can I characterize network performance, and how can I produce a baseline of them?" In the soft form, "What should I observe?"

Mostly, I divide this process into small points such as:

  1. Identify the critical point of network and business. There might be that you might give more preference to specific Access switches and a few access switches' preferences are too low. First, you have to sit with your business unit and need to identify the critical part of your network.
  2. The speed at which the network converges:- this is a little long process and sometime might take a few weeks to months to calculate an ideal value for your network. Measuring the delay & jitter for a stream as it passes across an internationally failed path during an MW. Routing table conversation or conversation timer. STP or other loop prevention protocols such as ELRP conversation timer, etc.
  3. The rate at which changes occur in the network: How often links change state, how often a failure of external provider as Cloud Provider, your monitoring system provider, ISP provider, etc.
  4. The utilization of links throughout the network: measure the utilization pattern across the common periods of time. Different rates as hourly, daily, weekly, and monthly can be very useful in determining whether there's really a problem. This is also true that you can't monitor each port in the network. So you have to draw critical and important links in your network and monitor with different protocols such as "Netflow, SNMP, API, RSPAN, etc.
  5. QoS flow of your network and applications: Applications rely on consistent jitter and delay to work correctly. You can't know what "wrong" is if you don't know what "correct" looks like.

A few other things must also be kept an eye on, such as:

  1. Collecting data sampling from different parts of the network in working conditions.
  2. Devices resource & life cycle management. You will not get the router or switch performance same in low and high CPU or memory uses.
  3. Arrange your documents and keep updating.
  4. Established a CMDB process.
  5. I recommended creating snapshots or shorts of your monitoring data in three different phases Normal working conditions, failure path, and not working conditions.
  6. Who, Where, what, and When states of your connected devices or end users must be known by the monitoring systems.
  7. Which Protocol or technology is used by the application as SNMP, ICMP, API, etc? If a few devices are not supporting a particular feature such as Specific MIB or API etc, then upgrade the devices or recalculate it.
  8. Your tools must correlate this monitoring data between devices or in a path. It should not happen that you collected Netflow and checked Source & destination address match or particular stream but the monitoring system can't able to tell you what the starting point of this stream is and what the exit point of this stream is. Otherwise, you end up with check Netflow on each port manually.
  9. Do you need all alarms? Where are your alarm catalogs?

One more point: Can you describe the network cost and its output to the management? If not twice, do you really need to amend the current network? All things can vary from company to company, network to network, etc.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics