Business Process Event Monitor

Measure Overall System Health Based on Multiple Metrics

New Feature: Business Process Event Monitor

FrameFlow v2019.3 introduced a new event monitor called the Business Process Event Monitor. In this blog post we'll go over its features and give you some tips on how to best take advantage of what it has to offer.

Business Processes

Most, if not all, applications and business processes are based on multiple components or units, each of which plays an important role in the overall operation of the process.

For example, an enterprise resource planning system might include web servers for the interface, application servers for data processing, and database servers for storage and archival. These software units sit on top of hardware components such as physical servers, SANs, networking equipment and power systems.

Mission critical systems incorporate redundancy to make sure that the failure of any individual unit does not impact the overall operation of the system.

Enterprise monitoring solutions like FrameFlow can monitor the individual units and alert about failures, but how do you judge the overall health of a complicated system? FrameFlow's new Business Process event monitor gives you a powerful mechanism to measure and report on a system's overall health.

Business Process Event Monitor

With the Business Process event monitor you can define the units that make up the system or business process and then score their health based on the results of your event monitors. The following screenshot you can see an example where we have configure three units.

As you define each unit, you can select the data points that represent its overall health level. Each unit starts with an overall score of 100%. Based on the results of monitoring we will apply negative scores which will allow us to produce an overall score measuring the unit's health.

In this example we have a unit called "Front End" and it is represents the web servers that end users interact with. The web servers are load-balanced so if one goes down, the other can take over. If both are operational then the workload is balanced between them.

For these systems, bandwidth and CPU usage are the two most important metrics. Based on that we have selected those two data points for each of the web server nodes.

Each data point has a status level which can be success, warning, error or critical. The status of a data point depends on the thresholds that you chose when you configured your event monitors. For example, you may have configured a CPU usage monitor with thresholds for warning, error and critical levels.

In the settings for the business unit you can set penalty values for each status level. In our example we have used penalties of 10, 30 and 50 for the bandwidth data points. Remember, the unit starts with a score of 100%. With these settings if the bandwidth on webserver-1 is at an error level, we will deduct 30 points from the score.

We chose the thresholds to get our desired alerting. If bandwidth on both web servers is at a warning level, the overall score will be 80%. If at the same time the CPU is critical on one of them, the score will be reduced by 40 more points giving a score of 30%. Below we will see how this score will be used for alerting.

Defining Multiple Units

The other units in our example are the "Middle Tier" and the "Storage Tier". You'll want to assign different data points to each unit based on how the unit works and which resources it relies on. For example, in a middle tier that handles business logic, bandwidth might not be too important but CPU usage probably is. In a storage tier, disk space is likely to be an important metric in judging the overall health of the business process.

Alerting

After you have defined your business units and selected their data points, it's now time to associate the results with a virtual device and set alerting thresholds.

We used the term "virtual device" because the business process itself is composed of multiple physical and virtual systems. A virtual device acts as a container to hold the results of the business process monitoring and provides a way to show its results on dashboards and in reports. In FrameFlow, a virtual device is like any other device but when you add it, give it the name of the business process like "ERP System" or "Payroll System" instead of a hostname or IP address.

Next, use the option to alert if the total calculated percentage is less than a specified threshold so you get alerts if the health of the business process is degraded.

Wrap Up

The Business Process event monitor gives you a powerful mechanism to score the overall health of complicated systems. It lets you take raw monitoring data and turn it into a high-level metric that provides business information that you can act on. Setting one up can take a bit of time and if you need help, reach out to our support staff anytime. We can offer tips on how to define your units, select the types of data points use, and configure your scores and thresholds.

Try FrameFlow Now

Are you new to FrameFlow? Take it for a spin for free for 30 days and starting taking advantage of its enterprise IT monitoring features.

Try FrameFlow Now