Data Your Way: Integrate z Systems Operational Data into Your Existing Analytics Solution

0 Posted by - July 11, 2017 - Blog

More and more enterprises are recognizing the value of IT Operations Analytics (ITOA) to:

  • Reduce their root cause analysis elapsed time
  • Improve the efficiency of overall IT
  • Prevent outages

Usually, initial exploitations of ITOA are limited to data analysis from distributed platforms and possible additions of network logs and performance data. These limitations stem from a historical separation of mainframe applications and teams inside the enterprises.  The difference in the skill required to manage the different platforms and data is also a contributing factor. Alternately, the ability to manage all the IT operational data, from distributed and z Systems, recently become a necessity. The main reason for this shift is that many business-critical applications, behind a user-friendly web interface, and some part running on distributed platforms, rely on the mainframe-backend applications and systems (CICS, IMS, MQ or DB2 on z/OS).

A complete ITOA solution cannot miss mainframe operational data. This solution must also be stored in the same analytic engine as the data coming from the distributed platforms and network. This allows an end-to-end view of the IT systems and applications. While in distributed platforms and networks there are many applications that can retrieve log data and store it in the most used analytic engine, such as Splunk or Elastic Search, the know-how needed to timely retrieve important mainframe operations data is equally important.

Single Point for Data

IBM recently released IBM Common Data Provider for z Systems (CDPz), which is a single collection point to access near real-time operations data (SMF records, RMF data) and a wide variety of log data on z Systems. This product is based on 30+ years of experience in collecting, parsing, and analyzing z Systems operational data. CDPz can collect z Systems IT operations data from more than 140 data sources, including 100+ SMF record types and application logs (WebSphere, CICS). CDPz can also collect data from generic files and allows access to analytics data within minutes. It collects the requested data once and provides multiple end users and analytics products with the data they need.

All the data, the consumer identification, and the desired transformation on that data is easily and graphically defined in a z/OS plug-in. It is a web-user interface (UI) provided with the product, as shown in Figure 1. This web UI allows the user to graphically configure, change, and check the policies defined for the data collection.

Cost control through advanced filtering

The amount of data transferred and ingested in the analytic engine is always a pain point for the enterprises approaching or implementing ITOA. This difficulty is because usually the cost of the analytic engines is directly linked to the quantity of data ingested in them. CDPz offers advanced filtering capabilities to target specific use cases and reduce data volumes and network traffic. Batch data collection is available to control CPU consumption spikes. Moreover, CDPz offers the unique capability to filter the log data based on the content of the data itself. The filtering is based on a regular expression defined by the user who can decide if the log records that match the regular expression defined are to be sent or discarded. Leveraging this feature, it’s possible to satisfy the need to send, for example, only the logs related to a specific application to the data lake accessible from the specific application development team. It’s useful to have the correct level of data segregation to allow access to the data only if required. This feature is crucial to enable a complete DevOps paradigm, where the development team can have all the logs from mainframe, distributed in the same place, with no IBM z expertise.

Figure 1: CDPz Configuration Tool – Web User interface


What if the IT Operations team wants to have user-specific application data in the same analytic engine where the IT Operational Data is stored through CDPz? In this case, the CDPz Open Streaming Application Programming Interface (API) can be leveraged. Through this API, it’s possible to leverage CDPz to send user-application data to a consumer defined through the same web user interface. REXX and Java APIs are provided with the product to enable this feature. The user can define the data source to be used, the structure of the data to be sent (if any), the target consumer of the data, and (if needed) the transformation to be applied on top of the data.

Choice of Analytic Engine

Another need that is becoming more and more popular is to send IT Operational Data in different analytic engines, usually splitting the data based on enterprise policies and on team decisions; for example, only business-critical application data goes into Splunk. Less critical application data is stored in an Elastic Search instance. Another example is the Security team owning a Splunk instance that is used for security data only, and log real-time monitoring is managed with a different Splunk instance. There are many other reasons why users might need to send the same data or a different part of the data to different data lakes (Splunk, Elastic Search, Hadoop, IDAA). CDPz enables users to split and send the data to multiple consumers. IBM decided to be very open and to make CDPz the collection point for z Systems Operational Data for clients that are both on and off platform. Users can now choose whether to send data to IBM Operations Analytics products or to other analytics engines (Splunk, ELK stack, Kafka/Hadoop, Logstash, etc.). In March, IBM created a specific integration path for CDPz to send z Operations Data directly to Splunk. IDC ranked Splunk number 1 in the worldwide ITOA software market share for 2015. Per IDC, there are over 85 of the top Fortune 100 customers using Splunk.

IBM recognizes the importance of giving clients the choice of which analytics engine to send their z Systems operational data. Because of the advanced filtering capability, CDPz can limit the amount of data sent to Splunk (or to the other engine) based on real needs. This feature limits the cost of the overall solution.

Flexible Solution

CDPz is a young/old product. It is young because it was just recently made available by IBM. It’s old because it is based on 30-plus years of experience. It offers great flexibility to satisfy the most advanced and complex user scenarios and needs with a simple and user-friendly configuration and setup. CDPz main strengths are the:

  • Web UI interface to configure the product in terms of data to be collected and the engine to be fed
  • Ability to filter on the content of the data, which allows the users to carefully select what data and where to send it
  • Ability to send data simultaneously to different data lakes

More info about the product is available at:

Author Bio:

Domenico D’Alterio is Offering Manager for IT Operational Analytics with z Systems software.

Domenico joined IBM in 1999, and after a few years as developer and designer, he spent more than 11 years as manager, leading SW development and customer support teams in IBM. In February 2016, Domenico joined the z System Offering Management team.

In his IBM experience, Domenico worked and acquired knowledge in different IT Service Management arenas, such as license compliance management, endpoint management, network management, automation and workload automation.

He is currently responsible for few IBM Offerings in the z IT Operation Analytics, including IBM Common Data Provider for z System.

LinkedIn profile: