IT infrastructure of a modern enterprise usually includes a number of very powerful automation and control systems. Such functionality as realtime data collection, aggregation and storing do not belong to outstanding or extraordinary activities since last couple of decades. The main focus is now moving away from accumulating data to its ‘intellectual’ processing and analysis. IT (information technology), BI (business intelligence) and other managing and analytical departments are trying to bring additional insights and value to their enterprises by inventing new datadriven control approaches. This paper describes one approach how the realtime historical data may be utilized during so called “what if” enterprise scenario forecast. In our research we conduct the “whatif” research on a power plant and the problem setting with the scenario modelling stages look as follows:
a. Based on the available historical trends from SCADA (supervisory control and data acquisition system) approximate the output power characteristics of the power generating units with a piecewise linear dependencies for different ambient conditions. The approximation is conducted with a least squares (LSQ) method and the resulting model strongly depends on the data captured into the LSQ scope.
b. Calculate crosscorrelation for all output powers with all available measurements collected by power station SCADA system. The influencing factors belong to several major reasons: ambient conditions change, operators accuracy when following the generating plan, assets basic state characteristics.
c. Create several scenarios for research – for this purpose we take a random historical day with known power generating plan and calculated level of power plant margin. Alternative scenarios contain certain plan modifications and the modelling session objective is to define whether each plan deviation could provide additional margin to the power plant (define the better possible generating plan).
d. For each possible approximation generated in p.a and all combinations of influencing parameters of p.b recalculate the day margin with the new plan. Considering all the possible variants we get the MonteCarlo alike procedure of calculations. The resulting effect of each scenario is estimated by comparison of mean values of distribution sets obtained after MonteCarlo modelling session. The number of data points used to create a power plant model for this experiment is calculate as the number of parameters multiplied by observation timeframe duration and frequency of the SCADA trending in events per minute. As it is done in the paper, considering 100 parameters during 10 years with the frequency of 1 measurement per 1 minute leads to a 500 million points analysis at the model generation stage. During scenario modelling stage the typical number of MonteCarlo iterations is usually around 1 million. Both stages assume massive computations and may be considered as big data problems when conducted on a standard server platform. On the whole the paper targets to show a real practical case of such buzz notions of modern industrial IT as ‘digitalization’ and ‘big data’.