Storage

Archive Compression

Archive Compression Overview

The Data Archiver performs archive compression procedures to conserve disk storage space. Archive compression can be used on tags populated by any method (collector, migration, File collector, SDK programs, Excel, etc.)

Archive compression is a tag property. Archive compression can be enabled or disabled on different tags and can have different deadbands.

Archive compression applies to numeric data types (scaled, float, double float, integer and double integer). It does not apply to string or blob data types. Archive compression is useful only for analog values, not digital values.

Archive compression can result in fewer raw samples stored to disk than were sent by collector.

If all samples are stored, the required storage space cannot be reduced. If we can safely discard any samples, then some storage space can be reduced. Briefly, points along a straight or linearly sloping line can be safely dropped without information loss to the user. The dropped points can be reconstructed by linear interpolation during data retrieval. The user will still retrieve real-world values, even though fewer points were stored.

Archive Compression uses a held sample. This is a sample held in memory but not yet written to disk. The incoming sample always becomes the held sample. When an incoming sample is received, the currently- held sample is either written to disk or discarded. If the currently-held sample is always sent to disk, no compression occurs. If the currently-held sample is discarded, nothing is written to disk and storage space is conserved.

In other words, collector compression occurs when the collected incoming value is discarded. Archive compression occurs when the currently-held value is discarded.

Held samples are written to disk when archive compression is disabled or the archiver is shut down.

Any sample written to disk is a true incoming sample. No timestamp or value or quality is ever changed or interpolated.
Note: internal performance tags use an archive compression deadband of 0% or close to 0%.

Archive Compression Logic

The following describes the logic that is executed on every sample written to the archiver while archive compression is enabled for the tag:
IF the incoming sample data quality = held sample data quality
IF the new point is a bad
Toss the value to avoid repeated bads. Do we toss new bad or old bad?
ELSE
Decide if the new value exceeds the archive compression deadband/
ELSE//
data quality is changed, flush held to disk
IF we have exceeded deadband or changed quality
// Store the old held sample in the archive
// Set up new deadband threshold using incoming value and value written to disk.

// Make the incoming value the held value

Example: Change of data quality

The effect of archive compression is demonstrated in the following examples.

This example demonstrates that:
  • A change in data quality causes held samples to be stored.
  • Held samples are returned only in a current value sampling mode query.
  • Restarting the archiver causes the held sample to be flushed to disk.
Normally, a flat straight line would never cause the held value to be written to disk. An important exception is that changes in data quality force the held value to be written to disk. Assume a large archive compression deadband, such as 75% on a 0 to 100 EGU span.
Time Value Quality
t) 2 Good
t1 2 Bad
t2 2 Good

The following SQL query lets you see which data values were stored:

Select * from ihRawData where samplingmode=rawbytime and tagname = t20.ai-1.f_cv and timestamp > today

Notice that the value at t2 does not show up in a RawByTime query because it is a held sample. The held sample would appear in a current value query, but not in any other sampling mode:

select * from ihRawData where samplingmode=CurrentValue and tagname = t20.ai-1.f_cv

The points should accurately reflect the true time period for which the data quality was bad.

Shutting down and restarting the archiver forces it to write the held sample. Running the same SQL query would show that all 3 samples would be stored due to the change in data quality.

Archive Compression of Straight Line

In this case we have a straight flat line. Assume a small archive compression deadband, say 2% on a 0 to 100 EGU span. Since data occurs on a straight flat line, the deadband will never be exceeded.
Time Value Quality
t0 2 Good
t0+5 2 Bad
t0+10 2 Good
t0+15 2 Good
t0+20 2 Good

Shut down and restart the archiver, then perform the following SQL query:

select * from ihRawData where samplingmode=rawbytime and tagname = t20.ai-1.f_cv and timestamp > today

Only t0 and t0+20 were stored. T0 is the first point and T0+20 is the held sample written to disk on archiver shutdown, even though no deadband was exceeded.

Bad Data

This example demonstrates that repeated bad values are not stored. Assume a large archive compression deadband, such as, 75%.
Time Value Quality
t0 2 Good
t0+5 2 Bad
t0+10 2 Bad
t0+15 2 Bad
t0+20 2 Good
t0+25 3 Good
  • The t0+5 value is stored because of the change in data quality.
  • The t0+10 value is not stored because repeated bad values are not stored.
  • The t0+15 value is stored when the t0+20 comes in because of a change of quality.

Disabling Archive Compression for a Tag

Assume a large archive compression deadband, such as 75%
Time Value Quality
t0 2 Good
t0+5 10 Good
t0+10 99 Good
t0+15 Archive compression disabled
  • The t0 value is stored because it is the first sample.
  • The t0+5 is stored when the t0+10 comes in.
  • The t0+10 is stored when archive compression is disabled for the tag.

Archive Compression of Good Data

This example demonstrates that the held value is written to disk when the deadband is exceeded.

In this case, we have an upward ramping line. Assume a large archive compression deadband, such as 75% on a 0 to 100 EGU span.

Time Value Quality
t0 2 Good
t0+5 10 Good
t0+10 10 Good
t0+15 10 Good
t0+20 99 Good

Shut down and restart the archiver, then perform the following SQL query:

select * from ihRawData where samplingmode=rawbytime and tagname = t20.ai-1.f_cv and timestamp > today

Because of archive compression, the t0+5 and t0+10 values are not stored. The t0+15 value is stored when the t0+20 arrives. The t0+20 value would not be stored until a future sample arrives, no matter how long that takes.

Determining Whether Held Values are Written During Archive Compression

When archive compression is enabled for a tag, its current value is held in memory and not immediately written to disk. When a new value is received, the actual value of the tag is compared to the expected value to determine whether or not the held value should be written to disk. If the values are sufficiently different, the held value is written. This is sometimes described as "exceeding archive compression".

Archive compression uses a deadband on the slope of the line connecting the data points, not the value or time stamp of the points themselves. The archive compression algorithm calculates out the next expected value based on this slope, applies a deadband value, and checks whether the new value exceeds that deadband.

The "expected" value is what the value would be if the line continued with the same slope. A deadband value is an allowable deviation. If the new value is within the range of the expected value, plus or minus the deadband, it does not exceed archive compression and the current held value is not written to disk. (To be precise, the deadband is centered on the expected value, so that the actual range is plus or minus half of the deadband.)

Exceeding Archive Compression

EGUs are 0 to 200000 for a simulation tag.

Enter 2% Archive compression. This displays as 4,000 EGU units in the administration UI.

When a sample arrives, the archiver calculates the next expected value based on the slope and time since the last value was written. Let's say that the expected value is 17,000.

The deadband of 4,000 is centered, so the archiver adds and subtracts 2,000 from the expected value. Thus, the actual value must be from 15,000 to 19,000 inclusive for it to be ignored by the compression algorithm.

In other words, the actual value must be less than 15,000 or greater than 19,000 for it to exceed compression and for the held value to be written.

Determining Expected Value

The Archive Compression algorithm calculates the expected value from the slope, time, and offset (a combination of previous values and its timestamp):

ExpectedValue = m_CompSlope * Time + m_CompOffset;

Where

m_CompSlope = deltaValue / deltaT
m_CompOffset = lastValue - (m_CompSlope * LastTimeSecs)

Determining Expected Value

Values arriving into the archiver for tag1 are

Time Value
t0 2
t0+5 10
t0+10 20
The expected value at time stamp t0+15 is calculated based on the samples at t0+5 and t0+10:
m_CompSlope = deltaValue / deltaTime m_CompSlope = (20-10) / 5
m_CompSlope = 2
m_CompOffset = lastValue - (m_CompSlope * LastTimeSecs)
m_CompOffset = 20 - (2 * 10)
m_CompOffset = 0
ExpectedValue = m_CompSlope * Time + m_CompOffset; 
ExpectedValue = 2 * 15 + 0;
ExpectedValue = 30

The expected value at t0+15 is 30.

Archive Compression of a Ramping Tag

An iFIX tag is associated with an RA register. This value ramps up to 100 then drops immediately to 0.

Assume a 5-second poll time in Historian. How much archive compression can be performed to still "store" the same information?

With an archive compression of 75%, 25% or 5%, only the change in direction is logged:
11-Mar-2003 19:31:40.000 0.17 Good NonSpecific
11-Mar-2003 19:32:35.000 90.17 Good NonSpecific 
11-Mar-2003 19:32:40.000 0.17 Good NonSpecific
11-Mar-2003 19:33:35.000 91.83 Good NonSpecific 
11-Mar-2003 19:33:40.000 0.17 Good NonSpecific 
An archive compression of 1% stores the most samples.

An archive compression of 0% logs every incoming sample. Even on a perfectly ramping signal with no deviations, 0% compression conserves no storage space and essentially disables archive compression.

Archive Compression of a Drifting Tag

A drifting tag is one that ramps up, but for which the value barely falls within the deadband each time. Even though a new held sample is created and the current one discarded, the slope is not updated unless the current slope exceeded. With a properly chosen deadband value, this is irrelevant: by specifying a certain deadband, the user is saying that the process is tolerant of changes within that deadband value and that they do not need to be logged.

Archive Compression of a Filling Tank

In the case of a filling tank, the value (representing the fill level) ramps up, then stops. In this case, the system also uses collector compression, so when the value stops ramping, no more data is sent to the archiver. At some point in the future, the value will begin increasing again.

As long as nothing is sent to the archiver, no raw samples are stored. During this flat time (or plateau), it will appear flat in an interpolated retrieval because the last point is stretched. This illustrates that you should use interpolated retrieval methods on archived compressed data.

How Archive Compression Timeout Works

The Archive Compression Timeout value describes a period of time at which the held value is written to disk. If a value is held for a period of time that exceeds the timeout period, the next data point is considered to exceed the deadband value, regardless of the actual data received or the calculated slope.

After the value is written due to an archive compression timeout period, the timeout timer is reset and compression continues as normal.

Archive De-fragmentation - An Overview

What is De-fragmentation? Having an IHA file with contiguous data values for a particular tag is called Archive De-fragmentation.

An Historian IHA (Historian Data Archive) file could contain data values for multiple tags and data written for a particular tag may not be in continuous blocks. This means data values in IHA files are fragmented.

Archive De-fragmentation improves the performance of reading and processing of archive data dramatically.

De-fragmentation of existing archives:
  • An archive can be de-fragmented, when it is not active.
  • A command line based tool can be used to run on any existing archive as needed.
  • De-fragmentation can be done on all versions of archives, and the resulting archive will be the latest version.
  • The de-fragmentation must be started manually.

De-fragmenting an existing archive

About this task

A command line based tool is available to de-fragment an existing archive. This tool can be run on any existing archive(s). After the de-fragmentation, the new archive will be slightly smaller in size than the original one.

To de-fragment an archive using the tool:

Procedure

  1. Select a read-only archive to backup.
  2. Backup the archive.
  3. Transfer the archive from the production machine to another machine.
  4. Run the ihArchiveDefrag tool on the archive.
    -------------------------------------------------------------------------------
    Usage (Single File Mode): Operates on one archive at a time.
    
    ihArchiveDefrag [-v][-d][-y][-h][-s] SrcArchive [DestArchive] [Config.ihc]
    
    -------------------------------------------------------------------------------
    Usage (Directory Mode): Processes all archives in the source directory.
    
    ihArchiveDefrag [-v][-d][-y][-h][-s] SrcDir [DestDir] [Config.ihc]
    
    -------------------------------------------------------------------------------
    
          [-v]   Verbose logging to the logfile.
          [-c]   Skip the defrag step.  Only compare/verify the archives.
          [-d]   Skip the verify step.  Only defrag the archive.
          [-h]   Displays this help information.
          [-s]   Source is on a SSD.  Skip the pre-read step as it is not needed.
          [-y]   Yes.  Don't ask questions, answer YES.  Just do the work.
    
      * The Config.ihc is only needed for 4.0 or older archives
      * If the DestArchive is not provided, a name will be generated.
      * In directory mode, if the DestDir is not provided a new directory called 'DefragFiles' will be created under the SrcDir.
    
    Examples:
    
       ihArchiveDefrag User_Node_Archive001.iha
       ihArchiveDefrag User_Node_Archive001.iha D:\Output\User_Node_Archive001.iha
       ihArchiveDefrag -y c:\Historian\Archives\Backups
       ihArchiveDefrag -y c:\Historian\Archives\Backups d:\DefragOuput
    
    Note:
    • As part of this process the source archive MAY be modified, It is HIGHLY recommended that this tool be run on a disposable copy of the archive just in case it is modified.
    • De-fragmenting an archive is a disk intensive operation and can be slow (minutes to hours to run). Running on a machine with memory greater than twice the archive size is helpful.
    • If the performance is not satisfactory it is recommended that the source archive be on an SSD when running ihArchiveDefrag.
    • It is recommended that de-fragmentation should not be run on a production system as it can affect the performance of the production system.
  5. Transfer the resulting new archive back to the production machine.
  6. Unload the existing archive.
  7. Load the new de-fragmented archive.

About Storing Future Data

You can store future data in Historian. This future data is the predicted data, which has a future timestamp. You can use this data to perform a predictive or forecast analysis, and revise the forecasting algorithms as needed.

The data is stored in the Historian Archiver. You can use it to analyse both the historical data and future data (for example, using trend charts), and take necessary actions. This allows you to refine the way the data to be stored in Historian is received and processed.

By default, Historian stores up to 1200 seconds of future data. However, if you enable storage of future data, you can store data up to the following timestamp: 03:14:07 on Tuesday, 19 January, 2038.
Note: You can store future data beyond the license expiry date of Historian. For example, even if the license expires by May 31, 2022, you can store future data that is predicted till 19 January, 2038. However, after the license expires, you cannot use this feature.

You can store future data related to all the data types used in Historian. You can store future data for all the tags associated with a data store.

The following collectors support future data (that is, collect the future timestamp of the data):
  • The OPC DA collector
  • The OPCUA DA collector
  • The OPCHDA collector

To use this feature, you must enable the storage of future data. You can do this using sample programs or Command Line.

You can then retrieve and/or extract this data using any of the available options such as Historian Administrator, Rest APIs, Excel Add-In, and the Historian Interactive SQL application (ihsql.exe).

The following conditions apply when you store future data:
  • You can enable the future data storage only for a data store.
  • You can collect future data only using the OPC Data Access, OPCUA Data Access, and OPC HDA collectors. You cannot collect future data using Calculation collectors such as Server-to-Server, Server-to-Distributor, and Calculation.
  • When you select the Last 10 Values option for a tag, the results include the last ten values till the current date and time; the results do not include future data. If you want to view future data, you can filter the data based on the start and end dates.
  • Future data is stored till 19 January, 2038.
    Note: You can store future data beyond the license expiry date of Historian. For example, even if the license expires by May 31, 2022, you can store future data that is predicted till 19 January, 2038. However, after the license expires, you cannot use this feature.
Best practices:
  • For a tag for which you want to store future data, do not store past data.
  • As this feature can lead to out-of-order data, use this feature carefully to avoid load on the server. For example, store future data in the chronological order of time.
Known Issues:
  • Data recalculation does not work for future data.

Suppose you have enabled the storage of future data for a data store named FDataStore.

Suppose the current time is 9.00 am, April 13, 2020. And, future data is stored from 11.00 am onwards, and a size-based archive is created to store the data.

Data can be stored in the archive file only from 11.00 am onwards.

Scenario 1: The timestamp of the data is in the past. For example: 11.00 am, April 12, 2020. A new archive file will be created to store this data. Therefore, to reduce load on the server, we strongly recommend that you store future data in the chronological order of time.

Scenario 2: The timestamp of the data is beyond the outside-active-hours of the archive file. For example: 11.00 am, December 12, 2020. Suppose the current archive file is active only for today. Data will not be stored in the archive file because the timestamp of the data falls beyond the outside-active-hours value of the archive file. To avoid this issue, a new archive file must be created in such scenarios. To do so, you must enable offline archive creation.

Scenario 3: The timestamp of the data is much further in the future. For example: 11.00 am, April 12, 2025. The current archive file may be used to store the data for this timestamp as well. This results in an optimum usage of the archive file instead of creating multiple archive files.

Enable Storing Future Data Using Command Line

Before you begin

Ensure that you have a mechanism to generate future data that you want to store in Historian.

About this task

By default, Historian stores up to 1200 seconds of future data. However, if you enable storage of future data, you can store data up to the following timestamp: 03:14:07 on Tuesday, 19 January, 2038. This topic describes how to enable storage of future data using Command Line. You can perform this task for any data store other than the default USER data store.

Procedure

  1. Stop the Historian DataArchiver service.
  2. Open Command Prompt with elevated privileges or administrator privileges.
  3. Navigate to the folder in which the ihDataArchiver_x64.exe file is located. By default, it is C:\Program Files\Proficy\Proficy Historian\x64\Server.
  4. Run the following command: ihDataArchiver_x64 OPTION.<data store name> ihArchiverAllowFutureDataWrites 1
  5. Run the following command: ihDataArchiver_x64 OPTION.<data store name> ihArchiveActiveHours <number of active hours>
  6. Start the Historian DataArchiver service.

Enable Storage of Future Data

Before you begin

Ensure that you have a mechanism to generate future data that you want to store in Historian.

About this task

By default, Historian stores up to 1200 seconds of future data. However, if you enable storage of future data, you can store data up to the following timestamp: 03:14:07 on Tuesday, 19 January, 2038. This topic describes how to enable storage of future data.

Procedure

  1. For each data store for which you want to store future data:
    1. Enable the ihArchiverAllowFutureDataWrites option.
      Tip: You can perform this step using the sample programs or Command Line.
    2. Enable the ihArchiveCreateOfflineArchive option.

      This is to avoid receiving an outside-active-hours error. It happens if you attempt to store data when the current archive file is set to read-only.

      Tip: You can perform this step using Command Line.
  2. For each collector that sends data to the data store, set the Time Assigned By property to Source.
    Note: You can perform this step only for the OPC Data Access, OPCUA Data Access, and OPC Classic HDA collectors.

Results

Future data is now stored in Historian. You can retrieve the data using any of the available options such as Historian Administrator, Rest APIs, Excel Add-In, and the Historian Interactive SQL application (ihsql.exe).

Sample Program to Enable Future Data

Using C++

The following lines of code provide a sample program to enable the ihArchiverAllowFutureDataWrites option using C++:
void SetFutureDataWriteOptions(void)
{
	
	ihChar enable[50];
	printf("\n\n Please enter 1/0 for enable/disable future data write per data store: ");
	_getws(enable);	
	ihChar dataStoreName[50];
	printf("\n\n Please enter data store name to enable future data writes ");
	_getws(dataStoreName);

	ihAPIStatus Status;
	ihOptionEx option;
	option.Option = ihArchiverAllowFutureDataWrites;
	option.DataStoreName = dataStoreName;
	Status = ihArchiveSetOption(SrvHdl, &option,enable);
	
}

void GetFutureDataWriteOptions()
{
	ihChar dataStoreName[50];
	printf("\n\n Please enter the name of the data store Future Data write Option:  ");
	_getws(dataStoreName);

	ihAPIStatus	 Status;
	ihChar			*Value;
	ihOptionEx temp;

	temp.Option = ihArchiverAllowFutureDataWrites;
	temp.DataStoreName = dataStoreName;

	Status = ihArchiveGetOption(SrvHdl,&temp,&Value);
	
	printf("Archive Future Data Write Option ihArchiverAllowFutureDataWrites value is  - [%ls]  for the data store [%ls]\n", (Value ? Value : L"NULL"), dataStoreName);
	Pause();
}

Using C#

The following lines of code provide a sample program to enable the ihArchiverAllowFutureDataWrites option using C#:
static void Main(string[] args)
        {
            string swtValue = "0";
            string dsName = "";
            ConsoleKeyInfo kInfo;
            connection = ClientConnect.NewConnection;
            do
            {
                Console.WriteLine("1 Set/Enable Data Store Create Offline Archives Option Value");
                Console.WriteLine("2 Set/Enable Data Store Allow Future Data Writes Option Value");
                Console.WriteLine("3 Get Data Store Create Offline Archives Option Value");
                Console.WriteLine("4 Get Data Store Allow Future Data Writes Option Value");
                Console.WriteLine("Enter Option Value");
                swtValue = Console.ReadLine();
                switch (swtValue)
                {
                    case "1":
                        Console.WriteLine("Enter Data Store Name");
                        dsName = Console.ReadLine();
                        connection.IServer.SetOption(HistorianOption.ServerCreateOfflineArchives, "1", dsName);
                        Console.WriteLine("Option value set to " + connection.IServer.GetOption(HistorianOption.ServerCreateOfflineArchives, dsName));
                        Console.ReadKey();
                        break;
                    case "2":
                        Console.WriteLine("Enter Data Store Name");
                        dsName = Console.ReadLine();
                        connection.IServer.SetOption(HistorianOption.ArchiverAllowFutureDataWrites, "1", dsName);
                        Console.WriteLine("Option value set to " + connection.IServer.GetOption(HistorianOption.ArchiverAllowFutureDataWrites, dsName));
                        Console.ReadKey();
                        break;
                    case "3":
                        Console.WriteLine("Enter Data Store Name");
                        dsName = Console.ReadLine();
                        Console.WriteLine("Option value is: " + connection.IServer.GetOption(HistorianOption.ServerCreateOfflineArchives, dsName));
                        Console.ReadKey();
                        break;
                    case "4":
                        Console.WriteLine("Enter Data Store Name");
                        dsName = Console.ReadLine();
                        Console.WriteLine("Option value is: " + connection.IServer.GetOption(HistorianOption.ArchiverAllowFutureDataWrites, dsName));
                        Console.ReadKey();
                        break;
                    default:
                        Console.WriteLine("Please enter valid number");
                        Console.ReadKey();
                        break;
                }
                Console.WriteLine("Please Enter X to exit");
                kInfo = Console.ReadKey();

            } while (kInfo.Key != ConsoleKey.X);
        }
    }

Enable Offline Archive Creation

About this task

This topic describes how to enable offline archive creation. This is to avoid receiving an outside-active-hours error. It happens if you attempt to store data when the current archive file is set to read-only.

Procedure

  1. Stop the Historian DataArchiver service.
  2. Open Command Prompt with elevated privileges or administrator privileges.
  3. Navigate to the folder in which the ihDataArchiver_x64.exe file is located. By default, it is C:\Program Files\Proficy\Proficy Historian\x64\Server.
  4. To enable the ihArchiveCreateOfflineArchive option for a data store, run the following command: ihDataArchiver_x64 OPTION.<data store name> ihArchiveCreateOfflineArchive 1
  5. To enable the ihArchiverAllowFutureDataWrites option:
    1. Run the following command: ihDataArchiver_x64 OPTION.<data store name> ihArchiverAllowFutureDataWrites 1
    2. Set the Read only After(Hrs) property :::: [specify value in number of hours)]ihDataArchiver_x64 OPTION.<data store name> ihArchiveActiveHours <number of active hours>.
  6. Start the Historian DataArchiver service.