About Sending Data to Amazon S3

You can send tag data stored in Data Archiver to Amazon S3 at a scheduled interval. These files are stored as Parquet and/or CSV files in Amazon S3. You can specify the following parameters while exporting tag data to Amazon S3:
  • The list of tags whose data you want to export
  • The export schedule, which you can configure at a granular level
  • The format in which you want to store the tag data (CSV and/or Parquet)
Advantages of storing data in the Parquet file format:
  • Data is stored in Parquet files is efficient and optimizes space.
  • Since Parquet files are compatible with most applications, you can analyze this data in analysis tools such as Amazon Athena, Amazon Redshift, Amazon Kinesis, Amazon QuickSight, Microsoft Power BI, and Tableau.
How it works:
  1. You will create an S3 bucket in which you want to store tag data.
  2. You will deploy the scheduled export. This will be automatically deployed on the same VPC on which you have deployed Proficy Historian for Cloud.
  3. You will configure the settings for the scheduled export. To do so, you will edit and upload the ScheduledExportConfiguration.xml file to the S3 bucket that you have created. This file contains settings related to tags, export schedule, and the file format.
  4. Service Manager tasks are triggered based on the schedule you have configured.
  5. The Scheduled Export service reads tag data from Data Archiver based on the settings you have configured.
  6. CSV and/or Parquet files are exported into the S3 bucket that you have created. Depending on the file format and the date of export, the following folder structure is created, and files are stored in the respective folders.
  7. You can use the tag data stored in the CSV and/or Parquet files in analysis tools such as AWS Athena.
Workflow:
Step Number Description Notes
1 Deploy Proficy Historian for AWS. This step is required to install the Historian server in an EC2 instance in a VPC.
2 Install collectors and create a collector instance. This step is required if you want to collect data using collectors. For information on choosing a collector, refer to Choosing a Collector.
3 Deploy Scheduled Export to Parquet/CSV Service. This step is required to install the components that will export tag data from Data Archiver to Amazon S3 in the Parquet and/or CSV file format.
4 Configure the export settings. This step is required. It involves providing an XML file with the list of tags, export schedule, and the file format for the export.
After you perform these steps, tag data will be exported into S3 based on the specified schedule. This data will be available in Parquet and/or CSV files. You can then analyze this data using analysis tools.