Taxonomy Data Loader

About the Taxonomy Data Loader

The Taxonomy Data Loader loads data from a standard Excel workbook into the Taxonomy data model. The data loader will create or update Taxonomy Categories, Classes, and Types based on the data in the Excel workbook.

About the Taxonomy Data Loader Data Model

The data for a Taxonomy location is loaded from a single Excel workbook containing a single worksheet.

The Taxonomy Data Loader General Loading Strategy

This section describes any prerequisites to loading the data and the order in which the data will be loaded.

Before You Begin

As the taxonomy structure is a foundation for analysis and reporting, users should clearly understand the usage and data model for Taxonomy related records prior to implementing the Taxonomy structure.

About This Task

Note: Before reading this section, refer to the Data Model section.

The Taxonomy data load must be performed in a specific sequence to successfully populate fields, create records, and link them to the predecessor and/or successor records.

Procedure

  1. Load Taxonomy Category.
  2. Load Taxonomy Class.
  3. Load Taxonomy Type.
  4. Load Taxonomy Mapping Values.

About the Taxonomy Data Loader Workbook Layout and Use

This section provides a high-level overview and explanation of how the data loader workbook is constructed.

In order to import data using the Taxonomy Data Loader, GE Digital provides an Excel workbook, Taxonomy.xlsx, which supports baseline data loading of Taxonomy in Predix Essentials. This workbook must be used to perform the data load. On the Taxonomy worksheets, you will enter the information to load a taxonomy structure that will be assigned to assets within Predix Essentials.

The baseline file is organized such that each row is capable of creating one node in the data model when all columns contain the appropriate values.

The following table lists the worksheets that are included in the Taxonomy Data Loader workbook.

Note: Worksheets in the workbook not being used may be left blank, but should not be deleted from the workbook.
WorksheetDescription
ConfigurationThe Configuration worksheet is needed to describe the type of data that you will be loading and how that data should be handled during the data load.
TaxonomyCategoryThis worksheet is used to link Taxonomy Category data and Taxonomy Class data.
TaxonomyClassThis worksheet is used to link Taxonomy Class data and Taxonomy Type data.
TaxonomyMappingThis worksheet is used to link Taxonomy Type data and Taxonomy Mapping data.

Configuration Worksheet

The Configuration worksheet tells the data loader what types of data are being loaded and how the data is to be loaded, and is standard for all data loads regardless of the type of data that you are loading. The following table outlines the options that are valid or the values that are expected in each of the columns on the Configuration worksheet.
Field CaptionField IDData Type (Length)Comments
Load Data From WorksheetLOAD_DATA_WORKSHEETBooleanIdentifies if data from the corresponding worksheet identified in the Data Worksheet ID column will be loaded or not.
  • True: The corresponding worksheet will be processed.
  • False: The corresponding worksheet will not be processed.
Data Worksheet IDDATA_WORKSHEET_IDCharacterThis column contains the name of the <data>worksheet where the actual data is located. It needs to have the same name as the <data> worksheet in the data loader workbook.
Batch SizeBATCH_SIZECharacterModifying this field is required to determine the number of records processed in each batch. Enter the batch size you want, and the Data Loader will process that many records per batch.

For example, if you want to use a batch size of 100, enter 100, and the data loader will process 100 records per batch.

Note: The recommended batch size is 100. If the Batch Size column is removed from the source workbook, the data loader will default to a batch size of 100.

In addition to processing the data in batches, the log file reports progress by batch.

Primary Family IDPRIMARY_FAMILY_IDCharacterDepending on the type of data that you are working with, this will contain the Relationship Family ID or the Entity Family ID. You can also allow the data in source file to determine the Family ID by encapsulating the Field ID that contains the Family ID data in brackets (<>).

For example if in the <data> worksheet there is a column with an ID of PRIMARY_FAMILY_ID, where each row contains the corresponding Family ID, then in this column you should put the value of <PRIMARY_FAMILY_ID>.

If the Family ID in the Meridium, Inc. metadata contains spaces, then you have to use this feature.

Primary Family Key FieldsPRIMARY_FAMILY_KEY_FIELDSCharacterThis column contains the Field IDs associated with the Primary Family that are used to uniquely identify a record. If more than one field is to be used, then each Field ID needs to be separated by a | (Pipe) character. In the case where you are loading data into a relationship, if no keys fields exist or are used, use the <none> constant.

If the Primary Action is ACTION_INSERTONLY, then no key fields need to be specified, so you can use the <none> constant.

Family TypeFAMILY_TYPE The value is this column should be Entity or Relationship depending on the type of data that is being loaded.
Predecessor Family IDPRED_FAMILY_IDCharacterWhen the Family Type is Relationship, this column will contain the value of the Entity Family ID that is the predecessor in the relationship. Otherwise, it should contain the <none> constant. You can also use the data in each of the rows to determine the Predecessor Family ID.
Predecessor Family Key FieldsPRED_FAMILY_KEY_FIELDSCharacterThis column contains the Field ID or IDs associated with the Predecessor Family that are used to uniquely identify the predecessor record. If more than one field is to be used, then each Field ID needs to be separated by a | (Pipe) character.

If the Predecessor Action is ACTION_INSERTONLY, then no key fields need to be specified, so you can use the <none> constant.

Successor Family IDSUCC_FAMILY_IDCharacterWhen the Family Type is Relationship, this column will contain the value of the Entity Family ID that is the successor in the relationship. Otherwise, it should contain the <none> constant. You can also use the data in each of the rows to determine the Successor Family ID.
Successor Family Key FieldsSUCC_FAMILY_KEY_FIELDSCharacterThis column contains the Field ID or IDs associated with the Successor Family that are used to uniquely identify the successor record. If more than one field is to be used, then each Field ID needs to be separated by a | (Pipe) character.

If the Successor Action is ACTION_INSERTONLY, then no key fields need to be specified, so you can use the <none> constant.

Primary ActionPRIMARY_ACTIONCharacterThe value in this column will determine the action that will be applied to the Primary Family records. If the Family Type is Entity, then the possible values are:
  • ACTION_INSERTONLY
  • ACTION_INSERTUPDATE
  • ACTION_UPDATEONLY
  • ACTION_DELETE
  • ACTION_PURGE

Deleting a record and purging a record will both delete the current record, the difference being that the purge action will delete the record and all of the links or relationships tied to that record. The delete action will simple attempt to delete the record, and if it is related to another record, the delete will fail. If The Family Type is Relationship, then the possible values are:

  • ACTION_INSERTONLY
  • ACTION_INSERTUPDATE
  • ACTION_UPDATEONLY
  • ACTION_DELETE
Predecessor ActionPRED_ACTIONCharacterThe value in this column will determine the action that will be applied to the Predecessor Family records. The possible values are:
  • ACTION_INSERTONLY
  • ACTION_INSERTUPDATE
  • ACTION_UPDATEONLY
  • ACTION_DELETE
  • ACTION_PURGE
  • ACTION_LOCATE

If The Family Type is Entity then the value needs to be ACTION_NONE

.
Successor ActionSUCC_ACTIONCharacterThe value in this column will determine the action that will be applied to the Successor Family records. The possible values are:
  • ACTION_INSERTONLY
  • ACTION_INSERTUPDATE
  • ACTION_UPDATEONLY
  • ACTION_DELETE
  • ACTION_PURGE
  • ACTION_LOCATE

If The Family Type is Entitythen the value needs to be ACTION_NONE.

Insert with Null Values?OPTION_INSERT_ON_NULLBooleanWhen setting field values on a new record, if a value coming across is NULL, the field values will be set to NULL if this option is set to True.
Update with Null Values?OPTION_UPDATE_ON_NULLBooleanWhen setting field values on an existing record, if a value coming across is NULL, the field values will be set to NULL if this option is set to True.
Replace an Existing Link?OPTION_REPLACE_EXISTING_LINKBooleanThe Replace Existing Relationship option is used to determine how a relationship is to be maintained by its cardinality definition.

For example, the relationship Location Contains Asset that is defined in the Configuration Manager. It has a cardinality defined as Zero or One to Zero or One, has a Location LP-2300, and contains the Asset P-2300. If, in the data load, you assign the Asset P-5000 to be contained in the Location LP-2300, and you have set the Replace Existing Link property to True, then the data loader will link P-5000 to LP-2300 and unlink P-2300 from LP-2300. This assumes that P-5000 is not currently linked to another location. The same is true for a relationship that is defined as Zero or One to Zero or Many, or Zero or Many to Zero or One.

Allow Change of Family?OPTION_ALLOW_CHANGE_OF_FAMILYBooleanAllows the data loader to move an entity from one family to another.

For example this would allow an entity that is currently assigned to the Centrifugal Pump family to be moved to the Reciprocating Pump family.

All relationships will be maintained as long as the family to which the entity is being moved allows the same relationships.

Note: Because of the extra processing required, by selecting this option, the interface performance will decrease.

TaxonomyCategory

Note: Each row in this worksheet represents a single asset. You should not include the same asset more than once.
Field CaptionField IDData Type (Length)Comments
Taxonomy CategorySC_TAXOCATG_TAX_CATEG_CCharacter (50)This column is used for batching.
Taxonomy Category DescriptionSC_TAXOCATG_TAX_CATEG_DESC_CCharacter (255)None

TaxonomyClass

Note: Each row in this worksheet represents a single asset. You should not include the same asset more than once.
Field CaptionField IDData Type (Length)Comments
Taxonomy CategorySC_TAXOCATG_TAX_CATEG_CCharacter (50)This column is used for batching.
Taxonomy ClassSC_TAXOCLAS_TAX_CLASS_CCharacter (50)This is a key field.
Taxonomy Class Description'SC_TAXOCLAS_TAX_CLASS_DESC_CCharacter (255)None

TaxonomyType

Field CaptionField IDData Type (Length)Comments
Taxonomy CategorySC_TAXOCATG_TAX_CATEG_CCharacter (50)This column is used for batching.
Taxonomy ClassSC_TAXOCLAS_TAX_CLASS_CCharacter (50)This is a key field.
Taxonomy TypeSC_TAXOTYPE_TAX_TYPE_CCharacter (50)This is a key field.
Taxonomy Type DescriptionSC_TAXOTYPE_TAX_TYPE_DESC_CCharacter (255)None

TaxonomyMapping Worksheet

Field CaptionField IDData Type (Length)Comments
Taxonomy Mapping CategorySC_TAXOMAPP_TAX_CATEG_CCharacter (50)This column is used for batching.
Taxonomy Mapping ClassSC_TAXOMAPP_TAX_CLASS_CCharacter (50)This is a key field.
Taxonomy Mapping TypeSC_TAXOMAPP_TAX_TYPE_CCharacter (50)This is a key field.
Taxonomy Mapping ValueSC_TAXOMAPP_TAX_MAPPI_VALUE_CCharacter (255)This is a key field.