Taxonomy Data Loader

About the Taxonomy Data Loader

The Taxonomy Data Loader loads data from a standard Excel workbook into the Taxonomy data model. The data loader will create or update Taxonomy Categories, Classes, and Types based on the data in the Excel workbook.

About the Taxonomy Data Loader Data Model

The data for a Taxonomy location is loaded from a single Excel workbook containing a single worksheet.

The Taxonomy Data Loader General Loading Strategy

This section describes any prerequisites to loading the data and the order in which the data will be loaded.

Before You Begin

As the taxonomy structure is a foundation for analysis and reporting, users should clearly understand the usage and data model for Taxonomy related records prior to implementing the Taxonomy structure.

About This Task

Note: Before reading this section, refer to the Data Model section.

The Taxonomy data load must be performed in a specific sequence to successfully populate fields, create records, and link them to the predecessor and/or successor records.

Procedure

Load Taxonomy Category.
Load Taxonomy Class.
Load Taxonomy Type.
Load Taxonomy Mapping Values.

About the Taxonomy Data Loader Workbook Layout and Use

This section provides a high-level overview and explanation of how the data loader workbook is constructed.

In order to import data using the Taxonomy Data Loader, GE Digital provides an Excel workbook, Taxonomy.xlsx, which supports baseline data loading of Taxonomy in Predix Essentials. This workbook must be used to perform the data load. On the Taxonomy worksheets, you will enter the information to load a taxonomy structure that will be assigned to assets within Predix Essentials.

The baseline file is organized such that each row is capable of creating one node in the data model when all columns contain the appropriate values.

The following table lists the worksheets that are included in the Taxonomy Data Loader workbook.

Note: Worksheets in the workbook not being used may be left blank, but should not be deleted from the workbook.


Worksheet	Description
Configuration	The Configuration worksheet is needed to describe the type of data that you will be loading and how that data should be handled during the data load.
TaxonomyCategory	This worksheet is used to link Taxonomy Category data and Taxonomy Class data.
TaxonomyClass	This worksheet is used to link Taxonomy Class data and Taxonomy Type data.
TaxonomyMapping	This worksheet is used to link Taxonomy Type data and Taxonomy Mapping data.

Configuration Worksheet

The Configuration worksheet tells the data loader what types of data are being loaded and how the data is to be loaded, and is standard for all data loads regardless of the type of data that you are loading. The following table outlines the options that are valid or the values that are expected in each of the columns on the Configuration worksheet.


Field Caption	Field ID	Data Type (Length)	Comments
Load Data From Worksheet	LOAD_DATA_WORKSHEET	Boolean	Identifies if data from the corresponding worksheet identified in the Data Worksheet ID column will be loaded or not. True: The corresponding worksheet will be processed. False: The corresponding worksheet will not be processed.
Data Worksheet ID	DATA_WORKSHEET_ID	Character	This column contains the name of the <data>worksheet where the actual data is located. It needs to have the same name as the <data> worksheet in the data loader workbook.
Batch Size	BATCH_SIZE	Character	Modifying this field is required to determine the number of records processed in each batch. Enter the batch size you want, and the Data Loader will process that many records per batch. For example, if you want to use a batch size of 100, enter 100, and the data loader will process 100 records per batch. Note: The recommended batch size is 100. If the Batch Size column is removed from the source workbook, the data loader will default to a batch size of 100. In addition to processing the data in batches, the log file reports progress by batch.
Primary Family ID	PRIMARY_FAMILY_ID	Character	Depending on the type of data that you are working with, this will contain the Relationship Family ID or the Entity Family ID. You can also allow the data in source file to determine the Family ID by encapsulating the Field ID that contains the Family ID data in brackets (<>). For example if in the <data> worksheet there is a column with an ID of PRIMARY_FAMILY_ID, where each row contains the corresponding Family ID, then in this column you should put the value of `<PRIMARY_FAMILY_ID>`. If the Family ID in the Meridium, Inc. metadata contains spaces, then you have to use this feature.
Primary Family Key Fields	PRIMARY_FAMILY_KEY_FIELDS	Character	This column contains the Field IDs associated with the Primary Family that are used to uniquely identify a record. If more than one field is to be used, then each Field ID needs to be separated by a \| (Pipe) character. In the case where you are loading data into a relationship, if no keys fields exist or are used, use the <none> constant. If the Primary Action is ACTION_INSERTONLY, then no key fields need to be specified, so you can use the <none> constant.
Family Type	FAMILY_TYPE		The value is this column should be `Entity` or `Relationship` depending on the type of data that is being loaded.
Predecessor Family ID	PRED_FAMILY_ID	Character	When the Family Type is Relationship, this column will contain the value of the Entity Family ID that is the predecessor in the relationship. Otherwise, it should contain the <none> constant. You can also use the data in each of the rows to determine the Predecessor Family ID.
Predecessor Family Key Fields	PRED_FAMILY_KEY_FIELDS	Character	This column contains the Field ID or IDs associated with the Predecessor Family that are used to uniquely identify the predecessor record. If more than one field is to be used, then each Field ID needs to be separated by a \| (Pipe) character. If the Predecessor Action is ACTION_INSERTONLY, then no key fields need to be specified, so you can use the <none> constant.
Successor Family ID	SUCC_FAMILY_ID	Character	When the Family Type is Relationship, this column will contain the value of the Entity Family ID that is the successor in the relationship. Otherwise, it should contain the <none> constant. You can also use the data in each of the rows to determine the Successor Family ID.
Successor Family Key Fields	SUCC_FAMILY_KEY_FIELDS	Character	This column contains the Field ID or IDs associated with the Successor Family that are used to uniquely identify the successor record. If more than one field is to be used, then each Field ID needs to be separated by a \| (Pipe) character. If the Successor Action is ACTION_INSERTONLY, then no key fields need to be specified, so you can use the <none> constant.
Primary Action	PRIMARY_ACTION	Character	The value in this column will determine the action that will be applied to the Primary Family records. If the Family Type is Entity, then the possible values are: ACTION_INSERTONLY ACTION_INSERTUPDATE ACTION_UPDATEONLY ACTION_DELETE ACTION_PURGE Deleting a record and purging a record will both delete the current record, the difference being that the purge action will delete the record and all of the links or relationships tied to that record. The delete action will simple attempt to delete the record, and if it is related to another record, the delete will fail. If The Family Type is Relationship, then the possible values are: ACTION_INSERTONLY ACTION_INSERTUPDATE ACTION_UPDATEONLY ACTION_DELETE
Predecessor Action	PRED_ACTION	Character	The value in this column will determine the action that will be applied to the Predecessor Family records. The possible values are: ACTION_INSERTONLY ACTION_INSERTUPDATE ACTION_UPDATEONLY ACTION_DELETE ACTION_PURGE ACTION_LOCATE If The Family Type is Entity then the value needs to be ACTION_NONE .
Successor Action	SUCC_ACTION	Character	The value in this column will determine the action that will be applied to the Successor Family records. The possible values are: ACTION_INSERTONLY ACTION_INSERTUPDATE ACTION_UPDATEONLY ACTION_DELETE ACTION_PURGE ACTION_LOCATE If The Family Type is Entitythen the value needs to be ACTION_NONE.
Insert with Null Values?	OPTION_INSERT_ON_NULL	Boolean	When setting field values on a new record, if a value coming across is NULL, the field values will be set to NULL if this option is set to True.
Update with Null Values?	OPTION_UPDATE_ON_NULL	Boolean	When setting field values on an existing record, if a value coming across is NULL, the field values will be set to NULL if this option is set to True.
Replace an Existing Link?	OPTION_REPLACE_EXISTING_LINK	Boolean	The Replace Existing Relationship option is used to determine how a relationship is to be maintained by its cardinality definition. For example, the relationship Location Contains Asset that is defined in the Configuration Manager. It has a cardinality defined as Zero or One to Zero or One, has a Location LP-2300, and contains the Asset P-2300. If, in the data load, you assign the Asset P-5000 to be contained in the Location LP-2300, and you have set the Replace Existing Link property to True, then the data loader will link P-5000 to LP-2300 and unlink P-2300 from LP-2300. This assumes that P-5000 is not currently linked to another location. The same is true for a relationship that is defined as Zero or One to Zero or Many, or Zero or Many to Zero or One.
Allow Change of Family?	OPTION_ALLOW_CHANGE_OF_FAMILY	Boolean	Allows the data loader to move an entity from one family to another. For example this would allow an entity that is currently assigned to the Centrifugal Pump family to be moved to the Reciprocating Pump family. All relationships will be maintained as long as the family to which the entity is being moved allows the same relationships. Note: Because of the extra processing required, by selecting this option, the interface performance will decrease.

TaxonomyCategory

Note: Each row in this worksheet represents a single asset. You should not include the same asset more than once.


Field Caption	Field ID	Data Type (Length)	Comments
Taxonomy Category	SC_TAXOCATG_TAX_CATEG_C	Character (50)	This column is used for batching.
Taxonomy Category Description	SC_TAXOCATG_TAX_CATEG_DESC_C	Character (255)	None

TaxonomyClass

Note: Each row in this worksheet represents a single asset. You should not include the same asset more than once.


Field Caption	Field ID	Data Type (Length)	Comments
Taxonomy Category	SC_TAXOCATG_TAX_CATEG_C	Character (50)	This column is used for batching.
Taxonomy Class	SC_TAXOCLAS_TAX_CLASS_C	Character (50)	This is a key field.
Taxonomy Class Description	'SC_TAXOCLAS_TAX_CLASS_DESC_C	Character (255)	None

TaxonomyType


Field Caption	Field ID	Data Type (Length)	Comments
Taxonomy Category	SC_TAXOCATG_TAX_CATEG_C	Character (50)	This column is used for batching.
Taxonomy Class	SC_TAXOCLAS_TAX_CLASS_C	Character (50)	This is a key field.
Taxonomy Type	SC_TAXOTYPE_TAX_TYPE_C	Character (50)	This is a key field.
Taxonomy Type Description	SC_TAXOTYPE_TAX_TYPE_DESC_C	Character (255)	None

TaxonomyMapping Worksheet


Field Caption	Field ID	Data Type (Length)	Comments
Taxonomy Mapping Category	SC_TAXOMAPP_TAX_CATEG_C	Character (50)	This column is used for batching.
Taxonomy Mapping Class	SC_TAXOMAPP_TAX_CLASS_C	Character (50)	This is a key field.
Taxonomy Mapping Type	SC_TAXOMAPP_TAX_TYPE_C	Character (50)	This is a key field.
Taxonomy Mapping Value	SC_TAXOMAPP_TAX_MAPPI_VALUE_C	Character (255)	This is a key field.