Data Quality

About Data Quality

The Data Quality module allows you to assess the quality of the equipment data and work orders associated with the Computerized Maintenance Management System (CMMS) of your organization, and its readiness to be used to generate asset performance analytics.

Industrial Data Diagnostics Data Quality module analyzes your CMMS data to assess data completeness, accuracy, and standardization in key data fields needed for asset performance analytics. Industrial Data Diagnostics provides a list of all data quality issues to be addressed, and prioritizes these issues based on their impact on multiple asset performance metrics, effort required, and complexity in resolving these issues, so you can maximize your ROI in deciding what to work on first. Each issue is explained and includes concrete, actionable recommendations on how to improve the data. The impact of these data quality issues is also assessed in terms of ability to calculate reliability and maintenance metrics and asset performance analytics.

You can use this feature to perform the following actions:
  • Identify the most important issues for equipment and work order data.
  • Analyze different trends in your data to understand if the issue is getting better or worse over time.
  • Compare the data quality across the different sites in your organization.
  • Rectify data input omissions, inaccuracies, and errors through a set of recommendations for common causes of these issues.
  • Understand the impact of data quality issues on the trustworthiness of the reliability metrics and data analytics.

Improving the quality of the CMMS data is a continuous process. The quality of your data can only improve by constant feedback and utilization of data. If the people responsible for entering the data in your system can see how the data is being used, they will make an effort to capture the correct data. By making better asset performance improvement decisions, your organization can focus on the most critical failures and in turn reduce maintenance costs and production losses.

GE recommends the following workflow while assessing and utilizing data for decision-making purposes:

  • First step is to measure data quality using several dimensions like data completeness, standardization, accuracy, and timeliness. Industrial Data Diagnostics provides out of the box metrics that measure data quality across these four dimensions.
  • Then you identify the areas of good versus bad data. Industrial Data Diagnostics quantifies each metric in terms of a percentage of work orders or equipment that have a particular data quality issue and prioritizes them based on their impact. You can identify the areas of good and bad data using the Issues table in Industrial Data Diagnostics data quality module.
  • The good data can be utilized already in solution or in Industrial Data Diagnostics, for asset performance analytics.
  • The areas that need improvement, can be improved in two ways:
    • The future data can be improved by implementing best practices for people, processes, and technology of the maintenance organization. Industrial Data Diagnostics provides a set of recommendations, that identify common solutions on how to resolve these issues.
    • The historical data can be mined using natural language processing. For example, the free text fields for work order description can be analyzed to identify how the asset failed.
  • The good data from both these sources can then be used for asset performance analytics.
  • Finally, to sustain the improvements, you need to monitor and track data quality. Industrial Data Diagnostics provides trends on each data quality issue to help users see if the issue is getting better or worse.
The Data Quality module can be accessed by selecting the Data Quality button on the ribbon. It consists of two tabs:
  • Metric Confidence
  • Issues
  • Insights

Metric Confidence

The Metric Confidence section helps companies understand how reliable and how much they can trust the reliability metrics calculated in Industrial Data Diagnostics and APM, based on the data quality of the specific fields needed to calculate these metrics.

This screen allows you to compare the metric confidence for the sites in your organization. By default, the top 5 worst quality sites are chosen for display to focus your attention on the most important candidates for improvements.

The left panel shows all the sites in the organization and users can select or deselect specific sites.

The right panel shows the metric confidence score for each selected site. The confidence score for each of 15 reliability metrics offered in Industrial Data Diagnostics is displayed as High, Medium, Low, Very Low or Not Available (no data). This is based on the data quality for the specific fields that are required to calculate the metrics.

Users can expand each reliability metric to understand which data fields are relevant to each metric confidence score. Understanding the crucial link between data entry and its impact on performance metrics is critical to motivate the team to take data collection seriously.

For example, the Average corrective cost metric score depends on the following four data quality issues:
  • Maintenance cost is not captured on closed work orders.
  • Work Order types are missing.
  • Work Orders written at higher levels of functional location hierarchy.
  • Repair work orders are kept open for a long time.

Each of these issues affects the ability to calculate the Average Corrective Cost metric. For example, without maintenance cost, this metric will be underestimated or, if the repair work orders are kept open for a long time, all the cost elements might not have been captured on the work order yet. Therefore, each of these issues affect the overall metric confidence score.

The reliability metric such as Average Corrective Cost is given a high, medium, low, or very low score based on the issue with the biggest percentage score out of the list of all issues.

For example, in the above screenshot, Maintenance cost not captured on closed work orders was the biggest issue with 100% work orders missing maintenance cost. Hence, the Average Corrective Cost metric has a very low score.

Hyperlinks are shown for issues with a percentage score bigger than 0%. You can select these links to take them to the Site Comparison page which is discussed in the Site Comparison section.

PRIORITIZED DATA QUALITY ISSUES AND RECOMMENDATIONS

The PRIORITIZED DATA QUALITY ISSUES AND RECOMMENDATIONS tab contains a set of data quality metrics that are frequently found when assessing the quality of assets. This page shows the percentage of records with issue across each of these issues for the entire enterprise.

Following columns are displayed in a table on Issues and Recommendations.
  • PRIORITY: Industrial Data Diagnostics prioritizes the issues based on the following:
    • Impact of the data quality issue on asset performance analytics. For example, missing work order costs lead to underestimated cost metrics, such as Average Corrective Work Cost.
    • Time taken to implement recommended actions to improve data quality. For example, would it take a few months to make improvements or a longer time?
    • Complexity of improvement. For example, a system change might be easier to implement than a cultural change.
  • ISSUE: This describes the data quality issue.
  • % RECORDS WITH ISSUE WITHIN DATE RANGE: This field is calculated as following:

    (Number of work orders or equipment with data quality issue) / (Total work orders or equipment)

  • % WOS WITH ISSUE PREVIOUS MONTH: This field calculates the percentage of work orders written in the last month that had the particular issue.
  • RECOMMENDED ACTION: This column describes some common actions that you can take to resolve the particular issue and improve data quality.
  • IMPACT OF ISSUE: This column describes the impact of the issue on specific metrics and data analytics. For example, if work order maintenance cost is missing, the cost metrics like Average Corrective Work Cost are underestimated.

Hyperlinks are shown for issues with a percentage score bigger than 0%. You can select these links to take them to the Site Comparison page, discussed in the next section.

The site panel is shown on the left and the default selection is for all sites in the company. You may uncheck certain sites to exclude from the analysis or may clear all and pick a few sites for comparison by checking box for each site.

Note: The Equipment Hierarchy filter allows you to view the issues and metric confidence related to a particular Equipment Category, Class, or Type.
Note: The issues in the Data Quality module comprising of primary categories: Completeness, Accuracy, Standardization, and Timelines are visible in a flat list format by default. You can click the link Group by data quality categories to the upper right corner to view them in grouped issues format.

Site Comparison

When you click the hyperlinks available in the Issues and Recommendations table, a detailed analysis can be performed by comparing the issue for each site in the organization.

The percentage of work orders or equipment with that particular data quality issue are shown on the Y-axis and the month and year are shown on the x-axis. This trend analysis over time for the selected issue to give an idea of whether the issue has improved or is getting worse.

The percentage with issue column is provided by each in the selected list site at the end of this trend chart in a tabular format.

The site panel is shown on the left and the default selection is for all sites in the company. You may uncheck certain sites to exclude from the analysis or may clear all and pick a few sites for comparison by checking box for each site.

Work Orders or Equipment Details

You may further select a site of interest from the Site Comparison screen and come to a table that shows the list of work orders or equipment that have the specific issue. This list has identifier information like work order number, equipment ID and request ID (notification ID for SAP) that can help with the investigation process.

You can go back to your CMMS and start an analysis on specific work orders that have data quality issue and identify the root causes behind missing or inaccurate data. This will help you to identify the issues in your CMMS and implement some of the recommendations suggested in the Issues and Recommendations tab.

You can use the View functional location distribution link to access the Data Quality Insights section that allows you to view the functional location distribution insights to understand how the work orders and equipment are spread throughout different hierarchy levels.
Note: This option appears only when viewing a specific issue, that is, for work orders written at higher levels of functional location hierarchy.

Data Quality Insights

The Data Quality Insights section allows you to view the functional location distribution insights to understand how the work orders and equipment are spread throughout different hierarchy levels.

The left panel shows all the sites in the organization. You can select or deselect specific sites. More sites can be added or removed using the left panel.

The right panel lists the top five worst quality sites by default. The top five worst quality sites are chosen for display to focus your attention on the most important candidates for improvements.You can expand each site to view the functional location table that gives insights on hierarchy level, count, and cost of work orders and equipment information.

Access the Metric Confidence Section

Procedure

  1. Access Industrial Data Diagnostics.
  2. Select the Data Quality tab.
    The Metric Confidence section appears, displaying the reliability metrics confidence score for the top 5 worst data quality sites.

Access the PRIORITIZED DATA QUALITY ISSUES AND RECOMMENDATIONS Section

Procedure

  1. Access Industrial Data Diagnostics.
  2. Select the Data Quality tab.
  3. Select the Issues tab.
    The PRIORITIZED DATA QUALITY ISSUES AND RECOMMENDATIONS section appears, displaying frequently found data issues, the percentage of asset data having the issues, and the actions recommended by Industrial Data Diagnostics for the corresponding assets.

Access the Insights Section

Procedure

  1. Access Industrial Data Diagnostics.
  2. Select the Data Quality tab.
  3. Select the Insights tab.
    The DATA QUALITY INSIGHTS section appears, displaying the top five worst data quality sites. You can expand each site to view the functional location table that shows hierarchy level, count, and cost of work orders and equipment information.

Data Normalization

When the data is imported to the Industrial Data Diagnostics' database, the data is mapped to the Industrial Data Diagnostics' standard fields. Data normalization displays the number of assets and work orders that are imported, the number of assets and work orders that are accepted and the number of assets and work orders that are excluded.
Note: Required fields are checked before the mapping process.
Data normalization is divided into two categories:
  • Equipment Acceptance: Displays the percentage of equipment accepted, the percentage of equipment excluded and included for Data Quality Analysis, and the percentage of equipment excluded and included for Asset Performance Analysis.
  • Work History Acceptance: Displays the percentage of work orders accepted, the percentage of work orders excluded and included for Data Quality Analysis, and the percentage of work orders excluded and included for Asset Performance Analysis.
    Note: Select a reason to view the records that are excluded.
The following files are generated when you import the data to Industrial Data Diagnostics for the first time:
  • Field Mapping: The Download Field Mappings link opens a spreadsheet that displays the mapping of the data fields to the Industrial Data Diagnostics' standard data fields.
  • Code Mapping: The Download Code Mappings link opens a spreadsheet that displays the mapping of the data to the Industrial Data Diagnostics' standard data.