Analytic Management

Adding and Validating an Analytic Using REST APIs

The Analytics Catalog is a repository for hosting analytics. This example guides you through the process to add an analytic to the Analytics Catalog.

About This Task

The high level steps are as follows.

  • (Optional) Update the catalog taxonomy structure for the analytic classification.
  • Create an analytic catalog entry with the attributes for the analytic.
  • Upload the executable to the catalog.
  • Validate that the analytic runs in the Cloud Foundry environment (deploy and run the analytic, then retrieve the results).

Procedure

  1. Optional: Set up the taxonomy by issuing the following REST API request.
    POST <catalog_uri>/api/v1/catalog/taxonomy

    The catalog employs a hierarchical classification structure (called a taxonomy) to organize the analytics. When adding an analytic to the catalog, you will identify the taxonomy location for the analytic. The catalog allows you to tailor the taxonomy to your needs.

    The following is a sample JSON that could be sent using the POST request to add branches to the taxonomy. For more information, see About Analytic Taxonomy.

    {
        "node_name": "Analytics",
        "child_nodes": [
            {
                "node_name": "Diagnostic",
                "child_nodes": []
            },
            {
                "node_name": "Descriptive",
                "child_nodes": []
            },
            {
                "node_name": "Predictive",
                "child_nodes": []
            },
            {
                "node_name": "Prescriptive",
                "child_nodes": []
            }
        ]
    }
    
  2. Create an analytic catalog entry, which stores analytic attributes, by issuing the following REST API request.
    POST <catalog_uri>/api/v1/catalog/analytics

    The following sample request creates an analytic catalog entry. "supportedLanguage" field valid values are: java, matlab, python, python_3.

    {
      "taxonomyLocation": "/Analytics/Diagnostic",
      "supportedLanguage": "Matlab",
      "author": "Predix Analytics team",
      "version": "v1",
      "name": "Anomaly Detection"
    }

    The following sample shows a response body containing JSON defining the analytics’s attributes.

    {
        "author": "Predix Analytics team",
        "customMetadata": null,
        "taxonomyLocation": "/Analytics/Diagnostic",
        "createdTimestamp": "2015-09-18T16:16:50+00:00",
        "updatedTimestamp": "2015-09-18T16:16:50+00:00",
        "supportedLanguage": "Matlab",
        "description": null,
        "version": "v1",
        "name": "Anomaly Detection",
        "id": "09718078-95e7-4b58-b74a-152838f03b41"
    }
  3. Attach an executable analytic file to the analytic catalog entry by issuing the following REST API request.
    Note: The maximum file size for upload is 250MB. The maximum expanded analytic file size (including directories and files) is 500MB.
    POST <catalog_uri>/api/v1/catalog/artifacts

    The request is a multipart/form-data type with the following parts.

    NameValueRequired/Optional
    fileExecutable analytic file.Required
    catalogEntryIdAnalytic catalog entry ID. For an example, see the sample response in the previous step.Required
    typeFile type (must contain the value: 'executable').Required
    descriptionUp to 1,000 character description of the executable file.Optional
  4. Attach an analytic template to the analytic catalog entry by issuing the following REST API request.
    POST <catalog_uri>/api/v1/catalog/artifacts

    The request is a multipart/form-data type with the following parts.

    NameValueRequired/Optional
    fileAnalytic template file.Required
    catalogEntryIdAnalytic catalog entry ID. For an example, see the sample response in step 2.Required
    typeFile type (must contain the value: 'template'Required
    descriptionUp to 1,000 character description of the template file.Optional

    For more information on analytic templates, see Hierarchical Analytic Template: Defining the Analytic's Input/Output Structure.

  5. Deploy and validate the analytic with test input data.

    The test input data is used to invoke the analytic execution after the analytic is deployed successfully. For example, the demo-adder sample analytic can accept the following sample input data:

    {“number1”: 123, “number2”: 456}

    Choose from the following methods to pass the input data.

    • The input data can be sent directly in the body of the validation request. In this case, issue the following REST API request.
      POST <catalog_uri>/api/v1/catalog/analytics/{id}/validation
    • The input data can be uploaded as an artifact attached to the analytic (see Attaching Additional Artifacts to an Analytic Catalog Entry). This can be useful if you have a large dataset, or if you plan on reusing the dataset. In this case, issue the following REST API request with the artifact ID as a query parameter:
      POST <catalog_uri>/api/v1/catalog/analytics/{analyticId}/validation?input_id={artifactId}

    In either case, the request to deploy and validate an analytic is an asynchronous request. The response contains a validationRequestId that can later be used to retrieve the status and results of the validation. The following is a sample response from a validation request.

    {
        "analyticId": "09718078-95e7-4b58-b74a-152838f03b41",
        "validationRequestId": "dfcaa0d4-40f1-471d-9f66-e72a23a30266",
        "inputData": "{\"number1\": 123, \"number2\": 456}",
        "status": "QUEUED",
        "createdTimestamp": "2015-09-18T16:53:04+00:00",
        "updatedTimestamp": "2015-09-18T16:53:04+00:00",
        "message": "Analytic validation request successfully queued - reference id is dfcaa0d4-40f1-471d-9f66-e72a23a30266",
        "result": null
    }
  6. Retrieve the analytic validation status by issuing the following REST API request.

    Use the validationRequestId from the previous step to check the status of the deployment and validation. The status can be QUEUED, PROCESSING, or COMPLETED. If the status is COMPLETED, the response includes the analytic execution result.

    GET  <catalog_uri>/api/v1/catalog/analytics/{id}/validation/{validationRequestId}

    The following sample response shows a successful deployment and validation status check.

    {
        "analyticId": "09718078-95e7-4b58-b74a-152838f03b41",
        "validationRequestId": "dfcaa0d4-40f1-471d-9f66-e72a23a30266",
        "inputData": "{\"number1\": 123, \"number2\": 456}",
        "status": "COMPLETED",
        "createdTimestamp": "2015-09-18T16:53:04+00:00",
        "updatedTimestamp": "2015-09-18T16:55:41+00:00",
        "message": "Analytic validation completed successfully.",
        "result": "{\"result\":579}"
    }

    Once the status returns COMPLETED, the message field in the response indicates whether the validation attempt was successful or failed. If successful, the result field contains the analytic output data. If failed, the result field is empty, and the message field displays the failure reason.

About Running a Single Analytic

If you have a self-contained data set you can run (execute) a single analytic without providing an analytic template or using the orchestration runtime. There are three methods to run a single analytic.

Note: Asynchronous execution must be used in a production environment. Use synchronous execution only when testing or validating an analytic in a test environment. Always use asynchronous option if analytic execution will take longer than 30 seconds to complete.

Running an Analytic Synchronously Using the Analytics Catalog

Run a single analytic synchronously as follows.

Before You Begin

The analytic must be already validated and deployed to the Cloud Foundry environment before it can be run. For more information and detailed steps, see

Note: Use synchronous execution only when testing or validating an analytic in a test environment. If an analytic takes longer than 30 seconds during synchronous execution, you may receive a 500 HTTP error. Always use asynchronous option if analytic execution will take longer than 30 seconds to complete. Asynchronous execution must be used in a production environment.

Issue the following REST API request. Provide the analytic input JSON in the body of the request each time the analytic is run.

POST <catalog_uri>/api/v1/catalog/analytics/{id}/execution

If you are running the analytic multiple times with the same input data, or you have a large data set, you can upload the data as an artifact attached to the analytic (see Attaching Additional Artifacts to an Analytic Catalog Entry) to be retrieved when calling the analytic execution API. You can then call the analytic execution API without a request body, but with the artifact ID a query parameter, as in the following endpoint.

POST <catalog_uri>/api/v1/catalog/analytics/{analyticId}/execution?inputId={artifactId}

Running an Analytic Synchronously Using a Direct Call

Run a single analytic synchronously as follows. You can provide different data sets in the body of the request each time the analytic is run.

Before You Begin

The analytic must be already validated and deployed to the Cloud Foundry environment before it can be run. For more information and detailed steps, see

Note: Use synchronous execution only when testing or validating an analytic in a test environment. If an analytic takes longer than 30 seconds during synchronous execution, you may receive a 500 HTTP error. Always use asynchronous option if analytic execution will take longer than 30 seconds to complete. Asynchronous execution must be used in a production environment.

Issue the following REST API request.

POST <analytic_uri>/api/v1/analytic/execution

Where <analytic_uri> is the analytic catalog entry id followed by your Predix platform domain name. For example, https://09718078-95e7-4b60-b74a-152838f03b41.analytics.run.aws-usw02-pr.ice.predix.io. For more information about how to determine the <analytic_uri>, see Deployed Analytic URI.

The following table describes the request parameters.

ParameterParameter TypeData TypeDescription
bodybody, requiredJSON stringThe JSON string that will be passed to the analytic via its configured entry point in the config.json.
Predix-Zone-Idheader, requiredStringThe zone-id from the Analytics Catalog.
Authorizationheader, requiredStringThe user's UAA token.

If an error occurs it will contain an error response object similar to the following.

{
    "code": <see troubleshooting guide>,
    "severity": <not used at this time>,
    "detail": <a detailed description of the error>,
    "message": <a summary description of the error>
}

Running an Analytic Asynchronously Using a Direct Call

Run a single analytic asynchronously as follows. Provide the analytic input JSON in the body of the request each time the analytic is run.

Before You Begin

The analytic must be already validated and deployed to the Cloud Foundry environment before it can be run. For more information and detailed steps, see

Procedure

  1. Run a single analytic asynchronously by issuing the following REST API request. You can provide different data sets in the body of the request each time the analytic is run.
    POST <analytic_uri>/api/v1/analytic/execution/async

    Where <analytic_uri> is the analytic catalog entry id followed by your Predix platform domain name. For example, https://09718078-95e7-4b60-b74a-152838f03b41.analytics.run.aws-usw02-pr.ice.predix.io. For more information about how to determine the <analytic_uri>, see Deployed Analytic URI.

    The following table describes the request parameters.

    ParameterParameter TypeData TypeDescription
    bodybody, requiredJSON stringThe JSON string that will be passed to the analytic via its configured entry point in the config.json.
    Predix-Zone-Idheader, requiredStringThe zone-id from the Analytics Catalog.
    Authorizationheader, requiredStringThe user's UAA token.
    The POST request will return the status of the execution request including a requestId that can be used to retrieve the results of the analytic. For example:
    {
        "analyticId": "09718078-95e7-4b60-b74a-152838f03b41",
        "requestId": “0203479-1123-5b89-e89c-18928734962”,
        "analyticExecutionState": “QUEUED",
        "createdTimestamp": "2015-09-18T16:16:50+00:00",
        "updatedTimestamp": "2015-09-18T16:16:50+00:00
    }
  2. Retrieve the status of the asynchronous run by issuing the following REST API request. Use the requestId obtained in step 1.
    GET <analytic_uri>/api/v1/analytic/execution/async/{requestId}/status

    The status can be one of the following: QUEUED, PROCESSING, COMPLETED, or FAILED.

  3. If the status is COMPLETED or FAILED, retrieve the results of the asynchronous run by issuing the following REST API request. Use the requestId obtained in step 1.
    GET <analytic_uri>/api/v1/analytic/execution/async/{requestId}/result
    • If the status is COMPLETED, the results will be the output from the analytic.
    • If the status is FAILED, the results will contain the exception message from the Predix platform's processing and/or the analytic execution.
    • If an error occurs in the processing, the result request will contain an error response object as described in Running an Analytic Synchronously Using a Direct Call.
    • If an error occurs it will contain an error response object similar to the following.
      {
          "code": <see troubleshooting guide>,
          "severity": <not used at this time>,
          "detail": <a detailed description of the error>,
          "message": <a summary description of the error>
      }
  4. Delete the status and result of the of the asynchronous run by issuing the following REST API request. Use the requestId obtained in step 1.
    DELETE <analytic_uri>/api/v1/analytic/execution/async/{requestId}
    Note: Both the status and the result for the request will be deleted permanently. This step is recommended to release the storage space once the result is no longer needed.

Retrieving Analytic Catalog Entries

You can retrieve analytic catalog entries by pagination, sorting, and filtering criteria by issuing the following REST API request.

GET <catalog_uri>/api/v1/catalog/analytics 

The following sample request retrieves entries starting at page three in a zero-based page index, displaying 10 entries per page, and sorted by analytic name in ascending order.

GET <catalog_uri>/api/v1/catalog/analytics?page=2&size=10&sortOrder=asc&sortableFields=name

The following sample request retrieves entries belonging to an /Analytics/Diagnostic taxonomy .

GET <catalog_uri>/api/v1/catalog/analytics?taxonomyPath=/Analytics/Diagnostic

Deploying a Production Analytic to Cloud Foundry

Before You Begin

The analytic entry has been added to the catalog and validation is completed. See Adding and Validating an Analytic Using REST APIs.

About This Task

You can use the Deployment API to perform the following tasks:
  • Promote the validated analytic to a production state so it can be consumed by the Runtime service.
  • Optimize the analytics’s memory usage, disk usage, and number of instances.
  • Restart a previously deployed analytic.
Note: After an analytic has been promoted to production use, you cannot update or delete its executable artifact, or validate the analytic.

Procedure

  1. Deploy the analytic by issuing the following REST API request.
    The request to deploy an analytic is an asynchronous request. The response contains a requestId that can be later used to retrieve the deployment status.
    POST <catalog_uri>/api/v1/catalog/analytics/{id}/deployment
    The following sample request configures and initiates a deployment.
    {
        "memory":<Memory size in MB>,
        "diskQuota":<Disk space in MB>,
        "instances":<Number of instances>
    }
    For example
    {
        "memory":1024,
        "diskQuota":512,
        "instances":2
    }
    Be aware of the following:
    • Default values are assumed if memory, disk quota, and instance are not given or if amount given is smaller than the default value.
    • Default values are: memory, 512 MB; disk quota, 1024 MB for Java and Python, 2048 MB for Matlab; instance is 1.
    • If the given memory and disk quota values are greater than what the Predix platform supports, the analytic deployment will fail.
    • If the given memory value exceeds 7500MB, the analytic deployment will fail.
    • If the given number of instances exceeds two, the analytic deployment will fail.
    The following sample response shows a successful deployment.
    {
        "analyticId": "09718078-95e7-4b58-b74a-152838f03b41",
        "requestId": "dfcaa0d4-40f1-471d-9f66-e72a23a30266",
        "inputData": "{\"number1\": 123, \"number2\": 456}",
        "status": "QUEUED",
        "createdTimestamp": "2015-09-18T16:53:04+00:00",
        "updatedTimestamp": "2015-09-18T16:53:04+00:00",
        "message": "Analytic deployment request successfully queued - reference id is dfcaa0d4-40f1-471d-9f66-e72a23a30266",
        "result": null
    }
  2. Retrieve the analytic deployment status by issuing the following REST API request. Use the requestId from the previous step to check the status of the deployment. The status can be QUEUED, PROCESSING, or COMPLETED.
    GET  <catalog_uri>/api/v1/catalog/analytics/{id}/deployment/{requestId}
    The following sample response shows a successful deployment status check.
    {
        "analyticId": "09718078-95e7-4b58-b74a-152838f03b41",
        "requestId": "dfcaa0d4-40f1-471d-9f66-e72a23a30266",
        "inputData": "{\"number1\": 123, \"number2\": 456}",
        "status": "COMPLETED",
        "createdTimestamp": "2015-09-18T16:53:04+00:00",
        "updatedTimestamp": "2015-09-18T16:55:41+00:00",
        "message": "Analytic deployment completed successfully.",
        "result": ""
    }

    Once the status returns COMPLETED, the message field in the response indicates whether it was successful or failed.

Retrieving Recent Analytic Logs

You can retrieve the recent logs from a deployed analytic. Reviewing the logs can be useful when debugging issues such as execution failures.

To retrieve the recent analytic logs, issue the following REST API request.
GET <catalog_uri>/api/v1/catalog/analytics/{id}/logs

The response will contain the recent analytic logs in text/plain format. The following is a sample response.

[fd3dc098-113a-4f28-b1c9-1fb0265ce58c [Fri Mar 18 22:43:56 UTC 2016] 2016-03-18 22:43:56,377 10572 px_correlation_id= px_zone_id= px_service= px_user_name=  [main] INFO  c.g.p.a.j.s.JavaBasedAnalyticApplication - Started JavaBasedAnalyticApplication in 9.946 seconds (JVM running for 10.999) (STDOUT, App),
fd3dc098-113a-4f28-b1c9-1fb0265ce58c [Fri Mar 18 22:48:17 UTC 2016] 2016-03-18 22:48:17,584 271779 px_correlation_id=a427cf91-c91f-4d1f-89a9-6775af8acb9a px_zone_id=<zone_id> px_service=analytic px_user_name=  [http-nio-61729-exec-3] INFO  c.g.p.a.j.JavaBasedAnalyticServiceImpl - analyticId=<analytic_id> status=STARTED data={"number1":2,"number2":4} (STDOUT, App),
fd3dc098-113a-4f28-b1c9-1fb0265ce58c [Fri Mar 18 22:48:17 UTC 2016] 2016-03-18 22:48:17,588 271783 px_correlation_id=a427cf91-c91f-4d1f-89a9-6775af8acb9a px_zone_id=<zone_id> px_service=analytic px_user_name=  [http-nio-61729-exec-3] INFO  c.g.p.a.j.JavaBasedAnalyticServiceImpl - analyticId=<analytic_id> status=COMPLETED data={"result":6} (STDOUT, App)]

Deleting an Analytic From the Catalog

Deleting a non-subscription analytic will delete it from the analytics catalog. Deleting a subscription analytic will first unsubscribe it, and then delete it from the analytics catalog.

About This Task

Delete the analytic catalog entry by ID by issuing the following REST API request.

DELETE <catalog_uri>/api/v1/catalog/analytics/{id} 
Note: If the analytic has been deployed to Cloud Foundry, deleting it from the catalog also undeploys it from Cloud Foundry.
Note: An analytic that is used in an orchestration or a scheduled job cannot be deleted.

Managing Multiple Versions of an Analytic

Procedure

  1. Retrieve all versions of an analytic by issuing the following REST API request.

    For multiple versions of an analytic, the following API can be called to retrieve all versions of the analytic with the given name.

    GET <catalog_uri>/api/v1/catalog/analytics/versions?name=<analytic name> 
  2. Delete all versions of an analytic.

    If there are multiple versions of an analytic, the following API can be called to delete all versions of the analytic with the given name.

    DELETE <catalog_uri>/api/v1/catalog/analytics/{name}/versions 

About Analytic Taxonomy

Analytics hosted in the analytic catalog are categorized by their taxonomy locations. This provides a structured method for grouping related analytics and aids in analytic retrieval.

A taxonomy is a hierarchy of taxonomy nodes. The taxonomy JSON structure specification is as follows.

  • A taxonomy node contains node_name and child_nodes.
  • A node_name specifies the name of the node and must not contain a forward slash ("/").
  • A child_node can contain an array of nested taxonomy nodes.
  • An empty child_nodes array indicates a leaf node.

Taxonomy locations can be added using the API, which expects the desired taxonomy in a JSON structure (as either a single node object or an array of node objects). This API is additive, allowing additional nodes to be inserted after the initial structure is loaded.

The following is an example of the expected JSON structure.

Sample Taxonomy JSON Structure

[
    {
        "node_name": "Diagnostic",
        "child_nodes": [
            {
                "node_name": "Health Assessment",
                "child_nodes": [
                    {
                        "node_name": "Fault Diagnosis",
                        "child_nodes": []
                    },
                    {
                        "node_name": "Confidence Estimation",
                        "child_nodes": []
                    },
                    {
                        "node_name": "Uncertainty Estimation",
                        "child_nodes": []
                    }
                ]
            },
            {
                "node_name": "Condition Monitoring",
                "child_nodes": [
                    {
                        "node_name": "Fault Isolation",
                        "child_nodes": []
                    },
                    {
                        "node_name": "Anomaly Detection",
                        "child_nodes": []
                    }
                ]
            }
        ]
    },
    {
        "node_name": "Descriptive",
        "child_nodes": [
            {
                "node_name": "Standard Stats Techniques",
                "child_nodes": [
                    {
                        "node_name": "Bivariate Analysis",
                        "child_nodes": [
                            {
                                "node_name": "Cross Tabulation",
                                "child_nodes": []
                            },
                            {
                                "node_name": "Scatter Plots",
                                "child_nodes": []
                            }
                        ]
                    },
                    {
                        "node_name": "Univariate-Summary",
                        "child_nodes": []
                    }
                ]
            }
        ]
    }
]

When creating or updating an analytic entry, the taxonomy location can be specified as a slash-separated path. For example:

/Diagnostic/Condition Monitoring/Anomaly Detection
/Descriptive/Standard Stats Techniques/Bivariate Analysis

Taxonomy locations must be loaded before they can be assigned to an analytic.

If the taxonomy location is not specified for an analytic entry, it uses the following default location:

/uncategorized

Adding Taxonomy Locations

A taxonomy is a hierarchical tree of categories structuring an analytic. The following API call adds the taxonomy specified in the request body to the existing taxonomy in the catalog. It does not remove any part of the existing taxonomy.

About This Task

Set up the taxonomy by issuing the following REST API request.

POST <catalog_uri>/api/v1/catalog/taxonomy

Sample request to load a taxonomy.

{
    "node_name": "Analytics",
    "child_nodes": [
        {
            "node_name": "Diagnostic",
            "child_nodes": []
        },
        {
            "node_name": "Descriptive",
            "child_nodes": []
        },
        {
            "node_name": "Predictive",
            "child_nodes": []
        },
        {
            "node_name": "Prescriptive",
            "child_nodes": []
        }
    ]
}