Portworx Alerts
Portworx Alerts
Portworx provides a way to monitor your cluster using alerts. It has a predefined set of alerts which are listed below. The alerts are broadly classified into the following types based on the Resource on which it is raised
- Cluster
- Nodes
- Disks
- Volumes
- Pools
Each alert has a severity from one of the following levels:
- INFO
- WARNING
- ALARM
List of Alerts
Alert Codes | Alert Type | Severity | Resource Type | Description |
---|---|---|---|---|
0 | DriveOperationFailure | ALARM | DRIVE | Triggered when a driver operation such as add or replace fails. |
1 | DriveOperationSuccess | NOTIFY | DRIVE | Triggered when a driver operation such as add or replace succeeds. |
2 | DriveStateChange | WARN | DRIVE | Triggered when there is a change in the driver state viz. Free Disk space goes below the recommended level of 10%. |
3 | VolumeOperationFailureAlarm | ALARM | VOLUME | Triggered when a volume operation fails. Volume operations could be resize, cloudsnap, etc. The alert message will give more info about the specific error case. |
4 | VolumeOperationSuccess | NOTIFY | VOLUME | Triggered when a volume operation such as resize succeeds. |
5 | VolumeStateChange | WARN | VOLUME | Triggered when there is a change in the state of the volume. |
6 | VolGroupOperationFailure | ALARM | CLUSTER | Triggered when a volume group operation fails. |
7 | VolGroupOperationSuccess | NOTIFY | CLUSTER | Triggered when a volume group operation succeeds. |
8 | VolGroupStateChange | WARN | CLUSTER | Triggered when a volume group’s state changes. |
9 | NodeStartFailure | ALARM | CLUSTER | Triggered when a node in the Portworx cluster fails to start. |
10 | NodeStartSuccess | NOTIFY | CLUSTER | Triggered when a node in the Portworx cluster successfully initializes. |
11 | >Internal PX Alert< | - | - | Alert code used for internal Portworx bookkeeping. |
12 | NodeJournalHighUsage | ALARM | CLUSTER | Triggered when a node’s timestamp journal usage is not within limits. |
13 | IOOperation | ALARM | VOLUME | Triggered when an IO operation such as Block Read/Block Write fails. |
14-16 | >Internal PX Alerts< | - | - | Alert codes used for internal Portworx bookkeeping. |
17 | PXInitFailure | ALARM | NODE | Triggered when Portworx fails to initialize on a node. |
18 | PXInitSuccess | NOTIFY | NODE | Triggered when Portworx successfully initializes on a node. |
19 | PXStateChange | WARN | NODE | Triggered when the Portworx daemon shuts down in error. |
20 | VolumeOperationFailureWarn | WARN | VOLUME | Triggered when a volume operation fails. Volume operations could be resize, cloudsnap, etc. The alert message will give more info about the specific error case. |
21 | StorageVolumeMountDegraded | ALARM | NODE | Triggered when Portworx storage enters degraded mode on a node. |
22 | ClusterManagerFailure | ALARM | NODE | Triggered when Cluster manager on a Portworx node fails to start. The alert message will give more info about the specific error case. |
23 | KernelDriverFailure | ALARM | NODE | Triggered when an incorrect Portworx kernel module is detected. Indicates that Portworx is started with an incorrect version of the kernel module. |
24 | NodeDecommissionSuccess | NOTIFY | CLUSTER | Triggered when a node is successfully decommissioned from Portworx cluster. |
25 | NodeDecommissionFailure | ALARM | CLUSTER | Triggered when a node could not be decommissioned from Portworx cluster. |
26 | NodeDecommissionPending | WARN | CLUSTER | Triggered when a node decommission is kept in pending state as it has data which is not replicated on other nodes. |
27 | NodeInitFailure | ALARM | CLUSTER | Triggered when Portworx fails to initialize on a node. |
28 | >Internal PX Alert< | - | - | Alert code used for internal Portworx bookkeeping. |
29 | NodeScanCompletion | NOTIFY | NODE | Triggered when node media scan completes without error. |
30 | VolumeSpaceLow | ALARM | VOLUME | Triggered when the free space available in a volume goes below a threshold. |
31 | ReplAddVersionMismatch | WARN | VOLUME | Triggered when a volume HA update fails with version mismatch. |
32 | CloudsnapScheduleFailure | ALARM | NODE | Triggered if a cloudsnap schedule fails to configure. |
33 | CloudsnapOperationUpdate | NOTIFY | VOLUME | Triggered if a cloudsnap schedule is changed successfully. |
34 | CloudsnapOperationFailure | ALARM | VOLUME | Triggered when a cloudsnap operation fails. |
35 | CloudsnapOperationSuccess | NOTIFY | VOLUME | Triggered when a cloudsnap operation succeeds. |
36 | NodeMarkedDown | WARN | CLUSTER | Triggered when a Portworx node marks another node down as it is unable to connect to it. |
37 | VolumeCreateSuccess | NOTIFY | VOLUME | Triggered when a volume is successfully created. |
38 | VolumeCreateFailure | ALARM | VOLUME | Triggered when a volume creation fails. |
39 | VolumeDeleteSuccess | NOTIFY | VOLUME | Triggered when a volume is successfully deleted. |
40 | VolumeDeleteFailure | ALARM | VOLUME | Triggered when a volume deletion fails. |
41 | VolumeMountSuccess | NOTIFY | VOLUME | Triggered when a volume is successfully mounted at the requested path. |
42 | VolumeMountFailure | ALARM | VOLUME | Triggered when a volume cannot be mounted at the requested path. |
43 | VolumeUnmountSuccess | NOTIFY | VOLUME | Triggered when a volume is successfully unmounted. |
44 | VolumeUnmountFailure | ALARM | VOLUME | Triggered when a volume cannot be unmounted. The alert message provides more info about the specific error case. |
45 | VolumeHAUpdateSuccess | NOTIFY | VOLUME | Triggered when a volume’s replication factor (HA factor) is successfully updated. |
46 | VolumeHAUpdateFailure | ALARM | VOLUME | Triggered when an update to volume’s replication factor (HA factor) fails. |
47 | SnapshotCreateSuccess | NOTIFY | VOLUME | Triggered when a volume is successfully created. |
48 | SnapshotCreateFailure | ALARM | VOLUME | Triggered when a volume snapshot creation fails. |
49 | SnapshotRestoreSuccess | NOTIFY | VOLUME | Triggered when a snapshot is successfully restored on a volume. |
50 | SnapshotRestoreFailure | ALARM | VOLUME | Triggered when the operation of restoring a snapshot fails. |
51 | SnapshotIntervalUpdateFailure | ALARM | VOLUME | Triggered when an update of the snapshot interval for a volume fails. |
52 | SnapshotIntervalUpdateSuccess | NOTIFY | VOLUME | Triggered when a snapshot interval of a volume is successfully updated. |
53 | PXReady | NOTIFY | NODE | Triggered when Portworx is ready on a node. |
54 | StorageFailure | ALARM | NODE | Triggered when the provided storage drives could not be mounted by Portworx. |
55 | ObjectstoreFailure | ALARM | NODE | Triggered when an object store error is detected. |
56 | ObjectstoreSuccess | NOTIFY | NODE | Triggered upon a successful object store operation. |
57 | ObjectstoreStateChange | NOTIFY | NODE | Triggered in response to a state change. |
58 | LicenseExpiring | WARN | Cluster | Warning triggers 7 days before the installed or Trial license will expire (e.g. “PX-Enterprise license will expire in 6 days, 12:00”). It will also keep triggering after the license has expired (e.g. “Trial license expired 4 days, 06:22 ago”). |
59 | VolumeExtentDiffSlow | WARN | VOLUME | Volume extent diff is taking too long. |
60 | VolumeExtentDiffOk | WARN | VOLUME | Volume extent diff is okay. |
61 | SharedV4SetupFailure | WARN | NODE | Triggered when the creation of a sharedv4 volume fails. |
62 | SnapshotDeleteSuccess | NOTIFY | VOLUME | Triggered when a snapshot is successfully deleted. |
63 | SnapshotDeleteFailure | ALARM | VOLUME | Triggered when a snapshot delete is successfully deleted. |
64 | DriveStateChangeClear | WARN | DRIVE | Triggered when the drive’s state gets cleared. |
65 | VolumeSpaceLowCleared | NOTIFY | Volume | Triggered when the free disk space goes above the recommended level of 10%. |
66 | ClusterPairSuccess | NOTIFY | CLUSTER | Triggered when a cluster pair operation succeeds. |
67 | ClusterPairFailure | ALARM | ALARM | Triggered when a cluster pair operation fails. |
68 | CloudMigrationUpdate | NOTIFY | VOLUME | Triggered if a cloud migration is updated. |
69 | CloudMigrationSuccess | NOTIFY | VOLUME | Triggered when a cloud migration operation succeeds. |
70 | CloudMigrationFailure | ALARM | VOLUME | Triggered when a cloud migration operation fails. |
71 | ClusterDomainAdded | NOTIFY | CLUSTER | Triggered when a cluster domain is added. |
72 | ClusterDomainRemoved | NOTIFY | CLUSTER | Triggered when a cluster domain is removed. |
73 | ClusterDomainActivated | NOTIFY | CLUSTER | Triggered when a cluster domain is activated. |
74 | ClusterDomainDeactivated | NOTIFY | CLUSTER | Triggered when a cluster domain is deactivated. |
75 | MeteringAgentWarning | WARN | CLUSTER | Triggered when the metering agent encounters a non-critical problem. |
76 | MeteringAgentCritical | ALARM | CLUSTER | Triggered when the metering agent encounters a critical problem. |
77 | CloudsnapOperationWarning | WARN | VOLUME | Triggered when a cloud snap operation encounters a problem. |
78 | PoolExpandInProgress | NOTIFY | POOL | Triggered when a pool expand operation starts. |
79 | PoolExpandSuccessful | NOTIFY | POOL | Triggered when a pool expand operation succeeds. |
80 | PoolExpandFailed | ALARM | POOL | Triggered when a pool expand operation fails. |
86 | StoragelessToStorageNodeTransitionFailure | ALARM | NODE | Triggered when a node fails to transition from a storageless type to a storage type. |
87 | StoragelessToStorageNodeTransitionSuccess | NOTIFY | NODE | Triggered when a node transitions from a storageless type to a storage type successfully. |
89 | ClusterLicenseUpdated | NOTIFY | CLUSTER | Triggered when a license is updated for a cluster. |
90 | LicenseExpired | ALARM | CLUSTER | Triggered when the cluster license expires. |
91 | LicenseLeaseExpiring | WARNING | CLUSTER | Triggered when the license lease is about to expire since the last lease refresh failed. |
92 | LicenseLeaseExpired | ALARM | CLUSTER | Triggered when the license lease has expired since the last lease refresh failed. |
93 | LicenseServerDown | WARNING | NODE | Triggered when a node is unable to reach the license server. |
94 | FloatingLicenseSetupError | ALARM | NODE | Triggered when a node fails to setup a floating license. |
95 | NFSServerUnhealthy | WARNING | NODE | Triggered when the NFS server on this node is unhealthy. |
96 | FileSystemDependency | ALARM | NODE | Triggered during Portworx installation if there’s a filesystem dependency failure. |
97 | RebootRequired | ALARM | NODE | Triggered when a node requires a reboot. |
98 | TempFileSystemInitialization | ALARM | NODE | Triggered during Portworx installation if a node fails to initialize a temporary filesystem. |
99 | UnsupportedKernel | ALARM | NODE | Triggered during a Portworx installation if the node contains a kernel that is not supported by Portworx. |
100 | InvalidDevice | ALARM | NODE | Triggered during Portworx installation if an invalid device is provided to Portworx as a storage device. |
101 | NfsDependencyInstallFailure | ALARM | NODE | Triggered during Portworx installation if Portworx cannot install the NFS service. |
102 | NfsDependencyNotEnabled | ALARM | NODE | Triggered during Portworx installation if Portworx cannot enable the NFS service. |
103 | LicenseCheckFailed | ALARM | NODE | Triggered if a node fails a license check. |
104 | PortworxStoppedOnNode | WARNING | NODE | Triggered if Portworx is stopped on a node. |
105 | KvdbConnectionFailed | ALARM | NODE | Triggered if Portworx fails to connect to the KVDB. |
106 | InternalKvdbSetupFailed | ALARM | NODE | Triggered if Portworx fails to setup Internal KVDB on a node. |
107 | PortworxMonitorImagePullFailed | ALARM | NODE | Triggered if Portworx fails to pull Portworx images during installation. |
108 | PortworxMonitorPrePostExecutionFailed | ALARM | NODE | Triggered if Portworx fails to execute pre or post installation tasks. |
109 | PortworxMonitorMountValidationFailed | ALARM | NODE | Triggered if Portworx fails to validate mounts provided to Portworx container during installation. |
110 | PortworxMonitorSchedulerInitializationFailed | ALARM | NODE | Triggered if Portworx fails to initialize connection with scheduler during installation. |
111 | PortworxMonitorServiceControlsInitializationFailed | ALARM | NODE | Triggered if Portworx fails to initialize the service controls during installation. |
112 | PortworxMonitorInstallFailed | ALARM | NODE | Triggered if Portworx installation fails. |
113 | MissingInputArgument | ALARM | NODE | Triggered if there’s a missing input install argument. |
114 | PortworxMonitorImagePullInProgress | NOTIFY | NODE | Triggered when Portworx is pulling and extracting images during installation or upgrade. |
Last edited: Tuesday, Aug 18, 2020
Questions? Visit the Portworx forum.