Alerts

On the Admin> Alerts page, an administrator can set up alert rules to detect and respond to events occurring in the Greenplum Database system and in currently executing database queries. When a rule is matched, Command Center logs a record.

You can set up email alerts by configuring an SMTP server in Greenplum Database or in Command Center. Additionally, you can create a send-alert.sh shell script to forward alerts to other destinations, such as an SMS gateway or a Slack channel. If the script is present, Command Center runs it whenever an alert is raised.

Command Center creates the gpmetrics schema in the gpperfmon database to store both rules and log records. See gpmetrics Schema Reference for information about the gpcc_alert_rule and gpcc_alert_log tables in the gpmetrics schema.

This topic contains the following subtopics:

Configuring Alert Rules

Click EDIT to manage alert event rules. To enable an alert rule, enter any data required in the fields and check the box. Uncheck the box to disable the rule. Click SAVE when you have finished making changes to the alert configuration.

Alerts

Segment failure

An alert is raised when one or more failed segments are detected. After the alert email is raised, Command Center will raise the alert every 30 minutes until the segments are recovered.

Average memory (segment hosts) exceeds [%] for [N] min

An alert is raised when the average memory for all segment hosts exceeds the specified percentage for the specified number of minutes. Command Center samples all segment hosts every 15 seconds and calculates the mean of the samples. Only memory in use is considered; memory for buffers and cache is not included.

Memory (master) exceeds [%] for [N] min

An alert is raised when the percent of memory used on the master host exceeds the specified percentage for the specified number of minutes. Command Center samples memory usage on the master host every 15 seconds and calculates the mean of the samples. Only memory in use is considered; memory for buffers and cache is not included.

Total disk space exceeds [%] full

An alert is raised when the total of disk space in use for all segment hosts exceeds the specified percentage. Command Center gathers the available disk space and total disk space from each segment host in the Greenplum Database cluster. The percent of total disk space in use is calculated by the following formula:
     100 - sum(<available disk space>) / sum(<total disk space>) * 100
A disk space alert is raised no more than once every 24 hours.

Number of connections exceeds [N]

An alert is raised when the total number of database connections exceeds the number specified. The number of connections is checked every 30 seconds. After an alert is raised, the metrics collector checks the number of connections every 30 minutes until the number of connections drops below the threshold, and then it resumes checking every 30 seconds.

Average CPU (segment hosts) exceeds [%] for [N] min

An alert is raised when the average percent of CPU used for all segment hosts exceeds the specified percentage for the specified number of minutes. Command Center samples all segment hosts every 15 seconds and calculates the mean of the samples.

CPU (master) exceeds [%] for [N] min

An alert is raised when the CPU usage on the master host exceeds the specified percentage for the specified number of minutes. Command Center samples CPU usage on the master host every 15 seconds and calculates the mean of the samples.

Out of memory errors

An alert is raised when an executing query fails with an out of memory (OOM) error. Note that no alert is raised if there is insufficient memory to start the query.

Spill files for a query exceeds [GB]

An alert is raised when the total disk space consumed by a running query’s spill files exceeds the specified number of gigabytes. An alert is raised only once per query.

Query runtime exceeds [N] min

An alert is raised when a query runtime exceeds the number of minutes specified. This alert is raised just once for a query.

Query is blocked for [N] min

An alert is raised if a query remains in a blocked state for longer than the specified number of minutes. If an alert is raised, and then the query unblocks, runs, and blocks again for the specified time, an additional alert is raised. Blocked time excludes the time a query is queued before it runs. It is possible for a “Query runtime exceeds [N] min” rule to also trigger while a query is blocked.

Configuring Alert Email

Command Center requires an SMTP server to send alert emails. If SMTP has been configured for Greenplum Database, Command Center will use the configured SMTP server, SMTP user, and password. You must enter values for the fields in the right column, Send emails to and From, whether you use the Greenplum Database SMTP server or enter another one.

Configuring email With Command Center

Click EDIT in the Manage email configuration panel.

email config

The alert email configuration is set with the following Greenplum Database server configuration parameters:

SMTP Server address

The name or IP address of the SMTP server and the SMTP port number. The port number is typically 587 for connections with TLS encryption or 465 without encryption. If the gp_email_smtp_server configuration parameter is set in Greenplum Database, it is prefilled here. Ask your system admin for the correct values to enter. Example: smtp.example.com:465

Username

The username of the account to authenticate with the SMTP server. If the gp_email_smtp_password configuration parameter is set in Greenplum Database, it is prefilled here. Example: gpcc-alerts@example.com

Password

The password for the SMTP username. For security, the password is masked. If the gp_email_smtp_password configuration parameter is set in Greenplum Database, that value is used here.

Send emails to

To add an address to the list, enter the address and press Enter. To remove an email address, click the X on the address.

From

The email address to use for the From: address in the alert email. Example: do-not-reply@example.com. If you leave this field blank, Command Center uses the default value, noreply-gpcc-alerts@pivotal.io.

When you click SAVE, Command Center sends a test email to the addresses in the Send emails to field. The email contains a list of the currently configured alert rules. If there is an error in the SMTP server or username/password configuration and the email cannot be sent, Command Center displays an error message.

Configuring email for Greenplum Database

The following server configuration parameters are used to configure SMTP email for Greenplum Database.

gp_email_smtp_server

The SMTP server and port. Example: smtp.example.com:465

gp_email_smtp_userid

The name of a user to authenticate with the SMTP service. Example: gpcc-alerts@example.com

gp_email_smtp_password

The password for the SMTP user.

gp_email_from

The email address to set as the email sender. Example: noreply-gpcc-alerts@example.com

gp_email_to

A semicolon-separated list of email addresses to receive alert messages. Example gpcc-admin@example.com;gpdb-admin@example.com

Command Center uses the gp_email_smtp_server, gp_email_smtp_userid, and gp_email_smtp_password parameters if they are set. It ignores the remaining parameters.

You can check the current value of a configuration parameter by running the gpconfig -s command on the master host, for example:

$ gpconfig -s gp_email_smtp_server

Use the gpconfig -c option to set the values of server configuration parameters, for example:

$ gpconfig -c gp_email_smtp_server -v "smtp.example.com:465" 
$ gpconfig -c gp_email_smtp_userid -v "gpcc-alerts@example.com"
$ gpconfig -c gp_email_smtp_password -v "changeme"
$ gpconfig -c gp_email_from -v "gpcc-alerts@example.com"
$ gpconfig -c gp_email_to -v "gpcc-admin@example.com;gpdb-admin@example.com"

Run gpstop -u to reload the configuration files after changing these configuration parameters.

Creating a Send Alert Script

The send alert script is a shell script that you can use to send Command Center alerts to destinations such as SMS gateways, pagers, team collaboration tools like Slack, chat servers, archive files, alternative email servers, and so on. You can use the send alert script in addition to sending email from Command Center, or as an alternative to sending alert emails from Command Center.

Command Center looks for the script $MASTER_DATA_DIRECTORY/gpmetrics/send-alert.sh on the host where Command Center is running—either the master host or standby host. If the file exists and is executable by the gpadmin user, Command Center executes the script. The following variables are set on the command line when the script runs.

Variable Description
LINK URL of the Greenplum Command Center web server.
QUERYID ID of the query, if the alert was triggered by a query.
SERVERNAME Name of the Greenplum Command Center server.
QUERYTEXT The text of the query, if the alert was triggered by a query.
ACTIVERULENAME Current text the of rule, with user-specified values included.
LOGID Value of this alert’s id column in the gpmetrics.gpcc_alert_log table.
RULEDESCRIPTION Text of the rule, including user-specified values, at the time the alert was raised.
ALERTDATE Date the alert was raised.
ALERTTIME Time the alert was raised.
SUBJECT Subject line for email.

An example script that you can customize is provided at $GPCC_HOME/alert-email/send_alert.sh.sample. The example formats the alert as HTML email text and pipes it through the Linux mail command.

To set up a send alert script:

  1. Copy the $GPCC_HOME/alert-email/send_alert.sh.sample file to $MASTER_DATA_DIRECTORY/gpmetrics/send-alert.sh.

  2. Customize the script with code to format and deliver the alert to your desired destination.

  3. Run gpcc start to restart Command Center and enable the script.