gpmetrics Schema Reference

Greenplum Command Center creates the gpmetrics schema in the Greenplum Database gpperfmon to save alert rules and logs, and historical metrics collected by the Greenplum Database metrics collection system. The gpperfmon schema contains the following tables:

If you set the schema search path to include the gpmetrics schema, you do not have to qualify table names with the gpmetrics schema name. To set the default search path for the gpperfmon database enter this SQL command.

=# ALTER DATABASE gpperfmon SET search_path TO public,gpmetrics;

You must exit the current session and start a new session for the new search path to take effect.

Alert Tables

Command Center uses the gpcc_alert_rule and gpcc_alert_log tables to store the alert rules you set up in the user interface and to log messages when the alert rules are triggered.

gpcc_alert_rule Table

The gpcc_alert_rule table records the alert rules configured in the Command Center user interface. It has the columns shown in the following table.

Column Type Description
rule_id integer Unique id for the rule.
rule_type integer Reserved for future use.
rule_description character varying(512) Text of the rule.
rule_config json JSON string containing parameters for user-specified values.
ctime timestamp(0) without time zone Time the rule was created.
etime timestamp(0) without time zone Time the rule became inactive, or null if rule is active.

The gpcc_alert_rule table keeps a history of alert rule configurations. When a rule becomes active, a new row is inserted and the ctime timestamp column is set to the current time; the etime timestamp is null, indicating that the rule is still active. When a rule is either disabled or superceded by a new rule, the etime timestamp column is set to the current time. Thus, the set of currently active rules is all rows where the etime column is null. A row that has timestamps in both ctime and etime columns is an historical record of the period of time during which the rule was active.

The rule_id column, a unique integer, is the distribution key for the table and is used to identify a single alert rule configuration. This column can be joined with the rule_id column in the gpcc_alert_log table to identify the rule that triggered each recorded alert event.

The rule_description column contains a string that describes the event that matches the rule. It is the text displayed in the Command Center UI for the rule, with user-specified values inserted.

The rule_config column contains a JSON string with parameters for the values entered for each of the rule’s fields in the Command Center UI.

gpcc_alert_log Table

The gpcc_alert_log table has the columns shown in the following table.

Column Type Description
id integer Unique ID for the alert.
rule_id integer The ID of the rule that triggered this alert.
transaction_time timestamp(0) without time zone Time the alert was raised.
content json Contains parameters specifying values that triggered the alert.

The gpcc_alert_log table is an append-only, column-oriented table, partitioned by month on the transaction_time column. Command Center creates new partitions as needed an removes partitions over 12 months old.

A row is added to the gpcc_alert_log table whenever an alert rule is matched.

The id column, a unique integer, is the distribution key for the table.

The transaction_time column is set to the current time when a row is created.

The rule_id column can be joined with the rule_id column in the gpcc_alert_rule table to access details of the rule that triggered the alert.

The content column contains a JSON string with parameters specifying details about the event that triggered the alert. The JSON parameters vary with the type of the alert.

Example Query

This query lists the ten most recent alerts, including the configuration of the rule that triggered each event.

gpperfmon=# SELECT l.transaction_time, l.rule_id, r.rule_description, l.content
FROM gpmetrics.gpcc_alert_log AS l, gpmetrics.gpcc_alert_rule AS r
WHERE l.rule_id = r.rule_id AND r.etime is NULL
ORDER BY l.transaction_time
LIMIT 10;

Greenplum Database Metrics History Tables

The gpmetrics query history saves information collected by the Greenplum Database metrics collection system and forwarded to Greenplum Command Center.

The distribution key for each table is a ctime timestamp column, which is the time when the row is added to the datbase. The tables are partitioned by year and month. Greenplum Command Center creates new partitions automatically as needed.

The history tables use append-optimized, column-oriented storage.

Command Center only saves queries with runtimes greater than the value of the min_query_time configuration parameter, found in the $MASTER_DATA_DIRECTORY/gpmetrics/gpcc.conf configuration file on the host executing Command Center. The default, 0, saves all queries in the history table. This parameter can be configured on the Command Center Admin> Settings page.

gpcc_database_history

The gpcc_database_history table saves summary query activity metrics collected by the Greenplum Database metrics collector. This data can be used to review the Greenplum Database query load over time.

Column Type Description
ctime timestamp(0) without time zone Time the record was created.
queries_total integer Total number of queries running and queued to run.
queries_running integer Number of queries currently running.
queries_queued integer Number of queries queued, but not yet running.
queries_blocked integer The number of queries started, but blocked by other transactions.
queries_finished integer The number of queries that completed since the previous sampling interval.

gpcc_disk_history

The gpcc_disk_history table saves historical disk usage statistics for each Greenplum Database segment host file system.

Column Type Description
ctime timestamp(0) without time zone Time the row was created.
hostname character varying(64) Name of the segment host.
filesystem text Path to the segment’s data directory.
total_bytes bigint Total size of the file system storage in bytes.
bytes_used bigint Number of storage bytes in
bytes_available bigint Number of storage bytes available.

gpcc_pg_log_history

The gpcc_pg_log_history table stores pg_log errors and warnings from the Greenplum Server log files.

Column Type Description
logtime timestamp without time zone Timestamp for this log.
loguser text Name of the role executing the query.
logdatabase text The database accessed.
logpid text Process id.
logthread text Thread number.
loghost text Host name or IP address of the host.
logport text Port number.
logsessiontime timestamp without time zone Session timestamp.
logtransaction integer Transaction id.
logsession text Session id.
logcmdcount text Command count.
logsegment text Segment number.
logslice text Slice number.
logdistxact text Distributed transation id.
loglocalxact text Location transacton id.
logsubxact text Subtransaction id.
logseverity text Log severity.
logstate text SQL State code associated with this log message.
logmessage text Log or error message text.
logdetail text Detail message text associated with an error message.
loghint text Hint message text associated with an error message.
logquery text Internally-generated query text.
logquerypos integer Index into the internally-generated query text.
logcontext text Context in which this message gets generated.
logdebug text Query string with full detail for debugging.
logcursorpos integer Cursor index into the query string.
logfunction text Function in which this message is generated.
logfile text Log file in which this message is generated.
logline integer Line in the log file in which this message is generated.
logstack text Full text of the stack trace associated with this message.

gpcc_plannode_history

The gpcc_plannode_history table saves detailed metrics for each operation (node) in a completed query plan. Each row contains metrics for one operation that executed on one Greenplum Database segment. This information allows reconstructing the plan and execution metrics for a completed query.

Plan node history is only saved for queries that execute for 10 seconds or more.

Column Type Description
ctime timestamp(0) without time zone Time the row was created.
tmid integer A time identifier for the query. All records associated with a query will have the same tmid.
ssid integer Session id for the database connection. All records associated with the query will have the same ssid.
ccnt integer Command number within the session. All records associated with the query will have the same ccnt.
segid integer Id (dbid) of the segment for this plan node.
procid integer The postgres process ID for this plan node.
sliceid integer Id of the slice the operation belongs to. Operations that belong to the same slice execute in parallel.
nodeid integer The query plan node ID for this operation.
parent_nodeid integer The parent query plan node ID from the query plan.
node_type character varying(64) Name of the operation type.
tinit timestamp(6) without time zone Time the operation was initialized.
tstart timestamp(6) without time zone Time the operation started.
tfinish timestamp(6) without time zone Time the operation finished.
status character varying(16) Status of the operation: initialize, executing, or finished.
planrows double precision The number of output rows estimated for the operation.
planwidth integer Width of output rows estimated for the operation.
start_cost double precision Number of page reads expected to produce first output row.
total_cost double precision Number of page reads expected to produce all output rows.
tuple_count bigint
ntuples bigint
nloops bigint
first_tuple timestamp(6) without time zone Time the operation produced the first output row.
rel_oid oid Object ID of the output rows produced by the operation.
relation_name character varying(64) Name of the table this operation processed, if any.
index_name character varying(64) Name of the index used by this operation, if any.
alias_name character varying(64) Alias for the relation declared in the SQL command.
node_seq integer Node sequence
condition text Condition expression used for a filter or join operation.

gpcc_queries_history

The gpcc_queries_history table saves metrics for completed queries.

Column Type Description
ctime timestamp(0) without time zone Time the row was created.
tmid integer A time identifier for the query. All records associated with the query will have the same tmid.
ssid integer Session id for the database connection. All records associated with the query will have the same ssid.
ccnt integer Command number within this session. All records associated with the query will have the same ccnt.
username character varying(64) Role name that issued the query.
db character varying(64) Name of the database queried.
cost double precision Estimated cost to execute query, computed by the legacy planner or GPORCA.
tsubmit timestamp(6) without time zone Time the query was submitted.
tstart timestamp(6) without time zone Time the query was started.
tfinish timestamp(6) without time zone Time the query finished.
status character varying(16) Status of the query – abort, error, or done.
rows_out bigint Number of rows returned by the query.
error_msg text Error message, if the query failed.
plan_gen character varying(16) PLANNER if query plan was generated by the legacy planner; OPTIMIZER if the plan was generated by GPORCA, the Pivotal query optimizer.
query_hash character varying(64) Hash code generated from the text of the query.
query_text text Complete text of the query. Some queries may be reformatted before storing in the history database.
application_name character varying(64) Name of the client application that established the database connection.
rsqname character varying(64) If the gp_resource_manager configuration parameter is queue, the name of the resource queue managing the query.
rsgname character varying(64) If the gp_resource_manager configuration parameter is group, the name of the resource group managing the query.
cpu_master bigint Total CPU usage for this query on the Greenplum Database master instance.
cpu_segs bigint Total CPU usage for this query across all segments, measured in milliseconds. This is the sum of the CPU usage values taken from all active primary segments in the database array.
cpu_master_percent double precision Average CPU percent usage on the master host during execution of this query.
cpu_segs_percent double precision Average CPU percent usage on the segment hosts during the execution of this query.
skew_cpu double precision Displays the amount of processing skew in the system for this query. Processing/CPU skew occurs when one segment performs a disproportionate amount of processing for a query. The skew is calculated from total CPU seconds used on all segments during the execution of the query.
skew_rows double precision Displays the amount of row skew in the system. Row skew occurs when one segment produces a disproportionate number of rows for a query.
memory bigint Total size of memory, in bytes, used by all segments to execute this query.
disk_read_bytes bigint Number of bytes read from disk.
disk_write_bytes bigint Number of bytes written to disk.
spill_size bigint Total size of spill files used by all segments to execute this query.
rqpriority character varying(16) Priority setting for the resource queue managing this query. Blank if resource group management is enabled.

gpcc_scan_history

Column Type Description
ctime timestamp(0) without time zone Time the row was created.
scan_id integer Scan id for this scan. See schedule_id column in the gpcc_schedule.
status integer Scan completion status. 0=Running, 1=Success, 2=Failed, 3=Cancelled, 4=Expired.
tables_scanned bigint The number of tables that were scanned.
start_dt timestamp(0) without time zone Time the scan began.
end_dt timestamp(0) without time zone Time the scan ended.

gpcc_schedule

The gpcc_schedule table contains details for scheduled events, such as the table scan for Recommendations. Each row describes a schedule in a JSON value. The ctime and etime columns together specify the period of time during which the schedule is enabled.

Column Type Description
schedule_id integer Id number for this schedule.
description character varying(512) Description of schedule.
schedule_config json JSON string with schedule details.
ctime timestamp(0) without time zone Time the schedule was added and enabled.
etime timestamp(0) without time zone Time the schedule was disabled.

gpcc_system_history

The gpcc_system_history table saves historical system metrics for each Greenplum Database host, including the master, standby master, and segment hosts. The metrics include information about memory, CPU, disk, and network utilitization.

Column Type Description
ctime timestamp(0) without time zone Time the row was created.
hostname character varying(64) Segment or master hostname associated with these system metrics.
mem_total bigint Total system memory in Bytes for this host.
mem_used bigint System memory used, in Bytes, for this host.
mem_actual_used bigint Actual memory used, in Bytes, for this host (not including the memory reserved for cache and buffers).
mem_actual_free bigint Free actual memory, in Bytes, for this host (not including the memory reserved for cache and buffers).
swap_total bigint Total swap space in Bytes for this host.
swap_used bigint Swap space used, in Bytes, for this host.
swap_page_in bigint Number of swap pages in.
swap_page_out bigint Number of swap pages out.
cpu_user double precision Percentage of time CPU processes execute in user mode.
cpu_sys double precision Percentage of time CPU processes execute in system (kernel) mode.
cpu_idle double precision Percentage idle CPU.
load0 double precision CPU one-minute load average.
load1 double precision CPU five-minute load average.
load2 double precision CPU fifteen-minute load average.
quantum integer Interval between metrics collections.
disk_ro_rate bigint Disk read operations per second.
disk_wo_rate bigint Disk write operations per second.
disk_rb_rate bigint Bytes per second for disk read operations.
disk_wb_rate bigint Bytes per second for disk write operations.
net_rp_rate bigint Packets per second on the system network for read operations.
net_wp_rate bigint Packets per second on the system network for write operations.
net_rb_rate bigint Bytes per second on the system network for read operations.
net_wb_rate bigint Bytes per second on the system network for write operations.

gpcc_table_info

The gpcc_table_info table stores a current snapshot of statistics for tables. There is one row for each table and partition.

Column Type Description
ctime timestamp without time zone Time the row was created.
dbid oid Object ID of the database.
relid oid Object ID of the table.
paroid oid Object ID of a partition.
reltablespace oid Object ID of the table’s tablespace.
seq_scan bigint Number of sequential scans initiated on this table.
idx_scan bigint Number of index scans initiated on this table.
n_tup_ins bigint Number of rows inserted.
n_tup_del bigint Number of rows deleted.
n_tup_upd bigint Number of rows updated (includes hot updated rows).
n_tup_hot_upd bigint Number of rows HOT updated (no separate index update required).
last_seq_scan timestamp with time zone Time of the last sequential scan on this table. An ANALYZE on a table can cause a sequential scan and update the count in the seq_scan column. However, the time of that scan is not saved in this column.
last_idx_scan timestamp with time zone Time of the last index scan on this table.
last_ins timestamp with time zone Time of the last insert on this table.
last_del timestamp with time zone Time of the last delete on this table.
last_upd timestamp with time zone Time of the last update on this table.
last_analyze timestamp with time zone Time of the last analyze on this table.
last_vacuum timestamp with time zone Time of the last vacuum of this table (excludes vacuum full).
last_vacuum_full timestamp with time zone Time of the last vacuum full of this table.
size bigint Combined size of all of this table’s files.
row_cnt real Number of rows in the table.
children integer Number of partitions, including middle-level and child partitions.
schema name Name of the schema this table belongs to.
table_name name Name of the table.
owner name Name of the database role that owns this table.
relstorage character(1) Storage mode of this table. a=append-optimized, c=column-oriented, h =heap, v = virtual, x= external table.
relkind character(1) The type of object: r = heap or append-optimized table.
bloat real Calculated bloat for the table when last scanned.
scan_size bigint Size of table when last scanned.
unused real Unused.
skew real Calculated skew for the table when last scanned.
last_scan_rowcnt real Number of rows in the table when last scanned.
last_scan_ts timestamp with time zone Time the table was last scanned for Recommendations.
accuracy real Accuracy ratio calculated from query history when the table was last scanned.
last_accuracy_ts timestamp with time zone Time the accuracy ratio was last set.
distributed_by text Distribution policy for the table.

gpcc_table_info_history

The gpcc_table_info_history table stores a daily snapshot of statistics about tables. Command Center saves statistics from the gpcc_table_info table just before midnight each night. There is one row for each table per day.

Column Type Description
ctime timestamp(0) without time zone Time the record was created.
dbid oid Object ID of the database.
relid oid Object ID of the table.
paroid oid Object ID of a partition.
reltablespace oid Object ID of the table’s tablespace.
seq_scan bigint Number of sequential scans initiated on this table.
idx_scan bigint Number of index scans initiated on this table.
n_tup_ins bigint Number of rows inserted.
n_tup_del bigint Number of rows deleted.
n_tup_upd bigint Number of rows updated (includes hot updated rows).
n_tup_hot_upd bigint Number of rows HOT updated (no separate index update required).
last_seq_scan timestamp with time zone Time of the last sequential scan on this table.
last_idx_scan timestamp with time zone Time of the last index scan on this table.
last_ins timestamp with time zone Time of the last insert on this table.
last_del timestamp with time zone Time of the last delete on this table.
last_upd timestamp with time zone Time of the last update on this table.
last_analyze timestamp with time zone Time of the last analyze on this table.
last_vacuum timestamp with time zone Time of the last vacuum of this table (excludes vacuum full).
last_vacuum_full timestamp with time zone Time of the last vacuum full of this table.
size bigint Combined size of all of this table’s files.
row_cnt real Number of rows in the table.
children integer Number of partitions, including middle-level and child partitions.
schema name Name of the schema this table belongs to.
table_name name Name of the table.
owner name Name of the database role that owns this table.
relstorage character(1) Storage mode of this table. a=append-optimized, c=column-oriented, h = heap, v = virtual, x= external table.
relkind character(1) The type of object: r = heap or append-optimized table.
bloat real Calculated bloat for the table when last scanned.
scan_size bigint Size of the table when last scanned.
unused real
skew real Calculated skew for the table when last scanned.
last_scan_rowcnt real Number of rows in the table when last scanned.
last_scan_ts timestamp with time zone Time the table was last scanned for Recommendations.
accuracy real Accuracy ratio calculated from query history when the table was last scanned.
last_accuracy_ts timestamp with time zone Time the accuracy ratio was last set.
distributed_by text Distribution policy for the table.

Workload Management Tables

gpcc_wlm_rule

The gpcc_wlm_rule table stores the definition of workload management rules (both assignment rules and workload rules).

Column Type Description
serial_number integer Serial number of the rule.
rule_id integer Unique ID of the rule shown in the Command Center interface.
rsgname name Resource group name used to assign rule.
role name Database role name used to assign rule.
query_tag text Query tag string used to assign rule.
dest_rsg name Destination resource group name for moving a query.
cpu_time integer Maximum CPU time consumed by the query.
running_time integer Maximum running time for the query.
disk_io_mb integer Maximum total disk I/O used by the query.
planner_cost float8 Maximum Postgres Planner cost for the query.
orca_cost float8 Maximum GPORCA optimizer ost for the query.
slice_num integer Maximum number of slices assigned for executing the query.
action integer Action to performed when conditions are met.
active boolean Whether the rule is active or inactive.
ctime timestamp without time zone Time the rule was created.
etime timestamp without time zone Time that the rule was deleted.

gpcc_wlm_log_history

The gpcc_wlm_log_history table stores the log history of workload rule actions that are triggered. Note that log history for assignment rules is not collected.

Column Type Description
ctime timestamp without time zone Time the log entry was created.
tstart timestamp without time zone Time the operation started.
tfinish timestamp without time zone Time the operation finished.
rule_serial_number integer Serial number of the rule.
rule_id integer Unique ID of the rule shown in the Command Center interface.
tmid integer A time identifier for the query. All records associated with a query will have the same tmid.
ssid integer Session id for the database connection. All records associated with the query will have the same ssid.
ccnt integer Command number within the session. All records associated with the query will have the same ccnt.
action integer Workload rule action that was initiated.
rsgname name Resource group name used to assign rule.
role name Database role name used to assign rule.
status text Indication of whether the action succeeded or failed.
fail_msg text Message associated with a failed action.