LATEST VERSION: 3.3.1 - CHANGELOG
Pivotal Greenplum Command Center v3.2.2

Workload Manager Metric Reference

This topic lists the metrics Greenplum Workload Manager collects. These metrics can be used in the when clause of a Workload Manager rule to select Greenplum Database queries that trigger an action. Metrics in when clauses are prefixed with their scope, for example:

host:cpu_util > 35

The metric, in this example, is cpu_util and the scope is host. This metric will match any host with greater than 35% CPU utilization. The following expression matches a single postgres process on any host using more than 35% CPU:

host:pid:cpu_util > 35 and host:pid:name = 'postgres'

Metrics may also be listed in the optional including clause of a rule so that their values are saved with the record or event data when a rule is matched. When adding metrics to the including clause, omit the scope; Workload Manager finds the metric in the scope matched by the when clause.

The metrics are arranged in the following categories:

  • Connections – number of backend connections and connections to the master
  • Identification – names of users, hosts, databases, ports, processes, and so on
  • Transactions – information about the current transaction, queries within transactions, and numbers of transactions committed and rolled back in the database
  • Date/Time – date and time metrics for a host
  • CPU – CPU utilization for hosts, processes, and sessions
  • Memory – memory utilization for processes and queries
  • Vmem - vmem utilization for segments and sessions
  • Spill – number of spill files (work files) created and total spill file size for a query
  • I/O – disk read/write statistics for databases, processes, and queries
  • Skew – disk read/write skew and memory skew for queries

Connections

Scope Metric Data type Description
datid numbackends integer Number of connections to this database
gpdb total_master_connections integer Total number of connections to the master segment across all databases

Identification

Scope Metric Data type Description
session_id:host:pid usename string Name of the user logged into this backend
datid datname string Name of this database
host:pid long_name string By default, this is the absolute path to the process executable, but may be overridden by the process itself to status information in utilities like ps(1)
host:pid name string The filename of the executable
host:pid state string Kernel state of this process; see the man page for proc(5) for more information
session_id:host:pid application_name string Name of the application that is connected to this backend
session_id:host:pid client_addr string IP address of the client connected to this backend
session_id:host:pid client_port integer TCP port number that the client is using for communication with this backend
session_id:host:pid datid integer OID of the database this backend is connected to
session_id:host:pid datname string Name of the database this backend is connected to
session_id:host:pid gpdb_segment_role string The current role of this Greenplum Database segment (MASTER, SEGMENT, MIRROR)
session_id:host:pid usesysid integer OID of the user logged into this backend

Transactions

Scope Metric Data type Description
datid xact_commit integer Number of transactions in this database that have been committed
datid xact_rollback integer Number of transactions in this database that have been rolled back
session_id:host:pid backend_start string Time when this process was started, i.e., when the client connected to the server
session_id:host:pid current_query string Text of this backend’s current query.
session_id:host:pid query_start string Time when the currently active query was started
session_id:host:pid runtime integer Time elapsed since the query started, in seconds. This includes query wait time.
session_id:host:pid xact_start string Time when this process’ current transaction was started

Date/Time

Note: Date and time values are stored in UTC standard time and converted to the local time zone for display. Use the SHOW TIME ZONE and SET TIME ZONE commands in psql to view and set the local time zone.

Scope Metric Data type Description
host day integer Day as 0 - 30
host day_of_week integer Day as 0 - 6
host day_of_week_string string Mon, Tue, …
host month integer Month as 0 - 11
host year integer Numeric year
host hour integer Hour as 0 - 23
host minute integer Minute as 0 - 59

CPU

Scope Metric Data type Description
host node_cpu_util float CPU utilization on this host averaged over active CPUs (%). Excludes idle time. Example: Record high CPU queries when overall CPU utilization on a host is high.
host:pid avg_cpu_util float CPU utilization of this process averaged over the last two polling intervals (%).
host:pid cpu_util float CPU utilization of this process (%). Example: Record high CPU queries when overall CPU utilization on a host is high.
session_id cpu_skew float CPU utilization skew across the cluster. Calculated as the cubed standard deviation of session_id:host:total_cpu from all hosts running a certain query. Values closer to zero indicate less skew. This is not a percentage. Example: Complex rule.
session_id:host total_cpu float Total CPU utilization of all processes running a certain query on a host (%). Example: Record high CPU queries on a host when overall CPU utilization on that host is high.

Memory

Scope Metric Data type Description
host mem_avail integer Total available memory on this host (free + buffers + cached) (kB)
host mem_avail_pct float Available memory on this host as percentage of total
host mem_buffers integer Memory in buffers on this host (kB)
host mem_cached integer Cached memory on this host (kB)
host mem_free integer Free memory on this host (kB)
host mem_free_pct float Free memory on this host as percentage of total
host mem_total integer Total memory on this host (kB)
host:pid data_size_bytes integer The size of data+stack memory region in this process (bytes)
host:pid dirty_size_bytes integer The size of dirty pages used in this process (bytes)
host:pid library_size_bytes integer The size of library memory region in this process (bytes)
host:pid program_size_bytes integer The total program size (bytes)
host:pid program_size_pct float The size of this process as a percentage of total system memory
host:pid resident_size_bytes integer The size of resident memory consumed by this process (bytes)
host:pid resident_size_pct float The size of this process’ resident memory as a percentage of total system memory
host:pid shared_size_bytes integer The size of all shared pages used by this process (bytes)
host:pid text_size_bytes integer The size of code memory region in this process (bytes)
session_id:host total_resident_size_pct float Total resident memory percentage of all processes running a certain query on a host

Vmem

To use the vmem metrics in Workload Manager, you must first run the gp_session_state.sql script included with Greenplum Database on the postgres database. This is a one-time task.

The script creates the view session_level_memory_consumption in the database. See Viewing Session Memory Usage Information in the Greenplum Database Administrator Guide for information about this view and the gp_session_state.sql script.

Execute the script with the following command:

psql -d postgres -f $GPHOME/share/postgresql/contrib/gp_session_state.sql

Following are recommended configuration adjustments when using vmem metrics. Enter the commands at the gp-wlm command-line:

config modify gpdb_stats publish_frequency = 0.75
config modify gpdb_stats collect_frequency = 0.5
config modify rulesengine engine:rule_frequency = 0.5
Scope Metric Data type Description
host:segment_id total_vmem_size_mb integer Total vmem usage for this Greenplum segment in megabytes
host:segment_id total_vmem_size_pct float Total vmem usage for this Greenplum segment as a percentage of total
session_id:host:segment_id vmem_size_mb integer Total vmem used by the session on this segment
session_id:host:segment_id vmem_size_pct float The percentage of this segment’s gp_vmem_protect_limit consumed by this session

Spill

Scope Metric Data type Description
session_id:host:pid spillfile_count_across_cluster integer Total number of spill files (work files) created for this query across the cluster
session_id:host:pid spillfile_size_across_cluster integer Total size of spill files (work files) created for this query across the cluster, in bytes

I/O

Scope Metric Data type Description
datid blks_hit integer Number of times disk blocks were found already in the PostgreSQL buffer cache
datid blks_read integer Number of disk blocks read in this database
host:pid disk_read_bytes integer Total number of bytes read from disk by this process
host:pid disk_read_bytes_per_sec float The number of bytes read from disk per second by this process
host:pid disk_write_bytes integer Total number of bytes written to disk by this process
host:pid disk_write_bytes_per_sec float The number of bytes written to disk per second by this process
host:pid read_bytes integer Total number of bytes (disk, network, IPC) read by this process
host:pid read_bytes_per_sec float The number of bytes read per second (disk + net + IPC) by this process
host:pid reads integer Total number of read system calls made by this process
host:pid reads_per_sec float The number of total read(2) calls per second by this process
host:pid write_bytes integer Total number of bytes (disk, network, IPC) written by this process
host:pid write_bytes_per_sec float The number of bytes written per second (disk + net + IPC) by this process
host:pid writes integer Total number of write system calls made by this process
host:pid writes_per_sec float The number of total write(2) calls per second by this process
session_id:host total_disk_read_bytes_per_sec integer Total disk read bytes-per-second of all processes running a certain query on a host
session_id:host total_disk_write_bytes_per_sec integer Total disk write bytes-per-second of all processes running a certain query on a host

Skew

Scope Metric Data type Description
session_id disk_read_bytes_per_sec_skew float Disk read skew across the cluster. Calculated as the cubed standard deviation of session_id:host:total_disk_read_bytes_per_sec from all hosts running a certain query
session_id disk_write_bytes_per_sec_skew float Disk write skew across the cluster. Calculated as the cubed standard deviation of session_id:host:total_disk_write_bytes_per_sec from all hosts running a certain query
session_id resident_size_pct_skew float Resident memory utilization skew across the cluster. Calculated as the cubed standard deviation of session_id:host:total_resident_size_pct from all hosts running a certain query