Workload Manager Metric Reference
This topic lists the metrics Greenplum Workload Manager collects. These metrics can be used in the when
clause of a Workload Manager rule to select Greenplum Database queries that trigger an action. Metrics in when
clauses are prefixed with their scope, for example:
host:cpu_util > 35
The metric, in this example, is cpu_util
and the scope is host
. This metric will match any host with greater than 35% CPU utilization. The following expression matches a single postgres
process on any host using more than 35% CPU:
host:pid:cpu_util > 35 and host:pid:name = 'postgres'
Metrics may also be listed in the optional including
clause of a rule so that their values are saved with the record or event data when a rule is matched. When adding metrics to the including
clause, omit the scope; Workload Manager finds the metric in the scope matched by the when
clause.
The metrics are arranged in the following categories:
- Connections – number of backend connections and connections to the master
- Identification – names of users, hosts, databases, ports, processes, and so on
- Transactions – information about the current transaction, queries within transactions, and numbers of transactions committed and rolled back in the database
- Date/Time – date and time metrics for a host
- CPU – CPU utilization for hosts, processes, and sessions
- Memory – memory utilization for processes and queries
- Vmem - vmem utilization for segments and sessions
- Spill – number of spill files (work files) created and total spill file size for a query
- I/O – disk read/write statistics for databases, processes, and queries
- Skew – disk read/write skew and memory skew for queries
Connections
Scope |
Metric |
Data type |
Description |
datid |
numbackends |
integer |
Number of connections to this database |
gpdb |
total_master_connections |
integer |
Total number of connections to the master segment across all databases |
Identification
Scope |
Metric |
Data type |
Description |
session_id:host:pid |
usename |
string |
Name of the user logged into this backend |
datid |
datname |
string |
Name of this database |
host:pid |
long_name |
string |
By default, this is the absolute path to the process executable, but may be overridden by the process itself to status information in utilities like ps(1) |
host:pid |
name |
string |
The filename of the executable |
host:pid |
state |
string |
Kernel state of this process; see the man page for proc(5) for more information |
session_id:host:pid |
application_name |
string |
Name of the application that is connected to this backend |
session_id:host:pid |
client_addr |
string |
IP address of the client connected to this backend |
session_id:host:pid |
client_port |
integer |
TCP port number that the client is using for communication with this backend |
session_id:host:pid |
datid |
integer |
OID of the database this backend is connected to |
session_id:host:pid |
datname |
string |
Name of the database this backend is connected to |
session_id:host:pid |
gpdb_segment_role |
string |
The current role of this Greenplum Database segment (MASTER, SEGMENT, MIRROR) |
session_id:host:pid |
usesysid |
integer |
OID of the user logged into this backend |
Transactions
Scope |
Metric |
Data type |
Description |
datid |
xact_commit |
integer |
Number of transactions in this database that have been committed |
datid |
xact_rollback |
integer |
Number of transactions in this database that have been rolled back |
session_id:host:pid |
backend_start |
string |
Time when this process was started, i.e., when the client connected to the server |
session_id:host:pid |
current_query |
string |
Text of this backend’s current query. |
session_id:host:pid |
query_start |
string |
Time when the currently active query was started |
session_id:host:pid |
runtime |
integer |
Time elapsed since the query started, in seconds. This includes query wait time. |
session_id:host:pid |
xact_start |
string |
Time when this process’ current transaction was started |
Date/Time
Note: Date and time values are stored in UTC standard time and converted to the local time zone for display. Use the SHOW TIME ZONE
and SET TIME ZONE
commands in psql
to view and set the local time zone.
Scope |
Metric |
Data type |
Description |
host |
day |
integer |
Day as 0 - 30 |
host |
day_of_week |
integer |
Day as 0 - 6 |
host |
day_of_week_string |
string |
Mon, Tue, … |
host |
month |
integer |
Month as 0 - 11 |
host |
year |
integer |
Numeric year |
host |
hour |
integer |
Hour as 0 - 23 |
host |
minute |
integer |
Minute as 0 - 59 |
CPU
Memory
Scope |
Metric |
Data type |
Description |
host |
mem_avail |
integer |
Total available memory on this host (free + buffers + cached) (kB) |
host |
mem_avail_pct |
float |
Available memory on this host as percentage of total |
host |
mem_buffers |
integer |
Memory in buffers on this host (kB) |
host |
mem_cached |
integer |
Cached memory on this host (kB) |
host |
mem_free |
integer |
Free memory on this host (kB) |
host |
mem_free_pct |
float |
Free memory on this host as percentage of total |
host |
mem_total |
integer |
Total memory on this host (kB) |
host:pid |
data_size_bytes |
integer |
The size of data+stack memory region in this process (bytes) |
host:pid |
dirty_size_bytes |
integer |
The size of dirty pages used in this process (bytes) |
host:pid |
library_size_bytes |
integer |
The size of library memory region in this process (bytes) |
host:pid |
program_size_bytes |
integer |
The total program size (bytes) |
host:pid |
program_size_pct |
float |
The size of this process as a percentage of total system memory |
host:pid |
resident_size_bytes |
integer |
The size of resident memory consumed by this process (bytes) |
host:pid |
resident_size_pct |
float |
The size of this process’ resident memory as a percentage of total system memory |
host:pid |
shared_size_bytes |
integer |
The size of all shared pages used by this process (bytes) |
host:pid |
text_size_bytes |
integer |
The size of code memory region in this process (bytes) |
session_id:host |
total_resident_size_pct |
float |
Total resident memory percentage of all processes running a certain query on a host |
Vmem
To use the vmem metrics in Workload Manager, you must first run the gp_session_state.sql
script included with Greenplum Database on the postgres database. This is a one-time task.
The script creates the view session_level_memory_consumption
in the database. See Viewing Session Memory Usage Information in the Greenplum Database Administrator Guide for information about this view and the gp_session_state.sql
script.
Execute the script with the following command:
psql -d postgres -f $GPHOME/share/postgresql/contrib/gp_session_state.sql
Following are recommended configuration adjustments when using vmem metrics. Enter the commands at the gp-wlm
command-line:
config modify gpdb_stats publish_frequency = 0.75
config modify gpdb_stats collect_frequency = 0.5
config modify rulesengine engine:rule_frequency = 0.5
Scope |
Metric |
Data type |
Description |
host:segment_id |
total_vmem_size_mb |
integer |
Total vmem usage for this Greenplum segment in megabytes |
host:segment_id |
total_vmem_size_pct |
float |
Total vmem usage for this Greenplum segment as a percentage of total |
session_id:host:segment_id |
vmem_size_mb |
integer |
Total vmem used by the session on this segment |
session_id:host:segment_id |
vmem_size_pct |
float |
The percentage of this segment’s gp_vmem_protect_limit consumed by this session |
Spill
Scope |
Metric |
Data type |
Description |
session_id:host:pid |
spillfile_count_across_cluster |
integer |
Total number of spill files (work files) created for this query across the cluster |
session_id:host:pid |
spillfile_size_across_cluster |
integer |
Total size of spill files (work files) created for this query across the cluster, in bytes |
I/O
Scope |
Metric |
Data type |
Description |
datid |
blks_hit |
integer |
Number of times disk blocks were found already in the PostgreSQL buffer cache |
datid |
blks_read |
integer |
Number of disk blocks read in this database |
host:pid |
disk_read_bytes |
integer |
Total number of bytes read from disk by this process |
host:pid |
disk_read_bytes_per_sec |
float |
The number of bytes read from disk per second by this process |
host:pid |
disk_write_bytes |
integer |
Total number of bytes written to disk by this process |
host:pid |
disk_write_bytes_per_sec |
float |
The number of bytes written to disk per second by this process |
host:pid |
read_bytes |
integer |
Total number of bytes (disk, network, IPC) read by this process |
host:pid |
read_bytes_per_sec |
float |
The number of bytes read per second (disk + net + IPC) by this process |
host:pid |
reads |
integer |
Total number of read system calls made by this process |
host:pid |
reads_per_sec |
float |
The number of total read(2) calls per second by this process |
host:pid |
write_bytes |
integer |
Total number of bytes (disk, network, IPC) written by this process |
host:pid |
write_bytes_per_sec |
float |
The number of bytes written per second (disk + net + IPC) by this process |
host:pid |
writes |
integer |
Total number of write system calls made by this process |
host:pid |
writes_per_sec |
float |
The number of total write(2) calls per second by this process |
session_id:host |
total_disk_read_bytes_per_sec |
integer |
Total disk read bytes-per-second of all processes running a certain query on a host |
session_id:host |
total_disk_write_bytes_per_sec |
integer |
Total disk write bytes-per-second of all processes running a certain query on a host |
Skew
Scope |
Metric |
Data type |
Description |
session_id |
disk_read_bytes_per_sec_skew |
float |
Disk read skew across the cluster. Calculated as the cubed standard deviation of session_id:host:total_disk_read_bytes_per_sec from all hosts running a certain query |
session_id |
disk_write_bytes_per_sec_skew |
float |
Disk write skew across the cluster. Calculated as the cubed standard deviation of session_id:host:total_disk_write_bytes_per_sec from all hosts running a certain query |
session_id |
resident_size_pct_skew |
float |
Resident memory utilization skew across the cluster. Calculated as the cubed standard deviation of session_id:host:total_resident_size_pct from all hosts running a certain query |