About Pivotal Greenplum Command Center
Pivotal Greenplum Command Center is a management tool for the Pivotal Greenplum Database Big Data Platform. This topic introduces key concepts about Greenplum Command Center and its components.
Greenplum Command Center monitors system performance metrics, analyzes cluster health, and enables database administrators to perform management tasks in a Greenplum Database environment.
Greenplum Command Center provides a browser-native HTML5 graphical console for viewing Greenplum Database system metrics and performing certain database administrative tasks. The Command Center application provides the following functionality:
- Interactive overview of realtime system metrics. Drill down to see details for individual cluster hosts and segments.
- Detailed realtime statistics for the cluster and by server.
- Query Monitor view lists queries executing, waiting to execute, and blocked by locks held by other queries.
- Query Detail view shows query metrics, query text, and the execution plan for the query.
- Workload view allows administrators to create and manage workloads to manage concurrency and allocate CPU and memory resources. Create assignment criteria to assign transactions to workloads.
- Four permission levels allow users to view or cancel their own or others’ queries, and to view or manage administrative information.
- Cluster Metrics view shows synchronized charts of historical system metrics.
- History view lists completed queries and system metrics plotted over a selected time period.
- Permissions view to see or manage Command Center permission levels.
- Authentication view to see or edit the
pg_hba.confhost-based authentication configuration file.
- Segment Status view with summaries and details by segment.
- Storage Status view with summaries and details by segment data directory.
The following figure illustrates the Greenplum Command Center architecture.
The Greenplum Command Center web server and backend application can run on the master, standby master, or any segment host in the Greenplum Database cluster—the standby master host is recommended. The web server, gpccws, is a custom HTTP server designed for Command Center. The web application is an HTML5 and Go language application.
The Command Center web server authenticates users with the Greenplum Database authentication system. Administrators can edit the Greenplum Database host-based authentication file,
pg_hba.conf, in the Command Center Console. Command Center can also be configured to authenticate users in a Kerberos environment.
Command Center defines four user authorization levels to manage users’ access to the Query Monitor, and to administrative information and operations. User authorization is managed in the Administrative area of the Command Center user interface.
Greenplum Command Center displays information derived from several sources:
- Greenplum Database performance monitoring database (gpperfmon)
- Operating system process accounting
- Greenplum Database system catalog tables
- Real-time query metrics collection extension
- Workload management extension
Greenplum Database is instrumented to enable capturing performance metrics and tracking query execution. The performance monitoring database and the query metrics collection extension deploy agents—processes running on each host to collect metrics. The gpperfmon agents forward collected data to an agent on the Greenplum Database master. The real-time query metrics agents submit collected data directly to the Command Center rpc port. The agents also collect data from the host operating system so that query performance can be correlated with CPU and memory utilization and disk space can be monitored in Command Center.
The gpperfmon performance monitoring database stores current and historical query status and system information collected from agents running on the master and segment hosts. Greenplum Command Center uses gpperfmon for historical data only; it uses the real-time query metrics to monitor active and queued queries. Greenplum Database sends UDP packets at various points during query execution. The
gpsmon process on each segment host collects the data. Periodically, every 15 seconds by default, a
gpmmon agent on the master host signals the
gpsmon process to forward the collected data. The agent on the master host receives the data and adds it to the gpperfmon database.
The Command Center database consists of three sets of tables:
- now tables store data on current system metrics such as active queries
- history tables store data on historical metrics
- tail tables are for data in transition. Tail tables are for internal use only and should not be queried by users.
The now and tail data are stored as text files on the master host file system, and the Command Center database accesses them via external tables. The history tables are regular database tables stored within the gpperfmon database.
You can run SQL queries on the data stored in the gpperfmon database. Greenplum Command Center runs queries on the database for information presented in the Command Center Console. The Greenplum Database Reference Guide contains references for the tables in the gpperfmon database.
Greenplum Database provides a management utility,
gpperfmon_install, to create the gpperfmon database and enable the gpperfmon agents on the master and segment hosts. Creating the gpperfmon database is a prerequisite for installing Greenplum Command Center. See the Greenplum Database Utility Guide for details of running the
gpperfmon_install management utility.
The data collected by real-time query metrics collection is more detailed and more current than statistics recorded in the gpperfmon database. Command Center users can observe queries as they execute and, with sufficient permissions, cancel problem queries to allow other queries to complete.
The Greenplum Database query metrics extension and the metrics collection agent work together to collect real-time metrics and update the Command Center application.
Greenplum Database calls the query metrics extension when a query is first submitted, when a query’s status changes, and when a node in the query execution plan initializes, starts, or finishes. The query metrics extension sends metrics to the metrics collection agent running on each segment host. The extension also collects information about the locks queries hold so that you can see which queries hold locks that block other queries. The agent posts the metrics to the Greenplum Command Center rpc port.
metrics_collection extension is included with Pivotal Greenplum Database. The extension is enabled by setting the
gp_enable_query_metrics server configuration parameter to on and restarting the Greenplum Database cluster. The metrics collection agent is installed on each host when you install Greenplum Command Center. The Command Center application monitors the agent and restarts it if needed.
Workloads set concurrency, memory, and CPU resource limits for database transactions they manage. A Greenplum Command Center workload corresponds to a Greenplum Database resource group, but adds additional capabilities that are not available with resource groups.
Command Center allows administrators greater flexibility in assigning transactions to workloads. Every Greenplum Database role is assigned to a single resource group and, by default, transactions are managed by the role’s resource group. With Command Center workload management, administrators can define criteria to assign transactions to workloads based on attributes other than the role submitting the transaction. Currently, assignment criteria can evaluate query tags and roles in combination with query tags.
A query tag is a key-value pair defined in the
gpcc.query_tags parameter of a database session. The parameter has the format
<tag1>=<value1>;<tag2>=<value2>, where tags and values are user-defined values. For example, if you want to run ETL operations in a workload named “etl”, you could define a tag named “xact-type” and set it to “etl”:
gpcc.query_tags parameter can be set as a connection parameter on Greenplum Database clients that allow it, or with a
SET command inside the session after the connection has been established, for example
gp_wlm extension in Pivotal Greenplum Database provides support for Command Center workloads. The extension is included with Pivotal Greenplum Database, but is not enabled by default. Initially, Greenplum Database uses resource queues to manage resources. Using Command Center workloads requires enabling resource groups in Greenplum Database. Resource groups are based on the Linux control groups (cgroups) service, which must first be enabled in the operating system.
See Enabling Workload Management in Greenplum Command Center for the steps to follow to enable Linux cgroups, Greenplum Database resource groups, and Command Center workloads.