Workload Management

Defining Resource Groups and Resource Attributes

Command Center allows you to view resource groups that have been created in Greenplum Database, to add or delete resource groups, and to edit the resource group attributes Concurrency, CPU %, and Memory %.

To change values of the MEMORY_AUDITOR, CPUSET, MEMORY_SHARED_QUOTA, or MEMORY_SPILL_RATIO resource group attributes, use the ALTER RESOURCE GROUP SQL statement.

Resource Group Editor

  1. Click EDIT to open the Resource Group editor.

  2. To delete a resource group, select the resource group, and click the minus sign that appears at the right.

    You cannot delete the default_group or admin_group resource groups. You cannot delete a resource group that is assigned to any Greenplum Database role.

  3. To add a resource group, click ADD RESOURCE GROUP and enter a name for the resource group in the Name column. Resource group names must be unique and are case-sensitive.

  4. Adjust the values of the Concurrency, CPU %, and Memory % resource group attributes.
     
    Concurrency

    The maximum number of concurrent transactions, including active and idle transactions, that are permitted in the resource group. Concurrency sets the CONCURRENCY attribute of the resource group. The total of the Concurrency columns cannot exceed the value of the Greenplum Database max_connections master server configuration parameter.

    CPU %
    The percentage of CPU resources available to this resource group. The percentage is the portion of the total CPU percentage allocated for all resource groups (reserved CPUs excluded), which is set with the gp_resource_group_cpu_limit server configuration parameter. CPU % sets the CPU_RATE_LIMIT attribute of the resource group.
     
    Memory %
    The percentage of memory resources available to this resource group. The percentage is the portion of the total memory allocated for all resource groups, which is set with the gp_resource_group_memory_limit Greenplum Database configuration parameter. Memory % sets the MEMORY_LIMIT attribute of the resource group.
     
    The totals of the CPU % and Memory % columns must not exceed 100%. You should not allow the total of the CPU % column to exceed 90%, because this could cause resource-intensive queries to consume nearly all CPU, starving other Greenplum Database processes. If the total of the Memory % column is less than 100%, the unreserved memory is part of the resource group shared global memory pool. See “Global Shared Memory” in Using Resource Groups in the Greenplum Database Administrator Guide for information about the global resource group shared memory pool.

  5. Click Apply to save your changes or click Cancel to abandon your changes.

Assigning Resource Groups to Roles

Every Greenplum Database role is assigned to a single resource group in the pg_roles system table. Transactions executed by a role are managed by its assigned resource group, unless you create an assignment rule to override the default.

You can view the current resource group assignments for all roles and change a role’s resource group by adding it to a different resource group.

Role Assignment Editor

To move a role to a different resource group:

  1. Enter all or part of the role name in the add role field beneath the new resource group.

  2. Choose the role from the list that is displayed and press Enter.

The change is immediately applied to the Greenplum Database pg_roles system table.

Editing Rules for Resource Groups

In future Greenplum Command Center releases, you will be able to use the Workload Management user interface to interactively define assignment rules, and idle session kill rules. In the current release, you must edit the JSON document the workload management database extension evaluates.

JSON Editor

Greenplum Command Center saves the JSON document in a Greenplum Database table, gpcc.gp_wlm_config, in the postgres database. When you save your changes, Command Center updates the document in the database. The size of the JSON document must not exceed 1MB.

Warning! Never manually modify the gpcc.gp_wlm_config table.

When you first enable Workload Management in Command Center, a sample JSON document is displayed. The following example shows rules for the two default Greenplum Database resource groups, admin_group and default_group.

{
  "version": 1,
  "assignmentRules": [
    {
      "resourceGroupName": "admin_group",
      "roleName": "optionalRoleToFilterWith",
      "queryTags": "exampleKey1=exampleValue1;exampleKey2=exampleValue2",
      "disabled": true
    },
    {
      "resourceGroupName": "default_group",
      "queryTags": "exampleKey1=exampleValue1;exampleKey2=exampleValue2",
      "disabled": true
    }
  ],
  "idleSessionKillRules": {
    "admin_group": 
      {
        "timeoutSeconds": 7200,
        "exemptedRoles": "gp.*",
        "message": "Session killed by admin_group kill rule."
      },
    "default_group": 
      {
        "timeoutSeconds": 7200,
        "message": "Session killed by default_group kill rule."
      }
    }
}

JSON Format for Rules

This section describes the JSON object that stores the resource group assignment and idle session kill rules. The object has three members:

  • a version key/value pair
  • an assignmentRules array containing one element for each assignment rule
  • an idleSessionKillRules list containing one key/value pair for each resource group

version pair

version
The version value is an integer. It is reserved for future use. It can be set to 1.

assignmentRules array

assignmentRules
The assignmentRules array has one element for each assignment rule. Each element maps the rule to a resource group in Greenplum Database and defines the conditions that assign a transaction to that resource group.

Greenplum Command Center evaluates each rule in the assignmentRules array from top to bottom and stops at the first match, so the order is important.

The elements in the assignmentRules array can have the following key/value pairs.

resourceGroupName
The name of the Greenplum Database resource group. The resource group must already exist or have been created in Greenplum Database with the CREATE RESOURCE GROUP SQL statement.

roleName
The name of a Greenplum Database role to match against the current role identifier in the Greenplum Database session. The current role is initially the database role that authenticated with the Greenplum Database system to begin the session. A user with sufficient privileges can change the current role in the database session using the SET ROLE SQL command.

If no role is specified in the assignment rule and the query tags match, the transaction is assigned to the specified resource group. If the roleName value is present, however, the current database user must match the specified role.

queryTags
A list of query tags to match against the gpcc.query_tags parameter in the Greenplum Database session. A query tag is a user-defined <name>=<value> pair. Separate multiple query tags with semicolons. For example, the following statement, executed in the Greenplum Database session, sets the appName and appUser query tags to “tableau” and “bi_sales”, respectively.

=# SET gpcc.query_tags TO 'appName=tableau;appUser=bi_sales';

To match, all tags in the assignment rule’s queryTags value must be present in the gpcc.query_tags parameter in the database session. The order of the tags is not significant, and the gpcc.query_tags parameter can be a superset of the tags defined in the queryTags value.

If the queryTags value is empty, or the parameter omitted, the rule will match every session for the roleName database role.

If you set the gpcc.query_tags parameter inside of a transaction, you must commit the transaction before the new query tags are used to evaluate assignment rules.

The gpcc.query_tags parameter can be specified using the SET command, as above, or as a connection parameter with database clients that support it, such as psql. Here are two examples that show how to specify query tags on the psql command line:

$ PGOPTIONS="-c gpcc.query_tags='appName=tableau;appUser=bi_sales'" psql

$ psql postgresql://mdw:5432/postgres?options="-c gpcc.query_tags%3D'appName%3Dtableau;appUser%3Dbi_sales'"

In the second example, it is necessary to code the equals signs as %3D to prevent psql from interpreting the query tags as command-line arguments.

disabled
If set to true, the assignment rule is ignored when Command Center evaluates rules to assign transactions to workloads. This parameter is optional and its default value is "false".

idleSessionKillRules list

The value of the idleSessionKillRules object is a list of key/value pairs, where the key is the name of a Greenplum Database resource group. Each resource group should only appear as a key in the list once. The value is an object that contains the following key/value pairs.

timeoutSeconds
The number of seconds after a session becomes idle before the session is killed. The value should generally be large enough to allow interactive users to enter SQL commands. A value of 600, for example, kills the session after 10 minutes in the idle state.

exemptedRoles
A list of Greenplum Database roles to exempt from the idle session kill rule. Sessions belonging to any role in this list will not be killed. Posix regular expressions may be used to match role names.

message
The text to include in the session termination message. The default message, "Session killed due to exceeding idle session time limit", is included in the error message if you omit the message parameter from the rule or if you specify the empty string ("") for the message.

Rule Examples

Simple Assignment Rule Example

This example demonstrates how the Greenplum Database workload management extension assignment works to set the resource group for a transaction.

With the default resource groups, admin_group and default_group, any transaction the gpmon user executes is assigned to the admin_group, since gpmon is a superuser. This example enables the gpmon role to set query tags to temporarily execute transactions in the default_group resource group.

  1. Log in to the gpperfmon database with the gpmon role.

    gpperfmon=# SELECT rsgname FROM pg_stat_activity
    WHERE procpid = pg_backend_pid();
       rsgname
    -------------
     admin_group
    (1 row)
    

    If the result is empty, resource groups are not properly enabled. See Troubleshooting Workload Management for help getting resource groups set up.

  2. On the Command Center Admin> Workload Mgmt view, add the following assignment rule to the Assignment Rules JSON and click the Update button.

    {
      "resourceGroupName": "default_group",
      "roleName": "gpmon",
      "queryTags": "transType=default;app=psql"
    }
    
  3. In the gpperfmon database session, set the gpcc.query_tags parameter to transType=default and check the resource group.

    gpperfmon=# SET gpcc.query_tags TO 'transType=default';
    gpperfmon=# SELECT rsgname FROM pg_stat_activity WHERE procpid = pg_backend_pid();
       rsgname
    -------------
     admin_group
    (1 row)    
    

    The resource group is still admin_group because the query tag does not match all of the tags defined in the assignment rule.

  4. Set the gpcc.query_tags parameter to app=psql;reason=testing;transType=default and check the resource group.

    gpperfmon=# SET gpcc.query_tags TO 'app=psql;reason=testing;transType=default';
    gpperfmon=# SELECT rsgname FROM pg_stat_activity WHERE procpid = pg_backend_pid();
        rsgname
    ---------------
     default_group
    (1 row)
    

    Subsequent transactions in this session are assigned to the default_group resource group. Notice that you can set the query tags in any order and that tags not specified in the assignment rule queryTags value are ignored.

Benchmark Assignment Rule Example

This example tests two scenarios using a TPC-H benchmark test. (See the TPC web site for information about TPC benchmarks and source code.)

In scenario one, two roles (tpch_1 and tpch_4) execute a parallel workload in separate resource groups configured identically. In scenario two, both roles execute the workload in the same resource group with memory, cpu, and concurrency limits doubled.

Four assignment rules defined in the Command Center Admin> Workload Mgmt view assign transactions to resource groups for the two scenarios.

{
  "version": 1,
  "assignmentRules": [
    {
      "resourceGroupName": "tpch_group1",
      "roleName": "tpch_1",
      "queryTags": "scenario=one",
      "disabled": false
    },
    {
      "resourceGroupName": "tpch_group2",
      "roleName": "tpch_4",
      "queryTags": "scenario=one",
      "disabled": false
    },
    {
      "resourceGroupName": "tpch_group1",
      "roleName": "tpch_1",
      "queryTags": "scenario=two",
      "disabled": false
    },
    {
      "resourceGroupName": "tpch_group1",
      "roleName": "tpch_4",
      "queryTags": "scenario=two"
    }
  ]
}

The first two rules, with query tag "scenario=one", assign transactions executed by role tpch_1 to the tpch_group1 resource group and transactions executed by role tpch_4 to the tpch_group2 resource group. The last two rules, with query tag "scenario=two", assign transactions from both roles tpch_1 and tpch_4 to the tpch_group1 resource group.

Here is the resource group configuration for scenario one.

=# CREATE RESOURCE GROUP tpch_group1 with (CPU_RATE_LIMIT=25, MEMORY_LIMIT=20, CONCURRENCY=20);
=# CREATE RESOURCE GROUP tpch_group2 with (CPU_RATE_LIMIT=25, MEMORY_LIMIT=20, CONCURRENCY=20);
=# SELECT groupname, cpu_rate_limit, memory_limit, concurrency FROM 
    gp_toolkit.gp_resgroup_config;
   groupname   | cpu_rate_limit | memory_limit | concurrency
---------------+----------------+--------------+-------------
 default_group | 25             | 30           | 20
 admin_group   | 10             | 10           | 10
 tpch_group1   | 25             | 20           | 20
 tpch_group2   | 25             | 20           | 20
(4 rows)

Setting query tags for the transactions requires setting the gpcc.query_tags Greenplum Database session variable. For scenario one the gpcc.query_tags session parameter should be set as follows:

=# SET gpcc.query_tags to 'scenario=one';

Given the assignment rules we created, if the gpcc.query_tags parameter is not set, or is not set to 'scenario=one' or 'scenario=two' all transactions are assigned to the role’s default resource group, which is default_group if you have not explicitly set it.

If the client application allows defining session parameters on the connection string, you could define the gpcc.query_tags parameter on the connection request.

This example submits queries in SQL script files using the psql client utility, so the gpcc.query_tags parameter must be defined whenever psql starts. The psql command-line does not have an option to set arbitrary parameters, so we arrange for the SQL script to read the query tags from an OS environment variable.

  1. Set an environment variable to the content you want to define for gpcc.query_tags.

    $ export GPCC_QUERY_TAGS='scenario=one'
    
  2. Set the gpcc.query_tags session variable in the SQL script by inserting these lines before the start of the transaction.

    \set qtags '\'' `echo "$GPCC_QUERY_TAGS"` '\''
    SET gpcc.query_tags to :qtags;
    

When you execute the benchmark test, you can watch the Query Monitor to see that transactions are assigned to workload tpch_group1 or tpch_group2 according to the assignment rules.

Scenario One Query Monitor

This is the resource group configuration for scenario two.

=# DROP RESOURCE GROUP tpch_group2;
=# ALTER RESOURCE GROUP tpch_group1 SET CPU_RATE_LIMIT 50;
=# ALTER RESOURCE GROUP tpch_group1 SET MEMORY_LIMIT 40;
=# ALTER RESOURCE GROUP tpch_group1 SET concurrency 40;
=# SELECT groupname, cpu_rate_limit, memory_limit, concurrency
    FROM gp_toolkit.gp_resgroup_config;
   groupname   | cpu_rate_limit | memory_limit | concurrency
---------------+----------------+--------------+-------------
 default_group | 25             | 30           | 20
 admin_group   | 10             | 10           | 10
 tpch_group1   | 50             | 40           | 40
(3 rows)

Set the GPCC_QUERY_TAGS environment variable for scenario two.

export GPCC_QUERY_TAGS='scenario=two'

Run the benchmark test again and compare the workload assignments on the Query Monitor. All transactions should be executed in the tpch_group1 resource group.

Scenario Two Query Monitor

Accessing Workload Assignment Rules with Database Functions

The Greenplum Database workload management extension (gp_wlm) creates two user-defined functions in the postgres database to retrieve or set the workload assignment rules. The Greenplum Command Center Workload Management user interface calls these functions to display the current JSON string and to save your edits.

You can call these functions to programmatically query or configure workload management rules.

Warning! The gpcc.workload_config table should only be accessed by using the gpcc.get_workload_config() and gpcc.set_workload_config() functions or the Command Center user interface. Do not drop the table while the workload management extension is enabled.

The gpcc.set_workload_config() function requires valid JSON syntax, but does not validate the workload management rules. You must ensure that the JSON value contains a version variable, and correctly specified assignment and idle session kill rules.

gpcc.get_workload_config()
Retrieves the current workload assignment rules as a JSON value.

Example

postgres=# SELECT gpcc.get_workload_config();
                  get_workload_config
--------------------------------------------------------
 {
    "version": 1,
    "assignmentRules": [
      {
        "resourceGroupName": "default_group",
        "roleName": "gpadmin",
        "queryTags": "appName=tableau;appUser=bi_sales"
      },
      {
        "resourceGroupName": "admin_group",
        "roleName": "gpadmin",
        "queryTags": "appName=tableau;appUser=bi_acct",
        "disabled": true
      }
    ],
    "idleSessionKillRules": {
      "admin_group": {
        "timeoutSeconds": 600
      },
      "default_group": {
        "timeoutSeconds": 3600
      }
    }
  }

(1 row)

gpcc.set_workload_config()
Sets the workload assignment rules. The argument is a valid JSON value containing the assignment rules an. See JSON Parameters for descriptions of the parameters.

The gpcc.set_workload_config() function accepts any valid JSON value. You must ensure that the value contains a version element and properly specified assignment and idle session kill rules.

If you call gpcc.set_workload_config() within a transaction, you must commit the transaction before the workload managment extension applies the new rules.

Example

postgres=# SELECT gpcc.set_workload_config(
'
  { "version": 1, 
  "assignmentRules": 
     [ 
       {
         "resourceGroupName": "default_group",
         "roleName": "gpadmin",
         "queryTags": "appName=tableau;appUser=bi_sales"
       },
       {
         "resourceGroupName": "admin_group",
         "roleName": "gpadmin",
         "queryTags": "appName=tableau;appUser=bi_acct",
         "disabled": true
      }
    ],
    "idleSessionKillRules": 
    { 
      "admin_group": 
        { "timeoutSeconds": 600 },
      "default_group": 
        { "timeoutSeconds": 3600 } 
    }
  }
' 
);
 set_workload_config
---------------------
 t
(1 row)