Iterator Attributes

The tables below list additional metrics for all of the possible iterators in a query. These metrics display in the circled area of the Query Plan details shown below. The values in the [estimated] field(s) are the anticipated values that the query planner expected the iterator to generate during query execution.

Metric Terminology

The following information explains some of the database terms and concepts that appear in iterator metrics in Command Center:

Node
Refers to a step in a query plan. A query plan has sets of operations that Greenplum Database performs to produce the answer to a given query. A node in the plan represents a specific database operation, such as a table scan, join, aggregation, or sort.

Iterator
Represents the actual execution of the node in a query plan. Node and iterator are sometimes used interchangeably.

Tuple
Refers to a row returned as part of a result set from a query, as well as a record in a table.

Spill
When there is not enough memory to perform a database operation, data must be written (or spilled) to disk.

Passes

Occur when an iterator must scan (or pass) over spilled data to obtain a result. A pass represents one pass through all input tuples, or all data in batch files generated after spill, which happens hierarchically. In the first pass, all input tuples are read, and intermediate results are spilled to a specified number of batch files. In the second pass, the data in all batch files is processed. If the results are still too large to store in memory, the intermediate results are spilled to the second level of spill files, and the process repeats again.

Batches
Refers to the actual files created when data is spilled to disk. This is most often associated to Hash operations.

Join
This clause in a query joins two or more tables. There are three types of Join algorithms in Greenplum Database:

  • Hash Join
  • Merge Join
  • Nested Loop

Each of these operations include their own respective Join semantics. The Performance Monitor Console displays iterator metrics for each of these semantics.

Append

An Append iterator has two or more input sets. Append returns all rows from the first input set, then all rows from the second input set, and so on, until all rows from all input sets are processed. Append is also used when you select from a table involved in an inheritance hierarchy.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Append Current Input Source Inputs The number of the current table being scanned.

Append Current Input Source

(Inputs) The number of the current table being scanned.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Append Current Input Source Inputs The number of the current table being scanned.

Append-only Scan

This iterator scans append-only type-tables.

Metric Unit Description
Rows in Rows in The number of tuples received by the iterator.
Append-only Scan Rescan Rescans The number of append-only rescans by this iterator.

Append-only Columnar Scan

This iterator scans append-only columnar-type tables.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Append-Only Columnar Scan Rescan Rescans The number of append-only columnar rescans by this iterator.

Aggregate

The query planner produces an aggregate iterator whenever the query includes an aggregate function. For example, the following functions are aggregate functions: AVG(), COUNT(), MAX(), MIN(), STDDEV(), SUM(), and VARIANCE().

Aggregate reads all the rows in the input set and computes the aggregate values. If the input set is not grouped, Aggregate produces a single result row.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Aggregate Total Spill Tuples Tuples The number of tuples spilled to disk
Aggregate Total Spill Batches Batches The number of spill batches required.
Aggregate Total Spill Pass Passes The number of passes across all of the batches.
Aggregate Current Spill Pass Read Bytes Bytes The number of tuples read in for this spill batch.
Aggregate Current Spill Pass Tuples Tuples The number of tuples that are in each spill file in the current pass.
Aggregate Current Spill Pass Bytes Bytes The number of bytes that are in each spill file in the current pass.
Aggregate Current Spill Pass Batches Batches The number of batches created in the current pass.

BitmapAnd

This iterator takes the bitmaps generated from multiple BitmapIndexScan iterators, puts them together with an ANDclause, and generates a new bitmap as its output.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

BitmapOr

This iterator takes the bitmaps generated from multiple BitmapIndexScan iterators, puts them together with an OR clause, and generates a new bitmap as its output.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

Bitmap Append-Only Scan

This iterator retrieves all rows from the bitmap generated by BitmapAnd, BitmapOr, or BitmapIndexScan and accesses the append-only table to retrieve the relevant rows.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Bitmap Append-Only Scan Pages Pages The number of bitmap heap pages scanned.
Bitmap Append-Only Scan Rescan Rescans The number of bitmap heap page rescans by this iterator.

Bitmap Heap Scan

This iterator retrieves all rows from the bitmap generated by BitmapAnd, BitmapOr, or BitmapIndexScan and accesses the heap table to retrieve the relevant rows.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Bitmap Heap Scan Pages Pages The number of bitmap heap pages scanned.
Bitmap Heap Scan Rescan Rescans The number bitmap heap page rescans by this iterator.

Bitmap Index Scan

This iterator produces a bitmap that corresponds to the rules that satisfy the query plan.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Bitmap Index Scan Rescan Rescans The number bitmap index rescans by this iterator.

Broadcast Motion

This iterator sends an outbound tuple to all of its destinations.

Note that the Motion metrics for the iterator are useful when investigating potential networking issues in the Greenplum Database system. Typically, the “Ack Time” values should be very small (microseconds or milliseconds). However if the “Ack Time” values are one or more seconds (particularly the “Motion Min Ack Time” metric), then a network performance issue likely exists.

Also, if there are a large number of packets being dropped because of queue overflow, you can increase the value for the gp_interconnect_queue_depth system configuration parameter to improve performance. See the Greenplum Database Administrator Guide for more in formation about system configuration parameters.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Motion Bytes Sent Bytes The number of bytes sent by the iterator.
Motion Total Ack Time Microseconds The total amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Average Ack Time Microseconds The average amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Max Ack Time Microseconds The maximum amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Min Ack Time Microseconds The minimum amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Count Resent Packets The total number of packets that the iterator did not acknowledge when they first arrived in the queue.
Motion Max Resent Packets The maximum number of packets that the iterator did not acknowledge when they first arrived in the queue. This metric is applied on a per packet basis. For example, a value of “10” indicates that a particular packet did not get acknowledged by this iterator 10 times, and that this was the maximum for this iterator.
Motion Bytes Received Bytes The number of bytes received by the iterator.
Motion Count Dropped Packets The number of packets dropped by the iterator because of buffer overruns.

Explicit Redistribute Motion

The Explicit Redistribute iterator moves tuples to segments explicitly specified in the segment ID column of the tuples. This is differs from a Redistribute Motion iterator, where target segments are indirectly specified through hash expressions.

The Explicit Redistribute iterator is used when the query portion of a DML planned statement requires moving tuples across distributed tables.

Note that the Motion metrics for the iterator are useful when investigating potential networking issues in the Greenplum Database system. Typically, the “Ack Time” values should be very small (microseconds or milliseconds). However if the “Ack Time” values are one or more seconds (particularly the “Motion Min Ack Time” metric), then a network performance issue likely exists.

Also, if there are a large number of packets being dropped because of queue overflow, you can increase the value for the gp_interconnect_queue_depth system configuration parameter to improve performance. See the Greenplum Database Administrator Guide for more information about system configuration parameters.

The maximum amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Motion Bytes Sent Bytes The number of bytes sent by the iterator.
Motion Total Ack Time Microseconds The total amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Average Ack Time Microseconds The average amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Max Ack Time Microseconds
Motion Min Ack Time Microseconds The minimum amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Count Resent Packets The total number of packets that the iterator did not acknowledge when they first arrived in the queue.
Motion Max Resent Packets The maximum number of packets that the iterator did not acknowledge when they first arrived in the queue. This metric is applied on a per packet basis. For example, a value of “10” indicates that a particular packet did not get acknowledged by this iterator 10 times, and that this was the maximum for this iterator.
Motion Bytes Received Bytes The number of bytes received by the iterator.
Motion Count Dropped Packets The number of packets dropped by the iterator because of buffer overruns.

External Scan

This iterator scans an external table.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
External Scan Rescan Rescans The number of external table rescans by this iterator.

Function Scan

This iterator returns tuples produces by a function.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

Gather Motion

This iterator gathers streams of tuples that are sent by “sending” motions. If a merge key is specified, it merges many streams into a single order-preserved stream.

Note that the Motion metrics for the iterator are useful when investigating potential networking issues in the Greenplum Database system. Typically, the “Ack Time” values should be very small (microseconds or milliseconds). However if the “Ack Time” values are one or more seconds (particularly the “Motion Min Ack Time” metric), then a network performance issue likely exists.

Also, if there are a large number of packets being dropped because of queue overflow, you can increase the value for the gp_interconnect_queue_depth system configuration parameter to improve performance. See the Greenplum Database Administrator Guide for more in formation about system configuration parameters.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Motion Bytes Sent Bytes The number of bytes sent by the iterator.
Motion Total Ack Time Microseconds The total amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Average Ack Time Microseconds The average amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Max Ack Time Microseconds The maximum amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Min Ack Time Microseconds The minimum amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Count Resent Packets The total number of packets that the iterator did not acknowledge when they first arrived in the queue.
Motion Max Resent Packets The maximum number of packets that the iterator did not acknowledge when they first arrived in the queue. This metric is applied on a per packet basis. For example, a value of “10” indicates that a particular packet did not get acknowledged by this iterator 10 times, and that this was the maximum for this iterator.
Motion Bytes Received Bytes The number of bytes received by the iterator.
Motion Count Dropped Packets The number of packets dropped by the iterator because of buffer overruns.

GroupAggregate

The GroupAggregate iterator is a way to compute vector aggregates, and it is used to satisfy a `GROUP BY` clause. A single input set is required by the GroupAggregate iterator, and it must be ordered by the grouping column(s). This iterator returns a single row for a unique value of grouping columns.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Aggregate Total Spill Tuples Tuples The number of tuples spilled to disk
Aggregate Total Spill Bytes Bytes The number of bytes spilled to disk.
Aggregate Total Spill Batches Batches The number of spill batches required.
Aggregate Total Spill Pass Passes The number of passes across all of the batches
Aggregate Current Spill Pass Read Tuples Tuples The number of bytes read in for this spill batch
Aggregate Current Spill Pass Read Bytes Bytes The number of tuples read in for this spill batch
Aggregate Current Spill Pass Tuples Tuples The number of tuples that are in each spill file in the current pass.
Aggregate Current Spill Pass Bytes Bytes The number of bytes that are in each spill file in the current pass.
Aggregate Current Spill Pass Batches Batches The number of batches created in the current pass.

HashAggregate

The HashAggregate iterator is similar to the GroupAggregate iterator. A single input set is required by the HashAggregate iterator and it creates a hash table from the input. However, it does not require its input to ordered.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Aggregate Total Spill Tuples Tuples The number of tuples spilled to disk
Aggregate Total Spill Bytes Bytes The number of bytes spilled to disk.
Aggregate Total Spill Batches Batches The number of spill batches required.
Aggregate Total Spill Pass Passes The number of passes across all of the batches.
Aggregate Current Spill Pass Read Tuples Tuples The number of bytes read in for this spill batch.
Aggregate Current Spill Pass Read Bytes Bytes The number of tuples read in for this spill batch.
Aggregate Current Spill Pass Tuples Tuples The number of tuples that are in each spill file in the current pass.
Aggregate Current Spill Pass Bytes Bytes The number of bytes that are in each spill file in the current pass.
Aggregate Current Spill Pass Batches Batches The number of batches created in the current pass.

Hash Join

The Hash Join iterator requires two input sets - the outer and inner tables.

The Hash Join iterator starts by creating its inner table using the Hash operator. The Hash operator creates a temporary Hash index that covers the join column in the inner table. When the hash table (that is, the inner table) is created, Hash Join reads each row in the outer table, hashes the join column (from the outer table), and searches the temporary Hash index for a matching value.

In Greenplum Database, a Hash Join algorithm can be used with the following join semantics:

  • Left Join
  • Left Anti Semi Join
  • Full Join
  • Right Join
  • EXISTS Join
  • Reverse In Join
  • Unique Inner Join
  • Unique Outer Join
Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Hash Spill Batches Batches The current batch being spilled.
Hash Spill Tuples Tuples The current number of spilled tuples.
Hash Spill Bytes Bytes The current number of bytes spilled to disk.

Index Scan

An Index Scan operator traverses an index structure. If you specify a starting value for an indexed column, the Index Scan will begin at the appropriate value. If you specify an ending value, the Index Scan will complete as soon as it finds an index entry greater than the ending value. A query planner uses an Index Scan operator when it can reduce the size of the result set by traversing a range of indexed values, or when it can avoid a sort because of the implicit ordering offered by an index.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Index Scan Restore Pos Restores The number of restores.
Index Scan Rescan Rescans The number of rescans.

Limit

The Limit operator is used to limit the size of a result set. Greenplum Database uses the Limit operator for both Limit and Offset processing. The Limit operator works by discarding the first x rows from its input set, returning the next y rows, and discarding the remainder. If the query includes an OFFSETclause, x represents the offset amount; otherwise, x is zero. If the query includes a LIMITclause, y represents the Limit amount; otherwise, y is at least as large as the number of rows in the input set.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

Materialize

The materialize iterator is used for some sub-select operations. The query planner can decide that it is less expensive to materialize a sub-select one time than it is to repeat the work for each top-level row. Materialize is also used for some merge/join operations.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Materialize Rescan Rescans The number of times the executor requested to rescan the date for this iterator.

Merge Join

The Merge Join iterator joins two tables. Like the Nested Loop iterator, Merge Join requires two input sets: An outer table and an inner table. Each input set must be ordered by the join columns. In Greenplum Database, the Merge Join algorithm can be used with the following join semantics:

  • Left Join
  • Left Anti Semi Join
  • Full Join
  • Right Join
  • EXISTS Join
  • Reverse In Join
  • Unique Outer joins
  • Unique Inner Join

op

The Nested Loop iterator is used to perform a join between two tables, and as a result requires two input sets. It fetches each table from one of the input sets (called the outer table). For each row in the outer table, the other input (called the inner table) is searched for a row that meets the join qualifier. In Greenplum Database, a Merge Join algorithm can be used with the following join semantics:

  • Left Join
  • Left Anti Semi Join
  • Full Join
  • Right Join
  • EXISTS Join
  • Reverse In Join
  • Unique Outer Join
  • Unique Inner Join
Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Nested Loop Inner Tuples Tuples The number of rows from the inner part of the query plan.
Nested Loop Outer Tuples Tuples The number of rows from the outer part of the query plan.

Redistribute Motion

This iterator sends an outbound tuple to only one destination based on the value of a hash.

Note that the Motion metrics for the iterator are useful when investigating potential networking issues in the Greenplum Database system. Typically, the “Ack Time” values should be very small (microseconds or milliseconds). However if the “Ack Time” values are one or more seconds (particularly the “Motion Min Ack Time” metric), then a network performance issue likely exists.

Also, if there are a large number of packets being dropped because of queue overflow, you can increase the value for the gp_interconnect_queue_depth system configuration parameter to improve performance. See the Greenplum Database Administrator Guide for more in formation about system configuration parameters.

th Unit Description
Rows in Rows The number of tuples received by the iterator.
Motion Bytes Sent Bytes The number of bytes sent by the iterator.
Motion Total Ack Time Microseconds The total amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Average Ack Time Microseconds The average amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Max Ack Time Microseconds The maximum amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Min Ack Time Microseconds The minimum amount of time that the iterator waited for an acknowledgement after sending a packet of data.
Motion Count Resent Packets The total number of packets that the iterator did not acknowledge when they first arrived in the queue.
Motion Max Resent Packets The maximum number of packets that the iterator did not acknowledge when they first arrived in the queue. This metric is applied on a per packet basis. For example, a value of “10” indicates that a particular packet did not get acknowledged by this iterator 10 times, and that this was the maximum for this iterator.
Motion Bytes Received Bytes The number of bytes received by the iterator.
Motion Count Dropped Packets The number of packets dropped by the iterator because of buffer overruns.

Result

The Result iterator is used to either (1) execute a query that does not retrieve data from a table, or evaluate the parts of a WHEREclause that do not depend on data retrieved from a table. It can also be used if the top node in the query plan is an Append iterator.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

Repeat

This iterator repeats every input operator a certain number of times. This is typically used for certain grouping operations.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

Seq Scan

The Seq Scan iterator scans heap tables, and is the most basic query iterator. Any single-table query can be done by using the Seq Scan iterator. Seq Scan starts at the beginning of a heap table and scans to the end of the heap table. For each row in the heap table, Seq Scan evaluates the query constraints (the WHERE clause). If the constraints are satisfied, the required columns are added to the result set.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Seq Scan Page Stats Pages The number of pages scanned.
Seq Scan Restore Pos Restores The number of times the executor restored the scan position.
Seq Scan Rescan Rescans The number of times the executor requested to rescan the date for this iterator.

SetOp

There are four Setop iterators:

  • Intersect
  • Intersect All
  • Except
  • Except All

These iterators are produced only when the query planner encounters an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL clause, respectively.

All SetOp iterators require two input sets. They combine the input sets into a sorted list, and then groups of identical rows are identified. For each group, the SetOp iterators counts the number of rows contributed by each input set, then uses the counts to determine the number of rows to add to the result set.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

Shared Scan

This iterator evaluates the common parts of a query plan. It shares the output of the common sub-plans with all other iterators, so that the sub-plan only needs to execute one time.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Seq Scan Page Stats Seq Scan Page Stats The number of pages scanned.
Seq Scan Restore Pos Seq Scan Restore Pos The number of times the executor restored the scan position.
Seq Scan Rescan Seq Scan Rescan The number of times the executor requested to rescan the date for this iterator.

Sort

The Sort iterator imposes an ordering on the result set. Greenplum Database uses two different sort strategies: An in-memory sort and an on-disk sort. If the size of the result set exceeds the available memory, the Sort iterator distributes the input set to a collection of sorted work files and then merges the work files back together again. If the result set is less than the available memory, the sort is done in memory.

The Sort iterator is used for many purposes. A Sort can be used to satisfy an ORDER BY clause. Also, some query operators require their input sets to be ordered.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Sort Memory Usage Bytes The number of bytes used by the sort.
Sort Spill Tuples Tuples The current number of spilled tuples.
Sort Spill Bytes Bytes The current number of spilled bytes.
Sort Spill Pass Passes The number of merge passes. Each merge pass merges several sorted runs into larger runs.
Sort Current Spill Pass Tuples Tuples The number of tuples spilled in the current spill pass.
Sort Current Spill Pass Bytes Bytes The number of bytes spilled in the current spill pass.

Subquery Scan

A Subquery Scan iterator is a pass-through iterator. It scans through its input set, adding each row to the result set. This iterator is used for internal purposes and has no affect on the overall query plan.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.
Subquery Scan Rescan Rescans The number of times the executor requested to rescan the date for this iterator.

Tid Scan

The Tid Scan (tuple ID scan) iterator is used whenever the query planner encounters a constraint of the form ctid = expression or expression = ctid. This specifies a tuple ID, an identifier that is unique within a table. The tuple ID works like a bookmark, but is valid only within a single transaction. After the transaction completes, the tuple ID is not used again.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

Unique

The Unique iterator eliminates duplicate values from the input set. The input set must be ordered by the columns, and the columns must be unique. The Unique operator removes only rows — it does not remove columns and it does not change the ordering of the result set. Unique can return the first row in the result set before it has finished processing the input set. The query planner uses the Unique operator to satisfy a DISTINCTclause. Unique is also used to eliminate duplicates in a UNION.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

Values Scan

The Value Scan iterator is used to iterate over a set of constant tuples.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.

Window

The Window function performs calculations across sets of rows that are related to the current query row. The Window iterator computes Window functions on the input set of rows.

Metric Unit Description
Rows in Rows The number of tuples received by the iterator.