SUBSET page-2.5/2.6

Omniscope SUBSET Functions (2.5+)

This page refers to the use of SUBSET functions in Omniscope 2.5 through to Omniscope 2.6 b606.  In Omniscope 2.6 b607 or later, please see the updated guide.

Defining formulae involving evaluated data subsets 


SUBSET functions are unique to Omniscope and are broken down into two sub-classes of functions which work together:

  1. SUBSET, SUBSET2, SUBSET3... are 'clauses' used to define specific subsets by "querying" the data by matching field values, defining a subset of records for use in another function.
     For example: "All records where Country is UK" or "All records with the same Country as the record being evaluated".
  2. SUBSET_MEAN, SUBSET_SUM, etc. are used to calculate a statistical function upon a specified field within a given subset defined by a SUBSET 'clause'.
    For example: "Sum of Sales Volume in subset X" where X is a subset as above.

Note:  SUBSET functions replace the previous, now deprecated DATASET_... functions as they are more flexible and efficient.  Other functions other than SUBSET_MEAN/SUM/etc. also use SUBSET 'clauses' such as RANK. 

Example: Combining SUBSET 'clause' with SUBSET_SUM to roll-up data

If you wanted to find:

The sum of Sales Volume for the current country 

which could be rewritten as:

Sum of Sales Volume in subset (all records with the same Country as the record being evaluated)

you would use the formula:

SUBSET_SUM([Sales Volume], SUBSET([Country]))


Examples of SUBSET 'clause' constructions:

SUBSET 'clauses' results in a data subset for use in other functions (such as SUBSET_MEAN or RANK). You would not normally use the result of SUBSET 'clause' as the end result of the entire formula, since you cannot display the definition of the subset defined as a data table inside a cell.

SUBSET 'clauses' are used to identify records by one constraint (e.g. Month is "February"), and SUBSET2 is used to identify records by two constraints (e.g. Month is "February" and Client is "Acme Bank"), etc.

SUBSET()

The entire dataset

SUBSET([Month])

Data subset with the same Month field value as the record being evaluated

SUBSET([Month]"February")

Data subset with the Month field value 'February'

SUBSET([Quantity]5)

Data subset where the field/column Quantity value equals 5
SUBSET([Quantity]5"<")
SUBSET([Quantity]5"<=")
SUBSET([Quantity]5">")
SUBSET([Quantity]5">=")
SUBSET([Quantity]5"=")
SUBSET(
[Quantity]5"<>")
Data subsets where the filed/column Quantity value is less than, less than/equal, greater than, greater than/equal, equal, not equal to the value 5 (respectively).  Use of the "operator" inequality options requires Omniscope 2.6+.
SUBSET2([Month][Client])

Data subset with the same Month and Client field values as the record being evaluated

SUBSET2([Month][Client],
   "February"
"Acme Bank")

Data subset with the Month and Client field values 'February' and 'Acme Bank'

SUBSET3([Month][Client][Country])

Data subset with the same Month, Client and Country field values as the record being evaluated

SUBSET3([Month][Client][Country],
   "February"
"Acme Bank""UK")

Data subset with the Month, Client and Country field values 'February', 'Acme Bank' and 'UK'

SUBSET3([Month][Client][Quantity]
   "February"
"Acme Bank"5,
   "=", "=", ">=")
Data subset where the Month and Client field values are 'February' and 'Acme Bank', and the Quantity field values are greater than or equal to 5.  Use of "operator" inequality options requires Omniscope 2.6+.

Examples of how you would use SUBSET: functions

SUBSET_... functions (such as SUBSET_MEAN) accept a field for the statistical function and an optional data subset 'clause' to restrict the record set.

SUBSET_MEAN([Price])
or
SUBSET_MEAN([Price], SUBSET())

The mean of the Price field, for all records
SUBSET_SUM([Price], SUBSET([Month]))
The sum of the Price field, for the data subset with the same Month field value as the record being evaluated
SUBSET_MODE([Currency],
   SUBSET(
[Month]"February"))
The most common Currency field value, for the data subset with the Month field value 'February'

Worked example - single field subset

This example shows a table of data containing Traders and their Sales Volume by Country.  We want to find out the total Sales Volume for each Trader.

Country

Trader

Sales Volume

Formula field

UKJohn Swires10,00060,000
FranceJohn Swires15,00060,000
USAJohn Swires35,00060,000
UKAnne Campbell7,00021,000
JapanAnne Campbell14,00021,000

 

The field "Formula field" above is defined with the formula:

SUBSET_SUM([Sales Volume], SUBSET([Trader])) 

Broken down, this means:

SUBSET([Trader])   The records with the same Trader value as the record being evaluated

SUBSET_SUM([Sales Volume], subset)   The sum of Sales Volume for those records

When Omniscope evaluates the first cell (60,000) in "Formula field":

SUBSET([Trader]) evaluates to all records with Trader "John Swires" (i.e. the first 3 records).

SUBSET_SUM([Sales Volume], subset) evaluates to 60,000 (10,000 + 15,000 + 35,000).

Worked example - multi-field subsets

This example involves a table of data containing multiple quotes for multiple parts to multiple clients in multiple currencies. We want to find the lowest Quote in "USD" for each combination of Part and Customer.

Customer

Currency

Part

Quote

  Min USD Quote

Alpha

GBP

widget

8

 

 

12

Beta

USD

gromett

10

 

 

10

Gamma

Yen

nubbin

50

 

 

6

Alpha

USD

widget

12

 

 

12

Gamma

USD

widget

9

 

 

9

Beta

Euro

 gromett

11

 

 

10

AlphaYen

  nubbin

52  6
BetaUSD

  nubbin

5  5
GammaUSD

 gromett

7  7
AlphaUSD

  nubbin

6  6
GammaUSD

  nubbin

6  6

The formula field [Min USD Quote] above can be defined with the following formula:

SUBSET_MIN([Quote],SUBSET3([Customer],[Currency],[Part],[Customer],"USD",[Part]))

Broken down, this formula means:

SUBSET_MIN( [Quote], find the lowest quotation in the column for the following subsets:

SUBSET3([Customer],[Currency],[Part],[Customer],"USD",[Part] ) = all combinations of 'Customer' and 'Part' where Currency="USD"