mp_stackdiffs.sas File Reference

Prepares an audit table for stacking (re-applying) the changes. More...

Go to the source code of this file.

Detailed Description

When the underlying data from a Base Table is refreshed, it can be helpful to have any previously-applied changes, re-applied.

Such situation might arise if you are applying those changes using a tool like Data Controller for SASĀ® - which records all such changes in an audit table. It may also apply if you are preparing a series of specific cell-level transactions, that you would like to apply to multiple sets of (similarly structured) Base Tables.

In both cases, it is necessary that the transactions are stored using the mp_storediffs.sas macro, or at least that the underlying table is structured as per the definition in mp_coretable.sas (DIFFTABLE entry)

This macro is used to convert the stored changes (tall format) into staged changes (wide format), with base table values incorporated (in the case of modified rows), ready for the subsequent load process.

Essentially then, what this macro does, is turn a table like this:

KEY_HASH:$32. MOVE_TYPE:$1. TGTVAR_NM:$32. IS_PK:best. IS_DIFF:best. TGTVAR_TYPE:$1. OLDVAL_NUM:best32. NEWVAL_NUM:best32. OLDVAL_CHAR:$32765. NEWVAL_CHAR:$32765.
27AA6F7581052E7FF48E1BCA901313FB A NAME 1 -1 C . . Newbie
27AA6F7581052E7FF48E1BCA901313FB A AGE 0 -1 N . 13
27AA6F7581052E7FF48E1BCA901313FB A HEIGHT 0 -1 N . 65.3
27AA6F7581052E7FF48E1BCA901313FB A SEX 0 -1 C . . F
27AA6F7581052E7FF48E1BCA901313FB A WEIGHT 0 -1 N . 98
86703FDE9E87DD5C0F8E1072545D0128 D NAME 1 -1 C . . Alfred
86703FDE9E87DD5C0F8E1072545D0128 D AGE 0 -1 N 14 .
86703FDE9E87DD5C0F8E1072545D0128 D HEIGHT 0 -1 N 69 .
86703FDE9E87DD5C0F8E1072545D0128 D SEX 0 -1 C . . M
86703FDE9E87DD5C0F8E1072545D0128 D WEIGHT 0 -1 N 112.5 .
64489C85DC2FE0787B85CD87214B3810 M NAME 1 0 C . . Alice Alice
64489C85DC2FE0787B85CD87214B3810 M AGE 0 1 N 13 99
64489C85DC2FE0787B85CD87214B3810 M HEIGHT 0 0 N 56.5 56.5
64489C85DC2FE0787B85CD87214B3810 M SEX 0 0 C . . F F
64489C85DC2FE0787B85CD87214B3810 M WEIGHT 0 0 N 84 84
Into three tables like this:

work.outmod:

NAME:$8. SEX:$1. AGE:best. HEIGHT:best. WEIGHT:best.
Alice F 99 56.5 84

work.outadd:

NAME:$8. SEX:$1. AGE:best. HEIGHT:best. WEIGHT:best.
Newbie F 13 65.3 98

work.outdel:

NAME:$8. SEX:$1. AGE:best. HEIGHT:best. WEIGHT:best.
Alfred M 14 69 112.5

As you might expect, there are a bunch of extra features and checks.

The macro supports both SCD2 (TXTEMPORAL) and UPDATE loadtypes. If the base table contains a PROCESSED_DTTM column (or similar), this can be ignored by declaring it in the processed_dttm_var parameter.

The macro is also flexible where columns have been added or removed from the base table UNLESS there is a change to the primary key.

Changes to the primary key fields are NOT supported, and are likely to cause unexpected results.

The following pre-flight checks are made:

  • All primary key columns exist on the base table
  • There is no change in variable TYPE for any of the columns
  • There is no reduction in variable LENGTH below the max-length of the supplied values

Rules for stacking changes are as follows:

Transaction TypeKey BehaviourColumn Behaviour
Deletes The row is added to &outDEL. UNLESS it no longer exists in the base table, in which case it is added to &errDS. instead. Deletes are unaffected by the addition or removal of non Primary-Key columns.
Inserts Previously newly added rows are added to the outADD table UNLESS they are present in the Base table.
In this case they are added to the &errDS. table instead.
Inserts are unaffected by the addition of columns in the Base Table (they are padded with blanks). Deleted columns are only a problem if they appear on the previous insert - in which case the record is added to &errDS..
Updates Previously modified rows are merged with base table values such that only the individual cells that were previously changed are re-applied. Where the row contains cells that were not marked as having changed in the prior transaction, the 'blanks' are filled with base table values in the outMOD table.
If the row no longer exists on the base table, then the row is added to the errDS table instead.
Updates are unaffected by the addition of columns in the Base Table - the new cells are simply populated with Base Table values. Deleted columns are only a problem if they relate to a modified cell (is_diff=1) - in which case the record is added to &errDS..

To illustrate the above with a diagram:

For examples of usage, check out the mp_stackdiffs.test.sas program.

Parameters
[in]baselibdsBase Table against which the changes will be applied, in libref.dataset format.
[in]auditlibdsDataset with previously applied transactions, to be re-applied. Use libref.dataset format. DDL as follows: mp_coretable(DIFFTABLE)
[in]keySpace seperated list of key variables
[in]mdebug=Set to 1 to enable DEBUG messages and preserve outputs
[in]processed_dttm_var=(0) If a variable is being used to mark the processed datetime, put the name of the variable here. It will NOT be included in the staged dataset (the load process is expected to provide this)
[out]errds=(work.errds) Output table containing problematic records. The columns of this table are:
  • PK_VARS - Space separated list of primary key variable names
  • PK_VALS - Slash separted list of PK variable values
  • ERR_MSG - Explanation of why this record is problematic
[out]outmod=(work.outmod) Output table containing modified records
[out]outadd=(work.outadd) Output table containing additional records
[out]outdel=(work.outdel) Output table containing deleted records

SAS Macros

Related Macros

Todo:
The current approach assumes that a variable called KEY_HASH is not on the base table. This part will need to be refactored (eg using mf_getuniquename.sas) when such a use case arises.
Version
9.2
Author
Allan Bowe

Definition in file mp_stackdiffs.sas.