Phase-III Macro System: Unterschied zwischen den Versionen

Aus phenixxenia.org
Zur Navigation springen Zur Suche springen
K
K
 
(24 dazwischenliegende Versionen desselben Benutzers werden nicht angezeigt)
Zeile 1: Zeile 1:
[[Kategorie:zazy]]
+
[[Kategorie:Twiggi]]
  
= General =
+
[[Datei:FRAKTAL_MACRO_ARCHITECTURE.png|300px|thumb|Synopsis]]
  
The '''''Phase-III Macro System''''' is a flexible, data independent and parameter controlled set of SAS macros.
+
== Positioning ==
  
'''The ''Phase-III Macro System'' is not an end-to-end reporting tool.'''
+
;The ''Phase-III Macro System'' is not an end-to-end reporting tool.
 
+
;The '''''Phase-III Macro System''''' is a flexible, data independent and parameter controlled set of SAS macros.
* It is a '''highly interacting collection of macro modules''' providing '''transformation methods''' for study emergent datasets making use of all the information available in the description part of the dataset processed. The user is provided with (an) output dataset(s) containing character columns with standard names and externally controlled attributes.
 
  
 +
* It is a '''well structured and heavily interacting set of macro modules''' providing '''transformation methods''' for study emergent data making use of all the information available in the description part of the dataset processed. The user is provided with (an) output dataset(s) containing character columns with standard names and externally controlled attributes.
 
* The ''Phase-III Macro System'' provides '''subroutines that care for data types, formats, labels, headers, missing values, loops and more'''. Runtime generated information used to control processing is kept in '''standardized data structures''' using macro-variable lists ''(mlists)'', SAS formats and datasets.
 
* The ''Phase-III Macro System'' provides '''subroutines that care for data types, formats, labels, headers, missing values, loops and more'''. Runtime generated information used to control processing is kept in '''standardized data structures''' using macro-variable lists ''(mlists)'', SAS formats and datasets.
 +
* Input data structures may need some form of '''pre-processing''' as well as output data structures may need some '''post-processing''' to perfectly fulfill requirements. The ''Phase-III Macro System'' already '''supports these steps''' to some extent by providing ''condense'', ''struct'' and ''missline'' functions.
  
* Input data structures may need some form of '''pre-processing''' as well as output data structures may need some '''post-processing''' to perfectly fulfil requirements. The ''Phase-III Macro System'' already '''supports these steps''' to some extent by providing ''condense'', ''struct'' and ''missline'' functions.
+
== Properties ==
  
== Objective ==
+
The ''Phase-III Macro System'' is aimed at '''serving as a base for an extensible system''' that provides mechanisms for shaping input datasets, processing calculations and generating SAS datasets with ready made text content.
 
 
The ''Phase-III Macro System'' is aimed at '''serving as a base for an extendable system''' that provides mechanisms for shaping input datasets, processing calculations and generating SAS datasets with ready made text content.
 
 
 
== Scope ==
 
  
 
The ''Phase-III Macro System'' '''interacts with and makes use of other programs, modules, systems and datasets available'''. Communication and information interchange use SAS macrovariables, environment variables from the operating system and data structures compatible with the SAS System.
 
The ''Phase-III Macro System'' '''interacts with and makes use of other programs, modules, systems and datasets available'''. Communication and information interchange use SAS macrovariables, environment variables from the operating system and data structures compatible with the SAS System.
  
 
Input data streams will require preprocessing in general by assigning formats and labels. Output datasets will need postprocessing using merge and set operations mainly.
 
Input data streams will require preprocessing in general by assigning formats and labels. Output datasets will need postprocessing using merge and set operations mainly.
 
== Characteristics ==
 
  
 
Module size is kept small (not more than three screen pages) for maintainability and '''avoids hard-coded references''' to any application related information like data types, labels and formats.
 
Module size is kept small (not more than three screen pages) for maintainability and '''avoids hard-coded references''' to any application related information like data types, labels and formats.
 
Coding style makes broad use of '''automatic documentation''' and '''generation of meta data and lookup tables at runtime'''.
 
Coding style makes broad use of '''automatic documentation''' and '''generation of meta data and lookup tables at runtime'''.
  
== Approach ==
+
== Fraktal Architecture ==
 
 
# Avoid dependency of programs to data scope, study characteristics or personal styles.
 
# Have modules implemented in a way to operate in any emerging environment.
 
# Be prepared to add new output structures without substantial delay.
 
# Produce a wide variety of output with a minimum set of modules.
 
# Minimize maintenance efforts through self-documenting and limited program code.
 
# Maximize validation throughput by adopting a non-mutual-impact architecture.
 
 
 
== Architecture ==
 
 
 
=== Info Modules ===
 
 
 
Provide information about datasets and variables for correct processing.
 
 
 
=== Service Modules ===
 
 
 
Provide frequently requested tasks in a standard format with limited parameter set
 
 
 
=== Core Modules ===
 
 
 
Perform input transformation, calculations and output transformation
 
 
 
=== User Modules ===
 
 
 
Generate datasets carrying subtables controlled by user-supplied parms.
 
 
 
= Module Details =
 
 
 
== Info Modules ==
 
 
 
=== %GET_ATTR() ===
 
 
 
==== Function ====
 
 
 
'''Return single attributes like label, format, etc.'''
 
 
 
==== Description ====
 
 
 
Reads dataset header and returns attributes as undeclared macro variables using the requested attributes names. Information becomes available when the particular variable is declared in the calling environment using a %global or %local statement.
 
 
 
==== Source ====
 
 
 
%MACRO GET_ATTR(dsn=,source=,attrib=) / store des="Get attribute from SAS Variable";
 
 
 
%LOCAL name;
 
  
%LET name=GET_ATTR;
+
The ''Phase-III Macro System'' was coded by utilizing '''''[[Fraktal SAS Programming]]'''''. To a certain degree this concept compares to database normalization. ''[[Fraktal SAS Programming]]'' restricts program coding according to the '''''one-function-one-module''''' principle which is very well supported by the means delivered with the '''''SAS System'''''. The ''Phase-III Macro System'' system architecture is comprised from '''four types of modules'''.
 
 
%IF &DSN ne and &SOURCE ne and &ATTRIB ne %THEN %DO;
 
 
 
  %IF %INDEX(&DSN,.) eq 0 %THEN %DO;
 
    %LET dsn=WORK.&DSN;
 
  %END;
 
 
 
proc datasets nolist lib=%SCAN(&DSN,1);
 
  contents noprint  data=%SCAN(&DSN,2)(keep=&SOURCE)
 
                      out=work.tmp_data(keep=&ATTRIB);
 
  run;
 
quit;
 
 
 
proc sql noprint;
 
  select &ATTRIB
 
    into :&ATTRIB
 
  from work.tmp_data
 
  ;
 
 
 
%IF %UPCASE(&ATTRIB) ne LABEL %THEN %DO;
 
  %LET &ATTRIB = &&&ATTRIB;
 
%END;
 
 
 
quit;
 
 
 
%PUT &NAME._MESSAGE: Temporary SAS dataset WORK.TMP_DATA created ;
 
%PUT &NAME._MESSAGE: Field %UPCASE(&SOURCE) in dataset %UPCASE(&DSN) has &ATTRIB = %BQUOTE(&&&ATTRIB.). ;
 
%PUT &NAME._MESSAGE: Information stored into Local Macrovariable of calling environment: ;
 
%PUT &NAME._MESSAGE: &ATTRIB=%BQUOTE(&&&ATTRIB);
 
%PUT ;
 
 
 
%END;
 
 
 
%ELSE %DO;
 
%PUT vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv;
 
%PUT &NAME._ERROR: Missing Keyword Parameter(s).;
 
%PUT &NAME._STATUS: Macro processing abended. ;
 
%PUT &NAME._STATUS: Global Macrovariable(s) not available. ;
 
%PUT ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^;
 
%GEN_MAIL(name=&NAME,rc=1);
 
%END;
 
 
 
%MEND GET_ATTR;
 
 
 
=== %GRP_DESC() ===
 
 
 
==== Function ====
 
 
 
'''Provide info about a categorial variable.'''
 
 
 
==== Description ====
 
 
 
Investigates given categorial variable and provides results using undeclared macro variables: &n_grp - number of distinct values; &v_grp – structured list of distinct unformatted values; &l_grp – structured list of distinct formatted values.
 
 
 
=== %CHK_LIST() ===
 
 
 
==== Function ====
 
 
 
'''Provide info about a list type macrovar.'''
 
 
 
==== Description ====
 
 
 
Reads supplied list of tokens and returns undeclared macro variables: &n_lst - number of list elements; &v_lst – structured list of supplied elements. Input list elements may be separated by blank and comma only.
 
 
 
== User Modules ==
 
 
 
=== %TWO_CATV() ===
 
 
 
==== Function ====
 
 
 
'''Deliver PCT/count table from 2 nested categorial variables.'''
 
 
 
==== Description ====
 
 
 
Perform nested processing of two categorial variables looping the context variable from the row_* modules over the categories of the "outer" categories.
 
 
 
 
 
==== Parameters ====
 
  
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
!Name !!Description
+
! Type !!Function performed
 
|-
 
|-
| dsn || input dataset name
+
| [[Info Modules (from Phase-III Macro System)|Info Modules]] || Provide information about datasets and variables for correct processing
 
|-
 
|-
| row, row2 || categorial variable name, 2=nested variable
+
| [[Service Modules (from Phase-III Macro System)|Service Modules]] || Provide frequently requested tasks in a standard format with limited parameter set
 
|-
 
|-
| exclude || decode for excluded group from &ROW
+
| [[Core Modules (from Phase-III Macro System)|Core Modules]] || Perform input transformation, calculations and output transformation
 
|-
 
|-
| weight || Y/N (multiply percentages for &ROW and &ROW2)
+
| [[User Modules (from Phase-III Macro System)|User Modules]] || Generate datasets carrying subtables controlled by user-supplied parms
|-
 
| col || categorial variable name used for columns
 
|-
 
| head2 || Y/N (block header for nested variable)
 
|-
 
| indent, indinc || n (number of indent columns and increment for nested variable)
 
|-
 
| num || n (sequence number of output)
 
|-
 
| stat || Y/N (column with statistics names)
 
|-
 
| space || 1/2/3 (blank line before or after output and between nesting levels)
 
|-
 
| struct, struct2 || name of reference dataset used for full decode structure, 2=nested variable
 
|-
 
| condense || var#value (non-distinct variable and true value for &ROW)
 
|-
 
| misslin2 || Y/N (force missing line for nested variable)
 
 
|}
 
|}
  
==== Source ====
+
== Approach ==
 
 
===== declares and upper level processing =====
 
 
 
%MACRO TWO_CATV(dsn=
 
                ,exclude=
 
                ,row=
 
                ,row2=
 
                ,col=
 
                ,indent=0
 
                ,num=
 
                ,stat=N
 
                ,weight=Y
 
                ,space=2
 
                ,condense=
 
                ,struct=
 
                ,struct2=
 
                ,head2=N,misslin2=
 
                ,indinc=2)
 
/ store des=""
 
;
 
 
 
%LOCAL n_grp v_grp n name;
 
 
 
%LET name=TWO_CATV;
 
%IF &STRUCT  eq %THEN %LET struct =&DSN;
 
%IF &STRUCT2 eq %THEN %LET struct2=&DSN;
 
 
 
%GRP_DESC(dsn=&DSN
 
          ,grp=&ROW
 
          ,miss=n)
 
;
 
 
 
%TOP_FILT(dsn=&DSN
 
          ,grp=&ROW
 
          ,by=&COL
 
          ,grplvl=&NUM
 
          ,var=
 
          ,condense=&CONDENSE)
 
;
 
 
 
%TOP_FREQ(dsn=top_filt
 
          ,struct=&STRUCT
 
          ,grp=&ROW
 
          ,by=&COL)
 
;
 
 
 
%TOP_OUTC(dsn=top_freq
 
          ,head=n
 
          ,total=n
 
          ,stat=&STAT
 
          ,indent=&INDENT
 
          ,grp=&ROW
 
          ,rev=n
 
          ,use=
 
          ,by=&COL
 
          ,missline=)
 
;
 
 
 
===== loop for lower level processing =====
 
 
 
%DO n=1 %TO &N_GRP;
 
  %IF %SCAN(&V_GRP,&N) ne &EXCLUDE %THEN %DO;
 
    %ROW_FILT(dsn=&DSN
 
              ,context=&ROW
 
              ,subgrp=&N
 
              ,grp=&ROW2
 
              ,by=&COL
 
              ,var=
 
              ,miss=n)
 
    ;
 
    %ROW_FREQ(dsn=row_filt
 
              ,sum=top_freq
 
              ,struct=&STRUCT2
 
              ,context=&ROW
 
              ,grp=&ROW2
 
              ,by=&COL
 
              ,weight=&WEIGHT)
 
    ;
 
    %ROW_OUTC(dsn=row_freq
 
              ,sum=main_3rd
 
              ,head=&HEAD2
 
              ,stat=&STAT
 
              ,indent=%EVAL(&INDENT+&INDINC)
 
              ,context=&ROW
 
              ,grp=&ROW2
 
              ,by=&COL
 
              ,missline=&MISSLIN2)
 
    ;
 
  %END;
 
%END;
 
 
 
===== care for naming and send completion mail =====
 
 
 
%IF &TAB_NAME ne %THEN %DO;
 
  data %SUBSTR(&TAB_NAME,1,3)&NUM%SUBSTR(&TAB_NAME,5,4);
 
    set
 
  %DO n=1 %TO &N_GRP;
 
    %IF &SPACE eq 1 %THEN dummy ;
 
    %IF %SCAN(&V_GRP,&N) ne &EXCLUDE %THEN row&NUM._&N ;
 
    %IF &SPACE eq 2 %THEN dummy ;
 
  %END;
 
    %IF &SPACE eq 3 %THEN dummy ;
 
    ;
 
  run;
 
%END;
 
  
%GEN_MAIL(name=&NAME);
+
In general, ''[[Fraktal SAS Programming]]'' introduced a fair number of benefits to the coding of ''Phase-III Macro System'', amongst which were:
  
%MEND TWO_CATV;
+
* Avoid dependency of programs to data scope, study characteristics or personal styles.
 +
* Have modules implemented in a way to operate in any emerging environment.
 +
* Be prepared to add new output structures without substantial delay.
 +
* Produce a wide variety of output with a minimum set of modules.
 +
* Minimize maintenance efforts through self-documenting and limited program code.
 +
* Maximize validation throughput by adopting a non-mutual-impact architecture.

Aktuelle Version vom 6. Januar 2016, 16:43 Uhr


Synopsis

Positioning

The Phase-III Macro System is not an end-to-end reporting tool.
The Phase-III Macro System is a flexible, data independent and parameter controlled set of SAS macros.
  • It is a well structured and heavily interacting set of macro modules providing transformation methods for study emergent data making use of all the information available in the description part of the dataset processed. The user is provided with (an) output dataset(s) containing character columns with standard names and externally controlled attributes.
  • The Phase-III Macro System provides subroutines that care for data types, formats, labels, headers, missing values, loops and more. Runtime generated information used to control processing is kept in standardized data structures using macro-variable lists (mlists), SAS formats and datasets.
  • Input data structures may need some form of pre-processing as well as output data structures may need some post-processing to perfectly fulfill requirements. The Phase-III Macro System already supports these steps to some extent by providing condense, struct and missline functions.

Properties

The Phase-III Macro System is aimed at serving as a base for an extensible system that provides mechanisms for shaping input datasets, processing calculations and generating SAS datasets with ready made text content.

The Phase-III Macro System interacts with and makes use of other programs, modules, systems and datasets available. Communication and information interchange use SAS macrovariables, environment variables from the operating system and data structures compatible with the SAS System.

Input data streams will require preprocessing in general by assigning formats and labels. Output datasets will need postprocessing using merge and set operations mainly.

Module size is kept small (not more than three screen pages) for maintainability and avoids hard-coded references to any application related information like data types, labels and formats. Coding style makes broad use of automatic documentation and generation of meta data and lookup tables at runtime.

Fraktal Architecture

The Phase-III Macro System was coded by utilizing Fraktal SAS Programming. To a certain degree this concept compares to database normalization. Fraktal SAS Programming restricts program coding according to the one-function-one-module principle which is very well supported by the means delivered with the SAS System. The Phase-III Macro System system architecture is comprised from four types of modules.

Type Function performed
Info Modules Provide information about datasets and variables for correct processing
Service Modules Provide frequently requested tasks in a standard format with limited parameter set
Core Modules Perform input transformation, calculations and output transformation
User Modules Generate datasets carrying subtables controlled by user-supplied parms

Approach

In general, Fraktal SAS Programming introduced a fair number of benefits to the coding of Phase-III Macro System, amongst which were:

  • Avoid dependency of programs to data scope, study characteristics or personal styles.
  • Have modules implemented in a way to operate in any emerging environment.
  • Be prepared to add new output structures without substantial delay.
  • Produce a wide variety of output with a minimum set of modules.
  • Minimize maintenance efforts through self-documenting and limited program code.
  • Maximize validation throughput by adopting a non-mutual-impact architecture.