Fraktal SAS Programming: Unterschied zwischen den Versionen

Aus phenixxenia.org
Zur Navigation springen Zur Suche springen
K
K
 
(34 dazwischenliegende Versionen desselben Benutzers werden nicht angezeigt)
Zeile 1: Zeile 1:
 
[[Kategorie:zazy]]
 
[[Kategorie:zazy]]
  
==Preface==
+
== In General ==
  
The ''SAS System ('''SAS''')'' is an impressive powerful ecosystem of languages, tools and programs leaving the user with all means at hand to work with data and satisfy his curiosity, be it of scientific origin or simply driven by work orders in a top-down ruled organization.
+
'''Welcome to the ''Introduction to "Fraktal SAS Programming"''.'''  
  
Given the above, it is not surprising that
+
'''The pages provided here are intended to serve as guidelines for ''Beginners in SAS Based Reporting from Database Tables.'''
#SAS license fees appear high, and
 
#the individual trying to start a user career feels pretty lonesome.
 
  
 +
'''Why ''"Fraktal"''?''' [https://www.youtube.com/watch?v=qqRiZWGwk-A Start this movie on measurement of coast lines by using Fractals at 16:40 (German audio)]
  
Since no one would buy a modern smartphone to simply make phone calls it is likewise un-appropriate to use SAS solely as a
+
'''We are using the term ''"Fraktal"'' with a ''"k"'' here to emphasize, that the programming concept introduced is derived from ''Fractal Geometry'' in Mathematics but not identical to it.'''
*SQL database system
 
*basket of tabulation programs
 
*graphics toolbox
 
*web publishing agent
 
*data-warehouse platform
 
*statistics package
 
*metadata manager
 
*source code generator
 
  
 +
'''''Fraktal SAS Programming''''' is considered to be ''"fractal"'' because the program archtitecture is suggested to use minimized i.e. smallest scale modules to comprise the implementation from. Unlike fractal curves segments, e.g. used in coast line measurement, the module size meets a lower limit introduced by [[Coding (from Fraktal SAS Programming)|syntactic properties]]. Nevertheless, the module size possible will range between a few lines and very few screen pages. Overall size of a module will very rarely reach 100 lines of code. This includes declares, communication and documentation as well as logic like loops and branches.
  
Indeed, SAS can perform any of these functions, and more, and even worse, a small team of SAS geeks can deliver any combination of them as scenario-tailored application in an awesome short time frame.
+
'''To make these guidelines compatible with the subject they present, the structure is likewise modularized to a maximum: It is comprised from half-page text slides, making tiny lessons that are easily taken one by one.'''
  
Of course, the result will be a dynamic, self-documenting, metadata driven and generic sort of thing.
+
== In Detail ==
  
That’s why SAS starters feel lonesome and hence, matured users have organized themselves in non-commercial networks worldwide, the largest of which is '''PhUSE''', the '''''[http://www.phuse.eu Pharmaceutical User Software Exchange]'''''.
+
#[[Preface (from Fraktal SAS Programming)|'''Preface:''' Learn about appropriate positioning of the '''"SAS System"''' from '''"SAS Institute"'''.]]
 
+
#[[Coding (from Fraktal SAS Programming)|'''Coding:''' Read important considerations on front-end '''program structure''' and back-end '''runtime behaviour'''.]]
'''Are you ready?'''
+
##[[Implicit Coding (from Fraktal SAS Programming)|Implicit Coding]]
 
+
##[[Explicit Coding (from Fraktal SAS Programming)|Explicit Coding]]
'''Welcome to the club!'''
+
#[[Macro (from Fraktal SAS Programming)|'''Macro:''' Find out how every single aspect from your '''workflow definition''' can easily be '''reflected and implemented'''.]]
 
+
##[[Straightforward Coding (from Fraktal SAS Programming)|Straightforward Coding]]
 
+
##[[Generalized Approach (from Fraktal SAS Programming)|Generalized Approach]]
==Coding==
+
##[[Advanced Coding (from Fraktal SAS Programming)|Advanced Coding]]
 
+
##[[Symbol Tables (from Fraktal SAS Programming)|Symbol Tables]]
===Rules?===
+
##[[Parameter Scope (from Fraktal SAS Programming)|Parameter Scope]]
 
+
##[[Extending Control (from Fraktal SAS Programming)|Extending Control]]
While there is no technical reason to introduce and follow coding rules and typographical conventions, it has proven as helpful to do so depending on working context and purpose that is followed.
+
###[[Apply Logic (from Fraktal SAS Programming)|Apply Logic]]
 
+
###[[Process Metadata (from Fraktal SAS Programming)|Process Metadata]]
'''''SAS is freedom''''' is good news for most ad-hoc programmers aiming to have results the same minute.
+
####[[What is Metadata? (from Fraktal SAS Programming)|What is Metadata?]]
 
+
####[[Process Metadata: List (from Fraktal SAS Programming)|Process Metadata: List]]
'''''SAS is freedom''''' is bad news for all team leads and managers bearing responsibility for sustainable usage of resources and maintenance of programs written by individuals that will most likely leave some day.
+
####[[Process Metadata: Numbered (from Fraktal SAS Programming)|Process Metadata: Numbered]]
 
+
####[[Process Metadata: Direct (from Fraktal SAS Programming)|Process Metadata: Direct]]
Throughout the text of this tutorial we will therefore adhere to a set of rules that might seem superfluous at 1st sight but will help to catch structure and process implemented in a program without deep-diving into the code.
+
###[[Workflow Documentation (from Fraktal SAS Programming)|Workflow Documentation]]
 
+
####[[Stored Workflow Documentation (from Fraktal SAS Programming)|Stored Workflow Documentation]]
 
+
###[[Realtime Information (from Fraktal SAS Programming)|Realtime Information]]
===Standards!===
+
##[[Fully Qualified Coding (from Fraktal SAS Programming)|Fully Qualified Coding]]
 
+
#[[DBMS Interaction (from Fraktal SAS Programming)|'''DBMS:''' Talk to Database Management Systems from within your SAS program to build a seamlessly integrated workflow.]]
SAS supports modular coding very well because code processing follows a block or “group” structure as the architects at SAS Institute Inc. would put it. Let’s directly jump into this topic:
+
##[[Libname Engine (from_Fraktal_SAS_Programming)|Libname Engine]]
 
+
##[[Hybrid Queries (from_Fraktal_SAS_Programming)|Hybrid Queries]]
data basix;
+
##[[Passthru SQL (from_Fraktal_SAS_Programming)|Passthru SQL]]
city='Washington'; lat="038° 054′ N"; long="077° 002′ W"; output;
+
#[[Programming (from Fraktal SAS Programming)|'''Programming:''' Communicate your algorithm to SAS code processors by applying them to data and code as well.]]
city='Berlin'; lat="052° 031′ N"; long="013° 024′ O"; output;
+
##[[Data Step Programming (from Fraktal SAS Programming)|Processing Records]]
city='Tokyo'; lat="035° 041′ N"; long="139° 046′ O"; output;
+
###[[Read Text File with DSL (from Fraktal SAS Programming)|Read Text File]]
proc sort; by lat;
+
###[[Create Dataset with DSL (from Fraktal SAS Programming)|Create Dataset]]
proc print; run;
+
###[[Process Data using DSL (from Fraktal SAS Programming)|Process Data]]
 
+
##[[Macro Programming (from Fraktal SAS Programming)|Generating Code]]
This appears to be an easy to read and straightforward written program, and this is definitely true. And indeed, this code will complete without error messages and produce a formatted list of three cities along with their explicit latitude and longitude.
+
###[[Macro XSET (from Fraktal SAS Programming)|Copy environment from operating system]]
 
+
###[[Macro XDIR (from Fraktal SAS Programming)|List OS directory in SAS LOG screen]]
'''But this is not the program that is processed by SAS.'''
+
####[[Macro rXDIR (from Fraktal SAS Programming)|Try recursion and window front-end]]
 
+
###[[Macro XEDIT (from Fraktal SAS Programming)|Open selected text file in SAS Program Eitor]]
'''What does SAS see?'''
+
###[[Macro XAMINE (from Fraktal SAS Programming)|Build advanced function from basic SAS Macros]]
 
+
#[[Data Structures (from Fraktal SAS Programming)|'''Data Structures:''' Discover the the key to flexibility and efficiency in SAS programming.]]
 
+
##[[Tables and Views (from Fraktal SAS Programming)|Tables and Views]]
===Groups===
+
##[[Meta Data Tables (from Fraktal SAS Programming)|Meta Data Tables]]
 
+
##[[SAS Formats (from Fraktal SAS Programming)|SAS Formats]]
The SAS compiler processes the source code submitted in so called '''''steps''''' which in turn are comprised from groups of lines terminated by a semicolon. If users do not code full steps, then SAS completes the code up to a certain amount.
 
 
 
Lines terminated with semicolon are called '''''statements'''''.
 
 
 
Steps comprised from statements like above are called '''''run groups'''''.
 
 
 
Logically, the submitted code from the above example, will be transformed into three run groups that are executed in discrete steps. In each step syntax check and handling of user feedback is handled separately.
 
 
 
data basix;
 
city='Washington'; lat="038° 054′ N"; long="077° 002′ W"; output;
 
city='Berlin'; lat="052° 031′ N"; long="013° 024′ O"; output;
 
city='Tokyo'; lat="035° 041′ N"; long="139° 046′ O"; output;
 
run;
 
 
 
proc sort data=basix out=basix;
 
by lat;
 
run;
 
 
 
proc print data=basix;
 
run;
 
 
 
 
 
===Segments===
 
 
 
As mentioned earlier, SAS coded workflow is processed as sequence of blocks or groups. Since this processing structure is used everywhere in SAS, we will refer to these blocks and groups as '''''segments''''' throughout the remainder of this text.
 
 
 
Due to various languages available inside SAS, particular segments might have their very special appearance. The '''''run group'' example''' from above is merely one of them.
 
 
 
'''Segments from different syntaxes may be hierarchically nested.'''
 
 
 
'''Segments may not intersect, with one exception, however.'''
 
 
 
 
 
==Macro==
 
 
 
===Straightforward Coding===
 
 
 
Because it is the most prominent type in a professional senior SAS programmer’s life’s production (the ''Oeuvre''), we will describe SAS Macro ('''''"MACRO"''''') coding as '''''1st segment type'''''.
 
 
 
As we remember from the '''''run group'' segment type''' example, code segments are verbatim encapsulated by an initializing statement that is accompanied by a corresponding termination statement. '''MACRO definitions''' are defined by using these two specific statements:
 
 
 
%MACRO name;
 
program code
 
%MEND name;
 
 
 
%NAME;
 
 
 
===Generalized Approach===
 
 
 
It appears necessary to stress here, that any MACRO does not execute the program code contained but passes it to the '''''SAS'' compiler''' which will perform a '''Compile-and-Go''' step as default. Nevertheless it would be premature to assume that this mechanism requires the code to be SAS code.
 
 
 
'''Instead, it is possible to GENERATE-and-PASS any code.'''
 
 
 
SAS provides means and concepts to direct generated source code to appropriate agents, be it external programs or the operating system itself. OS functions may be called explicitly or implicitly or code may be written to a text file that is executed later on.
 
 
 
Out of the numerous options, the following two might appear quite useful.
 
 
 
 
 
====Utilize OS Functions====
 
 
 
1. Access results from OS commands as data source.
 
 
 
filename myfref  pipe “dir c:\ /d”;
 
 
 
This statement assigns a file reference with target type ''pipe''. The pipe type dynamically accesses the result of an OS function as data stream that can be used as text input file inside a data step.
 
 
 
2. Perform an operation on OS level.
 
 
 
systask command “mkdir c:\&MYDIR.”;
 
 
 
The SYSTASK statement is a powerful means to initiate and control background tasks. With options WAIT/NOWAIT it provides direct utilization of OS multitasking by initiating parts of complex SAS code as background tasks.
 
 
 
 
 
====Write Vector Graphics====
 
 
 
filename _xml_ "&MYPTH.\&MYGPH..svg";
 
data _null_;
 
file _xml_;
 
put '<?xml version="1.0" standalone="no"?>';
 
put '<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">';
 
put '<svg xmlns="http://www.w3.org/2000/svg" version="1.1"
 
  width="29.7cm" height="21cm"
 
  viewBox="-200 -100 1200 800">
 
  <desc> Example anim01 - demonstrate animation elements
 
  </desc>
 
  <title> SpotGrid
 
  </title>
 
';
 
put '
 
  <rect id="OuterBorder" x="-4" y="-4" width="904" height="604" fill="rgb(255,255,255)" stroke="rgb(0,0,0)" stroke-width="8"> 
 
  </rect>
 
';
 
put '</svg>';
 
run;
 
 
 
 
 
===Advanced Coding===
 
 
 
While the crude approach to MACRO programming is 1st choice for any ad-hoc implementation it will never result in a piece of software that will survive in a quality controlled environment. Moreover, when used as component in a modularized system it will not produce predictable results and very likely mess things up at run-time.
 
 
 
'''Why?'''
 
 
 
The key reason is, that SAS languages – like other programming languages – do use variable properties, but without forcing the programmer to deliver this information by declaring everything forehand in a header section or similar place.
 
 
 
SAS code is executed regardless whether explicit declaration is found or not. When none is found, SAS applies built-in rules to perform automatic declaration on which it then operates. Properties given that way might not conform with programmers’ expectations or the system’s design requirements.
 
 
 
'''That’s why!'''
 
 
 
 
 
===Symbol Tables===
 
 
 
Control information tokens, referred to as ''parameters'' are called '''''macro variables''''' (''"variables"'') when writing MACROs. Macro variables are stored in tables, which have been given the name '''''symbol tables''''' (''"tables"'').
 
 
 
On starting a SAS process a '''''global symbol table''''' is initiated and populated with control information used by the session and non-MACRO programs.
 
 
 
On invocation of a MACRO, a '''''local symbol table''''' is initiated and kept alive during run-time of the MACRO. Local symbol tables disappear on termination of '''"their" MACRO'''.
 
 
 
Symbol tables are two-column character type matrices with '''one single property ''scope'' being either ''global'' or local'''''. MACRO variable names are stored in the 1st column, MACRO variable values are stored in the 2nd column.
 
 
 
'''MACRO symbol tables are stored in memory.'''
 
 
 
'''MACRO functions are processed in memory.'''
 
 
 
 
 
===Parameter Scope===
 
 
 
Since scope is the only property of MACRO variables, declaration is easy: Simply assign each variable used to one of these two groups.
 
 
 
However, there is a set of rules requiring your attention:
 
*A particular variable name may appear in an unlimited number of tables.
 
*MACROs may be nested to form unlimited invocation hierarchies.
 
*A calling MACRO’s local table appears global to the called MACRO.
 
*Read references to variables are performed 1st against the local table.
 
*Write references are processed likewise: local 1st, global 2nd.
 
*Write references not met in the invocation hierarchy generate a local variable.
 
 
 
 
 
'''Obviously variable declaration is a critical issue in a validated environment. If not done in total then the validation status of the whole system is questionable.'''
 
 
 
 
 
===Extending Control===
 
 
 
Now, with proper declaration, it is safe to run your MACRO as a component in a validated system. However, it is still difficult to follow its results and discover failure risks or un-wanted misbehavior.
 
 
 
You might therefore find it useful to add functionality such as:
 
#apply logic to check for parameters’ appropriate values
 
#navigate through the ecosystem by reading and processing metadata
 
#document workflow by writing comments to the LOG
 
#inform responsibles about invocation by sending an email
 
#write a text file that contains the plain code the MACRO generated
 
 
 
 
 
We will implement these requirements now step by step and thereby touch relevant parts of the '''so-called ''SAS Macro Facility'''''.
 
 
 
 
 
===Apply Logic===
 
 
 
Implementing MACRO logic is quite comparable to other languages, except that SAS requires so-called '''MACRO Triggers''' (''"TRIGGERS"''”) to direct processing to the appropriate subsystem inside the SAS ecosystem. These are:
 
 
 
'''''& – the ampersand'': indicates parameter reference'''
 
 
 
'''''% - the percent sign'': indicates syntax elements'''
 
 
 
TRIGGERS have been found necessary in the early history of SAS since the SAS Macro Facility was intended to perform text processing before code was sent to the SAS compiler. To invoke the text pre-processor every token is checked whether its 1st digit is a TRIGGER.
 
 
 
Of course the segment structure of coding also applies here:
 
 
 
'''%IF %LENGTH(&PARM_I.) ne 0 %THEN %DO;'''
 
''program code''
 
'''%END;'''
 
'''%ELSE %DO;'''
 
''alternate program code''
 
'''%END;'''
 
 
 
Depending on whether a value is supplied in parameter PARM_I either '''"program code"''' or '''"alternate program code"''' is passed to the SAS compiler for processing.
 
 
 
 
 
===What is Metadata?===
 
 
 
A widely used definition of metadata relies on two characteristics:
 
*primary control function vs. data content
 
*structured repository vs. one-dimensional parameter list
 
 
 
 
 
With these two pillars the definition does not correspond 100% to the denotation of meta-data, which is “data upon data”. Instead, metadata are transcribed as parameters used to
 
#define,
 
#control and  
 
#integrate
 
the dynamic workflow of a software system comprised from autonomous modules.
 
 
 
 
 
Claiming storage in a structured repository is a powerful condition and allows for unlimited complexity of workflow logic and its control.
 
 
 
'''This metadata repository is called ''METABASE''.'''
 
 
 
Finally, data may easily change domains from DATABASE to METABASE and back, being data content in one phase and control information in another phase of system runtime.
 
 
 
 
 
===Process Metadata===
 
 
 
Although the following examples hold for every static METABASE design we will start over with consideration of dynamic metadata arrays that are read from database tables or datasets.
 
Using dynamic metadata arrays is commonly referred to as '''''DATA DRIVEN APPROACH''''' because processing depends on content of data tables.
 
 
 
 
 
Examples are based on the sample CLASS dataset and presentation will have three logical steps:
 
#generate the list itself
 
#obtain the number of elements
 
#utilize list elements 
 
 
 
 
 
To accomplish this task we will use Data Steps along with PROCs SQL and PRINT in order to sketch three approaches:
 
*a list with selected delimiters (''"List"'')
 
*a set of numbered parameters (''"Numbered"'')
 
*a single parameter that is used repeatedly (''"Direct"'')
 

Aktuelle Version vom 27. Oktober 2015, 16:42 Uhr


In General

Welcome to the Introduction to "Fraktal SAS Programming".

The pages provided here are intended to serve as guidelines for Beginners in SAS Based Reporting from Database Tables.

Why "Fraktal"? Start this movie on measurement of coast lines by using Fractals at 16:40 (German audio)

We are using the term "Fraktal" with a "k" here to emphasize, that the programming concept introduced is derived from Fractal Geometry in Mathematics but not identical to it.

Fraktal SAS Programming is considered to be "fractal" because the program archtitecture is suggested to use minimized i.e. smallest scale modules to comprise the implementation from. Unlike fractal curves segments, e.g. used in coast line measurement, the module size meets a lower limit introduced by syntactic properties. Nevertheless, the module size possible will range between a few lines and very few screen pages. Overall size of a module will very rarely reach 100 lines of code. This includes declares, communication and documentation as well as logic like loops and branches.

To make these guidelines compatible with the subject they present, the structure is likewise modularized to a maximum: It is comprised from half-page text slides, making tiny lessons that are easily taken one by one.

In Detail

  1. Preface: Learn about appropriate positioning of the "SAS System" from "SAS Institute".
  2. Coding: Read important considerations on front-end program structure and back-end runtime behaviour.
    1. Implicit Coding
    2. Explicit Coding
  3. Macro: Find out how every single aspect from your workflow definition can easily be reflected and implemented.
    1. Straightforward Coding
    2. Generalized Approach
    3. Advanced Coding
    4. Symbol Tables
    5. Parameter Scope
    6. Extending Control
      1. Apply Logic
      2. Process Metadata
        1. What is Metadata?
        2. Process Metadata: List
        3. Process Metadata: Numbered
        4. Process Metadata: Direct
      3. Workflow Documentation
        1. Stored Workflow Documentation
      4. Realtime Information
    7. Fully Qualified Coding
  4. DBMS: Talk to Database Management Systems from within your SAS program to build a seamlessly integrated workflow.
    1. Libname Engine
    2. Hybrid Queries
    3. Passthru SQL
  5. Programming: Communicate your algorithm to SAS code processors by applying them to data and code as well.
    1. Processing Records
      1. Read Text File
      2. Create Dataset
      3. Process Data
    2. Generating Code
      1. Copy environment from operating system
      2. List OS directory in SAS LOG screen
        1. Try recursion and window front-end
      3. Open selected text file in SAS Program Eitor
      4. Build advanced function from basic SAS Macros
  6. Data Structures: Discover the the key to flexibility and efficiency in SAS programming.
    1. Tables and Views
    2. Meta Data Tables
    3. SAS Formats