Fast and Efficient Updates to Project Deliverables: The Unix/GNU Make Facility as a Cross-Platform Batch Controller for SAS®

Concurrent development of interdependent parts is a common feature of complex programming projects. In clinical reporting, statistical summaries rely on derived datasets (which in turn rely on raw data), and are generated by programs often developed simultaneously with derivation programs. This concurrent development pushes components out of sync: programs must be re-run in sequence before updates to raw data and revisions to analysis datasets are reflected in statistical output. As projects grow in scope and complexity, the need to manage file dependencies increases. Lack of synchronization hampers validation and QC activities, imposes redundancy on the production of output, and threatens project timelines.
The Unix ‘Make’ facility (and its cross-platform analogue, the GNU ‘Make’ facility) is a generalized ‘build system’ designed to manage dependencies in coding projects. Described in general terms, the Make facility examines relationships among project files, and sequentially rebuilds ‘target’ files whenever pre-requisites to those files have changed. This paper introduces the potential of Make for managing dependencies in SAS® programming projects, and describes how to adapt Make as a SAS batch-controller under both Unix and Windows, to efficiently refresh output files (derived datasets and statistical summaries) when any pre-requisites (programs and datasets) to those output files have changed. Automation techniques for detecting dependencies, and for generating and testing ‘makefiles’ (specialized script files interpreted by Make) are also discussed.
The techniques and tools described should be adaptable to any SAS version and any platform, and will hopefully be of interest to SAS users of all levels of experience.

conference: 
Paper Type: 
Paper

User login

Syndicate

Syndicate content
Drupal 6 Appliance - Powered by TurnKey Linux