yruba - why rules for bash?
... because it's cool! (download)
Introduction
The idea of yruba is similar to make,
ant and other rule evaluators
used to build software. Both,
make and ant, are a pain to use and
debug. Some reasons are:
- The typical
makefilehas a horrible syntax. Themakesyntax is mixed with shell syntax and quoting and escaping special characters becomes a nightmare - The rule system of
makeis crippled because the reason why a target is out-of-date is not configurable. It is always a time stamp. - The designers of
antmistook XML for a programming language, again ending up with syntax madness. - Everything in
anthad to be developed from scratch, because direct access to the underlying operating system is sacrificed for portability. But withoutjava/antthere is no portability anyway. - While
antseparates updating of dependencies and testing whether the target needs updating, the tests are implicit in the task implementation and almost never documented.
Whether yruba manages to get out of syntax hell depends on
your taste for shell script syntax. The rule concept is,
however, more explicit and more clearly defined than in either
ant and make (Concepts).
Yruba is intended to be simple to use for everyone who has done some shell scripting. Most build systems could in principle be written as one long shell script. Then, however, everything would be done everytime, which is a waste of time and resources. Consequently, it is necessary to have a control structure that is different from the common loops, conditional statements and function calls — namely rules. Yruba adds rules to shell scripting.
Concepts
The sole purpose of yruba is to get targets up-to-date. A target is often a single file, but can also represent a list of files or some overall goal to reach. A target gets updated by mapping it to a task that implements the necessary actions. A task has a name, and it is defined by a list of dependencies, a test and a command or function to execute:
- dependencies
- The optional list of dependencies for a task contains names of other targets. These must be up-to-date before the task can even be considered.
- test
- The optional test makes sure the task is not run unnecessarily. Only if the test returns true (exit code 0), signifying that the target is out-of-date, the task's command is run.
- command
- The required command for a task performs all actions necessary to bring the target(s) up-to-date.
Get Going
Yruba comes as a the shell script yruba . The
script looks for a task description file with the default name
yrules. This file uses bash syntax to
define tasks as described in section Writing Tasks. A default target can be
specified in yrules, but targets can also
be specified on the command line. To get a feel for how things
go, have a look at this simple example.
defaultTarget="hello" # optional
# map requests for *.msg to makemsg
mapTarget '*.msg' makemsg
# task "makemsg"
test_makemsg() { ! test -f "$1"; }
cmd_makemsg() { echo "Yruba is cool!" >"$1"; }
# task "hello"
dep_hello() { echo hello.msg; }
cmd_hello() { cat hello.msg; }
When yruba evaluates this rules file, it will first consider
target hello. Since hello is not
mapped to a task name with mapTarget,
hello itself is taken as the task name. Task
hello has the dependency hello.msg, as
produced by the function dep_hello. Now
hello.msg is considered as a target. Due to the
target mapping for *.msg, the task to run is
makemsg. This task has no dependencies (no
dep_makemsg), but it has a test
(test_makemsg). If the test finds that the file
hello.msg does not exist, the task's command,
cmd_makemsg, is run. Once this is done,
cmd_hello is immediately run since
hello does have no test.
Target Mapping
The difference between target and task is
only a subtle one. By default, a target and its task have the
same name. If you have a task called tgz to
pack your source files into a tar archive, you may
use tgz as a target name in a dependency list and on
the command line:
% yruba tgz
But it is also possible to map targets via a pattern to a
task. If there is one task, called compileC, that
compiles C source files into object files, we want to use this
task for all targets *.o. This is done by calling
mapTarget:
mapTarget '*.o' compileC
Whenever a target matching *.o is now
considered, it is mapped to the task compileC.
Writing Tasks
Implemention of a task requires up to three shell functions or scripts,
- one to produce the dependency list,
- one for the test and
- one for the command.
In what follows, the term function is used throughout although a script can usually also be used.
The Dependency List
A task may depend on targets that must be up-to-date before
its command can be run. The dependencies are specified by
implementing a
function that prints the targets on
stdout. It will be called with a single argument
— the target under consideration. The name of the function
must be the name of the task prefixed by
dep_. For a task called
jar, intended to produce a jar file from Java class
files, this would be:
dep_jar() {
# $1, the target's name, not used here
echo compileJava
}
This assumes that compileJava is a target that
can be mapped to a task or is itself the name of a task to create
the class files we want to pack. Users of make may
be tempted to list all the class files as dependencies. But this
is only because make does not separate the
dependencies from the test. With yruba the list of
class files only comes into play in the test function as
described below.
The string returned by dep_* will undergo one
round of eval processing in a statment like
eval set -- "$deps"
in order to have proper list handling. In most cases this is of no harm, but if there are potentially dangerous characters in some dependencies, then the use of lappend is recommended to built up the dependency list:
deps=$(lappend "$deps" "$someDep") ... echo "$deps"
A note on source files as dependencies:
make users may also like to put source files,
e.g. xyz.c, in the dependency list. This, however, is
only necessary if xyz.c is itself created by a code
generator or must be checked out of a repository, for
example. Otherwise yruba will merely check if the
file is there and continue. But this is a waste of time, because
if the file was inadvertently deleted, an
error messages will result in the test function soon enough.
The dep_* function for a task is optional. If it
does not exist, an empty dependency list is assumed.
Use of Variables
An additional detail to know is that the dep_*
function is called in a subshell. As a result it has access to
all shell variables but changing them will not have any effect
after the return from the function.
The Test
After yruba has recursively considered and eventually updated
all dependencies of a task, it calls the task's test
function. The name of this function must be the task's name
prefixed by test_. Now that all the dependencies
are up-to-date, the test can finally determine if the target
under consideration is out-of-date. For the task
jar the test function would be
test_jar() {
local jarfile=$1
# $2, $3, etc., the dependencies, not used here
JARCONTENT= ... # find list of files to pack
old "$jarfile" -d $JARCONTENT
}
The first argument of the test function is the target under
consideration. The following arguments are the dependency list
as produced by the dep_* function. The
function old comes with
yruba.
Only if the test returns true (exit code 0), the task's
command, as described below, will be called. If it returns false
(exit code ≠ 0), the target is supposed to be up-to-date
already. The test_* function is optional. If it
does not exist, the target is assumed to be out-of-date and the
command is called unconditionally.
Use of Variables
In contrast
to the dep_* function, the test_*
function is not called in a subshell. Consequently it is
able to set or change global
variables. In the example
this is used to set JARCONTENT to the list of files
to pack. The variable will be used again in the
cmd_* to finally pack the jar file (see below). Be
careful, however, to not inadvertantly change variables
somewhere up the call stack. If in doubt, use local
variables.
After calling the test function, yruba resets the current directory to where it was before the call to not disturb other tasks.
The Command
The minimum necessary to implement a task is the function that
updates the target. Its name must be the task name
prefixed with cmd_. For the task jar
this is cmd_jar. When the function is
eventually called by yruba, the first argument will
be the name of the target and the other arguments will be the
elements of the dependency list.
cmd_jar() {
local jarfile=$1
# refer to JARCONTENT as set by test_jar
jar cf "$jarfile" $JARCONTENT
}
Use of Variables
Like test_* also cmd_* is not
called in a subshell. This allows, for example, to
set a variable as the result of a task, as opposed to the
classical generation of a file. The comments about the dangers of
changing global variables apply as above.
After calling the test function, yruba resets the current directory to where it was before the call to not disturb other tasks.
Special Variables
defaultTarget —
the default target
The variable should normally be set somewhere at the
beginning of the yrules file. It provides the
target to consider if none is given to yruba on the
command line.
yruba_* — internal
variables of yruba
Variables prefixed with either yruba_ or
YRUBA_ should not be changed anywhere, because
they contain yruba internal information.
Library Functions
Yruba comes with a few predefined functions that may help in
writing tasks. Some of the functions produce a result to be
picked up by the caller. This may be a string printed on
stdout. If the function implements a boolean test,
then the result is delivered via the exit code. In the following
documentation, we say the test returns true, if
the exit code is 0, as is customs in shell programming. An exit
code other than 0 denotes false.
old — compare file
modification times
old t1 [...] -d [d1 [...]]
result: exit code
returns true if any of the file targets t1 [...]
is older than (test -ot) any of the dependency
files listed after -d.
Option -d must be present, but may have zero
arguments. If no dependency is given, false is returned, meaning
the target(s) are not old. If any of the dependencies does not
exist as a file (test -f), yruba exits with an error
message. If any of the targets does not exist as a file, true is
returned (this is a feature of test -ot).
haveClass —
does the current CLASSPATH provide a given
class?
haveClass class
result: exit code
Tries to compile a small java source file that references the
given class. The class must be in
fully qualified dot-notation. The compiler used is
${JAVAC} with a default of javac. The
function returns true, if the file can be compiled without
error. No CLASSPATH is explicitely set, meaning
that the value from the environment is relevant.
die —
print message and exit
die [text [...]]
result: none
The given text is printed to stderr, prefixed by
yruba:. Then the script is exited with code 1.
dlog —
print indented informational message
dlog [-n] [text [...]]
result: none
Prints the given text to stdout, properly
indented according to the current rule evaluation nesting
level. If -n is given as the first argument, it
is passed to echo, preventing a terminating
end-of-line.
lappend —
append an element to a list
lappend list elem
result: stdout
Maintaining a list in the shell is difficult as soon as the
elements contain space, newline or quote characters. The
l* functions provided with yruba try
to help with this. The list as a whole is stored as a string
in any shell variable. The content of the variable is kept
quoted in a way that
eval set -- "$list"
sets the positional parameters exactly to the list's
elements, given that list contains the list
representation.
A typical call to lappend should look like
mylist=$(lappend "$mylist" "$newelem")
Don't forget the quotes around the list and the element variable.
lcreate —
create a list from the arguments given
lcreate [arg [...]]
result: on stdout
Creates a list for use with yruba's
l* functions. Each parameter of
lcreate is made into a list element. The string
representing the list is printed out stdout.
mylist=$(lcreate "bla'\"bla" '$b' "\\")
In particular the intention is make sure that
list=$(lcreate "$@") eval set -- "$list"
does not change the values of the positional parameters.
This function was formerly called
lquote.
lget —
get ith element of a list
lget list index
result: stdout
Retrieves the element at index position index
from the given list. Indexing starts at 0. If the index is out
of range, an error is printed to stderr and exit code 1 is
returned. Example:
mylist=$(lcreate -0- -1- -2- -3- -4-) elem=$(lget "$mylist" 3)
will set variable elem to the string
-3-.
lhead —
get first element of a list
lhead list
result: stdout
Retrieves the first element of the list. If the list is
empty, an error is printed to stderr and exit code 1 is
returned. This function is a shortcut for lget "$list"
0. Example:
mylist=$(lcreate -0- -1- -2- -3- -4-) elem=$(lhead "$mylist")
will set variable elem to the string
-0-.
See also ltail.
lpush —
prepend an element to a list
lpush element list
result: stdout
The first parameter is prepended to the list represented by
the second paramter. See lappend for an
introduction to yruba's l*
functions. For example
mylist=$(lpush "$elem" "$mylist")
will establish the content of variable elem as
the first element in the list represented by the content of
variable mylist.
ltail —
remove first element from a list
ltail list
result: stdout
Removes the first element of the given list and returns the remainig list. If the given list is empty, an error is printed to stderr and exit code 1 is returned. Example:
mylist=$(lcreate -0- -1- -2- -3- -4-) mylist=$(ltail "$mylist")
will set the variable mylist to a string
representing the list with the four elements -1-,
-2-, -3- and -4-.
mapFilenames —
map file names to different directory and change extension
mapFilenames olddir newdir ext fname [...]
result: stdout
Removes a prefix of the length of the string
olddir from each filename given, typically a
directory name, replaces it with newdir, and also
sets the extension to ext. The dot
is implicitly assumed as part of the extension and need not be
specified. Special cases of ext are:
- If
extis the empty string, the old extension is simply removed. - If
extis a single dot, the extension of each file name is left unchanged.
The example
l=$(mapFilenames jsrc classes class jsrc/*/*.java)
assumes that some Java source files are sitting in package
directories below jsrc, say in pack1
and mypack. All these source files are picked up by
the glob pattern jsrc/*/*.java. An example file
name would be
jsrc/pack1/Blorb.java. The function
mapFilenames replaces jsrc with
classes and changes the extension to
class to get
classes/pack1/Blorb.class.
The result is assembled by means of lappend to make sure
that even dangerous
file names containing, say, quotes or
backslashes are properly handled. Consequently the result is a
list that can be handled savely with the other l*
functions, and in particular
eval set -- "$l"
will set the positional parameters such that
for x in "$@"; do ... done
will correctly iterate over even the strangest file names.
Note: Although the first parameter is a string — and typically you would just specify the directory to remove — only the length of this string is relevant, because exactly this number of characters is removed from each filename given.
mapTarget —
map a target pattern to a task name
mapTarget pattern task
result: none
By default, every target is handled by the task with the same
name. With mapTarget it is, however, possible to
declare that a target matching the given pattern
shall be handled by the named task. As an example
consider:
mapTarget '*.o' compileC
This makes sure that every target matching the pattern
*.o will be handled by the task
compileC. The function mapTarget
should usually be called before rule evaluation starts. The
pattern will go once through eval, so any special
character must be properly quoted. If several patterns match a
target, the pattern which was added later has
priority.
Yruba maintains the pattern map in a shell
case construct stored in the variable
yruba_tagmap. In desperate cases it may help
debugging to just print its contents.
ydoc —
add a one line descripton to a target
ydoc target text ...
result: none
Registers text as a short description of
target. To list the descripton of targets, use
command line option -i.