Makefile Language Tips and Tricks

Makefile Language Tips and Tricks
John Denker

* Contents

1 Action Items
2 The Gnu Makefile Language is Crock
- 2.1 Example
3 Mapcar
4 Scoping of Variables
5 Circular Updates
6 Testing for the Existence of a File
7 Rules with Multiple Products
8 Arithmetic or Lack Thereof
9 Miscellaneous Ugliness
10 A Collection of Useful Functions
11 References

1 Action Items

Put assignment statements at the top and rule stanzas at the bottom of the makefile. See section ‍2.
Be particularly wary of mixing assignments plus rules in an included sub-makefile. Instead, use (A) one file of assignments (to be included at the top) plus (B) one file of rules (to be included at the bottom).
Don’t assign a new value to an old variable-name. That’s bug-bait. It could screw up recipes that are defined earlier but invoked later. See the example in section ‍2.1.
Use := instead of = unless you are sure you want to be declaring a function. If the RHS is expensive — e.g. $(shell ...) — using = can be phenomenally inefficient.
Never write a makefile rule with multiple targets and a nontrivial recipe. Instead, use an hourglass structure. See section ‍7.

2 The Gnu Makefile Language is Crock

The format of a gnu makefile is a creepy chimera. It is mostly a declarative language, with bits of imperative language mixed in, just enough to fool you, especially if your background is in imperative programming, e.g. classical languages such as fortran, basic, c++, et cetera.

See reference ‍1 and reference ‍2.

: 1.
[I] The := assignments are imperative. They get evaluated once and for all, in order, as the makefile is read.
: 2a.
[D] In general the = statements are function declarations. The RHS gets re-evaluated whenever it is needed, so the order of evaluation often does not resemble the order of declaration.
: 2b.
[I] The = assignments might seem effectively imperative if the RHS is a constant.
: 3.
[D] After reading the whole file, make starts looking at targets and deciding which rules to invoke. The invocations may occur out of order, and may even occur in parallel.
: 4.
[I] Within a rule stanza, the recipe statements get carried out in order, in plain old imperative fashion.
: 5.
[D] Expressions in the recipes get evaluated when the rule is invoked.
: 6.
[I] OTOH expressions in the headline of the rule get evaluated much earlier, when the rule is read.

As a consequence: A variable that has a certain value in a recipe may have a different value (or no value at all) if used in an assignment a few lines later in the makefile, or a few lines earlier, or even in the headline of the rule, which can be super-confusing for non-experts.

2.1 Example

Just for fun, look at this simple makefile and predict what the output will be. Note that $(foo) appears in two places.

##############################
bar = a.src

foo = $(bar)

bar = b.src

all: foobar

foobar: c.src $(foo)
	$(info depends: $^)
	$(info recipe: c.src $(foo))

bar = xx

.PHONY : a.src b.src c.src
##############################

3 Mapcar

We can define a makefile function mapcar. Its first argument is a function f, and its second argument is a list. It applies f to each element of the list, and collects the results as a new list.

As such, it offers no advantages over the built-in $(foreach ...) function. The point of the exercise is that it can be generalized to mapcar2, which does the same thing for a function of two variables.

Beware: You should write $(call mapcar,func,list) with no spaces after the commas. Otherwise you might get a list containing just a space, in situations where the list should have no elements at all.

##############################
## some useful functions:
cdr = $(wordlist 2,$(words $1),$1)
cadr = $(wordlist 2,2,$1)
expand = $($1)
quote  = '$1'
quote2 = '$1+$2'

# apply a function to each element of a list;
# collect results as a new list:
mapcar   = $(if  $2,$(call $1,$(firstword $2))$(call spmapcar,$1,$(call cdr,$2)))
spmapcar = $(if $2, $(call $1,$(firstword $2))$(call spmapcar,$1,$(call cdr,$2)))

# same as above, for a function of two variables (e.g. quote2)
mapcar2   = $(if  $(and $2,$3),$(call $1,$(firstword $2),$(firstword $3))$(call \
   spmapcar2,$1,$(call cdr,$2),$(call cdr,$3)))
spmapcar2 = $(if $(and $2,$3), $(call $1,$(firstword $2),$(firstword $3))$(call \
   spmapcar2,$1,$(call cdr,$2),$(call cdr,$3)))

##########
## Test mapcar:
%.png :
	@###

projects := abc.htm xyz.htm
abc_figs := aa.png bb.png cc.png
xyz_figs := xx.png yy.png zz.png

figvars := $(projects:%.htm=%_figs)
figs := $(call mapcar,expand,$(figvars))

all : $(figs)
	$(info  {$(call mapcar, quote,$^)})
	$(info  {$(call mapcar2, quote2,$(abc_figs),$(xyz_figs))})

## output should be:
## {'aa.png' 'bb.png' 'cc.png' 'xx.png' 'yy.png' 'zz.png'}
## {'aa.png+xx.png' 'bb.png+yy.png' 'cc.png+zz.png'}
##############################

4 Scoping of Variables

Gnu makefiles give you very little control over the scope of variables. There are lots of situations where you can’t have a local variable, as far as I can tell, even though it would be very useful.

The problem is that something later in the file could change the value of the variable between the time you define it and the time you need it, e.g. if you need to use it in a recipe that won’t get invoked until later.

You can get some control (with very limited expressive power) using the $(let ...) function.

One workaround is to use sub-makefiles. Variables that get set in the sub-makefile cannot affect the parent makefile.

5 Circular Updates

Make issues a warning if there is a circular dependency, e.g. if foo depends on bar while bar depends on foo.

This can easily happen if something is getting updated. A familiar example is the .aux file used by latex. Compilation reads the file as an input, and then writes an updated version. The process “should” converge after a couple of iterations, but it’s not the simple acyclic graph that make expects.

You definitely want the .aux file to appear as a target on the LHS of the rule, so you can ask for it to be made. There is no way around this.

You also want the .aux file, or something like it, on the RHS, as a dependency.

The solution is to have two files with the same contents:

the .aux file appears on the LHS. It is a product, so after compilation it will be newer than all sources.
The .saveaux file appears on the RHS. It is a source, so after compilation it is older than all sources.

Since they have different names, make doesn’t consider them the same, so make does not realize the process is circular, even though it truly is.

The key quasi-circular step is this:

If the compilation produces a new and different .aux file, copy it to .saveaux. This means the process has not converged.
In contrast, if the compilation produces a .aux file is the same as before, don’t disturb the .saveaux file, so it retains its old date. This means the process has now converged.

6 Testing for the Existence of a File

Within a makefile, the approved way to test whether a file exists is to use the built-in $(wildcard ...) function. Despite the name, the pattern you feed to the function does not need to contain any wildcard metacharacters; it can be a plain old constant filename.

If the pattern doesn’t match any files, that is not an error. The function just returns nothing, which is how makefiles represent falsity.

This can be used in flow-control directives as well as in functions, as you can see in the example.

##############################
# flow control conditional directive:
ifneq (,$(wildcard foo))
  flag := yes
else
  flag := no
endif

# conditional function expression:
check := does it exist? $(if $(wildcard foo),yes,no)

.PHONY : all
all :
	$(info flag: $(flag))
	$(info check: $(check))
##############################

7 Rules with Multiple Products

7.1 The Problem: Quadratic Work

Make assumes that when a rule is invoked, it produces only one target. If you specify N targets for a rule, it’s the same as writing a group of N separate rules, with one target apiece, all with the same dependencies. The rules in the group will get invoked as necessary. For each invocation, $@ will be a single target.

To repeat: $@ will never represent more than one of the targets of a multi-target rule.

After invoking the rule for each target, make re-evaluates whether it is still necessary to invoke it for the others. Therefore sometimes the policy of implementing the multi-product rule as a group of single-target rules might seem to do what you want. The problem is, this is not reliable. In particular, if -W is in effect, N copies of the rule will get invoked, even though N−1 are pointless and wasteful. Each of the N products will get made N times, so the overall work is N², i.e. quadratic, which is N times larger than it should be.

7.2 Simple Workaround : Trigger File

It’s not really a solution, but as a simple workaround you can just avoid -W. Instead, you can add a trigger file, i.e. an additional trivial dependency, to the RHS of each rule you care about. You can then touch the trigger file. This achieves the desired result without quadratic work.

The trigger file is a real file that hangs around in your workspace. If not used, it might become very old.

The problem with this workaround is that -W is convenenient and expressive, and it’s safe to use on “some” rules. If you are using a makefile that you wrote months or years ago, it’s hard to remember when you can use -W and when you can’t. If you’re going to the trouble of installing triggers, you might as well install hourglass structures, so you can use -W without worrying.

##############################
# make -W a.src                 # horribly wasteful
# touch trigger.src ; make      # not horrible

cdr   = $(wordlist 2,$(words $1),$1)
rdc   = $(wordlist 1,$(words $(call cdr,$1)),$1)

%.src :
	echo asdf > $@

products := w.prod x.prod y.prod z.prod

all : $(products)

$(products) : a.src b.src c.src trigger.src
	cat a.src b.src c.src > w.prod  # allegedly $@ from $(call rdc,$^)
	cat a.src b.src > x.prod        # allegedly $@ from $(call rdc,$^)
	cat b.src c.src > y.prod        # allegedly $@ from $(call rdc,$^)
	cat c.src a.src > z.prod        # allegedly $@ from $(call rdc,$^)
##############################

7.3 Fancy Solution : Hourglass Collector/Emitter Structure

As a general rule:

Never write a makefile rule with a notrivial recipe and multiple targets.

‍ ‍ ‍ ‍ ‍

Instead, use an hourglass structure, also known as a wasp-waist structure or a collector-emitter structure. This how make should have interpreted multi-output rules all along. A basic example is shown in figure ‍1.

The terminology is:

A collector has only one target, so it may perform nontrivial work.
An emitter has no recipes, or perhaps only trivial recipes, so it may have multiple targets.

With the hourglass structure, i.e. with a collector and an emitter, it is safe to apply -W to any of the input source files.

The waist object is the only target of the collector rule, and the only dependency of the emitter rule. It should be declared as .INTERMEDIATE since it is not a file. It exists only in the make program’s imagination, not in the actual filesystem. The .INTERMEDIATE declaration guarantees that (a) no such file will be created, even though it is mentioned as the target of the collector rule, and (b) if a file of that name exists, it will be ignored.


Figure ‍1: Basic Hourglass Structure : Collector and Emitter		Figure ‍2: Fancy Hourglass Structure

A version with fancy optional features is shown in figure ‍2.

If desired, you can add a trigger file, i.e. a fake source file, and touch it whenever you like, as discussed above. This is rarely advantageous, since it is usually simpler to apply -W to one of the real sources.

You can add a final collector stage, as discussed in section ‍7.4.

##############################
cdr   = $(wordlist 2,$(words $1),$1)

# typical usages:
#       make -W a.src
#       make -W project.trigger
all : project.final   # other rules can refer to the final target

# create fake source files if needed;
# illustrate how a rule can get invoked N times:
a.src b.src c.src :
	touch $@

# this should be a real file.
# normally it is very old
# you can touch it if desired, but -W is at least as good
project.trigger :
	$(info making $@)
	touch $@

# main collector
# does all the work
project.waist : project.trigger a.src b.src c.src
	$(info collector: $@ from $(call cdr,$^))
	touch x.prod y.prod z.prod ## simulate main work ##

.INTERMEDIATE : project.waist   # imaginary file
# should not exist, ignored if it does exist

# main emitter (typically has no recipes)
x.prod y.prod z.prod : project.waist
	$(info emitter: $@ from $^)

# final collector (should not have any recipes)
# (although a completely empty recipe line is OK)
project.final : x.prod y.prod z.prod
	$(if ,$(info final: $@ from $^))

.PHONY : project.final  # another imaginary file
##############################

7.4 Output Collector

As a separate matter, separate from the quadratic work issue, it is good practice to define a collector for the products of the multi-product rule.

In the example project.final is up-to-date if and only if all the all the product files are up-to-date.

Rationale: It is simpler and safer to check the collector, even though you don’t absolutely need it (since you could just check all the product files directly). In particular, rules that come later in the process can simply depend on the collector, without having to know the details of the earlier rules. This makes things simpler, more reliable, and more maintainable.

In particular, it is convenient to say make project.final on the command line, without having to name all the files that are to be produced.

Further rationale: It would be a mistake to think that because all the products get made at the same time, it would suffice to check any one of them, rather than systematically checking all of them. This fails if one of the products gets deleted after it gets produced. In contrast, using a collector to check all of them is more reliable.

8 Arithmetic or Lack Thereof

Makefile per se doesn’t know how to do arithmetic. In general, to do arithmetic you have to invoke the shell, which is expensive and inelegant.

The best you can do without shelling out is simple addition or subtraction, using lists as tally marks, i.e. base-1 arithmetic. This involves goofy circumlocutions, namely addition implemented via concatenation of lists. Subtraction requires a calling the sub function, as given in section ‍10.

##############################
three := 3
five := 5

# function to multiply two decimal integers
# this works for any POSIX-compliant shell (and some others):
prod := $(shell echo $$(($(three) * $(five))))

all : prod rdc compare

prod:
	@## $(info prod: $(prod))

## should print
## prod: 15

# Idiom for decreasing a base-1 integer by 1 (without shell)
#xx Note: the obvious arithmetical expression would not work:
#xx rdc   = $(wordlist 1,$(words $1)-1,$1)

# However, the following idiomatic circumlocution does work,
# without invoking a shell:
cdr   = $(wordlist 2,$(words $1),$1)
rdc   = $(wordlist 1,$(words $(call cdr,$1)),$1)

list := a b c xxx

rdc :
	@## $(info rdc: $(call rdc, $(list)))

## should print
## rdc: a b c

# function for testing whether an unsigned decimal integer is greater than zero
>0 = $(wordlist 1,$1,t)

# logical nor
nor = $(if $(or $1,$2,$3,$4),,t)

# if both strings are nil, return t
# otherwise if they are identical, return the string,
# otherwise return nothing.
eq = $(or $(call nor,$1,$2),$(and $(findstring $1,$2),$(findstring $2,$1)))

compare:
	@## $(info 0 >0: '$(call >0,0)')
	@## $(info 1 >0: '$(call >0,1)')
	@## $(info aa eq aa: '$(call eq,aa,aa)')
	@## $(info aa eq bb: '$(call eq,aa,bb)')
	@## $(info nil eq nil: '$(call eq,,)')

# base1,N,optionalList
# returns a list of length N.
# shortens or extends the optionalList as necessary
base1 = $(if $(call eq,$1,$(words $(wordlist 1,$1,$2))),$(wordlist 1,$1,$2),$(call base1,$1,$2 t))

# subtraction using the base-1 (tally mark) representation.
# a result less than zero gets coerced to zero, i.e. the nil list
sub = $(wordlist $(words t $2),$(words $1),$1)

zero := $(call base1,0)
two := $(call base1,2,a b c d e)
five := $(call base1,5)
three := $(call sub,$(five),$(two))

base1 :
	@## $(info zero: '$(zero)')
	@## $(info two: '$(two)')
	@## $(info five: '$(five)')
	@## $(info three: '$(three)')
##############################

9 Miscellaneous Ugliness

9.1 Call

When I define a function such as cadr, why do I need to invoke it as $(call cadr,arg)? Why does the language not allow me to define something to be a function, so it can be invoked as simply $(cadr arg)?

9.2 Shell

Recipes are evaluated using sh, not bash. This can lead to all sorts of surprises. For example, in bash a newline is $'\n', but in sh you have to say "\n".

9.3 .RECIPEPREFIX

Every line of a recipe has to start with a tab, unless you specify a different .RECIPEPREFIX char. This makes the makefile hard to read, because a tab looks a lot like 8 spaces.

Beware that the .RECIPEPREFIX has to be a one-byte character. I would have liked to use the “Recipe” symbol (Rx ligature, ℞, unicode 0x211E), but alas that doesn’t work.

10 A Collection of Useful Functions

##############################
cdr    = $(wordlist 2,$(words $1),$1)
cddr   = $(wordlist 3,$(words $1),$1)
cdddr  = $(wordlist 4,$(words $1),$1)
car    = $(wordlist 1,1,$1)     # same as firstword
cadr   = $(wordlist 2,2,$1)
caddr  = $(wordlist 3,3,$1)
cadddr = $(wordlist 4,4,$1)

# reverse cdr, removes last element from the list
rdc    = $(wordlist 1,$(words $(call cdr,$1)),$1)

# apply a function to each element of a list;
# collect results as a new list:
mapcar   = $(if  $2,$(call $1,$(firstword $2))$(call spmapcar,$1,$(call cdr,$2)))
spmapcar = $(if $2, $(call $1,$(firstword $2))$(call spmapcar,$1,$(call cdr,$2)))

# same as above, for a function of two variables (e.g. quote2)
mapcar2   = $(if  $(and $2,$3),$(call $1,$(firstword $2),$(firstword $3))$(call \
   spmapcar2,$1,$(call cdr,$2),$(call cdr,$3)))
spmapcar2 = $(if $(and $2,$3), $(call $1,$(firstword $2),$(firstword $3))$(call \
   spmapcar2,$1,$(call cdr,$2),$(call cdr,$3)))

# miscellaneous functions, often applied via mapcar:
expand = $($1)
quote  = '$1'
quote2 = '$1+$2'

# function for testing whether an unsigned decimal integer is greater than zero
>0 = $(wordlist 1,$1,t)

# logical nor (and logical not)
# returns t if all strings have zero length
# otherwise returns nothing
# accepts any number of args up to 4
nor = $(if $(or $1,$2,$3,$4),,t)

# if both strings are nil, return t
# otherwise if they are identical, return the string,
# otherwise return nothing.
eq  = $(or $(call nor,$1,$2),$(and $(findstring $1,$2),$(findstring $2,$1)))
# simpler version, if you are sure the strings are not both nil
eq_ = $(and $(findstring $1,$2),$(findstring $2,$1))

# base1,N,optionalList
# returns a list of length N.
# shortens or extends the optionalList as necessary
base1 = $(if $(call eq,$1,$(words $(wordlist 1,$1,$2))),$(wordlist 1,$1,$2),$(call base1,$1,$2 t))

# subtraction using the base-1 (tally mark) representation.
# a result less than zero gets coerced to zero, i.e. the nil list
sub = $(wordlist $(words t $2),$(words $1),$1)
##############################

11 References

: 1.
GNU make (user manual)
https://www.gnu.org/software/make/manual/make.html
: 2.
Declarative programming (wikipedia article)
https://en.wikipedia.org/wiki/Declarative_programming

[Contents]