Meta-Programming in Perl
In my last article about Special Values in Perl I explained how we can create true distinct values that can be safely distinguished by other values. The final result looked like this:
|
|
While the code above works, i am unhappy with it. Not because it doesn’t work or has some flaws in it. Mainly because it is not really reusable. Let me explain.
Consider you want to create additional special values. How would you solve it? At the moment you would just Copy & Paste code, again and again …
I think Copy & Paste is bad, a lot of people will probably agree. I could fix that problem by just saying: This is the Special Values Design-Pattern. And suddenly it would become Good Practice for a lot of people. Because obviously Design Patterns are awesome and good practice, right?
I somehow find it interesting that giving just another name to Copy & Paste and naming it Design Pattern suddenly makes shit a lot better for many programmers, but that is just the way how things are. I would claim that every Design Pattern (or just Copy & Paste) is a flaw that a language is not powerful enough. This begs the question: Can we make the above code somehow reusable? Can we create a function that creates the Special Values including the functions all by itself?
What I have in mind is the following: I only want to write something like
|
|
and the enum
function should create the functions red
, green
, blue
including
is_red
, is_green
and is_blue
all by itself. Here I will show you, how to do
this in Perl. This kind of programming is named Metaprogramming.
Typeglobs & Symbol Table
To understand Metaprogramming in Perl, we first need to understand the typeglob and the Symbol Table in Perl.
As you probably know, Perl has different types. Here I don’t mean int
, float
and so on, more the ability that Perl distinguish between a Scalar, Array, Hash,
Filehandle (IO) and a Subroutine (Code).
|
|
All of those are named foo
, and still they are completely different from each
other. You can safely use $foo
along @foo
or a function foo()
without a
problem in Perl. Something that will not work in JavaScript for example and
many other languages.
|
|
Next is the Symbol Table. So what is the Symbol Table? As a user of a programming
language you maybe never asked this question, as it maybe seems dumb. But when
you type $foo
or @foo
or execute a function foo()
where are those
variables, functions and so on saved? In the Symbol Table!
The Symbol Table contains all variables, functions and so on. So whenever you type a variable name, it is looked up in the Symbol Table. In Perl the Symbol Table itself contains Typeglobs. So what are Typeglobs?
A Typeglob contains all our special values we have in Perl (Scalar, Array, Hash, …)
and can be used like a Hash in Perl. As an example. All of the above foo
values
we created are found in the typeglob named *foo
. And we can get references
to those values this way.
|
|
This is the reason why Perl can distinguish $foo
from @foo
. $foo
is saved
in *foo{SCALAR}
while @foo
is saved in *foo{ARRAY}
and so on.
Manipulating the Symbol Table
Here comes the interesting part. The Symbol Table itself can be manipulated in Perl at runtime.
|
|
First of, when we assign to *foo
we just assign it a reference. Depending
on the type of the reference, it will save it in the correct slot. But
we also can access values through the typeglob if wanted/needed.
|
|
Consider that accessing the typeglob like *foo{ARRAY}
returns a reference.
So we need to de-reference it to access the values. Hence the @{ ... }
or
${ ... }
around the typeglobs.
Writing *foo
is hard-coded, but Perl also allows to use a string to access
the typeglob.
|
|
Writing *$name
is a shortcut syntax for writing. *{ $name }
. We can consider
the *
as accessing the type glob and pass the name we want to access as
a variable name.
First note the no strict;. In Perl it usually is a good idea to use strict;
.
Also requiring a new Perl version like use v5.36
enables use strict;
or loading
some module like Moose does so.
In 99% this is good because it deactivates certain Perl syntax to disable
manipulating the symbol table. But in this case that is exactly what we want. With
no strict;
we can re-enable those features (or disable the disabling of features).
With all of this knowledge, now let’s write our enum
Implementing enum
First of I would create a single function only creating one special value at a time.
|
|
This would now allow writing create_special("foo")
that creates a foo()
and
ìs_foo()
function returning and checking our special value. Now enum
can
just be implemented by calling this function in a loop.
|
|
finally, we can write:
|
|
The implementation of print_colored
stays the same.
|
|
But there can be one problem. When we now just try to write.
|
|
we get the following compile-time error.
Bareword "green" not allowed while "strict subs" in use at ./metaprogramming.pl line xxx.
Bareword "blue" not allowed while "strict subs" in use at ./metaprogramming.pl line xxx.
Bareword "red" not allowed while "strict subs" in use at ./metaprogramming.pl line xxx.
Execution of ./metaprogramming.pl aborted due to compilation errors.
The reason for this is the following:
- Perl has a compile-time and a runtime-stage.
- The function
red
,green
, … are created at the runtime-stage. - When the compiler hits the
print_colored
line at the compile-time, the compiler don’t know about the functionsred
,green
andblue
. Only functions that the compiler knows so far at the compile-time are allowed to be called without parenthesis. (This usually fetches a lot of errors).
There are two ways to fix this problem.
First
We could add parenthesis. This way the compiler knows that we want to
call a function and those errors about the bareword goes away. That’s also
the reason why we never got an error writing the print_colored
function even
the fact we used the is_red()
, … functions there.
|
|
Second
We can add BEGIN { }
around our enum
call.
|
|
BEGIN is a special block in Perl that tells the compiler to immediately run the code before continuing compiling the rest of the program. This way the function gets created at compile-time. When the compiler sees the line.
|
|
it knows about the green
, red
, … functions and don’t complain about
barewords anymore.