Planet Raku

Raku RSS Feeds

Roman Baumer (Freenode: rba #raku or ##raku-infra) / 2020-10-29T16:19:19

gfldex: Planned obsolescence

Published by gfldex on 2020-10-26T19:17:36

Twelve years¹ ago Larry planned the obsolescence of one of my modules. His cunning plan was executed by lizmat a fortnight ago. If you are building Rakudo from source you take another shortcut now.

use v6.e.PREVIEW;

my %detectives;

my @a = <UK London Bakerstreet>;
%detectives{||@a} = "Holms";

say %detectives{||<UK London Bakerstreet>};
dd %detectives;
# OUTPUT: Holms
          Hash %detectives = {:UK(${:London(${:Bakerstreet("Holms")})})}

Slippy semi lists can be quite useful while working with parsed JSON. There are plenty of hashes of hashes of hashes.

Now I got a peculiar problem. One of my modules becomes redundant if — and only if — a Raku version is requested that is younger then v6.d. The module itself can demand v6.d even when used by a program or module that itself demands v6.e. At least it is easy to test for that condition.

use v6.e.PREVIEW;
say $*RAKU.version ~~ v6.e+;
# OUTPUT: True

We can use that to trigger a warning in a module as soon as v6.e becomes the default.

use v6;

CHECK if $*RAKU.version ~~ v6.e+ { warn "Unattended items will be destroy without warning! (This is a warning.)" };

This warning will be displayed once per precompilation so it is easy to miss. Also it will show even thought the program running the use statement might demand an older version of Raku. If only there where a subroutine that is called every time we import a module in the context of the caller!

use v6.d;

sub EXPORT {
    if $*RAKU.version ~~ v6.e+ { warn "Unattended items will be destroy without warning! (This is a warning.)" };

Now use can detect time travel by means of looking up the dynamic variable $*RAKU.

use v6.e.PREVIEW;
use warner;

# OUTPUT: Unattended items will be destroy without warning! (This is a warning.)
          in sub EXPORT at /home/dex/projects/raku/obsolescence/lib/warner.rakumod (warner) line 6

The ability of Raku to fix past mistakes was planned right from the beginning.² I feel ready for the next 100 years.

¹) That’s a lie. The truth was hidden by the move from SVN to git. Please forgive my urge to tell a good story.
²) That’s not a lie.

Rakudo Weekly News: 2020.43 Release And Star

Published by liztormato on 2020-10-26T13:15:57

Quite a number of releases this week(end): Alexander Kiryuhin released the Rakudo 2020.10 compiler release, Claudio Ramirez immediately provided many Linux package versions of that release, and JJ Merelo updated the standard Raku Alpine Docker image, as well as the special Raku Alpine Docker Image for testing modules by module developers.

And Patrick Spek also immediately sprang into action by creating a Rakudo Star 2020.10 release. Sadly without a confirmed Windows build: volunteers welcome. And kudos to all who made these releases possible!

Weekly Challenge

Weekly Challenge #84 is available for your perusal. Andrew Shitov did a full review of Challenge #81 as well as a full review of Challenge #82, both with video treatments!

Core Developments

This week’s new Pull Requests:

Please check them out and leave any comments that you may have!

Questions about Raku

Meanwhile on Twitter

Comments about Raku

New Raku Modules

Updated Raku Modules

Winding down

A week without any real blog posts, but with some new Raku modules! And new releases and some interesting new Raku Programming Language features. As always: please don’t forget to stay healthy and to stay safe. And make sure to check in next week for more Raku news!

Rakudo Weekly News: 2020.42 Recipes

Published by liztormato on 2020-10-19T13:55:48

Another Raku book just hit the (virtual) bookshelves: Raku Recipes, A Problem-Solution Approach by JJ Merelo, with examples about data science, analytics, microservices, and desktop/console applications usage. Recommended reading and mandatory addition to your programming language bookshelves!

Raku Development Grant Report

Jonathan Worthington reported on the progress of the newdisp branch this last September.

Alexey’s corner

Alexey Melezhik came out with two new Sparrow plugins: jinjalint, for linting Jinja Templates, and tflint for linting Terraform scripts.

Weekly Challenge

Weekly Challenge #83 is available for your perusal. Andrew Shitov did a YouTube review of Challenge #81 task #1 and task #2.

This week, Laurent Rosenfeld almost could not write the blog post because of events in the city where they are a council member. Kudos to Laurent for not being stopped by madmen!

Core Developments

This week’s new Pull Requests:

Please check them out and leave any comments that you may have!

Questions about Raku

Meanwhile on Twitter

Comments about Raku

Updated Raku Modules

Winding down

The first instalment of the Rakudo Weekly News in its second year! With some interesting new Raku Programming Language features! As always, but in this year more specifically: please don’t forget to stay healthy and to stay safe. And don’t forget to check in next week again for more Raku news!

Rakudo Weekly News: 2020.41 A First Year

Published by liztormato on 2020-10-12T19:10:07

About a year ago, the first Rakudo Weekly News hit the Net, shortly after the name change to Raku was officially accepted. It’s quite amazing how much has been achieved in the year since then. The extensive documentation has been updated, many (not yet all though) internals have been updated, several Raku books have been published, over 1500 Raku questions on StackOverflow, a lively /r/rakulang subreddit with more than 600 users, and a lively #rakulang Twitter feed as well!

Yours truly is looking forward for another year of progress, more focused on internals (like the rakuast and newdisp branches), rather than on the externals (how a programming language should be called).

A Third Manifesto

After part 1 of a Raku Manifesto and part 2 of a Raku Manifesto, this week Daniel Sockwell concluded this very interesting and thought provoking cycle of blog posts with part 3 of a Raku Manifesto. In it, Daniel gives examples of how open source projects are outperforming many similar projects backed by large corporations, and how excited Daniel is about the Raku Programming Language (/r/rakulang comments).

RakuAST Grant Report

Jonathan Worthington reported on the progress of the RakuAST Grant work, with an impressive list of features now supported (/r/rakulang comments).

From the Raku Steering Council

Vadim Belman has accepted the seat (/r/rakulang comments) left open by Jonathan Worthington, who has indicated that they need a break from Rakudo core development, and consequently also from the Raku Steering Council for personal health reasons (/r/rakulang comments).

Meanwhile, Daniel Sockwell has been working with lawyers at Yet Another Society (also known as The Perl Foundation) to handle potential issues with regards to having a powerful “Raku” trademark.


This week saw only a few blog posts, so not enough for a bouquet:

Weekly Challenge

Weekly Challenge #82 is available for your perusal. And Andrew Shitov did a full review of the Raku solutions of Challenge #80 (including the usual video run-through).

Core Developments

This week’s new Pull Requests:

Please check them out and leave any comments that you may have!

Questions about Raku

Meanwhile on Twitter

Meanwhile on the mailing-list

Comments about Raku

New Raku Modules

Updated Raku Modules

Winding down

A whole year has passed for the Weekly. How Raku and the world has changed in those twelve months: Raku for the better, the world for worse. Which brings yours truly to tell you: please don’t forget to stay healthy and to stay safe. And to check in next week for more news about the Raku Programming Language!

gfldex: Sneaky Arguments

Published by gfldex on 2020-10-10T10:07:38

While playing with .WHY I stepped onto an ENODOC. Method refers to Signature. Both fail to mention the default signature of a method. The documentation of automatic signatures fails to mention that %_ is always added to methods. Asking greppable6 for mentioning of it revealed plenty of redundant uses of *%_ in methods.

my method m1(:$named) {}
my method m2(:$named, *%_) {}

say &m1.signature;
say &m2.signature;
# OUTPUT: (Mu: :$named, *%_)
          (Mu: :$named, *%_)

The reason why I spotted this is .WHY. In Raku we can add comments that are attached to the language objects that follow them.

class Commented {
    #| this is an attribute
    has $.a-documented-attribute;
    #| this is a method
    method works {
        say ‚working as intended‘;
say $c.^can('works')[0].WHY;
say $c.^attributes[0].WHY;
# OUTPUT: this is a method
          this is an attribute

The docs fail to state how to access WHY-nodes for attributes and methods at runtime. We have to go through the MOP here because methods and attributes can be added to a type object at runtime. At least for methods I found a nicer way to handle this.

method works {
    return &?ROUTINE.WHY if %_;
    say ‚working as intended‘;
put $;

Oddly, this leads to the awkward situation that I add an undocumented argument to a method that returns the documentation for that method. To be explicit we got the trait is implementation-detail. Being forced to use a multi makes things more cumbersome though.

#| No comment! We are sneaky!
multi method sneaky { say ‚working as intended‘ }
multi method sneaky('why') is implementation-detail {
    return self.^lookup('sneaky').candidates[0].WHY;
say $c.sneaky('why');

That leaves the question why I need .WHY on attributes. I do agree with the assessment that JSON is lacking in general. My biggest objection is the lack of comments. For config files being able to tell a co-worker to keep his dirty fingers of a certain line can save lives. Debian is making good use of comments in such files too. It can save a lot of time if you don’t have to read a man page just to change one value. With inline comments on attributes I could express both the structure of data stored in a config file and instructions how to use the config file without another HEREDOC that can get out if sync with the actual code. I really like the structure of NestedText and am playing with adding type information. In NestedText leafs default to Str so that can be omitted. In Raku we get string representations for types for free, we can just add them as text.

I can define the structure of my config file as follows.

use Data::TypedNestedText;

class Phone {
has $.mobile;
has $.home;
has $.office;

class Contact {
has $.name;
#| this can be multiline
has $.address;
has $.phone;
has @.additional-roles;

For the output I wrote a naive deparser that can produce a NestedText with type annotations.

# this can be multiline
> 138 Almond Street
> Topika, Kansas 20697
name: Katheryn McDaniel
home: 1-210-555-8470
mobile: 1-210-555-5297
# this can be multiline
> 3636 Buffalo Ave
> Topika, Kansas 20692
name: Fumiko Purvis
office: 1-268-555-0280
- hug task force
- Christmas party team

This is much easier to read and edit then JSON. Writing the parse might not. I will report back shortly if it turns out to be easier then expected.

6guts: Taking a break from Raku core development

Published by jnthnwrthngtn on 2020-10-05T19:44:26

I’d like to thank everyone who voted for me in the recent Raku Steering Council elections. By this point, I’ve been working on the language for well over a decade, first to help turn a language design I found fascinating into a working implementation, and since the Christmas release to make that implementation more robust and performant. Overall, it’s been as fun as it has been challenging – in a large part because I’ve found myself sharing the journey with a lot of really great people. I’ve also tried to do my bit to keep the community around the language kind and considerate. Receiving a vote from around 90% of those who participated in the Steering Council elections was humbling.

Alas, I’ve today submitted my resignation to the Steering Council, on personal health grounds. For the same reason, I’ll be taking a step back from Raku core development (Raku, MoarVM, language design, etc.) Please don’t worry too much; I’ll almost certainly be fine. It may be I’m ready to continue working on Raku things in a month or two. It may also be longer. Either way, I think Raku will be better off with a fully sized Steering Council in place, and I’ll be better off without the anxiety that I’m holding a role that I’m not in a place to fulfill.

Rakudo Weekly News: 2020.40 Manifestly

Published by liztormato on 2020-10-05T15:22:29

After last weeks part 1 of a Raku Manifesto, Daniel Sockwell continued with part 2 of a Raku Manifesto, handling matters such as valuing individual productivity over large-group productivity, without devaluing large-group productivity. Again, a must read for each Rakoon (/r/rakulang comments). Can’t wait to read part 3!

Typing Junctionally

Wim Vanderbauwhede continues their journey into the Raku Programming Language, with two instalments focusing on Junctions:

A Grant Proposal

Piotr Fusik has submitted a grant proposal for a Ć-to-Raku translator. Ć is a programming language which can already be translated automatically to C, C++, C#, Java, JavaScript, Python, Swift and OpenCL. If you have any thoughts about this proposal, be sure to leave them with the proposal!

A bouquet of blog posts

This week saw a rather large set of blog posts about the Raku Programming Language from various expected and unexpected corners of the Earth:

Weekly Challenge

Weekly Challenge #81 is available for your perusal. And Andrew Shitov did a full review of the Raku solutions of Challenge #79 (including a video run-through), and Simon Proctor looked back on that.

Core Developments

Most of the core developments have been happening in the rakuast branch (converting RakuAST trees back to Raku source code, AKA deparsing). Meanwhile, in the main branch:

This week’s new Pull Requests:

Please check them out and leave any comments that you may have!

Questions about Raku

Meanwhile on Twitter

Comments about Raku

New Raku Modules

Updated Raku Modules

Winding down

A much quieter week, but with still some thought provoking blog posts!

In another language, in another galaxy, Tau Station (a digital expression of love between a beautifully written science fiction novel and a grippingly immersive MMO RPG) has started a Kickstarter to help it improve and expand their horizons. Knowing that many Rakoons are science-fiction fans as well, yours truly thought it would appropriate to draw your attention to this :-).

Please don’t forget to stay healthy and to stay safe. Until next week for more news about the Raku Programming Language. Until then!

p6steve: Machine Math and Raku

Published by p6steve on 2020-09-29T21:57:33

In the machine world, we have Int and Num. A High Level Language such as Raku is a abstraction to make it easier for humans. Humans expect decimal math 0.1 + 0.2 = 0.3 to work.

Neither Int nor Num can do this! Huh? How can that be?

Well Int is base-2 and decimals are base-10. And Num (IEEE 754 Floating Point) is inherently imprecise.

Luckily we have Rat. Raku automatically represent decimals and other rational numbers as a ratio of two integers. Taking into account all the various data shuffling any program creates, the performance is practically as fast as Int. Precision, accuracy and performance are maintained for the entire realm of practical algebraic calculations.

And when we get to the limit of Rat (at around 3.614007241618348e-20), Raku automatically converts to a Num. After this, precision cannot be recovered – but accuracy and dynamic range are maintained. Modern Floating Point Units (FPUs) make Num nearly as fast as Int so performance is kept fast.

Here, we have smoothly transitioned to the world of physical measurements and irrational numbers such as square root and logarithms. It’s a one way trip. (Hint: If you want a Num in the first place, just use ‘e’ in the literal.)

Good call, Raku!

Footnote: FatRat and BigInt are not important for most use cases. They incur a performance penalty for the average program and rightly belong in library extensions, not the core language.

gfldex: Releasing for virtuous programmers

Published by gfldex on 2020-09-28T20:02:30

Today I released META6::bin with the following command.

meta6 --release

Since I forgot to commit the changes to I had to release again with.

meta6 --release --version=+

As a result there are now some friendly tarballs at To get there I had to force travis to install Test::META from HEAD. If you wish to release on github with ease, you may have to run the following commands.

zef install --/test
zef install META6::bin

It should work with already existing META6.json files out of the box but likely need the setup instructions at the bottom of the README.

As soon as RabidGravy found the time to make a proper release of Test::META (shouldn’t take more then 10 seconds) I will set the proper dependencies.

Some oddities in the response of the github API (<tarball_url> doesn’t point at the tarball) gave me a good reason to have another look at Codeberg. A lot was added in the last two years. The project is run by a non-profit from Berlin and as such not based in a country that likes to punish people because they don’t like a government. If we want to provide a programming language for a global community we can’t rely on Trumpland anymore. The whole software stack is FOSS and the API looks really nice. I didn’t miss any features that are on github but I didn’t really look for projects that are handled by an organisation.

I aim to add support for Codeberg to META6::bin and shall report here before Christmas.

Rakudo Weekly News: 2020.39 The Releaser

Published by liztormato on 2020-09-28T16:45:56

Alexander Kiryuhin has been very busy in the past week. Not only did they release a Comma Complete update (the Raku IDE of choice, now with 2020.02 IntelliJ support). They also released the Rakudo 2020.09 Compiler Release implementing the Raku Programming Language. And Claudio Ramirez made sure there are ready to download Linux packages for that release. And Timo Paulssen made sure there’s an AppImage for it as well!

A Raku Manifesto

After asking a pretty pertinent question on Reddit, Daniel Sockwell continued to produce high-quality thought provoking blog posts. This time, it’s part 1 of a Raku Manifesto, describing what they think the Raku Programming Language is about. Thinking about expressive code vs uniform code, rewarding mastery vs ease of learnability, powerful vs unsurprising code and finally, individual vs large-group productivity. A must read for each Rakoon (/r/rakulang comments)!


Timo Paulssen was unsatisfied by the installation process of MoarPerf (the full Rakudo profiler for MoarVM). Until they found out about AppImage, a format for Linux that allows programs to be distributed as a single executable file. Then packaged MoarPerf and Rakudo in it, and also blogged about it. Can you AppImageine that?

A Universal Interpreter

Wim Vanderbauwhede continues their journey into functional programming with Raku, with two instalments (Twitter comments):

Another round of grant requests

Only a few more days to go to submit your grant proposals to improve the Raku Programming Language for the September 2020 round!

A list of matches

Wenzel P.P. Peppmeyer only wrote one blog post this week: List breaks the chain.

Yet another Pearl

Andrew Shitov wrote another episode in the Pearls of Raku series this week. Issue 13: functional elements and recursive sum.

Weekly Challenge

Weekly Challenge #80 is available for your perusal. And Andrew Shitov did a full review of the Raku solutions of Challenge #78 (including a video run-through).

Core Developments

Most of the core developments have been happening in the new-disp (resume on dispatch related work) and rakuast branches (converting RakuAST trees back to Raku source code, AKA deparsing). Meanwhile, in the main branch:

This week’s new Pull Requests:

Please check them out and leave any comments that you may have!

Questions about Raku

Meanwhile on Twitter

Meanwhile on perl6-users

Comments about Raku

New Raku Modules

Updated Raku Modules

Winding down

Quite a busy week again with some thought provoking blog posts! Please don’t forget to stay healthy and to stay safe. Until next week for more news about the Raku Programming Language. Until then!

gfldex: List breaks the chain

Published by gfldex on 2020-09-25T12:14:51

While watching RaycatWhoDat liking Raku, I realised that a List of Matches is not a MatchList.

say 'a1'.match(/ \d /).replace-with('#');
say 'a1b2'.match(/ \d /, :g).replace-with('#');
# OUTPUT: a#
#         No such method 'replace-with' for invocant of type 'List'
            in block <unit> at /home/dex/tmp/tmp-2.raku line 8

The 2nd call of replace-with will fail because .match with :g returns a List of Match. Of course we can play the easy game and just use subst which does the right thing when called with :g. This wouldn’t make a good blog post though.

To make replace-with work with a List we can use a where-clause. Any Match will have a copy of the original Str but not to the original Regex so we actually have to build a list of Str of everything that was not matched. This can be done by using the indices stored in .from and .to.

multi sub replace-with(\l where (.all ~~ Match), \r --> Str) {
    my $orig := l.head.orig;
    my @unmatched;

    @unmatched.push: $orig.substr(0, l.head.from);
    for ^(l.elems - 1) -> $idx {
        @unmatched.push: $orig.substr(l[$idx].to, l[$idx+1].from - l[$idx].to);

    @unmatched.push: $orig.substr(;

    (@unmatched Z (|(r xx l.elems), |'')).flat.join;

say 'a1vvvv2dd3e'.match(/ \d /, :g).&replace-with('#');
# OUTPUT: a#vvvv#dd#e

If the original string does not end with a match, the list of matches will be one short to be just zipped in. That’s why I have to extend the list of replacements by an empty string before feeding it to Z.

So if subst is doing it right why bother with .replace-with? Because sometimes we have to use $/.

if 'a1bb2ccc3e' ~~ m:g/ \d / {
    say $/.&replace-with('#');

Often we could change the code but when a routine from a module returns Match or a list thereof, we are out of luck. For completeness we need a few more multies.

multi sub replace-with(Match \m, \r --> Str) {

multi sub replace-with(Match \m, &r --> Str) {

multi sub replace-with(\l where (.all ~~ Match), &r) {
    my $orig := l.head.orig;
    my @unmatched;

    @unmatched.push: $orig.substr(0, l.head.from);
    for ^(l.elems - 1) -> $idx {
        @unmatched.push: $orig.substr(l[$idx].to, l[$idx+1].from - l[$idx].to);

    @unmatched.push: $orig.substr(;

    (@unmatched Z (|, |'')).flat.join;

Even if the problem is solvable it still bugs me. We have :g in many places in Raku to provide quite a lot of DWIM. In some places that concept breaks and almost all of them have to do with lists. Often ».comes to the rescue. When we actually have to work on the list and not the individual elements, even that doesn’t work. The methods on strings just work because in this case we deliberately avoid breaking the list of characters apart.

If you follow this blog you know that I’m leaning strongly towards operators. Sadly ., .? and ». are not real infixes which we can overload. Nor can we declare infixes that start with a . or we could introduce an operator that turns it’s LHS to a list and then does a dispatch to multis that can handle lists of a certain type.

Without that we need to change .match to return a subclass of List that got the method .replace-with. We could stick it into Cool but that is a crowded place already.

We don’t really have a nice way to augment the return values of builtin methods. So this will have to be fixed in CORE.

my Timotimo \this: Can you AppImageine that?

Published by Timo Paulssen on 2020-09-24T15:38:32

Can you AppImageine that?
Photo by Susan Holt Simpson / Unsplash
Can you AppImageine that?

I have been unsatisfied with the installation process for MoarPerf for a little while now. You have to either compile the javascript (and css) yourself using npm (or equivalent) which takes a little while, or you have to rely on me releasing an "everything built for you" version to my github repo.

The last few days I've repeatedly bonked my metaphorical pickaxe against the stone wall that is unpleasantly long build times and an endless stream of small mistakes in order to bring you MoarPerf in an AppImage.

AppImage is a format for linux programs that allows programs to be distributed as a single executable file without a separate install process.

The AppImage for MoarPerf includes a full Rakudo along with the dependencies of MoarPerf, the built javascript, and the Raku code. This way you don't even have to have a working Rakudo installation on the machine you want to use to analyze the profiler results. Yours truly tends to put changes in MoarVM or nqp or Rakudo that sometimes prevent things from working fine, and resetting the three repos back to a clean state and rebuilding can be a bit of an annoyance.

With the MoarPerf AppImage I don't have to worry about this at all any more! That's pretty nice.

AppImages for Everyone!

With an AppImage created for MoarPerf it was not too much work to make an AppImage for Rakudo without a built-in application.

The next step is, of course, to pack everything up nicely to create a one-or-two-click solution to build AppImages for any script that you may be interested in running.

There has also already been a module that creates a windows installer for a Raku program by installing a custom MoarVM/nqp/Rakudo into a pre-determined path (a limitation from back when Rakudo wasn't relocatable yet), and maybe I should offer an installer like this for windows users, too? The AppImage works much like this, too, except it already makes use of the work that made Rakudo relocatable, so it doesn't need to run in a pre-defined path.

If you want to give building AppImages a try as well, feel free to steal everything from the rakudo-appimage repository, and have a look at the .travis.yml and the appimage folder in the moarperf repo!

In any case, I would love to hear from people, whether the AppImages for Rakudo and MoarPerf work on their machines, and what modules/apps they would like to have in an AppImage. Feel free to message me on twitter, write to the Raku users mailing list, or find me on freenode as timotimo.

Thanks for reading, stay safe, and see y'all later!

vrurg: An article about Proxy is finally complete

Published by Vadim Belman on 2020-09-11T00:00:00

After month and a half full of many events, I finally got time to complete one more article for the Advanced Raku For Beginners series.

Frankly, I’m not happy about it. It feels to me that my not the perfect English is gotten even worse; not all topics and quirks are covered. But at least I’m getting back in these waters. So, just let it be. I’m warming up. And hope to advance into another subject soon.

gfldex: Releasing on github

Published by gfldex on 2020-09-10T19:48:28

In my last post I lamented the lack of testing metadata. Just a few days later it got in my way when I played with creating releases on github. My normal workflow on github is to commit changes and push them to trigger travis. When travis is fine I bump the version field in META6.json so the ecosystem and zef can pick up the changes. And there is a hidden trap. If anybody clones the repo via zef just before I bump the version, there will be a mismatch between code and version. CPAN doesn’t got that problem because there is always a static tarball per version. With releases we can get the same on github.

It’s a fairly straight forward process.

This is so simple that I immediately automated that stuff with META6::bin so I can mess it up. (Not released yet, see below.)

The result is an URL like so: When we feed that to zef it will check if the version is not already installed and then proceed to test and install.

And there is a catch. Even though zef is fine with the URL, Test::META will complain because it doesn’t end in .git and fail the test. This in turn will stop zef from installing the module. We added that check to make sure zef always gets a proper link to a clone-able repo for modules hosted on github. This assumption is clearly wrong and needs fixing. I will send a PR soon.

Having releases on github (other gitish repo-hosting sites will have similar facilities or will get them) can get us one step closer to a proper RPAN. Once I got my first module into the ecosystem this way I will provide an update here.

vrurg: The Election Time

Published by Vadim Belman on 2020-09-10T00:00:00

Just a reminder to anybody passing by this blog that the election to Raku Steering Council is going on now and will be taking place until Sep 20. More details can be found in the voting form and the original announcement.

I have cast my ballot.

Raku Advent Calendar: RFC 265: Interface polymorphism considered lovely

Published by vrurg on 2020-08-21T00:01:00

A little preface with an off-topic first. In the process of writing this post I was struck by the worst sysadmin’s nightmare: loss of servers followed by a bad backup. Until the very last moment I have had well-grounded fears of not finishing the post whatsoever. Luckily, I made a truce with life to get temporary respite. A conclusion? Don’t use bareos with ESXi. Or, probably, just don’t use bareos…

While picking up a RFC for my previous advent post I was totally focused on language-objects section. It took me a few passes to find the right one to cover. But in the meantime I realized that a very important topic is actually missing from the list. “Impossible!” – I said to myself and went onto another hunt later. Yet, neither search for “abstract class”, nor for “role” didn’t come up with any result. I was about to give up and make the conclusion that the idea came to life later, when the synopses were written or around so.

But, wait, what interface is mentioned as a topic of a OO-related RFC? Oh, that interface! As the request body states it:

Add a mechanism for declaring class interfaces with a further method for declaring that a class implements said interface.

At this point I realized once again that it is now a full 20 years behind us. That the text is from the times when many considered Java as the only right OO implementation! And indeed, by reading further we find the following statement, likely to be affected by some popular views of the time:

It’s now a compile time error if an interface file tries to do anything other than pre declare methods.

Reminds of something, isn’t it? And then, at the end of the RFC, we find another one:

Java is one language that springs to mind that uses interface polymorphism. Don’t let this put you off – if we must steal something from Java let’s steal something good.

Good? Good?!! Oh, my… Java’s attempt to solve problems of C++ multiple inheritance approach by simply denying it altogether is what drove me away from the language from the very beginning. I was fed up with Pascal controlling my writing style as far back as in early 90s!

Luckily, those involved in early Perl6 design must have shared my view to the problem (besides, Java itself has changed a lot since). So, we have roles now. What they have in common with abstract classes and the modern interfaces is that a role can define an interface to communicate with a class, and provide implementation of some role-specific behavior too. It can also do a little more than only that!

What makes roles different is the way a role is used in Raku OO model. A class doesn’t implement a role; nor it inherits from it as it would with abstract classes. Instead it does the role; or the other word I love to use for this: it consumes a role. Technically it means that roles are mixed into classes. The process can be figuratively described as if the compiler takes all methods and attributes contained by role’s type object and re-plants then onto the class. Something like:

role Foo {
    has $.foo = 42;
    method bar {
        say "hello!"
class Bar does Foo { }
my $obj =;
say $; # 42
$;     # hello!

How is it different from inheritance? Let’s change the class Bar a little:

class Baz {
    method bar {
        say "hello from Baz!"
class Bar does Foo is Baz {
    method bar {
        say "hello from Bar!";
}; # hello from Bar!
             # hello from Baz!

nextsame in this case re-dispatches a method call to the next method of the same name in the inheritance hierarchy. Simply put, it passes control over to the method Baz::bar, as one can see from the output we’ve received. And Foo::bar? It’s not there. When the compiler mixes the role into Bar it finds that the class does have a method named bar already. Thus the one from Foo is ignored. Since nextsame only considers classes in the inheritance hierarchy, Foo::bar is not invoked.

With another trick the difference from interface consumption can also be made clear:

class Bar {
    method bar {
        say "hello from Bar!"
my $obj =;
$; # hello from Bar!
$obj does Foo;
$; # hello!

In this example the role is mixed into an existing object, thanks to the dynamic nature of Raku which makes this possible. When a role is applied this way its content is enforced over the class content, similarly to a virus injecting its genetic material into a cell effectively overriding internal processes. This is why the second call to bar is dispatched to the Foo::bar method and Bar::bar is nowhere to be found on $obj this time.

To have this subject fully covered, let me show you some funny code example. The operator but used in it behaves like does except it doesn’t modify its LHS object; instead but creates and returns a new one:

‌‌my $s1 = "not empty means true";
my $s2 = $s1 but role { method Bool { False } };
say $s1 ?? "true" !! "false";
say $s2 ?? "true" !! "false";

This snippet I’m leaving for you to try on your own because it’s time for my post to move onto another topic: role parameterization.

Consider the example:

role R[Str:D $desc] {
    has Str:D $.description = $desc;
class Foo does R["some info"] { }
say; # some info

Or more practical one:

role R[::T] {
    has T $.val is rw;
class ContInt does R[Int] { } = "oops!"; # "Type check failed..." exception is thrown

The latter example utilizes so called type capture where T is a generic type, the concept many of you are likely to know from other languages, which turns into a concrete type only when the role gets consumed and supplied with a parameter, as in class ContInt declaration.

The final iteration for parametrics I’m going to present today would be this more extensive example:

role Vect[::TX] {
    has TX $.x;
    method distance(Vect $v) { ($v.x - $.x).abs }
role Vect[::TX, ::TY] {
    has TX $.x;
    has TY $.y;
    method distance(Vect $v) { 
        (($v.x - $.x)² + ($v.y - $.y)²).sqrt 

class Foo1  does Vect[Rat]      { }
class Foo2 does Vect[Int, Int] { }

my $foo1 =;
my $foo2 =, :y(5));
say $foo1;                                   # => 10.0)
say $foo2;                                   # => 10, y => 5)
say $foo2.distance(, :y(4))); # 1.4142135623730951

Hopefully, the code explains itself. Most certainly it nicely visualizes the long way made by the language designers since the initial RFC was made.

At the end I’d like to share a few interesting facts about Raku roles and their implementation by Rakudo.

  1. As of Raku v6.e, a role can define own constructor/destructor submethods. They’re not mixed into a class as methods are. Instead, they’re used to build/destroy an object same way, as constructors/destructors of classes do:
use v6.e.PREVIEW; # 6.e is not released yet
role R { submethod TWEAK { say "R" } }
class Foo { submethod TWEAK { say "Foo" } }
class Bar is Foo does R { submethod TWEAK { say "Bar" } }; # Foo
         # R
         # Bar
  1. Role body is a subroutine. Try this example:
role R { say "Role" }
class Foo { say "Foo" }
# Foo

Then modify class Foo so that it consumes R:

class Foo does R { say "Foo" }
# Role
# Foo

The difference in the output is explained by the fact that role body gets invoked when the role itself is mixed into a class. Try adding one more class consuming R alongside with Foo and see how the output changes. To make the distinction between class and role bodies even more clear, make your new class inherit from Foo. Even though is and does look alike they act very much different. 3. Square brackets in role declaration enclose a signature. As a matter of fact, it is the signature of role body subroutine! This makes a few very useful tricks possible:

# Limit role parameters to concrete numeric objects.
role R[Numeric:D ::T $default] {
    has T $.value = $default;
class Foo[42.13] { };
say; # 42.13

Or even:

# Same as above but only allow specific values.
role R[Numeric:D ::T $default where * > 10] {
    has T $.value = $default;

Moreover, in case when few different parametric candidates are declared for a role, choosing the right one is a task of the same kind as choosing the right routine of a few multi candidates and based on matching signatures to the parameters passed. 4. Rakudo implements a role using four different role types! Let me demonstrate one aspect of this with the following snippet based on the example for the previous fact:

for Foo.^roles -> \consumed {
    say R === consumed

=== is a strict object identity operator. In our case we can consider it as a strict type equivalence operator which tells us if two types are actually exactly the same one.

And as I hope to have this subject covered later in a more extensive article, at this point I would make it a classical abrupt open ending by providing just the output of the above snippet as a hint:


Raku Advent Calendar: RFC 28, by Simon Cozens

Published by coke on 2020-08-20T03:00:00


RFC 28 – Perl Should Stay Perl

Originally Submitted by Simon Cozens, RFC 28 on August 4, 2020, this RFC asked the community to make sure that whatever updates were made, that Perl 6 was still definitely recognizable as Perl. After 20 years of design, proofs-of-concept, implementations, two released language versions, we’ve ended up with something that is definitely Perlish, even if we’re no longer a Perl.

At the time the RFCs were submitted, the thought was that this language would be the next Perl in line, Perl 6. As time went on before an official language release, Perl 5 development picked up again, and that team & community wanted to continue on its own path. A few months ago, Perl 6 officially changed its name to Raku – not to get away from our Perl legacy, but to free the Perl 5 community to continue on their path as well. It was a difficult path to get to Raku, but we are happy with the language we’re shipping, even if we do miss having the Perl name on the tin.

“Attractive Nuisances”

Let’s dig into some of the specifics Simon mentions in his RFC.

We’ve got a golden opportunity here to turn Perl into whatever on earth we like. Let’s not take it.

This was a fine line that we ended up crossing, even before the rename. Specific design decisions were changed, we started with a fresh implementation (more than once if you count Pugs & Parrot & Niecza …). We are Perlish, inspired by Perl, but Raku is definitely different.

Nobody wins if we bend the Perl language out of all recognition, because it won’t be Perl any more.

I argue that eventually, everyone won – we got a new and improved Perl 5 (and soon, a 7), and we got a brand new language in Raku. The path wasn’t clear 20 years ago, but we ended up in a good place.

Some things just don’t need heavy object orientation.

Raku’s OO is everywhere: but it isn’t required. While you can treat everything as an object:


You can still use the familiar Perlish forms for most features. say sqrt 3;

Even native scalars (which don’t have the overhead of objects) let you treat them as OO if you want.

  my uint32 $x = 32;
  say $x;

Even though $x here doesn’t start out as an object, by calling a meta-method on it, the compiler cheats on our behalf and outputs Int here, the closest class to our native int.

But we avoid going the extent of Java; for example, we don’t have to define a class with a main method in order to execute a program.

Strong typing does not equal legitimacy.

Similar to the OO approach, we don’t require typing, but allow you to gradually add it. You can start with an untyped scalar variable, but as you further develop your code, you can add a type to that declared variable, and to parameters to subs & methods. The types can be single classes, subsets, Junctions, where clauses with complicated logic: you can use as much or as little typing as you want. Raku’s multi routines (subs or methods with the same name but different arguments) give you a way to split up your code based on types that is then optimized by the compiler. But you can use as little or as much of it as you want.

Just because Perl has a map operator, this doesn’t make it a functional programming language.

I think Raku stayed true to this point – while there are functional elements, the polyglot approach (supporting multiple different paradigms) means that any one of them, including functional, doesn’t take over the language. But you can declare routines pure, allowing the compiler to constant fold calls to that routine when the args are known at compile time.

Perl is really hard for a machine to parse. … It’s meant to be easy for humans to understand.

Development of Raku definitely embraced this thought – “torture the implementators on behalf of the users”. This is one of the reasons it took us a while to get to here. But on that journey, we designed and developed new language parsing tools that we not only use to build and run Raku, but we expose to our users as well, allowing them to implement their own languages and “Slangs” on top of our compiler.


Finally, now that the Perl team is proposing a version jump to 7, I suspect the Perl community will raise similar concerns to those raised by Simon. Raku and Perl 7 have taken two different paths, but both will be recognizable to the Perl 5 RFC contributors from 20 years ago.

Raku Advent Calendar: RFC 84 by Damian Conway: => => =>

Published by koto on 2020-08-19T01:00:00

RFC 84 by Damian Conway: Replace => (stringifying comma) with => (pair constructor)

Yet another nice goodie from Damian, truly what you might expect from the interlocutor and explicator!

The fat comma operator, =>, was originally used to separate values – with a twist. It behave just like , operator did, but modified parsing to stringify left operand.

It saved you some quoting for strings and so this code for hash initialization:

my %h = (
'a', 1,
'b', 2,

could be written as:

my %h = (
a => 1,
b => 2,

Here, bare a and b are parsed correctly, without a need to quote them into strings. However, the usual hash assignment semantics is still the same: pairs of values are processed one by one, and given that => is just a “left-side stringifying” comma operator, interestingly enough the code above is equivalent to this piece:

my %h = ( a => 1 => b => 2 => );

The proposal suggested changing the meaning of this “special” operator to become a constructor of a new data type, Pair.

A Pair is constructed from a key and a value:

my @pairs = a => 42, 1 => 2;
say @pairs[0]; # a => 42
say @pairs[1]; # 1 => 2;
say @pairs[1].key.^name; # Int, not a Str

The @pairs list contains just 2 values here, not 4, one is conveniently stringified for us and the second just uses bare Int literal as a key.

It turns out, introducing Pair is not only a convenient data type to operate on, but this change offers new opportunities for… subroutines.

Raku has first class support of signatures, both for the sake of the “first travel class” pun here and for the matter of it, yes, actually having Signature, Parameter and Capture as first-class objects, which allows for surprising solutions. It is not a surprise it supports named parameters with plenty of syntax for it. And Pair class has blended in quite naturally.

If a Pair is passed to a subroutine with a named parameter where keys match, it works just so, otherwise you have a “full” Pair, and if you want to insist, a bit of syntax can help you here:

sub foo($pos, :$named) {
say "$pos.gist(), $named.gist()";
foo(42); # 42, (Any)
try foo(named => 42); # Oops, no positionals were passed!
foo((named => 42)); # named => 42, (Any)
foo((named => 42), named => 42); # named => 42, 42

As we can see, designing a language is interesting: a change made in one part can have consequences in some other part, which might seem quite unrelated, and you better hope your choices will work out well when connected together. Thanks to Damian and all the people who worked on Raku design, for putting in an amazing amount of efforts into it!

And last, but not the least: what happened with the => train we saw? Well, now it does what you mean if you mean what it does:

my %a = a => 1 => b => 2;
say %a.raku; # {:a(1 => :b(2))}

And yes, this is a key a pointing to a value of Pair of 1 pointing to a value of Pair of b pointing to value of 2, so at least the direction is nice this time. Good luck and keep your directions!

Raku Advent Calendar: RFC 200, by Nathan Wiger: Revamp tie to support extensibility

Published by liztormato on 2020-08-18T03:00:00

Proposed on 7 September 2000, frozen on 20 September 2000, depends on RFC 159: True Polymorphic Objects proposed on 25 August 2000, frozen on 16 September 2000, also by Nathan Wiger and already blogged about earlier.

What is tie anyway?

RFC 200 was about extending the tie functionality as offered by Perl.

This functionality in Perl allows one to inject program logic into the system’s handling of scalars, arrays and hashes, among other things. This is done by assigning the name of a package to a data-structure such as an array (aka tying). That package is then expected to provide a number of subroutines (e.g. FETCH and STORE) that will be called by the system to achieve certain effects on the given data-structure.

As such, it is used by some of Perl’s core modules, such as threads, and many modules on CPAN, such as Tie::File. The tie functionality of Perl still suffers from the problems mentioned in the RFC.

It’s all tied

In Raku, everything is an object, or can be considered to be an object. Everything the system needs to do with an object, is done through its methods. In that sense, you could say that everything in Raku is a tied object. Fortunately, Rakudo (the most advanced implementation of the Raku Programming Language) can recognize when certain methods on an object are in fact the ones supplied by the system, and actually create short-cuts at compile time (e.g. when assigning to a variable that has a standard container: it won’t actually call a STORE method, but uses an internal subroutine to achieve the desired effect).

But apart from that, Rakudo has the capability of identifying hot code paths during execution of a program, and optimize these in real time.

Jonathan Worthington gave two very nice presentations about this process: How does deoptimization help us go faster from 2017, and a Performance Update from 2019.

Because everything in Raku is an object and access occurs through the methods of the classes of these objects, this allows the compiler and the runtime to have a much better grasp of what is actually going on in a program. Which in turn gives better optimization capabilities, even optimizing down to machine language level at some point.

And because everything is “tied” in Raku (looking at it using Perl-filtered glasses), injecting program logic into the system’s handling of arrays and hashes can be as simple as subclassing the system’s class and providing a special version of one of the standard methods as used by the system. Suppose you want to see in your program when an element is fetched from an array, one need only add a custom AT-POS method:

class VerboseFetcher is Array {    # subclass core's Array class
    method AT-POS($pos) {           # method for fetching an element
        say "fetching #$pos";        # tell the world
        nextsame                     # provide standard functionality

my @a is VerboseFetcher = 1,2,3;   # mark as special and initialize
say @a[1];  # fetching #1␤2

The Raku documentation contains an overview of which methods need to be supplied to emulate an Array and to emulate a Hash. By the way, the whole lemma about accessing data structure elements by index or key is recommended reading for someone wanting to grok those aspects of the internals of Raku.

Nothing is special

In a blog post about RFC 168 about making things less special, it was already mentioned that really nothing is special in Raku. And that (almost) all aspects of the language can by altered inside a lexical scope. So what the above example did to the Array class, can be done to any of Raku’s core classes, or any other classes that have been installed from the ecosystem, or that you have written yourself.

But it can be overwhelming to have to supply all of the logic needed to fully emulate an array or a hash. Especially when you first try to do this. Therefore the ecosystem actually has two modules with roles that help you with that:

Both modules only require you to implement 5 methods in a class that does these roles to get the full functionality of an array or a hash, completely customized to your liking.

In fact, the flexibility of the approach of Raku towards customizability of the language, actually allowed the implementation of Perl’s tie built-in function in Raku. So if you’re porting code from Perl to Raku, and the code in question uses tie, you can use this module as a quick intermediate solution.

Has the problem been fixed?

Let’s look at the problems that were mentioned with tie in RFC 200:

  1. It is non-extensible; you are limited to using functions that have been implemented with tie hooks in them already.

Raku is completely extensible and pluggable in (almost) all aspects of its implementation. There is no limitation to which classes one can and one cannot extend.

  1. Any additional functions require mixed calls to tied and OO interfaces, defeating a chief goal: transparency.

All interfaces use methods in Raku, since everything is an object or can be considered as one. Use of classes and methods should be clear to any programmer using Raku.

  1. It is slow. Very slow, in fact.

In Raku, it is all the same speed during execution. And every customization profits from the same optimization features like every other piece of code in Raku. And will be, in the end, optimized down to machine code when possible.

  1. You can’t easily integrate tie and operator overloading.

In Raku, operators are multi-dispatch subroutines that allow additional candidates for custom classes to be added.

  1. If defining tied and OO interfaces, you must define duplicate functions or use typeglobs.

Typeglobs don’t exist in Raku. All interfacing in Raku is done by supplying additional methods (or subroutines in case of operators). No duplication of effort is needed, so no such problem.

  1. Some parts of the syntax are, well, kludgey

One may argue that the kludgey syntax of Perl has been replaced by another kludgey syntax in Raku. That is probably in the eye of the beholder. Fact is that the syntax in Raku for injecting program logic, is not different from any other subclassing or role mixins one would otherwise do in Raku.


Nothing from RFC 159 actually was implemented in the way it was originally suggested. However, solutions to the problems mentioned have all been implemented in Raku.

Raku Advent Calendar: RFC 159, by Nathan Wiger: True Polymorphic Objects

Published by liztormato on 2020-08-17T00:01:00

Proposed on 25 August 2000, frozen on 16 September 2000

On polymorphism

RFC159 introduces the concept of true polymorphic object.

Objects that can morph into numbers, strings, booleans and much more on-demand. As such, objects can be freely passed around and manipulated without having to care what they contain (or even that they’re objects).

When one looks at how 42, "foo", now work in Raku nowadays, one can only see that that vision has pretty much been implemented. Because most of the time, one doesn’t really care about the fact that 42 is really an Int object, "foo" is really a Str object and that now represents a new Instant object every time it is called. The only thing one cares about, is that they can be used in expressions:

say "foo" ~ "bar";  # foobar
say 42 + 666;       # 708
say now - INIT now; # 0.0005243

RFC159 lists a number of method names to be used to indicate how an object should behave under certain circumstances, with a fallback provided by the system if the class of the object does not provide that method. In most cases these methods did not make it into Raku, but some of them did with a different name:

Name in RFC Name in Raku When
STRING Str Called in a string context
NUMBER Numeric Called in a numeric context
BOOLEAN Bool Called in a boolean context

And some of them even retained their name:

Name in RFC When
BUILD Called in object blessing
STORE Called in an lvalue = context
FETCH Called in an rvalue = context
DESTROY Called in object destruction

but with sometimes subtly different semantics from the RFC.

Only a few made it

In the end, only a limited set of special methods was decided on for Raku. All of the other methods in RFC159 have been implemented by polymorphic operators that coerce when needed. For instance the proposed PLUS method has been implemented as an infix + operator that has a “default” candidate that coerces its operands to a number.

So, effectively, if you have an object of class Foo and you want that to act as a number, one only needs to add a Numeric method to that class. An expression such as:

my $foo =;
say $foo + 42;

is effectively executing:

say infix:<+>( $foo, 42 );

and the infix:<+> candidate that takes Any objects, does:

return infix:<+>( $foo.Numeric, 42.Numeric );

And if such a class Foo does not provide a Numeric method, then it will throw an exception.

The DESTROY method

In Raku, object destruction is non-deterministic. If an object is no longer in use, it will probably get garbage collected. The probable part is because Raku does not know a global destruction phase, unlike Perl. So when a program is done, it just does an exit (although that logic does honour any END blocks).

An object is marked “ready for removal” when it can no longer be “reached”. It then has its DESTROY method called when the garbage collection logic kicks in. Which can be any amount of time after it became unreachable.

If you need deterministic calling of the DESTROY method, you can use a LEAVE phaser. Or if that doesn’t allow you to scratch your itch, you can possibly use the FINALIZER module.

STORE / FETCH on scalar values

Conceptually, you can think of a container in Raku as an object with STORE and FETCH methods. Whenever you set a value in a container, it conceptually calls the STORE method. And whenever the value inside the container is needed, it conceptually calls the FETCH method. In pseudo-code:

my $foo = 42;  #<$foo>).STORE(42)

But what if you want to control access to a scalar value, similar to Perl’s tie? Well, in Raku you can, with a special type of container class called Proxy. An example of its usage:

sub proxier($value? is copy) {
        FETCH => method { $value },
        STORE => method ($new) {
            say "storing";
            $value = $new

my $a := proxier(42);
say $a;    # 42
$a = 666;  # storing
say $a;    # 666

Subroutines return their result values de-containerized by default. There are basically two ways of making sure the actual container is returned: using return-rw (like in this example), or by marking the subroutine with the is rw trait.

STORE on compound values

Since FETCH only makes sense on scalar values, there is no support for FETCH on compound values, such as hashes and arrays, in Raku. I guess one could consider calling FETCH in such a case to be the Zen slice, but it was decided that that would just return the compound value itself.

The STORE method on compound values however, allows for some interesting functionality. The STORE method is called whenever there is an initialization of the entire compound value. For instance:

@a = 1,2,3;

basically executes:

@a := @a.STORE( (1,2,3) );

But what if you don’t have an initialized @a yet? Then the STORE method is supposed to actually create a new object and initialize this with the given values. And the STORE method can tell, because then it also receives a INITIALIZE named argument with a True value. So when you write this:

my @b = 1,2,3;

what basically gets executed is:

@b := (1,2,3), :INITIALIZE );

Now, if you realize that:

my @b;

is actually short for:

my @b is Array;

it’s only a small step to realize that you can create your own class with customized array logic, that can replace the standard Array logic with your own. Observe:

class Foo {
    has @!array;
    method STORE(@!array) {
        say "STORED @!array[]";

my @b is Foo = 1,2,3;  # STORED 1 2 3

However, when you actually start using such an array, you are confronted with some weird results:

say @b[0]; #
say @b[1]; # Index out of range. Is: 1, should be in 0..0

Without getting into the reasons for these results, it should be clear that to completely mimic an Array, a lot more is needed. Fortunately, there are ecosystem modules available to help you with that: Array::Agnostic for arrays, and Hash::Agnostic for hashes.


The BUILD method also subtly changed its semantics. In Raku, method BUILD will be called as an object method and receive all of the parameters given to .new, after which it is fully responsible for initializing object attributes. This becomes more visible when you use the internal helper module BUILDPLAN. This module shows the actions that will be performed on an object of a class when built with the default .new method:

class Bar {
    has $.score = 42;
# class Bar BUILDPLAN:
#  0: nqp::getattr(obj,Foo,'$!score') = :$score if possible
#  1: nqp::getattr(obj,Foo,'$!score') = 42 if not set

This is internals speak for: – assign the value of the optional named argument score to the $!score attribute – assign the value 42 to the $!score attribute if it was not set already

Now, if we add a BUILD method to the class, the buildplan changes:

class Bar {
    has $.score = 42;
    method BUILD() { }
# class Bar BUILDPLAN:
#  0: call obj.BUILD
#  1: nqp::getattr(obj,Foo,'$!score') = 42 if not set

Note that there is no automatic attempt to take the value of the named argument score anymore. Which means that you need to do a lot of work in your custom BUILD method if you have many named arguments, and only one of them needs special handling. That’s why the TWEAK method was added:

class Bar {
    has $.score = 42;
    method TWEAK() { }
# class Bar BUILDPLAN:
#  0: nqp::getattr(obj,Foo,'$!score') = :$score if possible
#  1: nqp::getattr(obj,Foo,'$!score') = 42 if not set
#  2: call obj.TWEAK

Note that the TWEAK method is called after all of the normal checks and initializations. This is in most cases much more useful.


Although the idea of true polymorphic objects has been implemented in Raku, it turned out quite different from originally envisioned. In hindsight, one can see why it was decided to be unpractical to try to support an ever increasing list of special methods for all objects. Instead, a choice was made to only implement a few key methods from the proposal, and for the others the approach of automatic coercions was taken.

vrurg: The Raku Steering Council Election

Published by Vadim Belman on 2020-07-30T00:00:00

I’m not the blogging kind of person and usually don’t post without a good reason. For a long while even a good reason wasn’t enough for me to write something. But things are changing, and this subject I should have mentioned earlier.

We’re currently in the process of forming The Raku Steering Council which is considered as a potentially effective governance model for the language and the community. It’s aimed at taking off load from the shoulders of Jonathan Worthington who currently bears the biggest responsibility for the vast majority of problems the community and the language development encounter.

The biggest advantages of the Council as I see them are:

Disclaimer: everything stated above is my personal view of the situation which is to be sumed up as: the damn good thing is happening!

To the point! The Council is not an elite closed club. Anybody can nominate himself! Just read the election announce.

BTW, the announce currently states the Aug 2 is the last date to nominate. This is about to change to Sep 6. Still, don’t procrastinate too much, let the community know about your nomination and yourself!

vrurg: Metamodel Introduction Article. Operators

Published by Vadim Belman on 2020-07-18T06:51:00

I’m publishing the next article from ARFB series. This time rather short one, much like a warm up prior to the main workout.

But I’d like to devote this post to another subject. It’s too small for an article yet still worth special note. It was again inspired by one more post from Wenzel P.P. Peppmeyer. Actually, I knew there going to be a new post from him when found an error report in Rakudo repository. And this is the subject of the report which made me write the post.

In the report Wenzel claims that the following code results in incorrect Rakudo behaviour:

class C { };
my \px =;
sub postcircumfix:«< >»(C $c is raw) {
    dd $c;
px<ls -l>;

And that either the operator redefinition must work or the error message he gets is less than awesome:

===SORRY!=== Error while compiling /home/dex/projects/raku/lib/raku-shell-piping/px.raku
Missing required term after infix
at /home/dex/projects/raku/lib/raku-shell-piping/px.raku:9
------> px<ls -l>⏏;
    expecting any of:

Before I tell why things are happening as intended here, let me notice two problems with the code itself which I copied over as-is since it doesn’t work anyway. First, the postcircumfix sub must be a multi and in Wenzel’s post it is done correctly. Second, it must receive two arguments: first is the object it is applied to, second is what is enclosed into the angle brakets.

So, why won’t it work as one might expect? In Raku there is a class of syntax constructs which look like operators but in fact they’re syntax sugars. There may be different reasons why is it done so. For example, the assignment operator = is done this way to achieve better performance. < > makes what is inclosed inside of it a string or a list of strings. Because of this it belongs to the same category, as quotes "", for example. Therefore, it can only be implemented properly as a syntax construct. When we try to redefine it we break the compiler’s parsing and instead of a postcircumfix it finds a pair of less than and greater than operators. Because the latter doesn’t have rhs statement hence the error we see.

And you know, it was really useful to make this post as I realized that closing of the tickat was preliminary and that such compiler behavior is still incorrect because the attempt to redefine the op should prbably not result in bad parsing.

vrurg: A New Article

Published by Vadim Belman on 2020-07-13T10:01:00

A new article of Advanced Raku For Beginners series is published now. With a really surprising subject this time! It is about how we define Raku. Or, in other words: how one knows that his code is Raku? Why a compiler has the right to state that it is actually compiling Raku? And a few side concepts to provide grounds for the main topic.

It may seem to be a bit late. But keeping in mind that the article refers back to a few concepts from previous publications, it’s probably right the time for it!

Enjoy and don’t forget to correct me whenever necessary!

my Timotimo \this: How would you like a 1000x speed increase

Published by Timo Paulssen on 2020-07-01T20:09:08

How would you like a 1000x speed increase

Good, that's the click-baity title out of the way. Sorry for taking such a long time to write again! There really has been everything going on.

To get back into blogging, I've decided to quickly write about a change I made some time ago already.

This change was for the "instrumented profiler", i.e. the one that will at run-time change all the code of the user's program, in order to measure execution times and count up calls and allocations.

In order to get everything right, the instrumented profiler keeps an entire call graph in memory. If you haven't seen something like it yet, imagine taking stack traces at every point in your program's life, and all these stack traces put together make all the paths in the tree that point at the root.

This means, among other things, that the same function can come up multiple times. With recursion, the same function can in fact come up a few hundred times "in a row". In general, if your call tree can become both deep and wide, you can end up with a whole truckload of nodes in your tree.

How would you like a 1000x speed increase
Photo by Gabriel Garcia Marengo / Unsplash

Is it a bad thing to have many nodes? Of course, it uses up memory. Only a single path on the tree is ever interesting at any one moment, though. Memory that's not read from or written to is not quite as "expensive". It never has to go into the CPU cache, and is even free to be swapped out to disk and/or compressed. But hold on, is this really actually the case?

It turns out that when you're compiling the Core Setting, which is a code file almost 2½ megabytes big with about 71½ thousand lines, and you're profiling during the parsing process, the tree gets enormous. At the same time, the parsing process slows to a crawl. What on earth is wrong here?

Well, looking at what MoarVM spends most of its time doing while the profiler runs gives you a good hint: It's spending almost all of its time going through the entirety of the tree for garbage collection purposes. Why would it do that, you ask? Well, in order to count allocated objects at every node, you have to match the count with the type you're allocating, and that means you need to hold on to a pointer to the type, and that in turn has to be kept up to date if anything moves (which the GC does to recently-born things) and to make sure types aren't considered unused and thrown out.

That's bad, right? Isn't there anything we can do? Well, we have to know at every node which counter belongs to which type, and we need to give all the types we have to the garbage collector to manage. But nothing forces us to have the types right next to the counter. And that's already the solution to the problem:

Holding on to all types is now the job of a little array kept just once per tree, and next to every counter there's just a little number that tells you where in the array to look.

This increases the cost of recording an allocation, as you'll now have to go to a separate memory location to match types to counters. On the other hand, the "little number" can be much smaller than before, and that saves memory in every node of the tree.

More importantly, the time cost of going through the profiler data is now independent of how big the tree is, since the individual nodes don't have to be looked at at all.

With a task as big as parsing the core setting, which is where almost every type, exception, operator, or sub lives, the difference is a factor of at least a thousand. Well, to be honest I didn't actually calculate the difference, but I'm sure it's somewhere between 100x faster and 10000x faster, and going from "ten microseconds per tree node" to "ten microseconds per tree" isn't a matter of a single factor increase, it's a complexity improvement from O(n) to O(1). As long as you can find a bigger tree, you can come up with a higher improvement factor. Very useful for writing that blog post you've always wanted to put at the center of a heated discussion about dishonest article titles!

Anyway, on testing my patch, esteemed colleague MasterDuke had this to say on IRC:

timotimo: hot damn, what did you do?!?! stage parse only took almost twice as long (i.e., 60s instead of the normal 37s) instead of the 930s last time i did the profile

(psst, don't check what 930 divided by 60 is, or else you'll expose my blog post title for the fraud that it is!)

Well, that's already all I had for this post. Thanks for your attention, stay safe, wear a mask (if by the time you're reading this the covid19 pandemic is still A Thing, or maybe something new has come up), and stay safe!

How would you like a 1000x speed increase
Photo by Macau Photo Agency / Unsplash

p6steve: perl7 vs. raku: Sibling Rivalry?

Published by p6steve on 2020-06-27T11:24:01

It was an emotional moment to see the keynote talk at TPRCiC from Sawyer X announcing that perl 7.00 === 5.32. Elation because of the ability of the hardcore perl community to finally break free of the frustrating perl6 roadblock. Pleasure in seeing how the risky decision to rename perl6 to raku has paid off and hopefully is beginning to defuse the tensions between the two rival communities. And Fear that improvements to perl7 will undermine the reasons for many to try out raku and may cannibalise raku usage. (Kudos to Sawyer to recognising that usage is an important design goal).

Then the left side of my brain kicked in. Raku took 15 years of total commitment of genius linguists to ingest 361 RFCs and then synthesise a new kind of programming language. If perl7 seeks the same level of completeness and perfection as raku, surely that will take the same amount of effort. And I do not see the perl community going for the same level of breaking changes that raku did. (OK maybe they could steal some stuff from raku to speed things up…)

And that brought me to Sadness. To reflect that perl Osborned sometime around 2005. That broke the community in two – let’s say the visionaries and the practical-cats. And it drove a mass emigration to Python. Ancient history.

So now we have two sister languages, and each will find a niche in the programming ecosystem via a process of Darwinism. They both inherit the traits ( that made perl great in the first place….

The design of Perl can be understood as a response to three broad trends in the computer industry: falling hardware costs, rising labor costs, and improvements in compiler technology. Many earlier computer languages, such as Fortran and C, aimed to make efficient use of expensive computer hardware. In contrast, Perl was designed so that computer programmers could write programs more quickly and easily.

Perl has many features that ease the task of the programmer at the expense of greater CPU and memory requirements. These include automatic memory management; dynamic typing; strings, lists, and hashes; regular expressions; introspection; and an eval() function. Perl follows the theory of “no built-in limits,” an idea similar to the Zero One Infinity rule.

Wall was trained as a linguist, and the design of Perl is very much informed by linguistic principles. Examples include Huffman coding(common constructions should be short), good end-weighting (the important information should come first), and a large collection of language primitives. Perl favours language constructs that are concise and natural for humans to write.

Perl’s syntax reflects the idea that “things that are different should look different.” For example, scalars, arrays, and hashes have different leading sigils. Array indices and hash keys use different kinds of braces. Strings and regular expressions have different standard delimiters. This approach can be contrasted with a language such as Lisp, where the same basic syntax, composed of simple and universal symbolic expressions, is used for all purposes.

Perl does not enforce any particular programming paradigm (proceduralobject-orientedfunctional, or others) or even require the programmer to choose among them.

But perl7 and raku serve distinct interests & needs:

compilationstatic parserone pass compiler
compile speedsuper fastrelies on pre-c0mp
executioninterpretedvirtual machine
execution speedsuper fastrelies on jit
module libraryCPAN nativeCPAN import
OO philosophyCor not modulepervasive
OO inheritanceRoles + IsRoles + Is + multiple
method invocation->.
type checkingnogradual
sigilsidiosyncratic consistent
unicodefeature guardcore
signaturesfeature guardcore
lazy executionnopecore
Rat mathnopecore
Sets & Mixesnopecore
Complex mathnopecore
variable scope“notched”cleaner
operatorsC-likecleaner (e.g. for ->)
AST macroshuh?
…and so on

A long list and perhaps a little harsh on perl since many things may be got from CPAN – but when you use raku in anger, you do see the benefit if having a large core language. Only when I made this table, did I truly realise just what a comprehensive language raku is, and that perl will genuinely be the lean and mean option.

Ariel Atom 3.5 review, price, specs and video | Evo
Model X | Tesla

And, lest we forget our strengths:

When I first saw Python code, I thought that using indents to define the scope seemed like a good idea. However, there’s a huge downside. Deep nesting is permitted, but lines can get so wide that they wrap lines in the text editor. Long functions and long conditional actions may make it hard to match the start to the end. And I pity anyone who miscounts spaces and accidentally puts in three spaces instead of four somewhere — this can take hours to debug and track down. [Source: here]

p6steve: Raku Objects: Confusing or What?

Published by p6steve on 2020-05-07T21:51:52

Chapter 1: The Convenience Seeker

Coming from Python, the Raku object model is recognizable, but brings a tad more structure:

Screenshot 2020-05-07 22.36.37

What works for me, as a convenience seeker, is:

These are the things you want if you are writing in a more procedural or functional style and using class as a means to define a record type.

Chapter 2: The Control Freak

Here’s the rub…

When we describe OO, terms like “encapsulation” and “data hiding” often come up. The key idea here is that the state model inside the object – that is, the way it chooses to represent the data it needs in order to implement its behaviours (the methods) – is free to evolve, for example to handle new requirements. The more complex the object, the more liberating this becomes.

However, getters and setters are methods that have an implicit connection with the state. While we might claim we’re achieving data hiding because we’re calling a method, not accessing state directly, my experience is that we quickly end up at a place where outside code is making sequences of setter calls to achieve an operation – which is a form of the feature envy anti-pattern. And if we’re doing that, it’s pretty certain we’ll end up with logic outside of the object that does a mix of getter and setter operations to achieve an operation. Really, these operations should have been exposed as methods with a names that describes what is being achieved. This becomes even more important if we’re in a concurrent setting; a well-designed object is often fairly easy to protect at the method boundary.

(source jnthn

Let’s fix that:

Screenshot 2020-05-07 22.38.41
Now, I had to do a bit more lifting, but here’s what I got:

And, in contrast to Chapter 1:

Chapter 3: Who Got the Colon in the End?

I also discovered Larry’s First Law of Language Redesign: Everyone wants the colon

Apocalypse 1: The Ugly, the Bad, and the Good

I conclude that Larry’s decision was to confer the colon on the method syntax,  subtly tilting the balance towards the strict model: by preferring $p.y: 3 over $p.y = 2.

p6steve: Raku vs. Perl – save 70%

Published by p6steve on 2020-04-17T17:36:39

Having hit rock bottom with an ‘I can’t understand my own code sufficiently enough to extend/maintain it’, I have been on a journey to review the perl5 Physics::Unit design and to use this to cut through my self made mess of raku Physics::Unit version 0.0.2.

Now I bring the perspective of a couple of years of regular raku coding to bear, so I am hoping that the bastard child of mature perl5 and raku version one will surpass both in the spirit of David Bowie’s “Pretty Things”.

One of the reasons I chose Physics::Units as a project was that, on the face of it, it seemed to have an aspect that could be approached by raku Grammars – helping me learn them. Here’s a sample of the perl5 version:

Screenshot 2020-04-17 18.40.05

Yes – a recursive descent parser written from scratch in perl5 – pay dirt! There are 215 source code lines dedicated to the parse function. 5 more screens like this…

So I took out my newly sharpened raku tools and here’s my entire port: 

Screenshot 2020-04-17 18.42.08

Instead of ranging over 215 lines, raku has refined this down to a total of 58 lines (incl. the 11 blank ones I kept in for readability) – that’s a space saving of over 70%. Partly removal of parser boilerplate code, partly the raku Grammar constructs and partly an increased focus on the program logic as opposed to the mechanism.

For my coding style, this represents a greater than a two-thirds improvement – by getting the whole parser onto a single screen, I find that I can get the whole problem into my brain’s working memory and avoid burning cycles scrolling up and down to pin down butterflies bugs.

Attentive students will have noted that the Grammar / code integration provides a very natural paradigm for loading real-world data into an OO system, the UnitAction class starts with a stub object and populates ‘has’ attributes as it goes.

Oh and the raku code does a whole lot more such as support for unicode superscripts (up to +/-4), type assignment and type checking, offsets (such as 0 K = 273.15 °C), wider tolerance of user input and so on. Most importantly Real values are kept as Rats as much as possible which helps greatly for example, when round tripping 38.5 °C to  °F and back it is still equals 38.5 °C!

One final remark – use Grammar::Tracer is a fantastic debugging tool for finding and fixing the subtle bugs that can come in and contributing to quickly getting to the optimum solution. Rakudo Star Release 2020.01

Published on 2020-02-24T00:00:00

p6steve: Raku: the firkin challenge

Published by p6steve on 2020-01-20T22:40:01

For anyone wondering where my occasional blog on raku has been for a couple of months – sorry. I have been busy wrestling with, and losing to, the first released version of my Physics::Measure module.

Of course, this is all a bit overshadowed by the name change from perl6 to raku. I was skeptical on this, but did not have a strong opinion either way. So kudos to the folks who thrashed this out and I am looking forward to a naissance. For now, I have decided to keep my nickname ‘p6steve’ – I enjoy the resonance between P6 and P–sics and that is my niche. No offence intended to either camp.

My stated aim (blogs passim) is to create a set of physical units that makes sense for high school education. To me, inspired by the perl5 Physics::Unit module, that means not just core SI units for science class, but also old style units like sea miles, furlongs/fortnight and so on for geography and even history. As I started to roll out v0.0.3 of raku Physics::Unit, I thought it would be worthwhile to track a real-world high school education resource, namely OpenStax CNX. As I came upon this passage, I had to take the firkin challenge on:

While there are numerous types of units that we are all familiar with, there are others that are much more obscure. For example, a firkin is a unit of volume that was once used to measure beer. One firkin equals about 34 liters. To learn more about nonstandard units, use a dictionary or encyclopedia to research different “weights and measures.” Take note of any unusual units, such as a barleycorn, that are not listed in the text. Think about how the unit is defined and state its relationship to SI units.

Disaster – I went back to the code for Physics::Unit and, blow me, could I figure out how to drop in a new Unit: the firkin??…. nope!! Why not? Well Physics::Unit v:0.0.3 was impenetrable even to me, the author. Statistically it has 638 lines of code alongside 380 lines of heredoc data. Practically, while it passes all the tests 100%, it is not a practical, maintainable code base.

How did we get here? Well I plead guilty to being an average perl5 coder who really loves the expressivity that Larry offers … but a newbie to raku. I wanted to take on Physics::Measure to learn raku. Finally, I have started to get raku – but it has taken me a couple of years to get to this point!

My best step now – bin the code. I have dumped my original effort, gone back to the original perl5 Physics::Unit module source and transposed it to raku. The result: 296 lines of tight code alongside the same 380 lines of heredoc – a reduction of 53%! And a new found respect for the design skill of my perl5 forbears.

I am aiming to release as v0.0.7 in April 2020.

Jo Christian Oterhals: By the way, you could replace … * with Inf or the unicode infinity symbol ∞ to make it more…

Published by Jo Christian Oterhals on 2019-11-24T19:25:11

By the way, you could replace … * with Inf or the unicode infinity symbol ∞ to make it more readable, i.e.

my @a = 1, 1, * + * … ∞;

— — or — —

my @a = 1, 1, * + * … Inf;

Jo Christian Oterhals: As I understand this, * + * … * means the following:

Published by Jo Christian Oterhals on 2019-11-24T10:20:11

As I understand this, * + * … * means the following:

First— * + * sums the two previous elements in the list. … * tells this to do this an infinite number of times; i.e.

1, 1, (1 + 1)

1, 1, 2, (1 + 2)

1, 1, 2, 3, (2 + 3)

1, 1, 2, 3, 5, (3 + 5)

1, 1, 2, 3, 5, 8, (5 + 8), etc.

… three dots means that it does it lazy, i.e. that it does not generate an element before you call it. This can be good for large lists that are computationally heavy.

my Timotimo \this: Introducing: The Heap Snapshot UI

Published by Timo Paulssen on 2019-10-25T23:12:36

Introducing: The Heap Snapshot UI

Hello everyone! In the last report I said that just a little bit of work on the heap snapshot portion of the UI should result in a useful tool.

Introducing: The Heap Snapshot UI
Photo by Sticker Mule / Unsplash

Here's my report for the first useful pieces of the Heap Snapshot UI!

Last time you already saw the graphs showing how the number of instances of a given type or frame grow and shrink over the course of multiple snapshots, and how new snapshots can be requested from the UI.

The latter now looks a little bit different:

Introducing: The Heap Snapshot UI

Each snapshot now has a little button for itself, they are in one line instead of each snapshot having its own line, and the progress bar has been replaced with a percentage and a little "spinner".

There are multiple ways to get started navigating the heap snapshot. Everything is reachable from the "Root" object (this is the norm for reachability-based garbage collection schemes). You can just click through from there and see what you can find.

Another way is to look at the Type & Frame Lists, which show every type or frame along with the number of instances that exist in the heap snapshot, and the total size taken up by those objects.

Type & Frame Lists

Introducing: The Heap Snapshot UI

Clicking on a type, or the name or filename of a frame leads you to a list of all objects of that type, all frames with the given name, or all frames from the given file. They are grouped by size, and each object shows up as a little button with the ID:

Introducing: The Heap Snapshot UI

Clicking any of these buttons leads you to the Explorer.


Here's a screenshot of the explorer to give you an idea of how the parts go together that I explain next:

Introducing: The Heap Snapshot UI

The explorer is split into two identical panels, which allows you to compare two objects, or to explore in multiple directions from one given object.

There's an "Arrow to the Right" button on the left pane and an "Arrow to the Left" button on the right pane. These buttons make the other pane show the same object that the one pane currently shows.

On the left of each pane there's a "Path" display. Clicking the "Path" button in the explorer will calculate the shortest path to reach the object from the root. This is useful when you've got an object that you would expect to have already been deleted by the garbage collector, but for some reason is still around. The path can give the critical hint to figure out why it's still around. Maybe one phase of the program has ended, but something is still holding on to a cache that was put in as an optimization, and that still has your object in it? That cache in question would be on the path for your object.

The other half of each panel shows information about the object: Displayed at the very top is whether it is an object, a type object, an STable, or a frame.

Below that there is an input field where you can enter any ID belonging to a Collectable (the general term encompassing types, type objects, stables, and frames) to have a look.

The "Kind" field needs to have the number values replaced with human-readable text, but it's not the most interesting thing anyway.

The "Size" of the Collectable is split into two parts. One is the fixed size that every instance of the given type has. The other is any extra data an instance of this type may have attached to it, that's not a Collectable itself. This would be the case for arrays and hashes, as well as buffers and many "internal" objects.

Finally, the "References" field shows how many Collectables are referred to by the Collectable in question (outgoing references) and how many Collectables reference this object in question.

Below that there are two buttons, Path and Network. The former was explained further above, and the latter will get its own little section in this blog post.

Finally, the bottom of the panel is dedicated to a list of all references - outgoing or incoming - grouped by what the reference means, and what type it references.

Introducing: The Heap Snapshot UI

In this example you see that the frame of the function display from elementary2d.p6 on line 87 references a couple of variables ($_, $tv, &inv), the frame that called this frame (step), an outer frame (MAIN), and a code object. The right pane shows the incoming references. For incoming references, the name of the reference isn't available (yet), but you can see that 7 different objects are holding a reference to this frame.

Network View

The newest part of the heap snapshot UI is the Network View. It allows the user to get a "bird's eye" view of many objects and their relations to each other.

Here's a screenshot of the network view in action:

Introducing: The Heap Snapshot UI

The network view is split into two panes. The pane on the left lists all types present in the network currently. It allows you to give every type a different symbol, a different color, or optionally make it invisible. In addition, it shows how many of each type are currently in the network display.

The right pane shows the objects, sorted by how far they are from the root (object 0, the one in Layer 0, with the frog icon).

Each object has one three-piece button. On the left of the button is the icon representing the type, in the middle is the object ID for this particular object, and on the right is an icon for the "relation" this object has to the "selected" object:

This view was generated for object 46011 (in layer 4, with a hamburger as the icon). This object gets the little "map marker pin" icon to show that it's the "center" of the network. In layers for distances 3, 2, and 1 there is one object each with a little icon showing two map marker pins connected with a squiggly line. This means that the object is part of the shortest path to the root. The third kind of icon is an arrow pointing from the left into a square that's on the right. Those are objects that refer to the selected object.

There is also an icon that's the same but the arrow goes outwards from the square instead of inwards. Those are objects that are referenced by the selected object. However, there is currently no button to have every object referenced by the selected object put into the network view. This is one of the next steps I'll be working on.

Customizing the colors and visibility of different types can give you a view like this:

Introducing: The Heap Snapshot UI

And here's a view with more objects in it:

Introducing: The Heap Snapshot UI

Interesting observations from this image:

Next Steps

You have no doubt noticed that the buttons for collectables are very different between the network view and the type/frame lists and the explorer. The reason for that is that I only just started with the network view and wanted to display more info for each collectable (namely the icons to the left and right) and wanted them to look nicer. In the explorer there are sometimes thousands of objects in the reference list, and having big buttons like in the network view could be difficult to work with. There'll probably have to be a solution for that, or maybe it'll just work out fine in real-world use cases.

On the other hand, I want the colors and icons for types to be available everywhere, so that it's easier to spot common patterns across different views and to mark things you're interested in so they stand out in lists of many objects. I was also thinking of a "bookmark this object" feature for similar purposes.

Before most of that, the network viewer will have to become "navigable", i.e. clicking on an object should put it in the center, grab the path to the root, grab incoming references, etc.

There also need to be ways to handle references you're not (or no longer) interested in, especially when you come across an object that has thousands of them.

But until then, all of this should already be very useful!

Here's the section about the heap snapshot profiler from the original grant proposal:

Looking at the list, it seems like the majority of intended features are already available or will be very soon!

Easier Installation

Until now the user had to download nodejs and npm along with a whole load of javascript libraries in order to compile and bundle the javascript code that powers the frontend of moarperf.

Fortunately, it was relatively easy to get travis-ci to do the work automatically and upload a package with the finished javascript code and the backend code to github.

You can now visit the releases page on github to grab a tarball with all the files you need! Just install all backend dependencies with zef install --deps-only . and run service.p6!

And with that I'm already done for this report!

It looks like the heap snapshot portion of the grant is quite a bit smaller than the profiler part, although a lot of work happened in moarvm rather than the UI. I'm glad to see rapid progress on this.

I hope you enjoyed this quick look at the newest pieces of moarperf!
  - Timo

my Timotimo \this: Progressing with progress.

Published by Timo Paulssen on 2019-09-12T19:50:18

Progressing with progress.

It has been a while since the last progress report, hasn't it?

Over the last few months I've been focusing on the MoarVM Heap Snapshot Profiler. The new format that I explained in the last post, "Intermediate Progress Report: Heap Snapshots", is available in the master branch of MoarVM, and it has learned a few new tricks, too.

The first thing I usually did when opening a Heap Snapshot in the heapanalyzer (the older command-line based one) was to select a Snapshot, ask for the summary, and then for the top objects by size, top objects by count, top frames by size, and/or top frames by count to see if anything immediately catches my eye. In order to make more sense of the results, I would repeat those commands for one or more other Snapshots.

Snapshot  Heap Size          Objects  Type Objects  STables  Frames  References  
========  =================  =======  ============  =======  ======  ==========  
0         46,229,818 bytes   331,212  686           687      1,285   1,146,426   
25        63,471,658 bytes   475,587  995           996      2,832   1,889,612   
50        82,407,275 bytes   625,958  1,320         1,321    6,176   2,741,066   
75        97,860,712 bytes   754,075  1,415         1,416    6,967   3,436,141   
100       113,398,840 bytes  883,405  1,507         1,508    7,837   4,187,184   

Snapshot  Heap Size          Objects    Type Objects  STables  Frames  References  
========  =================  =========  ============  =======  ======  ==========  
125       130,799,241 bytes  1,028,928  1,631         1,632    9,254   5,036,284   
150       145,781,617 bytes  1,155,887  1,684         1,685    9,774   5,809,084   
175       162,018,588 bytes  1,293,439  1,791         1,792    10,887  6,602,449 

Realizing that the most common use case should be simple to achieve, I first implemented a command summary all and later a command summary every 10 to get the heapanalyzer to give the summaries of multiple Snapshots at once, and to be able to get summaries (relatively) quickly even if there's multiple hundreds of snapshots in one file.

Sadly, this still requires the parser to go through the entire file to do the counting and adding up. That's obviously not optimal, even though this is an Embarrassingly Parallel task, and it can use every CPU core in the machine you have, it's still a whole lot of work just for the summary.

For this reason I decided to shift the responsibility for this task to MoarVM itself, to be done while the snapshot is taken. In order to record everything that goes into the Snapshot, MoarVM already differentiates between Object, Type Object, STable, and Frame, and it stores all references anyway. I figured it shouldn't have a performance impact to just add up the numbers and make them available in the file.

The result is that the summary table as shown further above is available only milliseconds after loading the heap snapshot file, rather than after an explicit request and sometimes a lengthy wait period.

The next step was to see if top objects by size and friends could be made faster in a similar way.

I decided that adding an optional "statistics collection" feature inside of MoarVM's heap snapshot profiler would be worthwhile. If it turns out that the performance impact of summing up sizes and counts on a per-type and per-frame basis makes capturing a snapshot too slow, it could be turned off.

Frontend work

> snapshot 50
Loading that snapshot. Carry on...
> top frames by size
Wait a moment, while I finish loading the snapshot...

Name                                  Total Bytes    
====================================  =============  
finish_code_object (World.nqp:2532)   201,960 bytes  
moarop_mapper (QAST.nqp:1764)         136,512 bytes  
!protoregex (QRegex.nqp:1625)         71,760 bytes   
new_type (Metamodel.nqp:1345)         40,704 bytes   
statement (Perl6-Grammar.nqp:951)     35,640 bytes   
termish (Perl6-Grammar.nqp:3641)      34,720 bytes   
<anon> (Perl6-BOOTSTRAP.c.nqp:1382)   29,960 bytes   
EXPR (Perl6-Grammar.nqp:3677)         27,200 bytes   
<mainline> (Perl6-BOOTSTRAP.c.nqp:1)  26,496 bytes   
<mainline> (NQPCORE.setting:1)        25,896 bytes   
EXPR (NQPHLL.nqp:1186)                25,760 bytes   
<anon> (<null>:1)                     25,272 bytes   
declarator (Perl6-Grammar.nqp:2189)   23,520 bytes   
<anon> (<null>:1)                     22,464 bytes   
<anon> (<null>:1)                     22,464 bytes   

Showing the top objects or frame for a single snapshot is fairly straight-forward in the commandline based UI, but how would you display how a type or frame develops its value across many snapshots?

Instead of figuring out the best way to display this data in the commandline, I switched focus to the Moarperf Web Frontend. The most obvious way to display data like this is a Line Graph, I believe. So that's what we have now!

Progressing with progress.

And of course you also get to see the data from each snapshot's Summary in graph format:

Progressing with progress.

And now for the reason behind this blog post's Title.

Progress Notifications

Using Jonathan's module Concurrent::Progress (with a slight modification) I sprinkled the code to parse a snapshot with matching calls of .increment-target and .increment. The resulting progress reports (throttled to at most one per second) are then forwarded to the browser via the WebSocket connection that already delivers many other bits of data.

The result can be seen in this tiny screencast:

Progressing with progress.

The recording is rather choppy because the heapanalyzer code was using every last drop of performance out of my CPU while it was trying to capture my screen.

There's obviously a lot still missing from the heap snapshot analyzer frontend GUI, but I feel like this is a good start, and even provides useful features already. The graphs for the summary data are nicer to read than the table in the commandline UI, and it's only in this UI that you can get a graphical representation of the "highscore" lists.

I think a lot of the remaining features will already be useful after just the initial pieces are in place, so a little work should go a long way.

Bits and Bobs

I didn't spend the whole time between the last progress report and now to work directly on the features shown here. Apart from Life Intervening™, I worked on fixing many frustrating bugs related to both of the profilers in MoarVM. I added a small subsystem I call VMEvents that allows user code to be notified when GC runs happen and other interesting bits from inside MoarVM itself. And of course I've been assisting other developers by answering questions and looking over their contributions. And of course there's the occasional video-game-development related experiment, for example with the GTK Live Coding Tool.

Finally, here's a nice little screencap of that same tool displaying a hilbert curve:

Progressing with progress.

That's already everything I have for this time. A lot has (had to) happen behind the scenes to get to this point, but now there was finally something to look at (and touch, if you grab the source code and go through the needlessly complicated build process yourself).

Thank you for reading and I hope to see you in the next one!
  - Timo

Jo Christian Oterhals: You’re right.

Published by Jo Christian Oterhals on 2019-08-25T18:51:13

You’re right. I’ll let the article stand as it is and reflect my ignorance as it was when I wrote it :-)

Jo Christian Oterhals: Perl 6 small stuff #21: it’s a date! …or: learn from an overly complex solution to a simple task

Published by Jo Christian Oterhals on 2019-07-31T13:23:17

Perl 6 small stuff #21: it’s a date! …or: learn from an overly complex solution to a simple task

This week’s Perl Weekly Challenge (#19) has two tasks. The first is to find all months with five weekends in the years from 1900 through 2019. The second is to program an implementation of word wrap using the greedy algorithm.

Both are pretty straight-forward tasks, and the solutions to them can (and should) be as well. This time, however, I’m also going to do the opposite and incrementally turn the easy solution into an unnecessarily complex one. Because in this particular case we can learn more by doing things the unnecessarily hard way. So this post will take a look at Dates and date manipulation in Perl 6, using PWC #19 task 1 as an example:

Write a script to display months from the year 1900 to 2019 where you find 5 weekends i.e. 5 Friday, 5 Saturday and 5 Sunday.

Let’s start by finding five-weekend months the easy way:

#!/usr/bin/env perl6
say join "\n", grep *.day-of-week == 5, map { |$_, 1 }, do 1900..2019 X 1,3,5,7,8,10,12;

The algorithm for figuring this out is simple. Given the prerequisite that there must be five occurrences of not only Saturday and Sunday but also Friday, you A) *must* have 31 days to cram five weekends into. And when you know that you’ll also see that B) the last day of the month MUST be a Sunday and C) the first day of the month MUST be a Friday (you don’t have to check for both; if A is true and B is true, C is automatically true too).

The code above implements B and employs a few tricks. You read it from right to left (unless you write it from left to right, like this… say do 1900..2019 X 1,3,5,7,8,10,12 ==> map { |$_, 1 } ==> grep *.day-of-week == 5 ==> join “\n”; )

Using the X operator I create a cross product of all the years in the range 1900–2019 and the months 1, 3, 5, 7, 8, 10, 12 (31-day months). In return I get a sequence containing all year-month pairs of the period.

The map function iterates through the Seq. There it instantiates a Date object. A little song and dance is necessary: As takes three unnamed integer parameters, year, month and day, I have to do something to what I have — a Pair with year and month. I therefore use the | operator to “explode” the pair into two integer parameters for year and month.

You can always use this for calling a sub routine with fixed parameters, using an array with parameter values rather than having separate variables for each parameter. The code below exemplifies usage:

my @list = 1, 2, 3;
sub explode-parameters($one, $two, $three) { 
…do something…
#traditional call 
explode-parameters(@list[0], @list[1], @list[2]);
# …or using | 

Back to the business at hand — the .grep filters out the months where the 1st is a Friday, and those are our 5 weekend months. So the output of the one-liner above looks something like this:


This is a solution as good as any, and if a solution was all we wanted, we could have stopped here. But using this task as an example I want to explore ways to utilise the Date class. Example: The one-liner above does the job, but strictly speaking it doesn’t output the months but the first day of those months. Correcting this is easy, because the Date class supports something called formatters and use the sprintf syntax. To do this you utilise the named parameter “formatter” when instantiating the object.

say join "\n", grep *.day-of-week == 5, map { |$_, 1, formatter => { sprintf "%04d/%02d", .year, .month } }, do 1900..2019 X 1,3,5,7,8,10,12;

Every time a routine pulls a stringified version of the date, the formatter object is invoked. In our case the output has been changed to…


Formatters are powerful. Look into them.

Now to the overly complex solution. This is the unthinking programmer’s solution, as we don’t suppose anything. The program isn’t told that 5 weekend months only can occur on 31 day months. It doesn’t know that the 1st of such months must be a Friday. All it knows is that if the last day of the month is not Sunday, it figures out the date of the last Sunday (this is not very relevant when counting three-day weekends, but could be if you want to find Saturday+Sunday weekends, or only Sundays).

#!/usr/bin/env perl6
my $format-it = sub ($self) {
sprintf "%04d month %02d", .year, .month given $self;
sub MAIN(Int :$from-year = 1900, Int :$to-year where * > $from-year = 2019, Int :$weekend-length where * ~~ 1..3 = 3) {
my $date-loop =$from-year, 1, 1, formatter => $format-it);
while ($date-loop.year <= $to-year) {
my $date = $date-loop.later(day => $date-loop.days-in-month);
$date = $date.truncated-to('week').pred if $ != 7;
my @weekend = do for 0..^$weekend-length -> $w {
$date.earlier(day => $w).weekday-of-month;
say $date-loop if ([+] @weekend) / @weekend == 5;
$date-loop = $date-loop.later(:1month);

This code can solve the task both for three day weekends, but also for weekends consisting of Saturday + Sunday, as well as only Sundays. You control that with the command line parameter weekend-length=[1..3].

This code finds the last Sunday of each month and counts whether it has occured five times that month. It does the same for Saturday (if weekend-length=2) and Friday (if weekend-length=3). Like this:

my @weekend = do for 0..^$weekend-length -> $w { 
$date.earlier(day => $w).weekday-of-month;

The code then calculcates the average weekday-of-month for these three days like this:

say $date-loop if ([+] @weekend) / @weekend == 5;

This line uses the reduction operator [+] on the @weekend list to find the sum of all elements. That sum is divided by the number of elements. If the result is 5, then you have a five day weekend.

As for fun stuff to do with the Date object:

.later(day|month|year => Int) — adds the given number of time units to the current date. There’s also an earlier method for subtracting.

.days-in-months — tells you how many days there are in the current month of the Date object. The value may be 31, 30, 29 (february, leap year) or 28 (february).

.truncated-to(week|month|day|year) — rolls the date back to the first day of the week, month, day or year.

.weekday-of-month — figures out what day of week the current date is and calculates how many of that day there has been so far in that month.

Apart from this you’ll see that I added the formatter in a different way this time. This is probably cleaner looking and easier to maintain.

In the end this post maybe isn’t about dates and date manipulation at all, but rather is a call for all of us to use the documentation even more. It’s often I think that Perl 6 should have a function for x, y or z — .weekday-of-month is one such example — and the documentation tells me that it actually does!

It’s very easy to pick up Perl 6 and program it as you would have programmed Perl 5 or other languages you know well. But the documentation has lots of info of things you didn’t have before and that will make programming easier and more fun when you’ve learnt about them.

I guess you don’t need and excuse to delve into the docs, but if you do the Perl Weekly Challenge is an excellent excuse for spending time in the docs!

Jo Christian Oterhals: You have several options besides do. You could use parenthesises instead, like this:

Published by Jo Christian Oterhals on 2019-08-01T07:34:44

You have several options besides do. You could use parenthesises instead, like this:

say join "\n", grep *.day-of-week == 5, map { |$_, 1 }, (1900..2019 X 1,3,5,7,8,10,12);

In this case I just thought the code looked better without parenthesises, so I chose to use do instead. The docs has a couple of sentences about this option here.

BTW, I chose the form blah — the colon variant — instead of also because I wanted to avoid parenthesises. This freedom to chose is the essence of Perl’s credo “There’s more than one way to do it” I guess.

Rant: This freedom of choice is the best thing with Perl 6, but it has a downside too. Code can be harder to understand if you’re not familiar with the variants another programmer has made. This is a part of the Perls’s reputation of being write-only languages.

It won’t happen, but personally I’d like to see someone analyse real-world Perl 6 code (public repositories on Github for instance) and figure what forms are used most. The analysis could have been used to clean house — figuring out what forms to keep and which should be removed. Perl 6 would become a smaller — and arguably easier — language while staying just as powerful as it is today.

my Timotimo \this: A Close Look At Controlling The MoarVM Profiler

Published by Timo Paulssen on 2019-05-22T14:41:13

A Close Look At Controlling The MoarVM Profiler

This is slightly tangential to the Rakudo Perl 6 Performance Analysis Tooling grant from The Perl Foundation, but it does interact closely with it, so this is more or less a progress report.

The other day, Jonathan Worthington of Edument approached me with a little paid Job: Profiling very big codebases can be tedious, especially if you're interested in only one specific fraction of the program. That's why I got the assignment to make the profiler "configurable". On top of that, the Comma IDE will get native access to this functionality.

The actual request was just to allow specifying whether individual routines should be included in the profile via a configuration file. That would have been possible with just a simple text file that contains one filename/line number/routine name per line. However, I have been wanting something in MoarVM that allows much more involved configuration for many different parts of the VM, not just the profiler.

A Close Look At Controlling The MoarVM Profiler
Obligatory cat photo.

That's why I quickly developed a small and simple "domain-specific language" for this and similar purposes.

The language had a few requirements:

There's also some things that aren't necessary:

While thinking about what exactly I should build – before I eventually settled on building a "programming language" for this task – I bounced back and forth between the simplest thing that could possibly work (for example, a text file with a list of file/line/name entries) and the most powerful thing that I can implement in a sensible timeframe (for example, allowing a full NQP script). A very important realization was that as long as I require the first line to identify what "version" of configuration program it is, I could completely throw away the current design and put something else instead, if the need ever arises. That allowed me to actually commit to a design that looked at least somewhat promising. And so I got started on what I call confprog.

Here's an example program. It doesn't do very much, but shows what it's about in general:

version = 1
entry profiler_static:
log =;
profile = eq "calculate-strawberries"
profile |= eq "CustomCode.pm6"

The entry decides which stage of profiling this program is applied to. In this case, the profiler_static means we're seeing a routine for the first time, before it is actually entered. That's why only the information every individual invocation of the frame in question shares is available via the variable sf, which stands for Static Frame. The Static Frame also allows access to the Compilation Unit (cu) that it was compiled into, which lets us find the filename.

The first line that actually does something assigns a value to the special variable log. This will output the name of the routine the program was invoked for.

The next line will turn on profiling only if the name of the routine is "calculate-strawberries". The line after that will also turn on profiling if the filename the routine is from is "CustomCode.pm6".

Apart from profiler_static, there are a couple more entry points available.

The syntax is still subject to change, especially before the whole thing is actually in a released version of MoarVM.

There is a whole lot of other things I could imagine being of interest in the near or far future. One place I'm taking inspiration from is where "extended Berkeley Packet Filter" (eBPF for short) programs are being used in the linux kernel and related pieces of software:

Oversimplifying a bit, BPF was originally meant for tcpdump so that the kernel doesn't have to copy all data over to the userspace process so that the decision what is interesting or not can be made. Instead, the kernel receives a little piece of code in the special BPF language (or bytecode) and can calculate the decision before having to copy anything.

eBPF programs can now also be used as a complex ruleset for sandboxing processes (with "seccomp"), to decide how network packets are to be routed between sockets (that's probably for Software Defined Networks?), what operations a process may perform on a particular device, whether a trace point in the kernel or a user-space program should fire, and so on.

So what's the status of confprog? I've written a parser and compiler that feeds confprog "bytecode" (which is mostly the same as regular moarvm bytecode) to MoarVM. There's also a preliminary validator that ensures the program won't do anything weird, or crash, when run. It is much too lenient at the moment, though. Then there's an interpreter that actually runs the code. It can already take an initial value for the "decision output value" (great name, isn't it) and it will return whatever value the confprog has set when it runs. The heap snapshot profiler is currently the only part of MoarVM that will actually try to run a confprog, and it uses the value to decide whether to take a snapshot or not.

Next up on the list of things to work on:

Apart from improvements to the confprog programming language, the integration with MoarVM lacks almost everything, most importantly installing a confprog for the profiler to decide whether a frame should be profiled (which was the original purpose of this assignment).

After that, and after building a bit of GUI for Comma, the regular grant work can resume: Flame graphs are still not visible on the call graph explorer page, and heap snapshots can't be loaded into moarperf yet, either.

Thanks for sticking with me through this perhaps a little dry and technical post. I hope the next one will have a little more excitement! And if there's interest (which you can signal by sending me a message on irc, or posting on reddit, or reaching me via twitter @loltimo, on the Perl 6 discord server etc) I can also write a post on how exactly the compiler was made, and how you can build your own compiler with Perl 6 code. Until then, you can find Andrew Shitov's presentations about making tiny languages in Perl 6 on youtube.

I hope you have a wonderful day; see you in the next one!
  - Timo

PS: I would like to give a special shout-out to Nadim Khemir for the wonderful Data::Dump::Tree module which made it much easier to see what my parser was doing. Here's some example output from another simple confprog program:

[6] @0
0 = .Label .Node @1
│ ├ $.name = heapsnapshot.Str
│ ├ $.type = entrypoint.Str
│ ├ $.position is rw = Nil
│ └ @.children = [0] @2
1 = .Op .Node @3
│ ├ $.op = =.Str
│ ├ $.type is rw = Nil
│ └ @.children = [2] @4
│   ├ 0 = .Var .Node @5
│   │ ├ $.name = log.Str
│   │ └ $.type = CPType String :stringy  @6
│   └ 1 = String Value ("we're in") @7
2 = .Op .Node @8
│ ├ $.op = =.Str
│ ├ $.type is rw = Nil
│ └ @.children = [2] @9
│   ├ 0 = .Var .Node @10
│   │ ├ $.name = log.Str
│   │ └ $.type = CPType String :stringy  §6
│   └ 1 = .Op .Node @12
│     ├ $.op = getattr.Str
│     ├ $.type is rw = CPType String :stringy  §6
│     └ @.children = [2] @14
│       ├ 0 = .Var .Node @15
│       │ ├ $.name = sf.Str
│       │ └ $.type = CPType MVMStaticFrame  @16
│       └ 1 = name.Str
3 = .Op .Node @17
│ ├ $.op = =.Str
│ ├ $.type is rw = Nil
│ └ @.children = [2] @18
│   ├ 0 = .Var .Node @19
│   │ ├ $.name = log.Str
│   │ └ $.type = CPType String :stringy  §6
│   └ 1 = String Value ("i am the confprog and i say:") @21
4 = .Op .Node @22
│ ├ $.op = =.Str
│ ├ $.type is rw = Nil
│ └ @.children = [2] @23
│   ├ 0 = .Var .Node @24
│   │ ├ $.name = log.Str
│   │ └ $.type = CPType String :stringy  §6
│   └ 1 = String Value ("  no heap snapshots for you my friend!") @26
5 = .Op .Node @27
$.op = =.Str
$.type is rw = Nil
@.children = [2] @28
0 = .Var .Node @29
    │ ├ $.name = snapshot.Str
    │ └ $.type = CPType Int :numeric :stringy  @30
1 = Int Value (0) @31

Notice how it shows the type of most things, like name.Str, as well as cross-references for things that appear multiple times, like the CPType String. Particularly useful is giving your own classes methods that specify exactly how they should be displayed by DDT. Love It! Rakudo Star Release 2019.03

Published on 2019-03-30T00:00:00

brrt to the future: Reverse Linear Scan Allocation is probably a good idea

Published by Bart Wiegmans on 2019-03-21T15:52:00

Hi hackers! Today First of all, I want to thank everybody who gave such useful feedback on my last post.  For instance, I found out that the similarity between the expression JIT IR and the Testarossa Trees IR is quite remarkable, and that they have a fix for the problem that is quite different from what I had in mind.

Today I want to write something about register allocation, however. Register allocation is probably not my favorite problem, on account of being both messy and thankless. It is a messy problem because - aside from being NP-hard to solve optimally - hardware instruction sets and software ABI's introduce all sorts of annoying constraints. And it is a thankless problem because the case in which a good register allocator is useful - for instance, when there's lots of intermediate values used over a long stretch of code - are fairly rare. Much more common are the cases in which either there are trivially sufficient registers, or ABI constraints force a spill to memory anyway (e.g. when calling a function, almost all registers can be overwritten).

So, on account of this being not my favorite problem, and also because I promised to implement optimizations in the register allocator, I've been researching if there is a way to do better. And what better place to look than one of the fastest dynamic language implementations arround, LuaJIT? So that's what I did, and this post is about what I learned from that.

Truth be told, LuaJIT is not at all a learners' codebase (and I don't think it's author would claim this). It uses a rather terse style of C and lots and lots of preprocessor macros. I had somewhat gotten used to the style from hacking dynasm though, so that wasn't so bad. What was more surprising is that some of the steps in code generation that are distinct and separate in the MoarVM JIT - instruction selection, register allocation and emitting bytecode - were all blended together in LuaJIT. Over multiple backend architectures, too. And what's more - all these steps were done in reverse order - from the end of the program (trace) to the beginning. Now that's interesting...

I have no intention of combining all phases of code generation like LuaJIT has. But processing the IR in reverse seems to have some interesting properties. To understand why that is, I'll first have to explain how linear scan allocation currently works in MoarVM, and is most commonly described:

  1. First, the live ranges of program values are computed. Like the name indicates, these represent the range of the program code in which a value is both defined and may be used. Note that for the purpose of register allocation, the notion of a value shifts somewhat. In the expression DAG IR, a value is the result of a single computation. But for the purposes of register allocation, a value includes all its copies, as well as values computed from different conditional branches. This is necessary because when we actually start allocating registers, we need to know when a value is no longer in use (so we can reuse the register) and how long a value will remain in use -
  2. Because a value may be computed from distinct conditional branches, it is necessary to compute the holes in the live ranges. Holes exists because if a value is defined in both sides of a conditional branch, the range will cover both the earlier (in code order) branch and the later branch - but from the start of the later branch to its definition that value doesn't actually exist. We need this information to prevent the register allocator from trying to spill-and-load a nonexistent value, for instance.
  3. Only then can we allocate and assign the actual registers to instructions. Because we might have to spill values to memory, and because values now can have multiple definitions, this is a somewhat subtle problem. Also, we'll have to resolve all architecture specific register requirements in this step.
In the MoarVM register allocator, there's a fourth step and a fifth step. The fourth step exists to ensure that instructions conform to x86 two-operand form (Rather than return the result of an instruction in a third register, x86 reuses one of the input registers as the output register. E.g. all operators are of the form a = op(a, b)  rather than a = op(b, c). This saves on instruction encoding space). The fifth step inserts instructions that are introduced by the third step; this is done so that each instruction has a fixed address in the stream while the stream is being processed.

Altogether this is quite a bit of complexity and work, even for what is arguably the simplest correct global register allocation algorithm. So when I started thinking of the reverse linear scan algorithm employed by LuaJIT, the advantages became clear:
There are downsides as well, of course. Not knowing exactly how long a value will be live while processing it may cause the algorithm to make worse choices in which values to spill. But I don't think that's really a great concern, since figuring out the best possible value is practically impossible anyway, and the most commonly cited heuristic - evict the value that is live furthest in the future, because this will release a register over a longer range of code, reducing the chance that we'll need to evict again - is still available. (After all, we do always know the last use, even if we don't necessarily know the first definition).

Altogether, I'm quite excited about this algorithm; I think it will be a real simplification over the current implementation. Whether that will work out remains to be seen of course. I'll let you know!

brrt to the future: Something about IR optimization

Published by Bart Wiegmans on 2019-03-17T06:23:00

Hi hackers! Today I want to write about optimizing IR in the MoarVM JIT, and also a little bit about IR design itself.

One of the (major) design goals for the expression JIT was to have the ability to optimize code over the boundaries of individual MoarVM instructions. To enable this, the expression JIT first expands each VM instruction into a graph of lower-level operators. Optimization then means pattern-matching those graphs and replacing them with more efficient expressions.

As a running example, consider the idx operator. This operator takes two inputs (base and element) and a constant parameter scale and computes base+element*scale. This represents one of the operands of an  'indexed load' instruction on x86, typically used to process arrays. Such instructions allow one instruction to be used for what would otherwise be two operations (computing an address and loading a value). However, if the element of the idx operator is a constant, we can replace it instead with the addr instruction, which just adds a constant to a pointer. This is an improvement over idx because we no longer need to load the value of element into a register. This saves both an instruction and valuable register space.

Unfortunately this optimization introduces a bug. (Or, depending on your point of view, brings an existing bug out into the open). The expression JIT code generation process selects instructions for subtrees (tile) of the graph in a bottom-up fashion. These instructions represent the value computed or work performed by that subgraph. (For instance, a tree like (load (addr ? 8) 8) becomes mov ?, qword [?+8]; the question marks are filled in during register allocation). Because an instruction is always represents a tree, and because the graph is an arbitrary directed acyclic graph, the code generator projects that graph as a tree by visiting each operator node only once. So each value is computed once, and that computed value is reused by all later references.

It is worth going into some detail into why the expression graph is not a tree. Aside from transformations that might be introduced by optimizations (e.g. common subexpression elimination), a template may introduce a value that has multiple references via the let: pseudo-operator. See for instance the following (simplified) template:

(let: (($foo (load (local))))
    (add $foo (sub $foo (const 1))))

Both ADD and SUB refer to the same LOAD node

In this case, both references to $foo point directly to the same load operator. Thus, the graph is not a tree. Another case in which this occurs is during linking of templates into the graph. The output of an instruction is used, if possible, directly as the input for another instruction. (This is the primary way that the expression JIT can get rid of unnecessary memory operations). But there can be multiple instructions that use a value, in which case an operator can have multiple references. Finally, instruction operands are inserted by the compiler and these can have multiple references as well.

If each operator is visited only once during code generation, then this may introduce a problem when combined with another feature - conditional expressions. For instance, if two branches of a conditional expression both refer to the same value (represented by name $foo) then the code generator will only emit code to compute its value when it encounters the first reference. When the code generator encounters $foo for the second time in the other branch, no code will be emitted. This means that in the second branch, $foo will effectively have no defined value (because the code in the first branch is never executed), and wrong values or memory corruption is then the predictable result.

This bug has always existed for as long as the expression JIT has been under development, and in the past the solution has been not to write templates which have this problem. This is made a little easier by a feature the let: operator, in that it inserts a do operator which orders the values that are declared to be computed before the code that references them. So that this is in fact non-buggy:

(let: (($foo (load (local))) # code to compute $foo is emitted here
  (if (...)  
    (add $foo (const 1)) # $foo is just a reference
    (sub $foo (const 2)) # and here as well

The DO node is inserted for the LET operator. It ensures that the value of the LOAD node is computed before the reference in either branch

Alternatively, if a value $foo is used in the condition of the if operator, you can also be sure that it is available in both sides of the condition.

All these methods rely on the programmer being able to predict when a value will be first referenced and hence evaluated. An optimizer breaks this by design. This means that if I want the JIT optimizer to be successful, my options are:

  1. Fix the optimizer so as to not remove references that are critical for the correctness of the program
  2. Modify the input tree so that such references are either copied or moved forward
  3. Fix the code generator to emit code for a value, if it determines that an earlier reference is not available from the current block.
In other words, I first need to decide where this bug really belongs - in the optimizer, the code generator, or even the IR structure itself. The weakness of the expression IR is that expressions don't really impose a particular order. (This is unlike the spesh IR, which is instruction-based, and in which every instruction has a 'previous' and 'next' pointer). Thus, there really isn't a 'first' reference to a value, before the code generator introduces the concept. This is property is in fact quite handy for optimization (for instance, we can evaluate operands in whatever order is best, rather than being fixed by the input order) - so I'd really like to preserve it. But it also means that the property we're interested in - a value is computed before it is used in, in all possible code flow paths - isn't really expressible by the IR. And there is no obvious local invariant that can be maintained to ensure that this bug does not happen, so any correctness check may have to check the entire graph, which is quite impractical.

I hope this post explains why this is such a tricky problem! I have some ideas for how to get out of this, but I'll reserve those for a later post, since this one has gotten quite long enough. Until next time!

brrt to the future: A short post about types and polymorphism

Published by Bart Wiegmans on 2019-01-14T13:34:00

Hi all. I usually write somewhat long-winded posts, but today I'm going to try and make an exception. Today I want to talk about the expression template language used to map the high-level MoarVM instructions to low-level constructs that the JIT compiler can easily work with:

This 'language' was designed back in 2015 subject to three constraints:
Recently I've been working on adding support for floating point operations, and  this means working on the type system of the expression language. Because floating point instructions operate on a distinct set of registers from integer instructions, a floating point operator is not interchangeable with an integer (or pointer) operator.

This type system is enforced in two ways. First, by the template compiler, which attempts to check if you've used all operands correctly. This operates during development, which is convenient. Second, by instruction selection, as there will simply not be any instructions available that have the wrong combinations of types. Unfortunately, that happens at runtime, and such errors so annoying to debug that it motivated the development of the first type checker.

However, this presents two problems. One of the advantages of the expression IR is that, by virtue of having a small number of operators, it is fairly easy to analyze. Having a distinct set of operators for each type would undo that. But more importantly, there are several MoarVM instructions that are generic, i.e. that operate on integer, floating point, and pointer values. (For example, the set, getlex and bindlex instructions are generic in this way). This makes it impossible to know whether its values will be integers, pointers, or floats.

This is no problem for the interpreter since it can treat values as bags-of-bits (i.e., it can simply copy the union MVMRegister type that holds all values of all supported types). But the expression JIT works differently - it assumes that it can place any value in a register, and that it can reorder and potentially skip storing them to memory. (This saves work when the value would soon be overwritten anyway). So we need to know what register class that is, and we need to have the correct operators to manipulate a value in the right register class.

To summarize, the problem is:
There are two ways around this, and I chose to use both. First, we know as a fact for each local or lexical value in a MoarVM frame (subroutine) what type it should have. So even a generic operator like set can be resolved to a specific type at runtime, at which point we can select the correct operators. Second, we can introduce generic operators of our own. This is possible so long as we can select the correct instruction for an operator based on the types of the operands.

For instance, the store operator takes two operands, an address and a value. Depending on the type of the value (reg or num), we can always select the correct instruction (mov or movsd). It is however not possible to select different instructions for the load operator based on the type required, because instruction selection works from the bottom up. So we need a special load_num operator, but a store_num operator is not necessary. And this is true for a lot more operators than I had initially thought. For instance, aside from the (naturally generic) do and if operators, all arithmetic operators and comparison operators are generic in this way.

I realize that, despite my best efforts, this has become a rather long-winded post anyway.....

Anyway. For the next week, I'll be taking a slight detour, and I aim to generalize the two-operand form conversion that is necessary on x86. I'll try to write a blog about it as well, and maybe it'll be short and to the point. See you later!

brrt to the future: New years post

Published by Bart Wiegmans on 2019-01-06T13:15:00

Hi everybody! I recently read jnthns Perl 6 new years resolutions post, and I realized that this was an excellent example to emulate. So here I will attempt to share what I've been doing in 2018 and what I'll be doing in 2019.

In 2018, aside from the usual refactoring, bugfixing and the like:
So 2019 starts with me trying to complete the goals specified in that grant request. I've already partially completed one goal (as explained in the interim report) - ensuring that register encoding works correctly for SSE registers in DynASM. Next up is actually ensuring support for SSE (and floating point) registers in the JIT, which is surprisingly tricky, because it means introducing a type system where there wasn't really one previously. I will have more to report on that in the near future.

After that, the first thing on my list is the support for irregular register requirements in the register allocator, which should open up the possibility of supporting various instructions.

I guess that's all for now. Speak you later!

6guts: My Perl 6 wishes for 2019

Published by jnthnwrthngtn on 2019-01-02T01:35:51

This evening, I enjoyed the New Year’s fireworks display over the beautiful Prague skyline. Well, the bit of it that was between me and the fireworks, anyway. Rather than having its fireworks display at midnight, Prague puts it at 6pm on New Year’s Day. That makes it easy for families to go to, which is rather thoughtful. It’s also, for those of us with plans to dig back into work the following day, a nice end to the festive break.

Prague fireworks over Narodni Divadlo

So, tomorrow I’ll be digging back into work, which of late has involved a lot of Perl 6. Having spent so many years working on Perl 6 compiler and runtime design and implementation, it’s been fun to spend a good amount of 2018 using Perl 6 for commercial projects. I’m hopeful that will continue into 2019. Of course, I’ll be continuing to work on plenty of Perl 6 things that are for the good of all Perl 6 users too. In this post, I’d like to share some of the things I’m hoping to work on or achieve during 2019.

Partial Escape Analysis and related optimizations in MoarVM

The MoarVM specializer learned plenty of new tricks this year, delivering some nice speedups for many Perl 6 programs. Many of my performance improvement hopes for 2019 center around escape analysis and optimizations stemming from it.

The idea is to analyze object allocations, and find pieces of the program where we can fully understand all of the references that exist to the object. The points at which we can cease to do that is where an object escapes. In the best cases, an object never escapes; in other cases, there are a number of reads and writes performed to its attributes up until its escape.

Armed with this, we can perform scalar replacement, which involves placing the attributes of the object into local registers up until the escape point, if any. As well as reducing memory operations, this means we can often prove significantly more program properties, allowing further optimization (such as getting rid of dynamic type checks). In some cases, we might never need to allocate the object at all; this should be a big win for Perl 6, which by its design creates lots of short-lived objects.

There will be various code-generation and static optimizer improvements to be done in Rakudo in support of this work also, which should result in its own set of speedups.

Expect to hear plenty about this in my posts here in the year ahead.

Decreasing startup time and base memory use

The current Rakudo startup time is quite high. I’d really like to see it fall to around half of what it currently is during 2019. I’ve got some concrete ideas on how that can be achieved, including changing the way we store and deserialize NFAs used by the parser, and perhaps also dealing with the way we currently handle method caches to have less startup impact.

Both of these should also decrease the base memory use, which is also a good bit higher than I wish.

Improving compilation times

Some folks – myself included – are developing increasingly large applications in Perl 6. For the current major project I’m working on, runtime performance is not an issue by now, but I certainly feel myself waiting a bit on compiles. Part of it is parse performance, and I’d like to look at that; in doing so, I’d also be able to speed up handling of all Perl 6 grammars.

I suspect there are some good wins to be had elsewhere in the compilation pipeline too, and the startup time improvements described above should also help, especially when we pre-compile deep dependency trees. I’d also like to look into if we can do some speculative parallel compilation.

Research into concurrency safety

In Perl 6.d, we got non-blocking await and react support, which greatly improved the scalability of Perl 6 concurrent and parallel programs. Now many thousands of outstanding tasks can be juggled across just a handful of threads (the exact number chosen according to demand and CPU count).

For Perl 6.e, which is still a good way off, I’d like to having something to offer in terms of making Perl 6 concurrent and parallel programming safer. While we have a number of higher-level constructs that eliminate various ways to make mistakes, it’s still possible to get into trouble and have races when using them.

So, I plan to spend some time this year quietly exploring and prototyping in this space. Obviously, I want something that fits in with the Perl 6 language design, and that catches real and interesting bugs – probably by making things that are liable to occasionally explode in weird ways instead reliably do so in helpful ways, such that they show up reliably in tests.

Get Cro to its 1.0 release

In the 16 months since I revealed it, Cro has become a popular choice for implementing HTTP APIs and web applications in Perl 6. It has also attracted code contributions from a couple of dozen contributors. This year, I aim to see Cro through to its 1.0 release. That will include (to save you following the roadmap link):

Comma Community, and lots of improvements and features

I founded Comma IDE in order to bring Perl 6 a powerful Integrated Development Environment. We’ve come a long way since the Minimum Viable Product we shipped back in June to the first subscribers to the Comma Supporter Program. In recent months, I’ve used Comma almost daily on my various Perl 6 projects, and by this point honestly wouldn’t want to be without it. Like Cro, I built Comma because it’s a product I wanted to use, which I think is a good place to be in when building any product.

In a few months time, we expect to start offering Comma Community and Comma Complete. The former will be free of charge, and the latter a commercial offering under a subscribe-for-updates model (just like how the supporter program has worked so far). My own Comma wishlist is lengthy enough to keep us busy for a lot more than the next year, and that’s before considering things Comma users are asking for. Expect plenty of exciting new features, as well as ongoing tweaks to make each release feel that little bit nicer to use.

Speaking, conferences, workshops, etc.

This year will see me giving my first keynote at a European Perl Conference. I’m looking forward to being in Riga again; it’s a lovely city to wander around, and I remember having some pretty nice food there too. The keynote will focus on the concurrent and parallel aspects of Perl 6; thankfully, I’ve still a good six months to figure out exactly what angle I wish to take on that, having spoken on the topic many times before!

I also plan to submit a talk or two for the German Perl Workshop, and will probably find the Swiss Perl Workshop hard to resist attending once more. And, more locally, I’d like to try and track down other Perl folks here in Prague, and see if I can help some kind of to happen again.

I need to keep my travel down to sensible levels, but might be able to fit in the odd other bit of speaking during the year, if it’s not too far away.


While I want to spend most of my time building stuff rather than talking about it, I’m up for the occasional bit of teaching. I’m considering pitching a 1-day Perl 6 concurrency workshop to the Riga organizers. Then we’ll see if there’s enough folks interested in taking it. It’ll cost something, but probably much less than any other way of getting a day of teaching from me. :-)

So, down to work!

Well, a good night’s sleep first. :-) But tomorrow, another year of fun begins. I’m looking forward to it, and to working alongside many wonderful folks in the Perl community. Rakudo Star Release 2018.10

Published on 2018-11-11T00:00:00 Perl 6 Coding Contest 2019: Seeking Task Makers

Published by Moritz Lenz on 2018-11-10T23:00:01

I want to revive Carl Mäsak's Coding Contest as a crowd-sourced contest.

The contest will be in four phases:

For the first phase, development of tasks, I am looking for volunteers who come up with coding tasks collaboratively. Sadly, these volunteers, including myself, will be excluded from participating in the second phase.

I am looking for tasks that ...

This is non-trivial, so I'd like to have others to discuss things with, and to come up with some more tasks.

If you want to help with task creation, please send an email to, stating your intentions to help, and your freenode IRC handle (optional).

There are other ways to help too:

In these cases you can use the same email address to contact me, or use IRC (moritz on freenode) or twitter.

Zoffix Znet: Perl 6 Advent Calendar 2018 Call for Authors

Published on 2018-10-31T00:00:00

Write a blog post about Perl 6

Zoffix Znet: A Request to Larry Wall to Create a Language Name Alias for Perl 6

Published on 2018-10-07T00:00:00

The culmination of the naming discussion

6guts: Speeding up object creation

Published by jnthnwrthngtn on 2018-10-06T22:59:11

Recently, a Perl 6 object creation benchmark result did the rounds on social media. This Perl 6 code:

class Point {
    has $.x;
    has $.y;
my $total = 0;
for ^1_000_000 {
    my $p = => 2, y => 3);
    $total = $total + $p.x + $p.y;
say $total;

Now (on HEAD builds of Rakudo and MoarVM) runs faster than this roughly equivalent Perl 5 code:

use v5.10;

package Point;

sub new {
    my ($class, %args) = @_;
    bless \%args, $class;

sub x {
    my $self = shift;

sub y {
    my $self = shift;

package main;

my $total = 0;
for (1..1_000_000) {
    my $p = Point->new(x => 2, y => 3);
    $total = $total + $p->x + $p->y;
say $total;

(Aside: yes, I know there’s no shortage of libraries in Perl 5 that make OO nicer than this, though those I tried also made it slower.)

This is a fairly encouraging result: object creation, method calls, and attribute access are the operational bread and butter of OO programming, so it’s a pleasing sign that Perl 6 on Rakudo/MoarVM is getting increasingly speedy at them. In fact, it’s probably a bit better at them than this benchmark’s raw numbers show, since:

While dealing with Int values got faster recently, it’s still really making two Int objects every time around that loop and having to GC them later. An upcoming new set of analyses and optimizations will let us get rid of that cost too. And yes, startup will get faster with time also. In summary, while Rakudo/MoarVM is now winning that benchmark against Perl 5, there’s still lots more to be had. Which is a good job, since the equivalent Python and Ruby versions of that benchmark still run faster than the Perl 6 one.

In this post, I’ll look at the changes that allowed this benchmark to end up faster. None of the new work was particularly ground-breaking; in fact, it was mostly a case of doing small things to make better use of optimizations we already had.

What happens during construction?

Theoretically, the default new method in turn calls bless, passing the named arguments along. The bless method then creates an object instance, followed by calling BUILDALL. The BUILDALL method goes through the set of steps needed for constructing the object. In the case of a simple object like ours, that involves checking if an x and y named argument were passed, and if so assigning those values into the Scalar containers of the object attributes.

For those keeping count, that’s at least 3 method calls (newbless, and BUILDALL).

However, there’s a cheat. If bless wasn’t overridden (which would be an odd thing to do anyway), then the default new could just call BUILDALL directly anyway. Therefore, new looks like this:

multi method new(*%attrinit) {
        (my $bless := nqp::findmethod(self,'bless')),
      nqp::create(self).BUILDALL(Empty, %attrinit),

The BUILDALL method was originally a little “interpreter” that went through a per-object build plan stating what needs to be done. However, for quite some time now we’ve instead compiled a per-class BUILDPLAN method.

Slurpy sadness

To figure out how to speed this up, I took a look at how the specializer was handling the code. The answer: not so well. There were certainly some positive moments in there. Of note:

However, the new method was getting only a “certain” specialization, which is usually only done for very heavily polymorphic code. That wasn’t the case here; this program clearly constructs overwhelmingly one type of object. So what was going on?

In order to produce an “observed type” specialization – the more powerful kind – it needs to have data on the types of all of the passed arguments. And for the named arguments, that was missing. But why?

Logging of passed argument types is done on the callee side, since:

The argument logging was done as the interpreter executed each parameter processing instruction. However, when there’s a slurpy, then it would just swallow up all the remaining arguments without logging type information. Thus the information about the argument types was missing and we ended up with a less powerful form of specialization.

Teaching the slurpy handling code about argument logging felt a bit involved, but then I realized there was a far simpler solution: log all of the things in the argument buffer at the point an unspecialized frame is being invoked. We’re already logging the entry to the call frame at that point anyway. This meant that all of the parameter handling instructions got a bit simpler too, since they no longer had logging to do.

Conditional elimination

Having new being specialized in a more significant way was an immediate win. Of note, this part:

        (my $bless := nqp::findmethod(self,'bless')),

Was quite interesting. Since we were now specializing on the type of self, then the findmethod could be resolved into a constant. The resolution of a method on the constant Mu was also a constant. Therefore, the result of the eqaddr (same memory address) comparison of two constants should also have been turned into a constant…except that wasn’t happening! This time, it was simple: we just hadn’t implemented folding of that yet. So, that was an easy win, and once done meant the optimizer could see that the if would always go a certain way and thus optimize the whole thing into the chosen branch. Thus the new method was specialized into something like:

multi method new(*%attrinit) {
    nqp::create(self).BUILDALL(Empty, %attrinit),

Further, the create could be optimized into a “fast create” op, and the relatively small BUILDALL method could be inlined into new. Not bad.

Generating simpler code

At this point, things were much better, but still not quite there. I took a look at the BUILDALL method compilation, and realized that it could emit faster code.

The %attrinit is a Perl 6 Hash object, which is for the most part a wrapper around the lower-level VM hash object, which is the actual hash storage. We were, curiously, already pulling this lower-level hash out of the Hash object and using the nqp::existskey primitive to check if there was a value, but were not doing that for actually accessing the value. Instead, an .AT-KEY('x') method call was being done. While that was being handled fairly well – inlined and so forth – it also does its own bit of existence checking. I realized we could just use the nqp::atkey primitive here instead.

Later on, I also realized that we could do away with nqp::existskey and just use nqp::atkey. Since a VM-level null is something that never exists in Perl 6, we can safely use it as a sentinel to mean that there is no value. That got us down to a single hash lookup.

By this point, we were just about winning the benchmark, but only by a few percent. Were there any more easy pickings?

An off-by-one

My next surprise was that the new method didn’t get inlined. It was within the size limit. There was nothing preventing it. What was going on?

Looking closer, it was even worse than that. Normally, when something is too big to be inlined, but we can work out what specialization will be called on the target, we do “specialization linking”, indicating which specialization of the code to call. That wasn’t happening. But why?

Some debugging later, I sheepishly fixed an off-by-one in the code that looks through a multi-dispatch cache to see if a particular candidate matches the set of argument types we have during optimization of a call instruction. This probably increased the performance of quite a few calls involving passing named arguments to multi-methods – and meant new was also inlined, putting us a good bit ahead on this benchmark.

What next?

The next round of performance improvements – both to this code and plenty more besides – will come from escape analysis, scalar replacement, and related optimizations. I’ve already started on that work, though expect it will keep me busy for quite a while. I will, however, be able to deliver it in stages, and am optimistic I’ll have the first stage of it ready to talk about – and maybe even merge – in a week or so.

brrt to the future: A future for fork(2)

Published by Bart Wiegmans on 2018-10-03T22:18:00

Hi hackers. Today I want to write about a new functionality that I've been developing for MoarVM that has very little to do with the JIT compiler. But it is still about VM internals so I guess it will fit.

Many months ago, jnthn wrote a blog post on the relation between perl 5 and perl 6. And as a longtime and enthusiastic perl 5 user - most of the JIT's compile time support software is written in perl 5 for a reason - I wholeheartedly agree with the 'sister language' narrative. There is plenty of room for all sorts of perls yet, I hope. Yet one thing kept itching me:
Moreover, it’s very much the case that Perl 5 and Perl 6 make different trade-offs. To pick one concrete example, Perl 6 makes it easy to run code across multiple threads, and even uses multiple threads internally (for example, performing optimization and JIT compilation on a background thread). Which is great…except the only winning move in a game involving both threads and fork() is not to play. Sometimes one just can’t have their cake and eat it, and if you’re wanting a language that more directly gives you your POSIX, that’s probably always going to be a strength of Perl 5 over Perl 6.
(Emphasis mine). You see, I had never realized that MoarVM couldn't implement fork(), but it's true. In POSIX systems, a fork()'d child process inherits the full memory space, as-is, from its parent process. But it inherits only the forking thread. This means that any operations performed by any other thread, including operations that might need to be protected by a mutex (e.g. malloc()), will be interrupted and unfinished (in the child process). This can be a problem. Or, in the words of the linux manual page on the subject:

       *  The child process is created with a single thread—the one that
called fork(). The entire virtual address space of the parent is
replicated in the child, including the states of mutexes,
condition variables, and other pthreads objects; the use of
pthread_atfork(3) may be helpful for dealing with problems that
this can cause.

* After a fork() in a multithreaded program, the child can safely
call only async-signal-safe functions (see signal-safety(7)) until
such time as it calls execve(2).

Note that the set of signal-safe functions is not that large, and excludes all memory management functions. As noted by jnthn, MoarVM is inherently multithreaded from startup, and will happily spawn as many threads as needed by the program. It also uses malloc() and friends rather a lot. So it would seem that perl 6 cannot implement POSIX fork() (in which both parent and child program continue from the same position in the program) as well as perl 5 can.

I was disappointed. As a longtime (and enthusiastic) perl 5 user, fork() is one of my favorite concurrency tools. Its best feature is that parent and child processes are isolated by the operating system, so each can modify its own internal state without concern for concurrency,. This can make it practical to introduce concurrency after development, rather than designing it in from the start. Either process can crash while the other continues. It is also possible (and common) to run a child process with reduced privileges relative to the parent process, which isn't possible with threads. And it is possible to prepare arbitrarily complex state for the child process, unlike spawn() and similar calls.

Fortunately all hope isn't necessarily lost. The restrictions listed above only apply if there are multiple threads active at the moment that fork() is executed. That means that if we can stop all threads (except for the one planning to fork) before actually calling fork(), then the child process can continue safely. That is well within the realm of possibility, at least as far as system threads are concerned.

The process itself isn't very exciting to talk about, actually - it involves sending stop signals to system threads, waiting for these threads to join, verifying that the forking thread is the really only active thread, and restarting threads after fork(). Because of locking, it is a bit subtle (tbere may be another user thread that is also trying to fork), but not actually very hard. When I finally merged the code, it turned out that an ancient (and forgotten) thread list modification function was corrupting the list by not being aware of generational garbage collection... oops. But that was simple enough to fix (thanks to timotimo++ for the hint).

And now the following oneliner should work on platforms that support fork():(using development versions of MoarVM and Rakudo):

perl6 -e 'use nqp; my $i = nqp::fork(); say $i;'

The main limitation of this work is that it won't work if there are any user threads active. (If there are any, nqp::fork() will throw an exception). The reason why is simple: while it is possible to adapt the system threads so that I can stop them on demand, user threads may be doing arbitrary work, hold arbitrary locks and may be blocked (possibly indefinitely) on a system call. So jnthn's comment at the start of this post still applies - threads and fork() don't work together.

In fact, many programs may be better off with threads. But I think that languages in the perl family should let the user make that decision, rather than the VM. So I hope that this will find some use somewhere. If not, it was certainly fun to figure out. Happy hacking!

PS: For the curious, I think there may in fact be a way to make fork() work under a multithreaded program, and it relies on the fact that MoarVM has a multithreaded garbage collector. Basically, stopping all threads before calling fork() is not so different from stopping the world during the final phase of garbage collection. And so - in theory - it would be possible to hijack the synchronization mechanism of the garbage collector to pause all threads. During interpretation, and in JIT compiled code, each thread periodically checks if garbage collection has started. If it has, it will synchronize with the thread that started GC in order to share the work. Threads that are currently blocked (on a system call, or on acquiring a lock) mark themselves as such, and when they are resumed always check the GC status. So it is in fact possible to force MoarVM into a single active thread even with multiple user threads active. However, that still leaves cleanup to deal with, after returning from fork() in the child process. Also, this will not work if a thread is blocked on NativeCall. Altogether I think abusing the GC in order to enable a fork() may be a bit over the edge of insanity :-)

6guts: Eliminating unrequired guards

Published by jnthnwrthngtn on 2018-09-29T19:59:28

MoarVM’s optimizer can perform speculative optimization. It does this by gathering statistics as the program is interpreted, and then analyzing them to find out what types and callees typically show up at given points in the program. If it spots there is at least a 99% chance of a particular type showing up at a particular program point, then it will optimize the code ahead of that point as if that type would always show up.

Of course, statistics aren’t proofs. What about the 1% case? To handle this, a guard instruction is inserted. This cheaply checks if the type is the expected one, and if it isn’t, falls back to the interpreter. This process is known as deoptimization.

Just how cheap are guards?

I just stated that a guard cheaply checks if the type is the expected one, but just how cheap is it really? There’s both direct and indirect costs.

The direct cost is that of the check. Here’s a (slightly simplified) version of the JIT compiler code that produces the machine code for a type guard.

/* Load object that we should guard */
| mov TMP1, WORK[obj];
/* Get type table we expect and compare it with the object's one */
MVMint16 spesh_idx = guard->ins->operands[2].lit_i16;
| get_spesh_slot TMP2, spesh_idx;
| cmp TMP2, OBJECT:TMP1->st;
| jne >1;
/* We're good, no need to deopt */
| jmp >2;
/* Call deoptimization handler */
| mov ARG1, TC;
| mov ARG2, guard->deopt_offset;
| mov ARG3, guard->deopt_target;
| callp &MVM_spesh_deopt_one_direct;
/* Jump out to the interpreter */
| jmp ->exit;

Where get_spesh_slot is a macro like this:

|.macro get_spesh_slot, reg, idx;
| mov reg, TC->cur_frame;
| mov reg, FRAME:reg->effective_spesh_slots;
| mov reg, OBJECTPTR:reg[idx];

So, in the case that the guard matches, it’s 7 machine instructions (note: it’s actually a couple more because of something I omitted for simplicity). Thus there’s the cost of the time to execute them, plus the space they take in memory and, especially, the instruction cache. Further, one is a conditional jump. We’d expect it to be false most of the time, and so the CPU’s branch predictor should get a good hit rate – but branch predictor usage isn’t entirely free of charge either. Effectively, it’s not that bad, but it’s nice to save the cost if we can.

The indirect costs are much harder to quantify. In order to deoptimize, we need to have enough state to recreate the world as the interpreter expects it to be. I wrote on this topic not so long ago, for those who want to dive into the detail, but the essence of the problem is that we may have to retain some instructions and/or forgo some optimizations so that we are able to successfully deoptimize if needed. Thus, the presence of a guard constrains what optimizations we can perform in the code around it.

Representation problems

A guard instruction in MoarVM originally looked like:

sp_guard r(obj) sslot uint32

Where r(obj) is an object register to read containing the object to guard, the sslot is a spesh slot (an entry in a per-block constant table) containing the type we expect to see, and the uint32 indicates the target address after we deoptimize. Guards are inserted after instructions for which we had gathered statistics and determined there was a stable type. Things guarded include return values after a call, reads of object attributes, and reads of lexical variables.

This design has carried us a long way, however it has a major shortcoming. The program is represented in SSA form. Thus, an invoke followed by a guard might look something like:

invoke r6(5), r4(2)
sp_guard r6(5), sslot(42), litui32(64)

Where r6(5) has the return value written into it (and thus is a new SSA version of r6). We hold facts about a value (if it has a known type, if it has a known value, etc.) per SSA version. So the facts about r6(5) would be that it has a known type – the one that is asserted by the guard.

The invoke itself, however, might be optimized by performing inlining of the callee. In some cases, we might then know the type of result that the inlinee produces – either because there was a guard inside of the inlined code, or because we can actually prove the return type! However, since the facts about r6(5) were those produced by the guard, there was no way to talk about what we know of r6(5) before the guard and after the guard.

More awkwardly, while in the early days of the specializer we only ever put guards immediately after the instructions that read values, more recent additions might insert them at a distance (for example, in speculative call optimizations and around spesh plugins). In this case, we could not safely set facts on the guarded register, because those might lead to wrong optimizations being done prior to the guard.

Changing of the guards

Now a guard instruction looks like this:

sp_guard w(obj) r(obj) sslot uint32

Or, concretely:

invoke r6(5), r4(2)
sp_guard r6(6), r6(5), sslot(42), litui32(64)

That is to say, it introduces a new SSA version. This means that we get a way to talk about the value both before and after the guard instruction. Thus, if we perform an inlining and we know exactly what type it will return, then that type information will flow into the input – in our example, r6(5) – of the guard instruction. We can then notice that the property the guard wants to assert is already upheld, and replace the guard with a simple set (which may itself be eliminated by later optimizations).

This also solves the problem with guards inserted away from the original write of the value: we get a new SSA version beyond the guard point. This in turn leads to more opportunities to avoid repeated guards beyond that point.

Quite a lot of return value guards on common operations simply go away thanks to these changes. For example, in $a + $b, where $a and $b are Int, we will be able to inline the + operator, and we can statically see from its code that it will produce an Int. Thus, the guard on the return type in the caller of the operator can be eliminated. This saves the instructions associated with the guard, and potentially allows for further optimizations to take place since we know we’ll never deoptimize at that point.

In summary

MoarVM does lots of speculative optimization. This enables us to optimize in cases where we can’t prove a property of the program, but statistics tell us that it mostly behaves in a certain way. We make this safe by adding guards, and falling back to the general version of the code in cases where they fail.

However, guards have a cost. By changing our representation of them, so that we model the data coming into the guard and after the guard as two different SSA versions, we are able to eliminate many guard instructions. This not only reduces duplicate guards, but also allows for elimination of guards when the broader view afforded by inlining lets us prove properties that we weren’t previously able to.

In fact, upcoming work on escape analysis and scalar replacement will allow us to start seeing into currently opaque structures, such as Scalar containers. When we are able to do that, then we’ll be able to prove further program properties, leading to the elimination of yet more guards. Thus, this work is not only immediately useful, but also will help us better exploit upcoming optimizations.