Planet Raku

Raku RSS Feeds

Roman Baumer (Freenode: rba #raku or ##raku-infra) / 2022-12-08T13:21:26


Raku Advent Calendar: Day 8: I’ll Let You Know Later

Published by Jonathan Stowe on 2022-12-08T01:02:00

Back when the web was young the only way that you could know whether a resource had changed its state was to manually re-request the page, this wasn’t really too much of a problem when there was only static pages which didn’t change all that often. Then along came server-side applications, the CGI and the like, these could change their state more frequently in ways that you might be interested in, but in effect you were still stuck with some variation of refreshing the page (albeit possibly initiated by the browser under the instruction of some tag in the page), so if, say, you had an application that kicked off a long running background task it might redirect you to another page that checked the status of the job that would periodically refresh itself then redirect to the results when the task was complete, (in fact I know of at least one reasonably well known reporting application that does just this still in 2022.)

Then sometime around the turn of century things started to get a lot more interactive with the introduction of the XMLHttpRequest API which allowed a script in a web page to make requests to the server and, based on the response, update the view appropriately, thus making it possible for a web page to reflect a change in state in the server without any refreshing ( though still with some polling of the server in the background by the client-side script.) Then along came the WebSocket API which provides for bi-directional communication between the client and server, and Server-Sent Events which provides for server push of events (with associated data.) These technologies provide means to reflect changes in an application state in a web page without needing a page refresh.

Here I’m going to describe a way of implementing client side notifications from a Raku web application using Server-sent Events.

Server-sent Events

The Server-sent Events provide a server to client push mechanism implemented using a persistent but otherwise standard HTTP connection with Chunked transfer encoding and typically a Content-Type of text/event-stream. The client-side API is EventSource and is supported by most modern browsers, there are also client libraries ( including EventSource::Client,) allowing non-web applications to consume an event stream (but that will be for another time.)

On the server side I have implemented the EventSource::Server; while the examples here are using Cro it could be used with any HTTP server framework that will accept a Supply as the response data and emit chunked data to the client until Supply is done.

Conceptually the EventSource::Server is very simple: it takes a Supply of events and transforms them into properly formatted EventSource events which can be transmitted to the client in a stream of chunked data.

The Client side part

This is the index.html that will be served as static content from our server, it’s about the simplest I could come up with (using jQuery and Bootstrap for simplicity.) Essentially it’s a button that will make a request to the server, a space to put our “notifications” and the Javascript to consume the events from the server and display the notifications.

I don’t consider client side stuff as one of my core competencies, so forgive me for this.

<!DOCTYPE html>
<html lang="en">
 <head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Bootstrap 101 Template</title>
  <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@3.3.7/dist/css/bootstrap.min.css">
  <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@3.3.7/dist/css/bootstrap-theme.min.css">
 </head>
 <body>
  <main role="main" class="container-fluid">
   <div class="row">
    <div class="col"></div>
    <div class="col-8 text-center">
     <a href="button-pressed" class="btn btn-danger btn-lg active" role="button" aria-pressed="true">Press Me!</a>
    </div>
    <div class="col" id="notification-holder"></div>
   </div>
  </main>
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
  <script src="https://cdn.jsdelivr.net/npm/bootstrap@3.3.7/dist/js/bootstrap.min.js"></script>
  <script>
    var sse;
    function createNotification(message, type) {
      var html = '<div class="shadow bg-body rounded alert alert-' + type + ' alert-dismissable page-alert">';
      html += '<button type="button" data-dismiss="alert" class="close"><span aria-hidden="true">×</span><span class="sr-only">Close</span></button>';
      html += message;
      html += '</div>';
      $(html).hide().prependTo('#notification-holder').slideDown();
    };
    function notificationHandler(e) {
      const message = JSON.parse(event.data);
      createNotification(message.message, message.type);
    };
    function setupNotifications() {
      if ( sse ) {
        sse.removeEventListener("notification", notificationHandler);
        sse.close;
      }
 	 
      sse = new EventSource('/notifications');
      sse.addEventListener("notification", notificationHandler );
      $('.page-alert .close').click(function(e) {
        e.preventDefault();
        $(this).closest('.page-alert').slideUp();
      });
      return sse
    };
    setupNotifications();
  </script>
 </body>
</html>

Essentially the Javascript sets up the EventSource client to consume the events we will publish on /notifications and adds a Listener which parses the JSON data in the event (it doesn’t have to be JSON, but I find this most convenient,) and then insert the “notification” in the DOM. The rest is mostly Bootstrap stuff for dismissing the notification.

You could of course implement this in any other client-side framework (Angular, React or whatever the new New Hotness is,) but we’re here for the Raku not the Javascript.

Anyway this isn’t going to change at all, so if you actually want to run the examples, you can save it and forget about it.

The Server Side

The server part of our application is, largely, a simple Cro::HTTP application with three routes : one to serve up our index.html from above, another to handle the button push request and obviously a route to serve up the event stream on /notifications.

This is all bundled up in a single script for convenience of exposition, in a real world application you’d almost certainly want to split it up into several files.

class NotificationTest {
    use Cro::HTTP::Server;
    has Cro::Service $.http;

    class Notifier {
        use EventSource::Server;
        use JSON::Class;

        has Supplier::Preserving $!supplier = Supplier::Preserving.new;

        enum AlertType is export (
          Info    => "info",
          Success => "success",
          Warning => "warning",
          Danger  => "danger"
        );

        class Message does JSON::Class {
            has AlertType $.type is required is marshalled-by('Str');
            has Str $.message is required;
            has Str $.event-type = 'notification';
        }

        method notify(
          AlertType  $type,
              Str()  $message,
              Str  :$event-type = 'notification'
        --> Nil ) {
            $!supplier.emit:
              Message.new(:$type, :$message :$event-type );
        }

        multi method event-stream( --> Supply) {
            my $supply = $!supplier.Supply.map: -> $m {
                EventSource::Server::Event.new(
                  type => $m.event-type,
                  data => $m.to-json(:!pretty)
                )
            }
            EventSource::Server.new(
              :$supply,
              :keepalive,
              keepalive-interval => 10
            ).out-supply;
        }
    }

    class Routes {
        use Cro::HTTP::Router;

        has Notifier $.notifier
          handles <notify event-stream> = Notifier.new;

        method routes() {
            route {
                get -> {
                    static $*PROGRAM.parent, 'index.html';
                }
                get -> 'notifications' {
                    header 'X-Accel-Buffering', 'no';
                    content 'text/event-stream', $.event-stream();
                }
                get -> 'button-pressed' {
                    $.notify(Notifier::Info, 'Someone pressed the button');
                }
            }
        }
    }

    has $.routes-object;

    method routes-object( --> Routes ) handles <routes> {
        $!routes-object //= Routes.new();
    }

    method http( --> Cro::Service ) handles <start stop> {
        $!http //= Cro::HTTP::Server.new(
          http => <1.1>,
          host => '0.0.0.0',
          port => 9999,
          application => $.routes,
        );
    }
}

multi sub MAIN() {
    my NotificationTest $http = NotificationTest.new;
    $http.start;
    say "Listening at https://127.0.0.1:9999";
    react {
        whenever signal(SIGINT) {
            $http.stop;
            done;
        }
    }
} 

There’s nothing particularly unusual about this, but you’ll probably see that nearly everything is happening in the Notifier class. The routes are defined within a method within a Routes class so that the key methods of Notifier can be delegated from an instance of that class, which makes it nicer than having a global object, but also makes it easier to refactor or even replace the Notifier at run time (perhaps to localise the messages for example.)

The Notifier class itself can be thought of as a wrapper for the EventSource::Server, there is a Supplier (here a Supplier::Preserving which works better for this scenario,) onto which objects of Message or emitted by notify method, the Message class consumes JSON::Class so that it can easily be serialized as JSON when creating the final event that will be output onto the event stream. The EventType enumeration here maps to the CSS classes in the resulting notification HTML that influence the colour of the notification as displayed.

Most of the action here is actually going on in the event-stream method, which constructs the stream that is output to the client:

multi method event-stream( –> Supply) {
    my $supply = $!supplier.Supply.map: -> $m {
        EventSource::Server::Event.new(
          type => $m.event-type,
          data => $m.to-json(:!pretty)
        )
    }
    EventSource::Server.new(
      :$supply,
      :keepalive,
      keepalive-interval => 10
    ).out-supply;
} 

This maps the Supply derived from our Supplier such that the Message objects are serialized and wrapped in an EventSource::Server::Event object, the resulting new Supply is then passed to the EventSource::Server. The out-supply returns a further Supply which emits the encode event stream data suitable for being passed as content in the Cro route. The wrapping of the Message in the Event isn’t strictly necessary here as EventSource::Server will do it internally if necessary, but doing so allows control of the type which is the event type that will be specified when adding the event listener in your Javascript, so, for instance, you could emit events of different types on your stream and have different listeners for each in your Javascript, each having a different effect on your page.

The route for /notifications probably warrants closer inspection:

get -> 'notifications' {
    header 'X-Accel-Buffering', 'no';
    content 'text/event-stream', $.event-stream();
} 

Firstly, unless you have a particular reason, the Content Type should always be text/event-stream otherwise the client won’t recognise the stream, and, in all the implementations I have tried at least, will just sit there annoyingly doing nothing. The header here isn’t strictly necessary for this example, however if your clients will be accessing your application via a reverse proxy such as nginx then you may need to supply this (or one specific to your proxy,) in order to prevent the proxy buffering your stream which may lead to the events never being delivered to the client.

But what if don’t want everyone to get the same notifications?

This is all very well but for the majority of applications you probably want to send notifications to specific users (or sessions,) it’s unlikely that all the users of our application are interested that someone pressed the button, so we’ll introduce the notion of a session using Cro:HTTP::Session::InMemory, this has the advantage of being very simple to implement (and built-in.)

The changes to our original example are really quite small (I’ve omitted any authentication to keep it simple:)

class NotificationTest {
    use Cro::HTTP::Server;
    use Cro::HTTP::Auth;

    has Cro::Service $.http;

    class Session does Cro::HTTP::Auth {
        has Supplier $!supplier handles <emit Supply> = Supplier.new;
    }

    class Notifier {
        use EventSource::Server;
        use JSON::Class;

        enum AlertType is export (
          Info    => "info",
          Success => "success",
          Warning => "warning",
          Danger  => "danger"
        );

        class Message does JSON::Class {
            has AlertType $.type is required is marshalled-by('Str');
            has Str $.message is required;
            has Str $.event-type = ‘notification‘;
        }

        method notify(
          Session   $session,
          AlertType $type,
              Str() $message,
              Str  :$event-type = 'notification'
        –> Nil) {
            $session.emit: Message.new(:$type, :$message :$event-type );
        }

        multi method event-stream(Session $session, –> Supply) {
            my $supply = $session.Supply.map: -> $m {
                EventSource::Server::Event.new(
                  type => $m.event-type,
                  data => $m.to-json(:!pretty)
                )
            }
            EventSource::Server.new(
              :$supply,
              :keepalive,
              keepalive-interval => 10
            ).out-supply;
        }
    }

    class Routes {
        use Cro::HTTP::Router;
        use Cro::HTTP::Session::InMemory;

        has Notifier $.notifier handles <notify event-stream> = Notifier.new;

        method routes() {
            route {
                before Cro::HTTP::Session::InMemory[Session].new;
                get -> Session $session {
                    static $*PROGRAM.parent, ‘index.html‘;
                }
                get -> Session $session, ‘notifications‘ {
                    header ‘X-Accel-Buffering‘, ‘no‘;
                    content ‘text/event-stream‘, $.event-stream($session);
                }
                get -> Session $session, ‘button-pressed‘ {
                    $.notify($session, Notifier::Info, ‘You pressed the button‘);
                }
            }
        }
    }

    has $.routes-object;

    method routes-object( –> Routes ) handles <routes> {
        $!routes-object //= Routes.new();
    }

    method http( –> Cro::Service ) handles <start stop> {
        $!http //= Cro::HTTP::Server.new(
          http => <1.1>,
          host => ‘0.0.0.0‘,
          port => 9999,
          application => $.routes,
        );
    }
}

multi sub MAIN() {
    my NotificationTest $http = NotificationTest.new;
    $http.start;
    say “Listening at https://127.0.0.1:9999“;
    react {
        whenever signal(SIGINT) {
            $http.stop;
            done;
        }
    }
} 

As you can see much of the code remains unchanged, we’ve introduced a new Session class and made some changes to the Notifier methods and the routes.

The Session class is instantiated on the start of a new session and will be kept in memory until the session expires:

class Session does Cro::HTTP::Auth {
    has Supplier $!supplier
      handles <emit Supply> = Supplier.new;
} 

Because the same object stays in memory we can replace the single Supplier of the Notifier object with a per-session one, the same Session object being passed to the routes during the lifetime of the session:

method routes() {
    route {
        before Cro::HTTP::Session::InMemory[Session].new;
        get -> Session $session {
            static $*PROGRAM.parent, 'index.html';
        }
        get -> Session $session, 'notifications' {
            header 'X-Accel-Buffering', 'no';
            content 'text/event-stream', $.event-stream($session);
        }
        get -> Session $session, 'button-pressed' {
            $.notify($session, Notifier::Info, 'You pressed the button');
        }
    }
} 

The Cro::HTTP::Session::InMemory is introduced as a Middleware that handles the creation or retrieval of a session, setting the session cookie and so forth before the request is passed to the appropriate route. Where the first argument to a route block has a type that does Cro::HTTP::Auth then the session object will be passed, you can do interesting things with authentication and authorization by using more specific subsets of your Session class but we won’t need that here and we’ll just pass the session object to the modified Notifier methods:

method notify(
    Session $session,
  AlertType $type,
      Str() $message,
       Str :$event-type = 'notification'
–> Nil) {
        $session.emit: Message.new(:$type, :$message :$event-type );
}
     
multi method event-stream( Session $session, –> Supply) {
    my $supply = $session.Supply.map: -> $m {
        EventSource::Server::Event.new(
          type => $m.event-type,
          data => $m.to-json(:!pretty)
        )
    }
    EventSource::Server.new(
      :$supply,
      :keepalive,
      keepalive-interval => 10
    ).out-supply;
} 

Both the notify and event-stream are simply amended to take the Session object as the first argument and to use the (delegated) methods on the Session’s own Supplier rather than the shared one from Notifier.

And now each ‘user’ can get their own notifications, the button could be starting a long running job and they could be notified when it’s done. You could extend to do “broadcast” notifications by putting back the shared Supplier in Notifier, make a second multi candidate of notify which doesn’t take the Session and which would emit to that Supplier, then merge the shared and instance specific Supplies in the event-stream method.

But what if I have more than instance of my application?

You’ve probably worked out by now that using the “in-memory” session won’t work if you have more than one instance of your application, you might be able to get away with setting up “sticky sessions” on a load balancer at a push, but probably not something you’d want to rely on.

What we need is a shared source of notifications to which all the new notifications can be added and from which each instance will retrieve the notifications to be sent.

For this we can use a PostgreSQL database, which handily has a NOTIFY which allows the server to send a notification to all the connected clients that have requested to receive them.

In the amended application we will use Red to access the database ( plus a feature of DB::Pg to consume notifications from the server.)

For our simple application we only a table to hold the notifications, and a table in which to persist the sessions ( using Cro::HTTP::Session::Red ,) so let’s make them upfront:

CREATE FUNCTION public.new_notification() RETURNS trigger LANGUAGE plpgsql AS $$
    BEGIN
    PERFORM pg_notify(‘notifications‘, ‘‘ || NEW.id || ‘‘);
    RETURN NEW;
    END;
    $$;
     
    CREATE TABLE public.notification (
      id uuid NOT NULL,
      session_id character varying(255),
      type character varying(255) NOT NULL,
      message character varying(255) NOT NULL,
      event_type character varying(255) NOT NULL
    );
     
    CREATE TABLE public.session (
      id character varying(255) NOT NULL
    );
     
    ALTER TABLE ONLY public.notification
    ADD CONSTRAINT notification_pkey PRIMARY KEY (id);
     
    ALTER TABLE ONLY public.session
    ADD CONSTRAINT session_pkey PRIMARY KEY (id);
     
    CREATE TRIGGER notification_trigger AFTER INSERT ON public.notification FOR EACH ROW EXECUTE PROCEDURE public.new_notification();
     
    ALTER TABLE ONLY public.notification
    ADD CONSTRAINT notification_session_id_fkey FOREIGN KEY (session_id) REFERENCES public.session(id); 

I’ve used a database called notification_test in the example. The notification table has similar columns to the attributes of the Message class with the addition of the id and session_id, there is a trigger on insert that sends the Pg notification with the id of the new row, which will be consumed by the application.

The session table only has the required id column that will be populated by the session middleware when the new session is created.

The code has a few more changes than between the first examples, but the majority of the changes are to introduce the Red models for the two DB tables and to rework the way that the Notifier works:

class NotificationTest {
    use Cro::HTTP::Server;
    use Cro::HTTP::Auth;
    use UUID;
    use Red;
    use Red::DB;
    need Red::Driver;
    use JSON::Class;
    use JSON::OptIn;
     
    has Cro::Service $.http;
     
    model Message {
        …
    }
     
    model Session is table('session') does Cro::HTTP::Auth {
        has Str $.id is id;
        has @.messages
          is relationship({ .session-id }, model => Message )
          is json-skip;
    }
     
    enum AlertType is export (
      Info    => "info",
      Success => "success",
      Warning => "warning",
      Danger  => "danger"
    );
     
    model Message is table('notification') does JSON::Class {
        has Str $.id is id is marshalled-by('Str') = UUID.new.Str;
        has Str $.session-id is referencing(model => Session, column => 'id' ) is json-skip;
        has AlertType $.type is column is required is marshalled-by('Str');
        has Str $.message is column is required is json;
        has Str $.event-type is column is json = 'notification';
    }
     
    has Red::Driver $.database = database 'Pg', dbname => 'notification_test';
     
    class Notifier {
        use EventSource::Server;

        has Red::Driver $.database;
     
        method database(–> Red::Driver) handles <dbh> {
            $!database //= get-RED-DB();
        }
     
        has Supply $.message-supply;
     
        method message-supply( –> Supply ) {
            $!message-supply //= supply {
                whenever $.dbh.listen('notifications') -> $id {
                    if Message.^rs.grep(-> $v { $v.id eq $id }).head -> $message {
                        emit $message;
                    }
                }
            }
        }
     
        method notify(
          Session   $session,
          AlertType $type,
              Str() $message,
               Str :$event-type = 'notification'
        –> Nil ) {
            Message.^create(
              session-id => $session.id, :$type, :$message :$event-type
            );
        }
     
        multi method event-stream( Session $session, –> Supply) {
            my $supply = $.message-supply.grep( -> $m {
                $m.session-id eq $session.id
            }).map( -> $m {
                EventSource::Server::Event.new(
                  type => $m.event-type,
                  data => $m.to-json(:!pretty)
                )
            });
            EventSource::Server.new(
              :$supply,
              :keepalive,
              keepalive-interval => 10
            ).out-supply;
        }
    }
         
    class Routes {
        use Cro::HTTP::Router;
        use Cro::HTTP::Session::Red;
     
        has Notifier $.notifier
          handles <notify event-stream> = Notifier.new;
     
        method routes() {
            route {
                before Cro::HTTP::Session::Red[Session].new: cookie-name => 'NTEST_SESSION';
                get -> Session $session {
                    static $*PROGRAM.parent, 'index.html';
                }
                get -> Session $session, 'notifications' {
                    header 'X-Accel-Buffering', 'no';
                    content 'text/event-stream', $.event-stream($session);
                }
                get -> Session $session, 'button-pressed' {
                    $.notify($session, Info, 'You pressed the button');
                }
            }
        }
    }
     
    has $.routes-object;
     
    method routes-object( –> Routes ) handles <routes> {
        $!routes-object //= Routes.new();
    }
     
    method http( –> Cro::Service ) handles <start stop> {
        $!http //= Cro::HTTP::Server.new(
          http => <1.1>,
          host => '0.0.0.0',
          port => 9999,
          application => $.routes,
        );
    }
}
     
multi sub MAIN() {
    my NotificationTest $http = NotificationTest.new;
    $GLOBAL::RED-DB = $http.database;
    $http.start;
    say "Listening at https://127.0.0.1:9999";
    react {
        whenever signal(SIGINT) {
            $http.stop;
            done;
        }
    }
} 

I’ll gloss over the definition of the Red models as that should be mostly obvious, except to note that the Message model also does JSON::Class which allows the instances to be serialized as JSON (just like the original example,) so no extra code is required to create the events that are sent to the client.

The major changes are to the Notifier class which introduces message-supply which creates an on-demand supply) replacing the shared Supplier of the first example, and the per-session Supplier of the second:

has Supply $.message-supply;
 	 
method message-supply( –> Supply ) {
    $!message-supply //= supply {
        whenever $.dbh.listen(‘notifications‘) -> $id {
            if Message.^rs.grep(-> $v { $v.id eq $id }).head -> $message {
                emit $message;
            }
        }
    }
} 

This taps the Supply of Pg notifications provided by the underlying DB::Pg, which ( referring back to the SQL trigger described above,) emits the id of the newly created notification rows in the database, the notification row is then retrieved and then emitted onto the message-supply.

The notify method is altered to insert the Message to the notification table:

method notify(
  Session $session,
  AlertType $type,
  Str() $message,
  Str :$event-type = 'notification'
–> Nil ) {
        Message.^create(session-id => $session.id, :$type, :$message :$event-type );
} 

The signature of the method is unchanged and the session-id from the supplied Session is inserted into the Message.

The event-stream method needs to be altered to process the Message objects from the message-supply and select only those for the requested Session:

multi method event-stream( Session $session, –> Supply) {
    my $supply = $.message-supply.grep( -> $m {
        $m.session-id eq $session.id
    }).map( -> $m {
        EventSource::Server::Event.new(
          type => $m.event-type,
          data => $m.to-json(:!pretty)
        )
    });
    EventSource::Server.new(
      :$supply,
      :keepalive,
      keepalive-interval => 10
    ).out-supply;
} 

And that’s basically it, there’s a little extra scaffolding to deal with the database but not a particulary large change.

What else?

I’ve omitted any authentication from these example for brevity, but if you wanted to have per-user notifications then, if you have authenticated users, you could add the user id to the Message and filter where the user matches that of the Session.

Instead of using the Pg notifications, if you want to still use a database, you could repeatedly query the notifications table for new notifications as a background task. Or you could use some message queue to convey the notifications (ActiveMQ topics or a RabbitMQ fanout exchange for example).

But now you can tell your users what is going on in the application without them having to do anything.

Raku Advent Calendar: Day 7: .hyper and Cro

Published by Massa 👽 Humberto on 2022-12-07T01:03:00

or How (not) to pound your production server

(and to bring on the wrath of the Ops)

So, I’m a programmer and I work for a government TI “e-gov” department. My work here is mostly comprised of one-off data-integration tasks (like the one in this chronicle) and programming satellite utilities for our Citizen Relationship Management system.

the problem

So, suppose you have:

  1. a lot (half a million) records in a .csv file, to be entered in your database;
  2. a database only accessible via a not-controlled-by-you API;
  3. said API takes a little bit more than half a second per record;
  4. some consistency checks must be done before sending the records to the API; but
  5. the API is a “black box” and it may be more strict than your basic consistency checks;
  6. tight schedule (obviously)

the solution

the prototype: Text::CSV and HTTP::UserAgent

So, taking half a second per record just in the HTTP round-trip is bad, very bad (34 hours for the processing of the whole dataset).

sub read-csv(IO() $file) {
gather {
my $f = $file.open: :r :!chomp;
with Text::CSV.new {
.header: $f, munge-column-names => { S:g /\W+//.samemark('x').lc };
while my $row = .getline-hr: $f { take $row }
}
}
}
sub csv-to-yaml(@line –> Str)
# secret sauce
my %obj = do { };
to-yaml %obj
}
sub server-put($_) {
# HTTP::UserAgent
}
sub MAIN(Str $input) {
my @r = lazy read-csv $input;
server-login;
server-put csv-to-json $_ for @r
}

.hyperize it

Let’s try to make things move faster…

sub MAIN(Str $input) {
my @r = lazy read-csv $input;
server-login;
react {
whenever supply {
.emit for @r.hyper(:8degree, :16batch)
.map(&csv-to-yaml)
} {
server-post $_
}
}
}

So, explaining the code above a little bit: @r is a lazy sequence (this means, roughly, that the while my $row bit in read-csv is executed one row at a time, in a coroutine-like fashion.) When I use .hyper(:$degree, :$batch), it transforms the sequence in a “hyper-sequence”, basically opening a thread pool with $degree threads and sending to each thread $batch itens from the original sequence, until its end.

Yeah, but HTTP::UserAgent does not parallelise very nicely (it just does not work)… besides, why the react whenever supply emit? It’s a mystery lost to the time. Was it really needed? Probably not, but the clock is always ticking, so just move along.

Cro::HTTP to the rescue

sub server-login() {
my Lock \l .= new;
our $cro;
l.protect: {
my $c = Cro::HTTP::Client.new:
base-uri => SERVER-URL,
content-type => JSON,
user-agent => 'honking/2022.2.1',
timeout => %(
connection => 240,
headers => 480,
),
cookie-jar => Cro::HTTP::Client::CookieJar.new,
;
await $c.post: "{SERVER-URI}/{SESSION-PATH}", body => CREDENTIALS
$cro = $c
}
$cro
}
sub server-post($data) {
our $cro;
my $r = await $cro.post: "{SERVER-URI}/{DATA-PATH}", body => $data;
await $r.body
}

Nice, but I ran the thing on a testing database and… oh, no… lots of 503s and eventually a 401 and the connection was lost.

constant NUMBER-OF-RETRIES = 3; # YMMV
constant COOLING-OFF-PERIOD = 2; # this is plenty to stall this thread
sub server-post($data) {
our $cro;
do {
my $r = await $cro.post: "{SERVER-URI}/{DATA-PATH}", body => $data;
my $count = NUMBER-OF-RETRIES;
while $count and $r.status == 503|401 {
sleep COOLING-OFF-PERIOD;
server-login if $r.status == 401;
$r = await $cro.post: "{SERVER-URI}/{DATA-PATH}", body => $data;
}
await $r.body
}
}

Oh, it ran almost to the end of the data (and it’s fast), but… we are getting some 409s for some records where our csv-to-json is not smart enough, we can ignore those records. And some timeouts.

sub format-error(X::Cro::HTTP::Error $_) {
my $status-line = .response.Str.lines.first;
my $resp-body = do { await .response.body-blob }.decode;
my $req-method = .request.method;
my $req-target = .request.target;
my $req-body = do { await .request.body-blob }.decode;
"ERROR $status-line WITH $resp-body FOR $req-method $req-target WITH $req-body"
}
sub server-post($data) {
our $cro;
do {
my $r = await $cro.post: "{SERVER-URI}/{DATA-PATH}", body => $data;
my $count = NUMBER-OF-RETRIES;
while $count and $r.status == 503|401 {
sleep COOLING-OFF-PERIOD;
server-login if $r.status == 401;
$r = await $cro.post: "{SERVER-URI}/{DATA-PATH}", body => $data;
}
await $r.body
}
CATCH {
when X::Cro::HTTP::Client::Timeout {
note 'got a timeout, cooling off for a little bit more';
sleep 5 * COOLING-OFF-PERIOD;
server-login
}
when X::Cro::HTTP::Error {
note format-error $_
}
}
}

the result

So, now the whole process goes smoothly and finishes in 20 minutes, circa 100x faster.

Import the data in production… similar results. The process is ongoing, 15 minutes in, the Ops comes (in person):

Why is the server load triple the normal and the number of 5xx is thru the roof?

just five more minutes, check the ticket XXX, closing it now…

(unintelligible noises)

And this is the story of how to import half a million records, that would take two whole days to be imported, in twentysome minutes. The whole ticket took less than a day’s work, start to finish.

related readings

If you want to read more about Raku concurrency, past Advent articles that might interest you are:

Raku Advent Calendar: Day 6: Immutable data structures and reduction in Raku

Published by wimv12e on 2022-12-06T13:01:00

For a little compiler I’ve been writing, I felt increasingly the need for immutable data structures to ensure that nothing was passed by references between passes. I love Perl and Raku but I am a functional programmer at heart, so I prefer map and reduce over loops. It bothered me to run reductions on a mutable data structure. So I made a small library to make it easier to work with immutable maps and lists.

A reduction combines all elements of a list into a result. A typical example is the sum of all elements in a list. According to the Raku docs, reduce() has the following signature

multi sub reduce (&with, +list)

In general, if we have a list of elements of type T1 and a result of type T2, Raku’s reduce() function takes as first argument a function of the form

    -> T2 \acc, T1 \elt --> T2 { ... }

I use the form of reduce that takes three arguments: the reducing function, the accumulator (what the Raku docs call the initial value) and the list. As explained in the docs, Raku’s reduce operates from left to right. (In Haskell speak, it is a foldl :: (b -> a -> b) -> b -> [a].)

The use case is the traversal of a role-based datastructure ParsedProgram which contains a map and an ordered list of keys. The map itself contains elements of type ParsedCodeBlock which is essentially a list of tokens.

    role ParsedProgram {
        has Map $.blocks = Map.new; # {String => ParsedCodeBlock}
        has List $.blocks-sequence = List.new; # [String]
        ...
    }
    
    role ParsedCodeBlock {
        has List $.code = List.new; # [Token]
        ...
    }

List and Map are immutable, so we have immutable datastructures. What I want to do is update these datastructures using a nested reduction where I iterate over all the keys in the blocks-sequence List and then modify the corresponding ParsedCodeBlock. For that purpose, I wrote a small API, and in the code below, append and insert are part of that API. What they do is create a fresh List resp. Map rather than updating in place.

I prefer to use sigil-less variables for immutable data, so that sigils in my code show where I have use mutable variables.

The code below is an example of a typical traversal. We iterate over a list of code blocks in a program, parsed_program.blocks-sequence; on every iteration, we update the program parsed_program (the accumulator). The reduce() call takes a lambda function with the accumulator (ppr_) and a list element (code_block_label).

We get the code blocks from the program’s map of blocks, and use reduce() again to update the tokens in the code block. So we iterate over the original list of tokens (parsed_block.code) and build a new list. The lambda function therefore has as accumulator the updated list (mod_block_code_) and as element a token (token_).

The inner reduce creates a modified token and puts it in the updated list using append. Then the outer reduce updates the block code using clone and updates the map of code blocks in the program using insert, which updates the entry if it was present. Finally, we update the program using clone.

    reduce(
        -> ParsedProgram \ppr_, String \code_block_label {
            my ParsedCodeBlock \parsed_block =
                ppr_.blocks{code_block_label};
    
            my List \mod_block_code = reduce(
                -> \mod_block_code_,\token_ {
                    my Token \mod_token_ = ...;
                    append(mode_block_code_,mod_token_);
                },
                List.new,
                |parsed_block.code
            );
            my ParsedCodeBlock \mod_block_ =
                parsed_block.clone(code=>mode_block_code);
            my Map \blocks_ = insert(
                ppr_glob.blocks,code_block_label,mod_block_);
            ppr_.clone(blocks=>blocks_);
        },
        parsed_program,
        |parsed_program.blocks-sequence
    );
    

The entire library is only a handful of functions. The naming of the functions is based on Haskell’s, except where Raku already claimed a name as a keyword.

Map manipulation

Insert, update and remove entries in a Map. Given an existing key, insert will update the entry.

    sub insert(Map \m_, Str \k_, \v_ --> Map )
    sub update(Map \m_, Str \k_, \v_ --> Map )
    sub remove(Map \m_, Str \k_ --> Map )
    

List manipulation

There are more list manipulation functions because reductions operate on lists.

Add/remove an element at the front:

    # push
    sub append(List \l_, \e_ --> List)
    # unshift
    sub prepend(List \l_, \e_ --> List)
    

Split a list into its first element and the rest:

# return the first element, like shift
sub head(List \l_ --> Any)
# drops the first element
sub tail(List \l_ --> List)

# This is like head:tail in Haskell
sub headTail(List \l_ --> List) # List is a tuple (head, tail)

The typical use of headTail is something like:

    my (Str \leaf, List \leaves_) = headTail(leaves);
    

Similar operations but for the last element:

    # drop the last element
    sub init(List \l_ --> List)
    # return the last element, like pop.
    sub top(List \l_ --> Any) ,
    # Split the list on the last element
    sub initLast(List \l_ --> List) # List is a tuple (init, top)
    

The typical use of initLast is something like:

    my (List \leaves_, Str \leaf) = initLast(leaves);
    

Rakudo Weekly News: 2022.49 ReleaseMas Again

Published by liztormato on 2022-12-05T16:59:09

This week saw a new Rakudo compiler release (2022.12 by Justin DeVuyst) with a nice bunch of new features and bug fixes, a new Cro release (mostly to fix a testing issue that would inhibit installation by default) and a new release of the Podlite editor (0.4.0, with markdown block support, /r/rakulang comments). Nice prezzies!

Raku Advent Calendar 2022

The fourth Raku Advent Calendar has started with an introduction by JJ Merelo and the following contributions so far:

Meanwhile, Daniel Sockwell is still looking for a volunteer to write an advent post about the idea: “I’m disappointed by dynamic typing“.

Chris Jarvis has posted a nice overview of Advent related events, and EditorDavid also has a list.

Raku Conference 2023

Andrew Shitov has published tentative dates for the in-person 2023 Raku Conference: 3-4 August 2023 (/r/rakulang comments).

Advent of Code

Daniel Sockwell is telling us that there is a Github repository for 2022 Advent of Code solutions in the Raku Programming Language.

Anton’s Corner

Anton Antonov wrote about their new Proq::ZMQed module in a blog post.

Rainbow Butterfly Award

You can still nominate people for the Rainbow Butterfly Award until the 8th of December. Please consider who you would like to receive the Rainbow Butterly Award 2022 by sending your nomination by email to: rainbow@raku.org .

Weeklies

Weekly Challenge #194 is available for your perusal.

New Problem Solving Issues

New Pull Requests

Core Developments

Questions about Raku

Meanwhile on Mastodon

Meanwhile, still on Twitter

Meanwhile on the mailing list

Comments about Raku

New Raku Modules

Updated Raku Modules

Winding down

So many new things this week! And yet another new name in core development!

This week’s picture is to remind us of sunnier days in these dark days before the winter Solstice! Also to remind us Ukraine is still fighting the Russian aggression. Слава Україні!  Героям слава!

In the meantime, please keep staying safe, keep staying healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

Raku Advent Calendar: Day 5: Malware and Raku

Published by jjmerelo on 2022-12-05T01:01:00

This article has been written by Paula de la Hoz, cybersecurity specialist and artist.

While Raku regex and tokens are meant to work on data structures (such as parsing and validating file types), they can help us to better understand malware. Malware, as any other legit binary, have some signatures within. Some “file signatures” are widely used to blacklist those specific samples (the hashes), but the problem is that blacklisting hashes is not safe enough. Sometimes, the very same kind of malware could be slightly different in small details, and have many different samples related. In this case, apart from relying on dynamic detection (monitoring devices and alerting the user when something seems to be acting suspiciously), genes are also investigated.

Malware genes are pieces of the reversed code (such as strings) that are commonly seen in most or all the samples of a malware family. This sort of genes help researchers identify the malware family and contextualize the attacks , since this is relevant not only to try to put an end to the threat by executing the proper counterfeits in time, but also helps profiling and framing threat actors in some cases.

Generally, these genes are also useful to look for malware families among a unknown group of samples. A common tool for this is “YARA”. YARA is a tool used by researchers to create some rules and basic logic to try to find genes across samples. The structure of how YARA works can also be approached using Raku grammars, providing an alternative that might be useful when the YARA logic is not enough for the regex rules in specific cases. In order to test this idea, I created “CuBu” (curious butterfly), a tool similar to YARA which takes advantage of Raku elements to look for malware genes. For testing out the tool I designed a script to look for Sparkling Goblin genes. Sparkling Goblin is an APT (advanced persistent threat) that I happened to investigate a few months ago. While working on a YARA rule, I found out the following gene was commonly seen in some of their malware:

InterfaceSpeedTester9Calc

So I created a token in Raku using that gene:

my token gen1 {'InterfaceSpeedTester9Calc'}

Now created a regex with it:

my regex sparkling_goblin {<gen1>}

And parsed a file line by line trying to look for the gene:

my $c = 1;
for "$fo/$fi".IO.lines -> $line {
# If the line contains the gene, print it
if $line ~~ &sparkling_goblin {say "Sparkling Goblin found: "; say $line; say "in line $c"; say "in file $fi"; say " "; }
#if $line ~~ &sparkling2 {say "Sparkling Goblin complex regex found: "; say $line; say "in line $c"; say " "; }
$c++;
}

In the code above, the file is parsed in a given folder ($fo) and file ($fi); when the gene is found it prints the name of the file and the line. In this case there are too many steps for a single gene, but let’s check then using another regex from different tokens. Let’s say we also want to check for gene:

ScheduledCtrl9UpdateJobERK

So in this case we can create another token:

my token gen2 {'ScheduledCtrl9UpdateJobERK'}

And change the regex so it checks for one or the other:

my regex sparkling2 {
    [
       <gen1>|<gen2>
    ]
    }

And we can keep going with yet another gene:

my token gen3 {'ScanHardwareInfoPSt'}

And add it in the regex:

my regex sparkling2 {
    [
       <gen1>|<gen2>|<gen3>
    ]
    }

Now let’s say that the first gene is only suspicious when seen in the end of a line, but the second and third genes are suspicious always. We then should use the regex <gen1>$ included in our logic.

my regex sparkling2 {
    [
       <gen1>$|<gen2>|<gen3>
    ]
    }

This is becoming interesting and more specific. If we wanted to check for a line which ends with the first gene, or starts with the second gene we would do:

my regex sparkling2 {
    [
       <gen1>$|^<gen2>
    ]
    }

And if we want to look for a line which is specifically the third gene without anything else or any of the other genes inside the strings:

my regex sparkling2 {
    [
       <gen1>|<gen2>|^<gen3>$
    ]
}

And so on. Once you know your malware you can create more and more refined regex to work with them. You can create more than one regex to look for different specific things. This is how the whole code for the last option would look like:

sub MAIN (Str :$fi = '', Str :$fo = '') {
# some genes in the binary
my token gen1 {'InterfaceSpeedTester9Calc'}
my token gen2 {'ScheduledCtrl9UpdateJobERK'}
my token gen3 {'ScanHardwareInfoPSt'}
my regex sparkling2 {
[
<gen1>|<gen2>|^<gen3>$
]
}
my $c = 1;
for "$fo/$fi".IO.lines -> $line {
if $line ~~ &sparkling2 {say "Sparkling Goblin complex regex found: "; say $line; say "in line $c"; say "in file $fi"; say " "; }
$c++;
}
}

In my tool, CuBu, I used this raku (compiled with rakudo) inside a bash script using Zenity for a simple user friendly GUI that asks for the folder and the raku script and creates a CSV and a raw file with the results. It iterates every single file of the folder:

#!/bin/sh

zenity --forms --title="New analysis" \
	--text="Enter configuration:" \
	--separator="," \
	--add-entry="Folder" \
	--add-entry="Threat name" >> threat.csv

case $? in
    0)
        echo "Configuration set"
	name=$(csvtool col 2-2 threat.csv)
	mv threat.csv* "$name.csv"

	folder2=$(csvtool col 1-1 $name.csv)
	;;
    1)
        echo "Nothing configured."
	;;
    -1)
        echo "An unexpected error has occurred."
	;;
esac

zenity --question \
--text="You are going to check samples in folder $folder2 in order to look for $name. Is that okay?"
if [ $? ]; then
	echo "Starting analysis: "

touch results_$name

	for i in "$folder2"/*; do
		rakudo $name.raku --fi="$i" --fo=. >> results_$name
	done

	zenity --info \
	--text="Info saved in results_$name"
else
	echo "okay! bye!"
fi

Raku Advent Calendar: Day 4: Give the gift of time

Published by jjmerelo on 2022-12-04T00:31:32

Lately, Santa was getting lots of letters that went a bit like this

Dear Santa:
I've been mostly good, with 98% coverage this year, so what I want for Christmas is... time.
You know, I have great Rakulang GitHub Actions for stuff, but when I need to install some external package and also many distributions, it takes a loooong time to run, 10 minutes or so, and I can't do anything meaningful during that time, so I wander off elsewhere and I barely remember where I was.
So can I please have time?

Santa thought a bit about this. And that. And that other thing, too. Setting a cache for installed modules was no big deal; raku-test-action does precisely that. Setting a cache for that and modules that need to be installed, well, that’s something different altogether.

If there were a couple of things Santa knew, that was Raku and wrapping. So that was that. Just wrap everything nicely in a single image, a Dockerfile to bundle it all.

FROM jjmerelo/raku-test:latest

ENV PKGS="openssl-dev"
USER root
RUN apk update && apk upgrade && apk add --no-cache $PKGS
USER raku

WORKDIR /home/raku

COPY META6.json .

RUN zef install --deps-only . && rm META6.json

ENTRYPOINT ["zef","--debug","test","."]

We use as base image the tried and true raku-test image, which includes the bare basic to run your tests, and has been optimized (through multi-stage builds) to take the minimum amount of space.

Which means time, of course: less weight for an image, less time it will take to download it from the repository. So we’re good here, only 74.92 MB.

Then we bundle the packages we need inside the image. We just need 4 statements here; and one of them is just to make it a bit more generic. You want to bundle some other (Alpine) package within the image, change the PKGS variable and you’re good.

But then, we need to bundle the dependencies that are going to be used for testing, basically all production dependencies plus the ones declared as “test” (Also build dependencies, not so common, though). The only thing we need to install these is to copy the META6.json file in the build, fire the installation command, and then delete it, because we are not going to need it any more.

We add an ENTRYPOINT for good measure, but we don’t really need it here, because we are going to run test inside the container.

You probably know this needs to be saved to a Dockerfile in your root directory, so I need not repeat it here

Next step is to create a workflow that uploads it automatically to an image registry. Easiest to access is the GitHub Container Registry, so we will use this workflow to build and upload to that registry:

name: Test-create and publish Docker images

on:
  push:
    paths:
      - Dockerfile
      - META6.json
      - .github/workflows/test-upload-ghcr.yaml

env:
  REGISTRY: ghcr.io

jobs:
  build-and-push-image:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write

    steps:
      - name: Check out source
        uses: actions/checkout@v3

      - name: Log in to GHCR
        uses: docker/login-action@v2
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push image
        uses: docker/build-push-action@v3.2.0
        with:
          context: .
          push: true
          tags: ${{ env.REGISTRY }}/raku-community-modules/raku-test-www

This is longish, but pretty straightforward. First, run only when pushing to main; then, in an Ubuntu runner, establish permissions (need to be able to read contents, and also push to the registry), check out source, log in to the registry, and then build and push the image. Since this is a very common operation, we simply use existing github actions for that.

Only thing you’ll need to change here is the tags key: use the name of the organization/user you are working with instead of raku-community-modules, and the name you want to give your image instead of raku-test-www.

You need to save that with the same name that’s in the third paths key; that way, it will try to run every time this workflow changes, or when META6.json does.

We’re almost there. When this image is built, you need to integrate it in your testing workflows. Like this:

name: "Test"
on:
  push:
    paths:
      - META6.json
      - lib/*
      - t/*
  pull_request:
    paths:
      - META6.json
      - lib/*
      - t/*
jobs:
  test:
    runs-on: ubuntu-latest
    container:
      image: ghcr.io/raku-community-modules/raku-test-www:latest
      env:
        ONLINE_TESTING: 1
    steps:
      - name: Checkout
        uses: actions/checkout@v3
      - name: Test
        run: zef --debug test .

Of course, you’ll need to change the name of the image to whatever you’ve named it. Other than that, also pretty straightforward: check out the image, run the tests. Only thing here is that all the packages and distros will already be baked in the image, so you will only have to wait to download the image and to run your tests.

The gift of time

Santa was happy about this; by baking a container image every time some dependency changed, it gave all that time to Raku devs, who only needed to wait a paltry few seconds to download the image. Of course, they needed to wait for the tests, but they wrote that, so it’s on them.

With that, Santa wishes every one a merry Christmas and spend your time wisely helping others and making the world a better place to live.

rakudo.org: Rakudo compiler, Release #158 (2022.12)

Published on 2022-12-04T00:00:00

Rakudo Weekly News: 2022.48 Classy Core

Published by liztormato on 2022-11-29T13:59:12

This Saturday (3 December 2022 at 20:00 UTC), Vadim Belman will be giving the first online class about Rakudo core development (/r/rakulang comments). Requirements are a working knowledge of the Raku Programming Language, use of git and Github, and a willingness to self-learn things. It will be a kind of seminar where things may significantly divert from the initial plan.

Anton’s Corner

Anton Antonov did another of their blog post / YouTube video combos with Interactive Mermaid diagrams generation via Markdown evaluation (video, /r/rakulang comments).

San Francisco Raku Study Group

There will be an online SF Raku Study group at 9 GMT on 18 December (password: 4RakuRoll), the meeting on the 4th of December has been cancelled due to scheduling conflicts.

Steering Council

The minutes of the November 26th, 2022 meeting are available.

Also, the deadline for the acceptance of nominations for the Rainbow Butterfly Award has been moved to the 8th of December. Please consider who you would like to receive the Rainbow Butterly Award 2022 by sending your nomination by email to: rainbow@raku.org .

Special offer on Raku (e)Books!

It looks like Springer Verlag still has the special offer on two Raku Books, each at about 10€ / 10$, and the ebook at about 7€ / 7$! It is unclear how long this offer will last! This would make an excellent Christmas present!

Weeklies

Weekly Challenge #193 is available for your perusal.

New Problem Solving Issues

Core Developments

Questions about Raku

Meanwhile on Mastodon

Meanwhile, still on Twitter

Meanwhile on the mailing list

Comments about Raku

New Raku Modules

Updated Raku Modules

Winding down

Amazing to see how much interest #RakuLang is generating on Mastodon! And a new name in core development. And that while many people are preparing their entries for the Raku Advent Calendar!

This week’s picture is of a treetop in the setting sun! To remind us Ukraine is still fighting the Russian aggression. Слава Україні!  Героям слава!

In the meantime, please keep staying safe, keep staying healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

Rakudo Weekly News: 2022.47 Migratory

Published by liztormato on 2022-11-21T19:36:09

This year has seen a lot of migrations. In the real world sadly, but also online: the FreeNode fiasco comes to mind, and now Twitter appears to be going the same direction. So it was a good opportunity to start an official channel for the Raku Programming Language on Mastodon: https://fosstodon.org/@RakuLang, run by members of the Raku Steering Council. This will replace the two Twitter accounts that were run by Moritz Lenz and Roman Baumer.

Yours truly found out today that you can actually “follow” a tag on Mastodon. To follow the #RakuLang tag, the relative URL is /tags/RakuLang on your local Mastodon instance, so e.g. https://fosstodon.org/tags/RakuLang. Why the CamelCase? Aren’t tags case insensitive? They are indeed, but by camelcasing them, you make it easier on the people that need to use screen readers, as screenreaders take the capital letters as hints for pronunciation.

Special offer on Raku (e)Books!

It looks like Springer Verlag has put out a special offer on two Raku Books, each at about 10€ / 10$, and the ebook at about 7€ / 7$! It is unclear how long this offer will last! This would make an excellent Christmas present!

From the Documentation Team

After a lot of discussion, it was decided to start using the Github Discussions feature for the documentation project. And/or participate in discussions on the #raku-doc channel on Libera.chat (or just follow the live logs if you just want to lurk).

San Francisco Raku Study Group

Joseph Brenner has been hosting an on-line Raku Study group for quite some time now. Sadly, announcements so far had been out of step with the publication of the Rakudo Weekly News. That has been remedied, so yours truly is proud to announce there will be an online SF Raku Study group at 9 GMT on 4 and 18 December (password: 4RakuRoll). You will need Zoom to be installed.

Nikos’ Corner

Nikos Vaggalis published a column about the vision shared in The Perl and Raku Foundation (previously known as YAS: Yet Another Society).

Elizabeth’s Corner

Elizabeth Mattijsen continued their blogging streak with:

Weeklies

Weekly Challenge #192 is available for your perusal.

New Pull Requests

Core Developments

Questions about Raku

Meanwhile on Mastodon

Some weeks ago the big migration started. Yours truly went along (https://mastodon.social/@lizmat) and will be reporting Raku related information from the Mastodon area of the Fediverse. Turns out, there have been quite a few mentions of the Raku Programming Language on Mastodon in the past that were not properly reported in the Weekly. Without going to far back in history, here are the threads since October 2022:

Meanwhile on Twitter

Yours truly will continue to monitor Twitter, but decided to not post any new content on Twitter anymore.

Meanwhile on the mailing list

Comments about Raku

New Raku Modules

Updated Raku Modules

Winding down

An interesting week yet again. With a lot of changes. Fortunately, still the steady flow of Raku news!

This week’s picture of Lego-like bricks from the late 1960’s! To remind us Ukraine is still fighting the Russian aggression. Слава Україні!  Героям слава!

In the meantime, please keep staying safe, keep staying healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

Elizabeth Mattijsen: Role playing

Published by Elizabeth Mattijsen on 2022-11-20T15:15:02

This is the third part of the "A gaze of iterators!" series.

By any other name

Let's look at the names of the iterators in part 2. For this, I'm going to use the .^name method. As we've seen before, .^foo means "calling the .foo method on the object's meta-object".

say <a b c>.iterator.^name;
# Rakudo::Iterator::ReifiedListIterator
say (1..*).iterator.^name;
# Rakudo::Iterator::IntRangeUnending
say (1..10).pick(*).iterator.^name;
# List::PickN
say (1..10).map({++$_}).iterator.^name;
# Any::IterateOneWithoutPhasers

As you can see, the names of the classes of these iterator objects are all over the place. But they all provide the same interface: being able to call methods such as .pull-one, .push-all and .sink-all.

In many programming languages, you'd expect all of these classes to be sharing the same parent class. In the Raku Programming Language you can indeed inherit from a parent class. You can check the lineage of a class with the .^mro method (for method resolution order).

say <a b c>.iterator.^mro;
# ((ReifiedListIterator) (Any) (Mu))
say (1..*).iterator.^mro;
# ((IntRangeUnending) (Any) (Mu))
say (1..10).pick(*).iterator.^mro;
# ((PickN) (Any) (Mu))
say (1..10).map({++$_}).iterator.^mro;
# ((IterateOneWithoutPhasers) (Any) (Mu))

That is odd? They all seem to inherit from Any and Mu? Yet, one can not call the .pull-one method on every object that just inherits from Any and Mu:

say 42.^mro;
# ((Int) (Cool) (Any) (Mu))
say 42.pull-one;
# No such method 'pull-one' for invocant of type 'Int'

Role playing

The Raku Programming Language also provides a thing called "roles". In short, you could think of a role as a collection of methods that will be "implanted" into a class if the class itself does not provide a method implementation for it.

All of these iterator classes that we've seen here, actually do the Iterator role. And just as with the ^.mro, you can introspect which roles a class performs by calling the .^roles method. Let's see how that works out here:

say <a b c>.iterator.^roles;
# ((PredictiveIterator) (Iterator))
say (1..*).iterator.^roles;
# ((Iterator))
say (1..10).pick(*).iterator.^roles;
# ((Iterator))
say (1..10).map({++$_}).iterator.^roles;
# ((SlippyIterator) (Iterator))

So it looks like some classes are actually playing more than one role. But they all also do the Iterator role, it looks like.

How to be an iterator

To make a class be an iterator, one must tell the class to do the Iterator role. That's pretty simple, no? Let's start with an empty class that just wants to be an iterator. You do that by using does:

class Foo does Iterator {
}
===SORRY!=== Error while compiling -e
Method 'pull-one' must be implemented by Foo
because it is required by roles: Iterator.

So we need to actually provide some type of implementation for the interface that the Iterator role is providing. Ok, so let's make a very simple method pull-one that will randomly return True or False:

class TrueFalse does Iterator {
    method pull-one() { Bool.roll }
}
say TrueFalse.pull-one;  # True | False

The .roll method randomly picks a single value from a set of values. When called on an enum, it will randomly select one of the enums values. And the Bool enum has True and False as its values.

Of course, this is all very boring, let's make it more interesting:

class YeahButNoBut does Iterator {
    method pull-one() {
        Bool.roll ?? "Yeah but" !! "No but"
    }
}
say YeahButNoBut.pull-one;  # Yeah but | No but

So we now have a class that produces an iterator. But how would you actually use that in any "normal" way in your program? Well, by embedding the iterator into another class, and have a method .iterator in it that returns the iterator class:

class Jabbering {
    method iterator() {
        my class YeahButNoBut does Iterator {
            method pull-one() {
                Bool.roll ?? "Yeah but" !! "No but"
            }
        }
    }
}

Note here that the .iterator method actually returns the class object itself (usually referred to as the "type object"). Why? Because that's all we need from this iterator class: in its current form, this class doesn't need to keep any state.

Also note that classes in the Raku Programming Language can be lexically scoped by prefixing them with my, just as you would lexically scoped variables. This makes sense in this case, as there would be no need for the iterator class outside of the scope of the "Jabbering" class.

So how would this look with by .^name, ^.mro and .^roles, as we've shown with all of the other iterators?

say Jabbering.iterator.^name;
# Jabbering::YeahButNoBut
say Jabbering.iterator.^mro;
# ((YeahButNoBut) (Any) (Mu))
say Jabbering.iterator.^roles;
# ((Iterator))

As you can see, the Jabbering class iterator has the expected name. And the Jabbering class inherits from Any and Mu, and performs the Iterator role.

So with all of this out of the way, now you can start jabbering!

.say for Jabbering;

Hmmm... that doesn't stop now, does it?

Indeed it doesn't. As to why, that's for the next instalment in this series!

Conclusion

This concludes the third part of the series, in which the concept of roles in the Raku Programming Language is introduced, along with does. And that you can alter the scope of a class by prefixing it with my.

Questions and comments are always welcome. You can also drop into the #raku-beginner channel on Libera.chat, or on Discord if you'd like to have more immediate feedback.

I hope you liked it! Thank you for reading all the way to the end.

Elizabeth Mattijsen: Setting up your haystack

Published by Elizabeth Mattijsen on 2022-11-18T12:36:32

This blog post will discuss the ways you can specify where to look for matches with rak as part 4 of the It's time to rak! blog series.

From here on down

As we've seen in the earlier instalments, you can very easily search for a string using a literal string or a Raku regex in all files that look like they contain text.

# look for "foo" in all files
$ rak foo

# Search for "foo" in files of the "lib" directory
$ rak foo lib

And we've seen we can also limit the search to a single file:

# Look for "ve" anywhere on any line in file "twenty"
$ rak --type=contains ve twenty

These are three of the very basic ways to specify where to search: the first by not specifying anything, which implies all files in the current directory and any subdirectories of which the name does not start with a period.

The second basically being the same as the first, but starting from the "lib" directory on down, rather than from the current directory.

The third being the specification of a single file to look in. And that single file does not need to exist on the local filesystem! It can also be a URL. Let's look for the word "reading" in the first blog post of this series:

$ rak §reading https://dev.to/lizmat/its-time-to-rak-part-1-30ji
https://dev.to/lizmat/its-time-to-rak-part-1-30ji
598:aria-label="Add to 𝐫𝐞𝐚𝐝𝐢𝐧𝐠 list"
954:<p>Thank you for 𝐫𝐞𝐚𝐝𝐢𝐧𝐠 all the way to the end!</p>

What this basically does is to download the indicated resource (courtesy of curl) into a temporary file, search in that while keeping the original URL as "the filename", and remove the file automatically when it's done.

Actually only two

If you look at the above, then you realize that there are actually only two types of specification: a directory or a file (which could be local or remote). And that a directory will be recursed into to look for files to include in the search.

The search for files in a directory, and its subdirectories, can be influenced by 40 different arguments. This blog post will not mention all of them. You can do:

# produce extensive help on filesystem filters
$ rak --help=filesystem --pager=less

to get a more in-depth description of the logic for each of the options should you need a feature that is not covered in this blog post. The --pager argument is to let you more easily scroll the extensive text, but is of course not necessary.

What should be noted here is that these filesystem filters are only applied on subdirectories and files in those subdirectories. So not on any directory or file that you specify directly.

But beware! Many shells auto-expand anything you specify on the command line if they can: and these will be considered to be directly specified by rak, as it does not have a way to distinguish between what you typed and what the shell expanded it to. For example:

# Search all files and all subdirectories
$ rak foo *

The * in the shell will effectively do ls -d *. In practice, this is almost the same as not specifying anything at all. But with one subtle difference: none of the filesystem filters will be applied to what the shell expanded to.

Whereas if you would not specifying anything (or . to indicate the current directory), the filesystem filters would be applied, because you (implicitely) specified only the current directory. So only the current directory would be exempt from filesystem filters.

At the base

The two most important filesystem filters are --file and --dir. They expect a piece of code that will be given the basename of a file or a directory, and which should return a trueish value to allow the file / directory to be accepted. And they can also be specified as a flag: --file for unconditional acceptance, and --/dir for unconditional denial (which can be handy if you do not want recursion into subdirectories).

By default, --file and dir='!.starts-with(".")' are assumed. Which effectively means, don't recurse into directories that start with a period, and accept all files in any other directory.

To make it easier for you to specify files given by one or more extensions, you can use the --extensions argument:

# Only accept files with the .bat extension
$ rak foo --extensions=bat

As the name of the argument implies, you can specify multiple extensions, separated by comma's:

# Only accept files with the .bat or the .ps1 extension
$ rak foo --extensions=bat,ps1

It's also possible to only accept files without extension with the --extensions argument by just not specifying any actual extension:

# Only accept files without extension
$ rak foo --extensions=

You can also specify one of the predefined groups of extensions. For instance, if you would like to only include Raku and Markdown files in your search, you can do:

# Only accept Raku and Markdown files
$ rak foo --extensions=#raku,#markdown

Note that the groups of extensions are prefixed with #. To get an up-to-date list of extension groups:

# List all known extensions
# rak --list-known-extensions

If there is no argument specified related to the basename of the file (any of the above here), then the content of each file will be checked to see if it looks like it contains text. Only if it looks like that, will it actually be included.

More peripherally

The rest of the filesystem filter arguments can be roughly divided into the following groups: by content, epoch, owner / group, numeric meta value, external program and by filesystem attribute. Again, you can see all of the needed information about these by doing:

# produce extensive help on filesystem filters
$ rak --help=filesystem

In any case, the end result of all of these filters is an internal list of files that will be checked for the pattern. You could think of this list as the haystack, and the pattern as the proverbial needle, as it were.

More on the haystack

Apart from specifying paths after the pattern, there is also a --paths argument. This is supposed to contain a comma separated list of paths. So these two invocations are equivalent:

# Search in the "lib" and "doc" directories
$ rak foo lib doc
$ rak foo --paths=lib,doc

The --paths argument allows you to save a set of paths with a shortcut (as we've seen in Customizing your options).

You can also store filenames and/or paths in a file, and specify that file to be taken as the haystack specification: the --paths-from=filename and --files-from=filename arguments. Each line of the specified file will be taken as either a file or path specification. The difference in handling is that if a file is specified on a line with --paths-from, it is accepted. If a directory is specified on a line with --files-from, then it will be ignored as not being a file. And either of these take "-" (aka a single hyphen) to mean to read from STDIN.

For open source developers, the --under-version-control argument may be of use. When used in a git repository, it will set up the haystack with all the files that are under version control.

More extensive help on these and other haystack arguments can be obtained by doing:

# produce extensive help on haystack specification
$ rak --help=haystack

Twisting the haystack

There is one argument that converts the haystack into a list with the paths of all the files in the haystack: --find. It changes the list of targets into a target itself, if you will. So instead of looking for the pattern in the contents of the files of the haystack, you'd be looking in the names of the files instead.

# Show all filenames that have "lib" in their name
$ rak --find lib

And if you just want a list of filenames, you can omit the pattern altogether:

# Show all filenames from current directory on down
$ rak --find

And what if you would just like to see the names of directories instead of files? Well, that'd be only legal way to use the --file argument as a negator:

# Show all directory names from current directory down
$ rak --find --/file

In any case, the --find argument is named after the Unix find command. I thought I'd mention that, if that wasn't clear just yet.

Conclusion

This concludes part 4 of a series of blog posts about rak.

It shows how you can instruct rak where to look for matches, to create a haystack if you will. By applying different acceptance rules for files and subdirectories, for instance by looking at extensions. It also shows how you can twist the haystack to just show filenames or the names of directories.

If you have any comments, find bugs, have recommendations / ideas, please submit them as issues at the App::Rak repository. If you would like to have a more direct interaction, you can visit the #raku-rak channel on Libera.chat.

Thank you (again) for reading all the way to the end!

Rakudo Weekly News: 2022.46 Rainbow Butterfly

Published by liztormato on 2022-11-14T14:41:24

Sometimes good ideas almost fall through the cracks! In September the TPRF suggested the idea of setting up a yearly Rainbow Butterfly Award to be awarded to the person who has done outstanding non-core support for the Raku Community / promotion of the Raku Programming Language (discussed in the Raku Steering Council meeting of 17 September).

Please consider who you would like to receive the Rainbow Butterly Award 2022 by sending your nomination by email to: rainbow@raku.org . And if at all possible, add your reasons as to why the nominated person should receive the award! Nominations will be accepted until the 1st of December, after which the Raku Steering Council will deliberately choose and announce the winner.

Thank you, Roman

Roman Baumer, one of the people keeping the Raku infrastructure up and running, has indicated that they do not have the time anymore to properly support the Raku infrastructure. This means that we’re looking for people willing to take care of the day-to-day running, as well be able to support new developments of the Raku infrastructure. Please send an email to infra@raku.org if you’re interested!

FOSDEM 2023

The FOSDEM organization has announced the developer rooms for the 2023 edition of FOSDEM on 4/5 February 2023 in Brussels. Followed by a Call for Participation. Check out the devrooms and figure out where you could do a presentation of some aspect of the Raku Programming Language that would be of interest of the people in that devroom. You can also claim some financial support by TPRF!

Anton’s Corner

Anton Antonov blogged about their new interface to the DeepL natural language translation server called Lingua::Translation::DeepL.

Elizabeth’s Corner

Elizabeth Mattijsen continued their blogging streak with the first 2 parts of a new series called “A gaze of iterators!”:

Steering Council

The minutes of the November 12th, 2022 meeting are available.

Weeklies

Weekly Challenge #191 is available for your perusal.

New Problem Solving Issues

Core Developments

Questions about Raku

Meanwhile on Twitter

For the foreseeable future, yours truly will continue to monitor Twitter.

Meanwhile on the mailing list

Comments about Raku

New Raku Modules

Updated Raku Modules

Winding down

A bit of a quiet week again, while plenty going on in the rest of the world.

This week’s picture shows nature at work again! To remind us Ukraine is still fighting the Russian aggression. Слава Україні!  Героям слава!

In the meantime, please keep staying safe, keep staying healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

Elizabeth Mattijsen: Pushing the limits

Published by Elizabeth Mattijsen on 2022-11-13T09:49:20

This is the second part of the "A gaze of iterators!" series.

Pushing the limits

So let's again look at the list of methods the iterator on 42 has:

.say for 42.iterator.^methods'
# new
# pull-one
# push-exactly
# push-all
# skip-one
# skip-at-least
# count-only
# sink-all
# bool-only
# push-until-lazy
# push-at-least
# skip-at-least-pull-one
# is-lazy
# is-deterministic
# BUILDALL

Hmmm... .push-all looks interesting. Could it really be that simple? Let's see!

my @array;
42.iterator.push-all(@array);
say @array;  # [42]

And how about a list of values?

my @array;
<a b c>.iterator.push-all(@array);
say @array;  # [a b c]

Looking at this, you could realize that the above is just a convoluted way to write:

my @array = <a b c>;
say @array;  # [a b c]

And you'd be right again! And now you have a better idea of what goes on under the hood. Well, at least conceptually, because the actual implementation is of course at liberty to take short-cuts to improve efficiency.

Don't like that one

The next method on that list is .skip-one. Looks like it's mostly like .pull-one, so let's check:

my $iter = <a b c>.iterator;
say $iter.skip-one; # 1
say $iter.pull-one; # b
say $iter.pull-one; # c
say $iter.skip-one; # 0

So it looks like .skip-one really skips a value. But it also returns something? Indeed, it returns either 1 to indicate a successful skip (in this case "a" got skipped", and 0 for an unsuccessful skip (in this case because there is no value after "c").

Why not True and False you might ask? Well, these are all methods that work under the hood as efficiently as possible, and turning a native integer 1 into a boolean True would just be extra and unnecessary work.

What is it

Those two .is-lazy and .is-deterministic methods also look interesting:

say 42.iterator.is-lazy;           # False
say 42.iterator.is-deterministic;  # True

The .is-lazy method indicates whether the iterator is lazy or not.

In hindsight, the term "lazy" was probably a bad choice. The most obvious thing about "lazy" iterators, is that you cannot calculate the number of elements it will produce. So the term "countable" (while reversing the meaning of the returned value) would probably have been better.

say (1..*).iterator.is-lazy;  # True

is an example of a "lazy" iterator, of which you can not count the number of elements in a range of integers from 1 to infinity. If you try to do that with the .elems method, you will get an error:

say (1..*).elems;  # Cannot .elems a lazy list

If it wouldn't produce the error, it would hang because it would be producing values "ad infinitum" literally! Before being able to tell you the number of elements.

The .is-deterministic method indicates whether the iterator, given a certain source, will always produce the same values in the same order. The Raku internals can optimize certain situations if it knows whether the produced values will always be the same.

say (1..10).iterator.is-deterministic;          # True
say (1..10).pick(*).iterator.is-deterministic;  # False

Note the .pick method will produce the given values in a random order, so clearly not deterministic!

say (1..10).pick(*);  # (7 1 9 2 4 5 10 8 3 6)
say (1..10).pick(*);  # (4 6 8 9 2 10 7 5 1 3)

That's weird

What does .BUILDALL do? Actually, nothing that should concern you. The ALLCAPS of the method really indicates that there is something special going on!

The .BUILDALL method is a method that is automatically generated for every class, and it contains the default logic to initialize an object of that class. There is no source code for it: the definition of a class determines how that method will be directly generated into executable bytecode.

What are you sinking about?

Now, the other methods all have names that make sense, probably. But there is one method that appears to be different: .sink-all. Let's see what happens if we call that:

say <a b c>.iterator.sink-all;  # IterationEnd

That's informational? Not! But what did it do? In this particular case, it will mark the iterator as completed. Not very useful.

But there are other (very common) cases where calling the .sink-all method is very useful. Remember that a for loop is really just a .map of which the results are discarded?

my $seen = 0;
(1..10).map({++$seen}).iterator.sink-all;
say $seen;  # 10

The .sink-all method is used internally for those iterators that are just executed for their side-effects. So the above is just a very complicated way to write:

my $seen = 0;
++$seen for 1..10;
say $seen;  # 10

The term "sink" is the Raku equivalent for what other programming languages call "void context". But more about that later in this series.

Conclusion

This concludes the second part of the series, in which most of the other methods that you can call on an iterator have been explained. Specifically the .skip-one, .push-all, .is-lazy, .is-deterministic and .sink-all methods. With a side-order of .pick.

Questions and comments are always welcome. You can also drop into the #raku-beginner channel on Libera.chat, or on Discord if you'd like to have more immediate feedback.

I hope you liked it! Thank you for reading all the way to the end.

Elizabeth Mattijsen: A gaze of iterators!

Published by Elizabeth Mattijsen on 2022-11-11T10:36:12

This blog post provides an introduction to iterators in the Raku Programming Language.

It requires some basic understanding of Raku code. One could consider the Don't fear the grepper! series as a prerequisite for this series of blog posts.

Iterator Central

Iterators are at the basis of every type of iteration in the Raku Programming Language, except for loop (which uses a counter, or iterates indefinitely), while and until (which iterate while a condition is True / False).

Iterators are everywhere in Raku: all values and classes support having the iterator method called on them.

say 42.iterator;  # Rakudo::Iterator::ReifiedListIterator.new

So why would that say what it does? Well, as we've seen before, the say subroutine will call the .gist method on whatever it got. And the default (inherited) .gist method on instances of classes shows the name of the class and how it could possibly be created.

Same for any other class, even one of your own:

class Foo { }
say Foo.new;  # Foo.new

What can you use an iterator for?

Good question! Actually, in general you wouldn't be using an iterator yourself in your code. You would provide Raku with an iterator (usually implicitly), and let that do the work for you. In general.

But to get the feel of what an iterator can do, we're going to tinker with iterators a bit in this series of blog posts. Just to get a feel of what is going on under the hood, as it were.

Looking on the inside

First of all, one would like to know which methods you can call on an iterator object. Fortunately, the Raku Programming Language has many introspection capabilities.

One of them is the .methods method on the meta-object of the iterator. Don't think too much about that at this point: just know that there's a special syntax for calling a method on the meta-object of an object: .^method-name.

So let's see what the iterator on 42 can do:

.say for 42.iterator.^methods;
# new
# pull-one
# push-exactly
# push-all
...

.new? There's nothing new about that? Well, yes and no. The fact that it is listed here, means that the class has its own (not inherited) method new. Whether that is useful information, is up to the reader!

The next one is .pull-one. Let's see what happens if we call that on the iterator object:

say 42.iterator.pull-one; # 42

I guess you could hardly call that surprising. But what happens if you would call the .pull-one method for a second time?

my $iter = 42.iterator;
say $iter.pull-one; # 42
say $iter.pull-one; # IterationEnd

Hmmmm... what's this IterationEnd you say? It's a very special sentinel value that indicates that the iterator is done producing values. And that you should not call any methods on the iterator anymore (as the results will be undefined).

Ok, so let's try this again, this time with a small list of values:

my $iter = <a b c>.iterator;
say $iter.pull-one; # a
say $iter.pull-one; # b
say $iter.pull-one; # c
say $iter.pull-one; # IterationEnd

Or, shorter:

my $iter = <a b c>.iterator;
say $iter.pull-one for ^4;
# a
# b
# c
# IterationEnd

Pulling until it's done

Now that we know that the final value is IterationEnd, we should be able to write a loop checking for that value, right? And end the loop on that? Indeed we can! But it requires some special care:

my $iter = <a b c>.iterator;
until ($_ := $iter.pull-one) =:= IterationEnd {
    .say;
}
# a
# b
# c

That's maybe a lot of colons all of a sudden!

The first colon is in := which is the binding operator. It aliases the left side with the right side: in this case with the topic $_. It's to make sure that we're going to compare the actual value directly, rather than a value in a variable (as values in variables may actually appear differently to the outside world, if they want to).

The second one is the =:=, the identity operator. It checks whether both sides refer to the same item in memory. Whether they are really the same object.

Looking at this code, you might realize that this is actually a convoluted way to write:

for <a b c> {
    .say
}

And you'd be right: what you see above is more or less essentially what is happening under the hood. Of course, reality is a bit more complicated: in this example we for instance didn't account for handling any loop phasers (FIRST, NEXT and LAST). But the basic principle is the same!

And because every value and every class can take a call to the iterator method, you should understand now why this works:

for 42 {
    .say
}
# 42

Indeed, because you can call the .iterator method on any class or object!

Conclusion

This concludes the first part of the introduction to iterators, and possibly to the Raku Programming Language.

It introduced the iterator and ^methods methods, as well as the pull-one method and the special IterationEnd sentinel value for iterators. And it casually introduced the := binding operator and the =:= identity operator.

Questions and comments are always welcome. You can also drop into the #raku-beginner channel on Libera.chat, or on Discord if you'd like to have more immediate feedback.

I hope you liked it! Thank you for reading all the way to the end.

Rakudo Weekly News: 2022.45 Evaluersion

Published by liztormato on 2022-11-07T12:58:36

Anton Antonov has published a blog post (and associated video) about the conversion and evaluation of Raku files (/r/rakulang comments) into Mathematica and Jupyter notebooks.

Elizabeth’s Corner

Elizabeth Mattijsen continued their 4-week blogging streak with:

Weeklies

Weekly Challenge #190 is available for your perusal.

New Problem Solving Issues

New Pull Requests

Core Developments

Questions about Raku

Meanwhile on Twitter

For the foreseeable future, yours truly will continue to monitor Twitter.

Meanwhile on the mailing list

Comments about Raku

New Raku Modules

Updated Raku Modules

Winding down

A bit of a quiet week, but with some amazing RakuAST developments!

This week’s shows nature at work, even when the weather is definitely becoming more gloomy. To remind us Ukraine is still fighting the Russian aggression. Слава Україні!  Героям слава!

In the meantime, please keep staying safe, keep staying healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

Elizabeth Mattijsen: Sequencing for the win!

Published by Elizabeth Mattijsen on 2022-11-04T12:27:38

This is the 6th and final part of the "Don't fear the grepper!" series.

Storing results

In all of the previous instalments of this series of blog posts, the result of a .grep or .map operation was always immediately shown with say.

The say subroutine is supplied by the Raku core: it calls the .gist method on the given object(s), which is expected to give you a... gist (as in "the general meaning of a text").

For example:

say (1..5).map(* + 2);

shows:

(3 4 5 6 7)

Note that the gist of the result of the .map is shown with parentheses to give you an idea of the listiness of the result.

All objects in Raku have a .gist method, inherited from Any. If you don't like the gist it produces for your classes, you will need to provide your own method gist.

You can also store the results of a .grep or .map in an array:

my @result = (1..5).map(* + 2);
say @result;

which shows:

[3 4 5 6 7]

Note that this shows the result using square brackets. That's because the .gist method on Array objects uses square brackets. Which is otherwise all pretty straightforward.

You can even see the values as they're being calculated if you want to:

my @result = (1..5).map({ say "calculating $_"; $_ + 2});
say "stored";
say @result;

which shows:

calculating 1
calculating 2
calculating 3
calculating 4
calculating 5
stored
[3 4 5 6 7]

And if you're not interested in the complete result, you could just ask for it to show the first element. And as indexing in Raku is zero-based, that would be index 0:

my @result = (1..5).map({ say "calculating $_"; $_ + 2});
say "stored";
say @result[0];

which would show:

calculating 1
calculating 2
calculating 3
calculating 4
calculating 5
stored
3

In this case, all possible values were calculated and actually stored in the array @result even though you were only interested in the first value (the first element in the array). Which may be costly if there are a lot of values to calculate.

Being efficient

But couldn't you store the result in a scalar variable, and get the same result? Yes, you can, but the flow of execution and the result would be subtly different:

my $result = (1..5).map({ say "calculating $_"; $_ + 2});
say "stored";
say $result;

which would show:

stored
calculating 1
calculating 2
calculating 3
calculating 4
calculating 5
(3 4 5 6 7)

Note that we're back to showing the final result using parentheses. That's really because we're in fact just "gisting" (as one would say as an experienced Rakoon) the result of the .map like the original example.

What is more important to note, is that "stored" appears before you can see the values being calculated. It almost looks like the .map is not getting executed, until we actually need to make a gist of it in order to be able to say it. And you'd be right!

This is one of the properties of the Raku Programming Language: it tries to do as little as possible, and only do the stuff that's needed when its needed.

But you may ask, why did it fill the array completely in the example with @result? In short, That's because it was decided that when an array is assigned to, it will keep filling until the right hand side of the assignment has been exhausted.

Actually, it's a little more general than that, but this should be enough explanation for now

So you can think of:

my @result = (1..5).map({ say "calculating $_"; $_ + 2});

as:

my @result;
for (1..5).map({ say "calculating $_"; $_ + 2}) {
    @result.push($_);
}

It was decided that any other sort of (default) behaviour would have been too confusing for people used to other programming languages.

It's an object

Remember:

Everything in Raku is an object, or can appear to be one

The result of a .map is also an object. And you can use the .WHAT method to interrogate what kind of object something is. Some examples:

say 42.WHAT;     # (Int)
say "foo".WHAT;  # (Str)
say (1..5).WHAT; # (Range)

Note that .WHAT returns the type object (aka the class) of an object. And the .gist method for type objects puts parentheses around the name as an indicator it is a type object.

So what type of object is returned by .map?

say (1..5).map({ .say; $_ + 2}).WHAT; # (Seq)

A Seq object. Aha!

Note that calling .WHAT on the Seq object was completely silent otherwise. That's because it didn't execute anything. Because it didn't need to. Because it just interrogated meta-information about the Seq object.

Indexing a Seq

So what will happen if you want to see the first element only of a Seq object stored in a scalar variable? You can use the same indexing as we did on the @result array, because Seq objects understand that:

my $seq = (1..5).map({ say "calculating $_"; $_ + 2});
say "stored";
say $seq[0];

which shows:

stored
calculating 1
3

Wow. It only calculated a single value! Yes, here Raku could be, and actually was, as efficient as possible, because you only needed the first element. But what if you also want the third element?

my $seq = (1..5).map({ say "calculating $_"; $_ + 2});
say $seq[0];
say $seq[2];

shows:

calculating 1
3
calculating 2
calculating 3
5

As you can see, it doesn't re-calculate the first element again. So yes, it looks like it is cached somewhere. And you'd be right again. As soon as you use indexing on a Seq, it will create a hidden array that will be used to cache previously calculated values. Technically, that's because the Seq class performs the PositionalBindFailover role.

From here to infinity

It's this efficiency in Raku that allows you to actually specify Inf or Whatever as the end-point of the range in our example:

my $seq = (1..*).map({ say "calculating $_"; $_ + 2});
say $seq[0];

which shows:

calculating 1
3

without being busy calculating all values until the end of time or memory.

Grep the mapper

One nice feature of Seq, is that you can call .grep (or .map for that matter, or vice-versa) on it as well. So let's go all the way back to the initial example of filtering on even numbers. In the following example, we first create a Seq object with .map, then create a new Seq object using .grep on that. And then show the first three elements ([0..2]):

my $seq = (1..*).map({ say "map $_"; $_ + 2});
$seq = $seq.grep({ 
    say $_ %% 2 ?? "accept $_" !! "deny $_"; 
    $_ %% 2
});
say $seq[0..2];

which shows:

map 1
deny 3
map 2
accept 4
map 3
deny 5
map 4
accept 6
map 5
deny 7
map 6
accept 8
(4 6 8)

This shows that all values are produced one-by-one through the chain as efficiently as possible.

Now how that all works under the hood, is going to be the subject of a slightly more advanced series of blog posts, tentatively titled "A gaze of iterators".

Conclusion

This concludes the sixth and final part of the series.

It shows that the Raku Programming Language has a Seq object that is responsible for producing values. And that every object has a .WHAT method that gives you the type object of an instance, and a .gist method. While sneakily introducing the ternary ?? !! operator.

Questions and comments are always welcome. You can also drop into the #raku-beginner channel on Libera.chat, or on Discord if you'd like to have more immediate feedback.

I hope you liked it! Thanks again for reading all the way to the end.

gfldex: Recursive Cinderella

Published by gfldex on 2022-10-03T21:51:28

PWC 184 asked to split an array into numbers and letters.

for ( 'a 1 2 b 0', '3 c 4 d'), ( '1 2', 'p q r', 's 3', '4 5 t') -> @a is copy {
    @a.=map: *.split(' ').cache;
    my @numbers = @a.deepmap({ / <[0..9]> / ?? .Numeric !! Empty }).grep(+*)».Array;
    my @letters = @a.deepmap({ / <[a..z]> / ?? .Str !! Empty }).grep(+*)».Array;
    say @numbers, ‘ and ’, @letters;
}

I use deepmap with Empty to separate to weed from the chaff and remove the extra structure added by deepmap with a grep. That extra structure raised the question, what would be needed to split a tree in twain. This could be useful to gain parts of a XML Document with a single parser pass, while maintaining the structure of the tree. Relation of child nodes and parent nodes often carries meaning.

for ( 'a 1 2 b 0', '3 c 4 d'), ( '1 2', 'p q r', 's 3', '4 5 t'), [ '1', [ [ 'a' ], '2 b'], '3 c' ] -> @a is copy {

    multi sub cinderella(@data, @a, $needle-a, @b, $needle-b) {
        for @data {
            @a[$++] := my $new-a;
            @b[$++] := my $new-b;
            cinderella($_, $new-a, $needle-a, $new-b, $needle-b)
        }
    }

    multi sub cinderella(@data, Any:U $a is raw, $needle-a, Any:U $b is raw, $needle-b) {
        my @new-a;
        my @new-b;
        for @data {
            cinderella($_, $a, $needle-a, $b, $needle-b);
            @new-a.push: $a;
            @new-b.push: $b;
        }
        $a = @new-a;
        $b = @new-b;
    }

    multi sub cinderella(Str:D $s, $a is raw, $needle-a, $b is raw, $needle-b) {
        cinderella($_, my @new-a, $needle-a, my @new-b, $needle-b) for $s.split(' ');
        $a = @new-a ?? @new-a.join(' ') !! Empty;
        $b = @new-b ?? @new-b.join(' ') !! Empty;
    }

    multi sub cinderella(Str:D $s where *.chars == 1, @a, $needle-a, @b, $needle-b) {
        given $s {
            when $needle-a { @a.push: $s }
            when $needle-b { @b.push: $s }
            default { fail('dunno where to put: ' ~ $s) }
        }
    }

    cinderella @a, my @numbers, / <[0..9]> /, my @letters, / <[a..z]> /;

    dd @numbers, @letters;
}


# OUTPUT: Array @numbers = ["1 2 0", "3 4"]
          Array @letters = ["a b", "c d"]
          Array @numbers = ["1 2", Empty, "3", "4 5"]
          Array @letters = [Empty, "p q r", "s", "t"]
          Array @numbers = ["1", [[[],], "2"], "3"]
          Array @letters = [[], [["a"], "b"], "c"]

The leaves in the target tree can be either a Str or (empty) Array. However, the first multi candidate doesn’t make the decision what item is added to the target Arrays. Instead I create a fresh scalar container and bind it to a container within the Arrays. I then pass that container on to the next multi-candidate where it might be filled with Positional– or Str-object. Please note the distinction. MMD doesn’t care about the container type we use, it looks for the properties of the value. This allows me to split the strings on a white space and pass it on into the next round of MMD matches. The 2nd candidate handles the case where we descent into a nested Array. It can manipulate the scalar container created with @a[$++] := my $new-a; and turn it into a Positional value (here an Array), because that initial container is pass into the multi with is raw.

This is a very powerful concept. Writing the same with a single recursive function would contain a lot of special casing and be no join to debug.

Not doing what the instructor asked for seems to produce better results. I shall do so more often.

vrurg: Did you know that …

Published by Vadim Belman on 2022-10-03T00:00:00

Let’s assume we have a type with multi-component name, like:

class Foo::Bar {
}

And there is another class Baz for which we want it to be coercible into Foo::Bar. No problem!

class Baz {
    method Foo::Bar() { Foo::Bar.new }
}

Now we can do:

sub foo(Foo::Bar() $v) { say $v }
foo(Baz.new);

gfldex: Rabbitholeing

Published by gfldex on 2022-09-29T19:37:17

With PWC 182-2 specifically asking for Linux paths to be handled, we need to resolve issues like /../ and symbolic links. Since I didn’t feel like putting a directory called a into my root folder, I wrote a little helper that deals with some of the tripwires that modern filesystems provide.

my @input = qw <
    /a/b/c/1/x.pl
    /a/b/c/d/e/2/x.pl
    /a/b//c/d/3/x.pl
    /a/b/./c/4/x.pl
    /a/../a/b/c/d/5/x.pl
>;

sub resolve(Str:D() $s){
    my @parent-refs = (my @parts = $s.split(rx{ ‘/./’ | ‘/’+ })).pairs.grep(*.value eq '..')».key;
    @parent-refs = flat @parent-refs, @parent-refs »-» 1;
    @parts[@parent-refs]:delete;
    @parts.join(‘/’)
}

# OUTPUT: /a/b/c/1/x.pl /a/b/c/d/e/2/x.pl /a/b/c/d/3/x.pl /a/b/c/4/x.pl ///a/b/c/d/5/x.pl

The last path starts with a triple root, because join assumes holes are actually there. It won’t skip fields that are Empty either, so is default(Empty) doesn’t help. To actually remove elements from an Array we are better of with splice. Since this method doesn’t return self, we can’t just @parts.splice(@parent-refs.any,1).join(‘/’). Both ways to remove elements are mutators and eager. That doesn’t fit well into the rest of the language and spells doom for concurrency. To find a solution I had to go down the rabbit hole that is iteration in Rakudo. The bottom happens to be located in Rakudo/Iterator.pm6.

method pull-one() is raw {
  nqp::ifnull(
    nqp::atpos($!reified,++$!i),
    nqp::if(
      nqp::islt_i($!i,nqp::elems($!reified)), # found a hole
        self!hole($!i),
        IterationEnd
      )
   )
}

So a hole in an Array is just a null-pointer in C-land — given that we didn’t overshoot the end of the Array. With that knowledge, building an Iterator that skips elements becomes rather simple.

multi sub prune(@a, Int:D $i --> Seq:D) {
    prune @a, $i .. $i
}

multi sub prune(@a, +@l is copy --> Seq:D) {
    @l = @l.sort; # this makes checking the index against the needles simpler

    Seq.new: class :: does Iterator {
        has int $!i;
        has $!reified;
        submethod !SET-SELF(\arr) {
            $!reified := nqp::getattr(@a,List,'$!reified');
            $!i = -1;
            self
        }
        method new(\arr) { nqp::create(self)!SET-SELF(arr) }
        method pull-one is raw {
            loop {
                ++$!i;
                if @l {
                    @l.shift while +@l && $!i > @l[0].max;
                    next if +@l && @l[0].min ≤ $!i ≤ @l[0].max;
                }
                return nqp::ifnull(
                    nqp::atpos($reified, $!i),
                    nqp::if(
                        nqp::isge_i($!i, nqp::elems($reified)),
                        IterationEnd,
                        next # we actually got a hole
                    )
                );
            }
        }
    }.new(@a)
}

@a = ('a' .. 'z').List;

dd @a.&prune( 25 );
dd @a.&prune( 10..15 );
dd @a.&prune( (2,3,10..15, 21..22, 25).pick(5) ); # randomising for testing
# OUTPUT: ("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y").Seq
#         ("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z").Seq
#         ("a", "b", "e", "f", "g", "h", "i", "j", "q", "r", "s", "t", "u", "x", "y").Seq

The idea is to loop by default and bail out of that loop if the current index held in $!i is not found in any needle. Since we don’t got real goto in Raku, loop/next becomes a decent substitute. As I don’t really understand what a unreified Array is, I’m right now not qualified to provide a PR.

Dealing with lazy and infinite lists makes Array a beast. I shall not falter until I have tamed it!

gfldex: Valid temperatures

Published by gfldex on 2022-09-05T20:19:06

PWC 181 Task 2 asks for input validation of dates and numbers. OK, it should ask for input validation so my solution isn’t totally over the top. Better safe then sorry.

my $data = q:to/EOH/;
2022-08-01, 20
2022-08-09, 10
2022-08-03, 19
2022-08-06, 24
2022-08-05, 22
2022-08-10, 28
2022-08-07, 20
2022-08-04, 18
2022-08-08, 21
2022-08-02, 25
EOH

$data.lines()
==> map(* ~~ / $<date> = [ \d+ ‘-’ \d?\d ‘-’ \d?\d ] \s* ‘,’ \s* $<temperature> = [ '-'?\d+ [ '.' \d+ ]? ] /)
==> map(-> (Date(Str:D(Match)) :$date, Numeric(Str:D(Match)) :$temperature, *%) { [ $date, $temperature ] })
==> sort(*.first)
==> map( -> [$date, $temperature ] {
    state $last-temp;
    LEAVE $last-temp = $temperature;
    once next;
    „$date, Δ{abs $last-temp - $temperature}°C“ if $temperature > $last-temp;
})
==> -> *@a { @a.join($?NL) }()
==> put();

First I use a regex with named captures to get hold of the date and temperature Strs. Those are not overly helpful, as I need to sort by date. The 2nd map solves that problem with nested coercions in a destructuring sub-signature. As Match contains a few extra methods that I don’t care for, I need to gobble those up with an anonymous hash slurpy. Now I have a LoL with a date and temperature sub-list, where the date is a Date and the temperature is something Numeric. I sort by the first and then destructure again. I use a state container to keep a link between consecutive iteration steps. (To make this hyper-friendly, we would need to .rotor.) I always want $last-temp to contain the $temperature of the previous iteration but I don’t want to do anything else in the first iteration. Hence, the LEAVE-phaser and the once. Anything else means to create the output text. Since I don’t want a combo breaker, I use a pointy-block like we need to in a feed-operator-chain to add newlines where they belong.

If you managed to stuff more Raku-features into your solution, you out-convoluted me and I would love to see your code.

Joking aside, I believe to have another candidate for macro-ideas.txt. If we look at the first and second map, there is a pattern. We use the same named-arguments in both. With RakuAST it might be possible to provide a Regex, a block and a list of coercions, to produce a Sub that does that in one go. That would be a neat shortcut for simple parsers that deal with lines of text.

gfldex: Assuming optionality

Published by gfldex on 2022-09-04T13:52:56

PWC 180 Task 1 asks us to find the first unique character in a string. I wanted to have a nice interface where I would write:

say $str.comb.first: &unique-char;

The idea was to curry postcircumfix:<{ }> so it will be bound to a BagHash and always ask for :!exists. Alas, .assuming doesn’t do the right thing if the proto contains optional positions. I found a workaround utilising once.

for ‘Perl Weekly Challenge’, ‘Long Live Perl’ -> $str {
    my &unique-char = { (once $str.comb.BagHash»--){$_}:!exists }
    say $str.comb.first: &unique-char;
}

I don’t want to build the BagHash and remove single elements every time unique-char is called. There is a slight overhead when using once but that beats .assuming by a mile.

Given all the special cases Signatures provide, we may want to consider turning .assuming into a RakuAST-macro.

gfldex: @data».sparkle

Published by gfldex on 2022-08-28T21:32:20

While reading solutions to the PWC, I spotted a pattern. There where plenty of .map-calls that contained only simple maths. It must not be so! Task 179-2 leans itself to use vector operations.

use v6.d;

constant @sparks = "▁" .. "█";
constant \steps = +@sparks - 1;

multi MAIN(
    #| space separated list of numbers
    *@data
) {
    my $scale := @data.&{ (.max - .min) / steps };
    put @sparks[((@data »-» @data.min) »/» $scale)».round].join;
}

multi MAIN(
    #| output an example sparkline-graph
    Bool :$test
) {
    .&MAIN for <2 4 6 8 10 12 10 8 6 4 2>, <0 1 19 20>, <0 999 4000 4999 7000 7999>;
}

First, I define a character range (no need for .ord here) and store the number of bar-graph steps. Then I calculate the scale like it is used in a map (those paper-things, that always work and don’t break when dropped). Now I can use that scale to shrink (or grow) values. I believe rounding is better here then to cut of to integers.

When we feed a list to a postcircumfix:<[ ]> we get a list of values in return. However, we break the method-call chain when we do so. Since operators are just Subs with a funny syntax, we can solve that issue easily.

constant &sparks = {
    constant @sparks = "▁" .. "█";
    constant \steps = +@sparks - 1;

    &postcircumfix:<[ ]>.assuming(@sparks) but role :: { has $.steps = steps }
}.();

my $scale := @data.&{ (.max - .min) / &sparks.steps };
((@data »-» @data.min) »/» $scale)».round.&sparks.join.put;

The constants are hidden within a block-scope and only steps is exposed as part of a Subs “namespace”. By declaring &sparks as constant, I move the currying to compile time.

Hypers are not getting much use in PWC-solutions. I believe another entry in raku-did-you-know is in order.

p6steve: raku & rust: Option-Some-None

Published by p6steve on 2022-08-16T18:01:46

Regular visitors to my blog will know that I think raku and rust are both awesome in their chosen niches and are natural companions for the modern programming era just as perl and C were back in the day.

Coming off an excellent 2nd raku conference over the weekend, I got to thinking about how both languages handle the concept of “nothing” (the absence of a value) and wanted to have some -Ofun.

The rust idiom

First I googled some rust code example to remind me of the rust Option / Some / None idiom:

enum Option<T> {
    None,
    Some(T),
}

let mut opt: Option<i32>; 
opt = Some(1); 
//opt = None; 
match opt {
    Some(x) => {
        println!("Got {}", x); 
    }   
    None => {
        println!("Got nothing");
    }   
}

Here’s what the Rust Book has to say

Sometimes it’s desirable to catch the failure of some parts of a program instead of calling panic!; this can be accomplished using the Option enum.

The Option<T> enum has two variants:

https://doc.rust-lang.org/rust-by-example/std/option.html

The raku idiom

Then I compared to the usual raku idiom using Nil – of course since raku is gradually typed the most natural idiom for a quick 4 liner is no types.

my $opt;
$opt = 4;
#$opt = Nil;

with $opt { say "Got $_" } else { say "Got nothing" }

This may seem pretty cool and uncontroversial – but a lengthy debate “much ado about nothing” has been had over on github recently.

Which was enriched by a typically evocative insight from Larry Wall:

Nil is a kind of singularity. Black holes have no hair, and trying to fit them neatly back into the fabric of the universe results in contradictions. Similarly, attempting to fit Nil back into the normal type system is bound to cause problems, which is why we have Raku’s Nil rather than Perl’s undef. Perl programmers often fall into the trap of trying to store undef as a normal value, and get themselves into trouble down the road.

Black holes can also have an accretion disk (and maybe a “firewall”), which records some of the history of what is falling into the black hole. This is what a Failure does; it records some of the history of a singularity. The stuff still falling in is where black holes can keep their “hair”, at least temporarily.

The design of Nil with respect to Failure was primarily a cultural decision, not a technical decision. We do not want people thinking of Nil as a value, or as a type from which other user-defined types can be derived, other than user-defined Failures. We want them to think of Nil as the least-marked form of failure, so that they never treat Nil as some kind of nuanced success marker, or try to create a parallel system of nothingness that pretends to be storable as a value.

We could conceivably redefine Nil and Failure as both forms of Singularity (the property which subverts the return type system of the universe by the permanent absence of a return value), but the universe discourages people from creating their own singularities, and I suspect we should too.

Hope this helps… Larry

https://en.wikipedia.org/wiki/Larry_Wall

All your base are belong to us (the rust idiom in raku)

Not entirely content with this, I wondered if the raku (gradual) type system could do the more structured rust Option-Some-None thing. Here’s where I got to after a couple of hours hacking:

subset Option of Any where Some|None;

my Option $opt;
$opt = Some.new(7);
#$opt = None;
given $opt {
    when Some {
        say "Got {.v}"  
    }   
    when None {
        say "Got nothing"
    }   
}

Admittedly I only did Some[Int] via raku parametric roles and I should have followed suit with Option[Int] … but I am content to leave that as an exercise for the reader.

Here’s all the code to implement this with a few comments to show off the raku cool stuff:

role Things {                   #a utility Role for both flavours of Some

    method gist {
        "Some[{$.v.^name}]"     #
    }                           # .gist and .raku are used by .say and .Str
                                # methods ... so we override them to make nice
    method raku {               # output
        "Some[{$.v.^name}]"
    }

    method Str {             
        ~$.v                    # ~ is the Str concatenate operator, when used
    }                           # as a prefix it coerces its argument to (Str)

    method Numeric {
        +$.v                    # ~ is the addition operator, when used as a
    }                           # prefix it coerces its argument to (Num)
}

role None {}                    # roles are cool labels

role Some[::T] does Things {    # a parameterized role with a type capture
    has T $.v is required;      # using the type capture for a public attr

    multi method new( $v ) {    # ensure that the value is defined
        die "Died with undefined value" without $v;
        self.new: :$v
    }
}

role Some does Things {         # role are multis (Some[Int].new and Some.new)
    has $.s is required;        # require the attr ... if absent, fail
    has $.v;
    
    multi method new( ::T $v ) {   # use the type capture in a signature
        die "Died with undefined value" without $v; 
        self.new: s => Some[(T)].new(:$v)
    }   

    submethod TWEAK {           # late stage constructor alias $.v to $.v
        $!v := $!s.v            # for the Some Things role
    }   
}

#err - that's it!

### some tests...
my $x = Some[Int].new(42);
say "$x is ", $x;               # 42 is Some[Int]
say $x + 2;                     # 44 

my $y = Some[Rat].new(3.2);     # 3.2 is Some[Rat]
say "$y is ", $y; 

my $z = Some.new(2e1);
say "$z is ", $z;               # 20 is Some[Num]
say $z + 2e0;                   #22 

#my $a = Some.new();            #Died with X::Attribute::Required
#my $b = Some.new(Nil);         #Died with undefined value

This example shows how raku can be setup to ape another type system fairly closely in only 42 lines of code. Although you may have to resort to a bit more sleight of hand and effort to get rid of the .new method and to bring in Option[::T] in synch with Some[::T] – but not bad for a blog post.

Some of the things I like about raku that contribute to this are:

As usual, please comment here or over at the raku reddit page or the raku irc (also via discord) – let me know if any rustaceans would like to join in the -Ofun!

~p6steve

p6steve: TRC Slides

Published by p6steve on 2022-08-14T19:16:32

Attending The Raku Conference – https://conf.raku.org/2022/schedule

Here is my talk…

gradual-types-for-pandasDownload

Useful Links

Demo video …. here

rakudo.org: Rakudo compiler, Release #157 (2022.07)

Published on 2022-07-31T00:00:00

p6steve: Is raku Dan RubberSonic?

Published by p6steve on 2022-07-24T16:02:56

Intro

Raku is a great software language that draws on the scripting heritage of perl. The newly minted raku Dan modules provide Data ANalytics (geddit?) capabilities to raku for data science and engineering use cases. (Intro slides & demo video here). This is one in an occasional series of blog posts that seek to explore the whys & wherefores of raku Dan.

I you want to check out more, warts and all, visit /r/rakulang at Reddit and ask away!

Disclosure: I am the author of the raku Dan module family. ~p6steve

Why Dan?

It is tempting to say we need Dan for Raku just like Python needs Pandas – to meet the needs of the large community of Data Scientists. And there has been some healthy debate about this in the raku community around how this is valuable functionality that is otherwise lacking. Raku Dan is a necessary, but not a sufficient reason to switch.

The vast majority of Data Scientists are Python+Pandas users. To them, the question Why raku::Dan? is synonymous with the question Why raku? or even Why not Python?

Why not Python?

The majority of this vast majority is either (i) preoccupied with ascending the Python+Pandas learning curve, (ii) preoccupied with getting the data science bit to work (the problem) and not the toolset, (iii) value their team’s time investment in getting here and do not wish to risk a new and unproven path or (iv) simply cannot imagine a better way than Python+Pandas. There is no compelling reason to change that outweighs the perceived negatives and so they won’t.

A significant minority of this vast majority is looking for a better way. Some value speed and are trying Python+Polars. Changing to a similar, but faster library than Pandas is less disruptive than changing language.

Some are trying a purpose-built Data Science language like Julia to get both improved speed and a domain-oriented coding experience.

It seems that the reasons for making a change are a balance of:

SIDEBAR:- Notably, there is some limit to “raw speed at all costs” for most changers – otherwise folk would always choose R+data.table or Rust+Polars (the Polars library is written in Rust and uses Arrow2 fast columnar structures – bypassing Python would be an obvious speed up that would avoid the GIL and other overhead).

How Sonic is Raku?

Here, I use the term ‘Sonic’ to reflect raw speed.

Raku is not renowned for its build or execution speed – it’s a scripting language with on-the-fly build-and-run capability, after all. However, it does have very powerful & deep concurrency support such as hyper and map/reduce operations which are available in raku Dan from the outset:

say ~my \quants = Series.new([100, 15, 50, 15, 25]);
say ~my \prices = Series.new([1.1, 4.3, 2.2, 7.41, 2.89]); 
say ~my \costs  = Series.new( quants >>*<< prices );

Yes – you can hyper a Dan::Series!

There are also some promising developments in sight for Raku (such as AST) and better VM optimisations.

This table illustrates how raku fits:

Purpose1990s2010snext…?TechnologyResult
Script codePerlPython3RakuInterpreter / VMFlexible
System codeCC++RustCompilerFast

Polars is fast because Rust is fast. I love Rust and I believe that Rust and Raku is a great fit. That’s why raku Dan::Polars (a binding from Dan to Rust) is a vital member of the raku Dan module family and I am looking forward to its intro soon. With raku Dan::Polars we will get the same access to Rust and Arrow2 raw speed that Python/Polars has.

SIDEBAR:- a Rust compile (‘cargo build’ specifically) for dan/src/lib.rs (the custom Rust library that connects raku Dan::Polars to Rust/Polars) takes 2m 51s from scratch and 19.31s if I make a small change to the source file (incremental build). In contrast, the similar sized raku Dan/Polars.rakumod build never has a comparable initial compile overhead and an incremental change takes only 1.714s to recompile and run. Boy do I love scripting languages that avoid that 19s dev cycle time!!

So – raku is about as Sonic as Python.

How Rubber is Raku?

Here, I use the term ‘Rubber’ to reflect flexible coding experience.

I have spent the last 6 months learning to Rust. Rust is cool. But Rust is hard. Despite the alleviations of generic types and traits and so on, Rust is a strongly typed world (as you would expect from something that can make system code safely).

But, a Rust newby like me can spend ages to hammer down some type errors and untangle the <T>, into(), clone(), mut, Some(None), unwrap() knots just to get a few lines of code working. Strong typing means thinking as much about the language imposed patterns as about the problem-solution.

Obviously I am born scripter. I like to write a line or two that works in practice and then grow my code from there, refactoring with more structure as I start to repeat stuff and I need to harden against errors. So the raku model of “get some code working, then gradually layer in type safety as required” is a very natural fit.

This, I submit, is a crucial distinction for Data Science coders.

Strong typing puts substantial cognitive load on the coder and makes dealing with real-world data sources very onerous.

No wonder Python became the ‘darling child’ of the data scientist community. No wonder other strongly typed, though performant, languages of the day like Java or C++ didn’t pass muster. Simply put, Data science needs a flexible coding experience – it’s where the rubber hits the road.

So – raku is about as Rubbery as Python.

Why Raku?

Let’s say I have 5m sales records – maybe from a csv or some historic data files or data warehouse. Can I be sure that there are absolutely no errors such as (e.g.) ‘1’ (Int) instead of ‘1.0’ (Double), or an ‘l’ (Str) or maybe ‘𝟙’ (Unicode)?

Let’s see how that looks in raku:

> my @a = <1 1.0 l 𝟙>
[1 1.0 l 𝟙]                    #look I can suck up anything
> @a.map(*.^name)
(IntStr RatStr Str IntStr)     #I can just store them as allomorphs
> @a.are;
(Str)                          #and check for common parent type
> @a.grep(Real)
(1 1.0 𝟙)                      #I can extract the numbers
> @a.map({$_~~Real ?? .Num !! NaN})
(1 1 NaN 1)                    #or make them all Num (f64)

The .are method picks the narrowest type that is common to all the array items.

And it plays in the super well-structured raku gradual type system:

OK – so now I have a hierarchical type space for progressive munging and a set of type tests and an easy way to do type coercion / error rejection.

Raku also let’s me step up the enforcement by gradually adding types to my data structures.

Let’s gradually introduce typed variables. my Num @r = @a declares a new array @r whose items must be of type Num. It will reject other item types. This is a powerful way to specify and manage a “contract” between data capture / data munging phases and later data analysis code.

> my @a = <1 1.0 l 𝟙>
[1 1.0 l 𝟙]
> my Num @r = @a
Type check failed in assignment to @r; expected Num but got IntStr (IntStr.new(1, "1"))
  in block <unit> at <unknown file> line 1

Now, I can clean up my act like this (i) to replace unreal items with NaN and then (ii) to coerce all the remaining items to Num. I already know that there will be no coercion failures since they are all matched type Real.

> @a.map({$_=NaN if not $_~~Real})
(NaN)
> my Num() @r = @a
[1 1 NaN 1]

Here are some of the bits of raku I am using to do the work:

Many other horrors lurk in data munging and capture. Using raku and its gradual type system is a great way to contract and curate a data pipeline.

Here we have only scratched the surface with the full range of capabilities of raku and gradual types. What about DateTime formats? What about text extraction, regexes and unicode? So a lot more for next time…

Raku is RubberSonic

In summary, this blog has been trying to work out why raku::Dan in particular and therefore raku in general may have resonance with some the community of Data Scientists that typically use Python+Pandas today.

It has shown that when it comes to raw speed, Raku (when equipped with Dan::Polars) is about as Sonic as Python.

It has shown that when it comes to a flexible coding experience, Raku is about as Rubbery as Python.

This matters, I think, because the RubberSonic combination places raku Dan squarely in the tradition of the first Python/Pandas pioneers. It is a very good impedance match for the needs of data scientists – where the language does not get in the way of the solution.

Here I have set out a case for raku Gradual Typing and other unique capabilities which are not provided by the Python language to substantially improve consistency and control during the data munging and data capture phases of data analysis.

So, yes, raku Dan is definitely RubberSonic and is a very good fit for the needs of data scientists who feel constrained by the limits of the Python language around concurrency, gradual typing, formal OO with encapsulation, unicode and so on…

Please do leave a comment here, or come and join the raku debate over on reddit.

~p6steve

(c)2022 Henley Cloud Consulting Ltd.

p6steve: raku ‘KISS’

Published by p6steve on 2022-07-11T21:18:54

Occam’s razor, also known as Keep It Simple Stupid (KISS), is a sound principle. Larry says it this way ~ “make the easy things easy and the hard things possible”.

https://commons.wikimedia.org/wiki/File:Vsevolod_Maximovich_Kiss_1913.jpg

The raku community is a set of (deep) experts (yes, really) – who are intent on the hard things:

Me, I learnt raku after 4 years of perl and 10 years of not coding. So, my reboot was from (nearly) scratch and Think Raku was my guide.

I get the feeling that sometimes we forget all the easy stuff it can do….

SO… let’s just take a breath and remember that, if you want, raku is easy-peasy.

SO… to demonstrate this, I went to raku-Physics-Measure-Jupyter and clicked the Binder link. Binder is a really easy, on demand place where Python and Raku coders can host and share Jupyter notebooks. Jupyter is a really easy, live notebook for Python and Raku coders to dabble and test their ideas.

Well, I could spell out the details here. But I prefer to let the code speak for itself.

And here’s the result:

Simples!

~p6steve

PS. Comments/feedback welcome.

rakudo.org: Rakudo compiler, Release #156 (2022.06)

Published on 2022-06-05T00:00:00

p6steve: raku & rust: a romance?

Published by p6steve on 2022-05-28T19:15:56

Rust is blazingly fast and memory-efficient: with no runtime or garbage collector, it can power performance-critical services, run on embedded devices, and easily integrate with other languages. Rust continues the spirit of C with emphasis on code safety and performance with a compiled approach.

Raku is an open source, gradually typed, Unicode-ready, concurrency friendly programming language made for at least the next hundred years. Raku continues the spirit of Perl with interpreter-like code generation (actually on MoarVM), one-liners, shell-centric, lightweight objects and expressiveness to get code up and running fast.

Both are modern languages and come from the Linux background:

In the same way that Perl and C were highly complementary technologies in their heyday, so Rust and Raku are naturally, romantically destined to be linked.

Introducing raku Inline::Rust

Following the nomenclature of raku modules such as Inline::Perl5, Inline::Python, Inline::Go and so on, Inline::Rust is a newly availably “interface” to connect Raku code to Rust dynamic libraries. Unlike some of its brethren, this is rather a zen module in that it provides worked examples and helpful Dockerfile builds to speed up the process, but no code is needed. The standard Rust FFI (Foreign Function Interface) on one side – the term comes from the specification for Common Lisp, which explicitly refers to the language features for inter-language calls as such; it is also used officially by the Haskell, Rust and Python programming languages. The core Raku NativeCall on the other – for calling into dynamic libraries that follow the C calling convention

Inline::Rust is inspired by the excellent Rust FFI Omnibus by Jake Goulding.

The Rust FFI Omnibus is a collection of 6 examples of using code written in Rust from other languages. Rust has drawn a large number of people who are interested in calling native code from higher-level languages. Many nearly duplicate questions have been asked on Stack Overflow, so the Omnibus was created as a central location for easy reference. This reference already covers C, Ruby, Python, Haskell, node.js, C# and Julia – with examples for each of these languages accessing the same Rust library code.

For the purposes of brevity, since the Rust code is common for all, this article will focus on the Raku consumption:

Preamble

We need to start with some basics – follow the README.md at Inline::Rust to try it yourself:

use NativeCall; 

constant $n-path = './ffi-omnibus/target/debug/foo';

Rust FFI Omnibus: Integers

Use the is native Trait to define the external function characteristics:

sub addition(int32, int32) returns int32 
    is native($n-path) { * }

say addition(1, 2);

Rust FFI Omnibus: String Arguments

Raku NativeCall provides traits to control Str encoding:

sub how_many_characters(Str is encoded('utf8')) returns int32 
    is native($n-path) { * }

say how_many_characters("göes to élevên");

Rust FFI Omnibus: String Return Values

Here we get a string back from Rust – the “free” sub means that Raku is responsible for releasing the memory. Raku Pointers can use the .deref method to get their contents:

sub theme_song_generate(uint8) returns Pointer[Str] is encoded('utf8') 
    is native($n-path) { * }
sub theme_song_free(Pointer[Str]) 
    is native($n-path) { * }

my \song = theme_song_generate(5);
say song.deref;
theme_song_free(song);

Rust FFI Omnibus: Slice Arguments

Here the for statement is used to cover over the lack of a direct assignment to CArray … a standard raku Array has a totally different memory layout so the CArray type is provided.

sub sum_of_even(CArray[uint32], size_t) returns uint32 
    is native($n-path) { * }

my @numbers := CArray[uint32].new;
@numbers[$++] = $_ for 1..6; 

say sum_of_even( @numbers, @numbers.elems );

Rust FFI Omnibus: Tuples

Full disclosure – there is an open issue with this pattern:

class Tuple is repr('CStruct') {
    has uint32 $.x;
    has uint32 $.y;
}
sub flip_things_around(Tuple) returns Tuple 
    is native($n-path) { * }

my \initial = Tuple.new( x => 10, y => 20 );
my \result  = flip_things_around(initial);
say result.x, result.y;

Rust FFI Omnibus: Objects

Here we can use a Raku class to wrap a Rust structure, note the free method is now automatically called by the Raku Garbage Collector:

class ZipCodeDatabase is repr('CPointer') {
    sub zip_code_database_new() returns ZipCodeDatabase 
        is native($n-path) { * }
    sub zip_code_database_free(ZipCodeDatabase)         
        is native($n-path) { * }
    sub zip_code_database_populate(ZipCodeDatabase)     
        is native($n-path) { * }
    sub zip_code_database_population_of(ZipCodeDatabase, Str 
        is encoded('utf8'))returns uint32 is native($n-path) { * }

    method new {
        zip_code_database_new
    }

    # Free data when the object is garbage collected.
    submethod DESTROY {        
        zip_code_database_free(self);
    }

    method populate {
        zip_code_database_populate(self)
    }

    method population_of( Str \zip ) {
        zip_code_database_population_of(self, zip);
    }
}

my \database = ZipCodeDatabase.new;
database.populate;

my \pop1 = database.population_of('90210');
my \pop2 = database.population_of('20500');
say pop1 - pop2;

Summary

The raku Nativecall syntax is very straightforward and the C-heritage on both sides shows in the seamless marriage (geddit?) of types.

This results in probably the most concise and natural code on the raku side – take a look at the other examples and make your own judgement!

More to come on some practical applications of this and support for concurrency and gradual typing…

~p6steve

PS. Please do leave comments/feedback on the blog page… here

PPS. Love that raku does not enforce indentation – I can make it fit the narrow width here!

rakudo.org: Rakudo compiler, Release #155 (2022.04)

Published on 2022-04-24T00:00:00

vrurg: He Tested Many Locks. See What Happened Then!

Published by Vadim Belman on 2022-04-23T00:00:00

These clickbaiting titles are so horrible, I couldn’t stand mocking them! But at least mine speaks truth.

My recent tasks are spinning around concurrency in one way or another. And where the concurrency is there are locks. Basically, introducing a lock is the most popular and the most straightforward solution for most race conditions one could encounter in their code. Like, whenever an investigation results in a resolution that data is being updated in one thread while used in another then just wrap both blocks into a lock and be done with it! Right? Are you sure?

They used to say about Perl that “if a problem is solved with regex then you got two problems”. By changing ‘regex’ to ‘lock’ we shift into another domain. I wouldn’t discuss interlocks here because it’s rather a subject for a big CS article. But I would mention an issue that is possible to stumble upon in a heavily multi-threaded Raku application. Did you know that Lock, Raku’s most used type for locking, actually blocks its thread? Did you also know that threads are a limited resource? That the default ThreadPoolScheduler has a maximum, which depends on the number of CPU cores available to your system? It even used to be a hard-coded value of 64 threads a while ago.

Put together, these two conditions could result in stuck code, like in this example:

BEGIN PROCESS::<$SCHEDULER> = ThreadPoolScheduler.new: max_threads => 32;

my Lock $l .= new;
my Promise $p .= new;
my @p; 

@p.push: start $l.protect: { await $p; };

for ^100 -> $idx {
    @p.push: start { $l.protect: { say $idx } }
}

@p.push: start { $p.keep; }

await @p;

Looks innocent, isn’t it? But it would never end because all available threads would be consumed and blocked by locks. Then the last one, which is supposed to initiate the unlock, would just never start in first place. This is not a bug in the language but a side effect of its architecture. I had to create Async::Workers module a while ago to solve a task which was hit by this issue. In other cases I can replace Lock with Lock::Async and it would just work. Why? The answer is in the following section. Why not always Lock::Async? Because it is rather slow. How much slower? Read on!

Lock vs. Lock::Async

What makes these different? To put it simple, Lock is based on system-level routines. This is why it is blocking: because this is the default system behavior.

Lock::Async is built around Promise and await. The point is that in Raku await tries to release a thread and return it back into the scheduler pool, making it immediately available to other jobs. So does Lock::Async too: instead of blocking, its protect method enters into await.

BTW, it might be surprising to many, but lock method of Lock::Async doesn’t actually lock by itself.

Atomics

There is one more way to protect a block of code from re-entering. If you’re well familiar with atomic operations then you’re likely to know about it. For the rest I would briefly explain it in this section.

Let me skip the part about the atomic operations as such, Wikipedia has it. In particular we need CAS (Wikipedia again and Raku implementation). In a natural language terms the atomic approach can be “programmed” like this:

  1. Take a variable and set it to locked state if not set already; repeat otherwise
  2. Do your work.
  3. Set the variable to unlocked state.

Note that 1 and 3 are both atomic ops. In Raku code this is expressed in the following slightly simplified snippet:

my atomicint $lock = 0; # 0 is unlocked, 1 is locked
while cas($lock, 0, 1) == 1 {}  # lock
...                             # Do your work
$lock ⚛= 0;                     # unlock

Pretty simple, isn’t it? Let’s see what are the specs of this approach:

  1. It is blocking, akin to Lock
  2. It’s fast (will get back to this later)
  3. The lock operation might be a CPU hog

Item 2 is speculative at this moment, but guessable. Contrary to Lock, we don’t use a system call but rather base the lock on a purely computational trick.

Item 3 is apparent because even though Lock doesn’t release it’s thread for Raku scheduler, it does release a CPU core to the system.

Benchmarkers, let’s go benchmarking!

As I found myself in between of two big tasks today, I decided to make a pause and scratch the itch of comparing different approaches to locking. Apparently, we have three different kinds of locks at our hands, each based upon a specific approach. But aside of that, we also have two different modes of using them. One is explicit locking/unlocking withing the protected block. The other one is to use a wrapper method protect, available on Lock and Lock::Async. There is no data type for atomic locking, but this is something we can do ourselves and have the method implemented the same way, as Lock does.

Here is the code I used:

constant MAX_WORKERS = 50;  # how many workers per approach to start
constant TEST_SECS = 5;     # how long each worker must run

class Lock::Atomic {
    has atomicint $!lock = 0;

    method protect(&code) {
        while cas($!lock, 0, 1) == 1 { }
        LEAVE $!lock ⚛= 0;
        &code()
    }
}

my @tbl = <Wrkrs Atomic Lock Async Atomic.P Lock.P Async.P>;
my $max_w = max @tbl.map(*.chars);
printf (('%' ~ $max_w ~ 's') xx +@tbl).join(" ") ~ "\n", |@tbl;
my $dfmt = (('%' ~ $max_w ~ 'd') xx +@tbl).join(" ") ~ "\n";

for 2..MAX_WORKERS -> $wnum {
    $*ERR.print: "$wnum\r";
    my Promise:D $starter .= new;
    my Promise:D @ready;
    my Promise:D @workers;
    my atomicint $stop = 0;

    sub worker(&code) {
        my Promise:D $ready .= new;
        @ready.push: $ready;
        @workers.push: start {
            $ready.keep;
            await $starter;
            &code();
        }
    }

    my atomicint $ia-lock = 0;
    my $ia-counter = 0;

    my $il-lock = Lock.new;
    my $il-counter = 0;

    my $ila-lock = Lock::Async.new;
    my $ila-counter = 0;

    my $iap-lock = Lock::Atomic.new;
    my $iap-counter = 0;

    my $ilp-lock = Lock.new;
    my $ilp-counter = 0;

    my $ilap-lock = Lock::Async.new;
    my $ilap-counter = 0;

    for ^$wnum {
        worker {
            until $stop {
                while cas($ia-lock, 0, 1) == 1 { } # lock
                LEAVE $ia-lock ⚛= 0; # unlock
                ++$ia-counter;
            }
        }

        worker {
            until $stop {
                $il-lock.lock;
                LEAVE $il-lock.unlock;
                ++$il-counter;
            }
        }

        worker {
            until $stop {
                await $ila-lock.lock;
                LEAVE $ila-lock.unlock;
                ++$ila-counter;
            }
        }

        worker {
            until $stop {
                $iap-lock.protect: { ++$iap-counter }
            }
        }

        worker {
            until $stop {
                $ilp-lock.protect: { ++$ilp-counter }
            }
        }

        worker {
            until $stop {
                $ilap-lock.protect: { ++$ilap-counter }
            }
        }

    }

    await @ready;
    $starter.keep;
    sleep TEST_SECS;
    $*ERR.print: "stop\r";
    $stop ⚛= 1;
    await @workers;

    printf $dfmt, $wnum, $ia-counter, $il-counter, $ila-counter, $iap-counter, $ilp-counter, $ilap-counter;
}

The code is designed for a VM with 50 CPU cores available. By setting that many workers per approach, I also cover a complex case of an application over-utilizing the available CPU resources.

Let’s see what it comes up with:

   Wrkrs   Atomic     Lock    Async Atomic.P   Lock.P  Async.P
       2   918075   665498    71982   836455   489657    76854
       3   890188   652154    26960   864995   486114    27864
       4   838870   520518    27524   805314   454831    27535
       5   773773   428055    27481   795273   460203    28324
       6   726485   595197    22926   729501   422224    23352
       7   728120   377035    19213   659614   403106    19285
       8   629074   270232    16472   644671   366823    17020
       9   674701   473986    10063   590326   258306     9775
      10   536481   446204     8513   474136   292242     7984
      11   606643   242842     6362   450031   324993     7098
      12   501309   224378     9150   468906   251205     8377
      13   446031   145927     7370   491844   277977     8089
      14   444665   181033     9241   412468   218475    10332
      15   410456   169641    10967   393594   247976    10008
      16   406301   206980     9504   389292   250340    10301
      17   381023   186901     8748   381707   250569     8113
      18   403485   150345     6011   424671   234118     6879
      19   372433   127482     8251   311399   253627     7280
      20   343862   139383     5196   301621   192184     5412
      21   350132   132489     6751   315653   201810     6165
      22   287302   188378     7317   244079   226062     6159
      23   326460   183097     6924   290294   158450     6270
      24   256724   128700     2623   294105   143476     3101
      25   254587    83739     1808   309929   164739     1878
      26   235215   245942     2228   211904   210358     1618
      27   263130   112510     1701   232590   162413     2628
      28   244143   228978       51   292340   161485       54
      29   235120   104492     2761   245573   148261     3117
      30   222840   116766     4035   241322   140127     3515
      31   261837    91613     7340   221193   145555     6209
      32   206170    85345     5786   278407    99747     5445
      33   240815   109631     2307   242664   128062     2796
      34   196083   144639      868   182816   210769      664
      35   198096   142727     5128   225467   113573     4991
      36   186880   225368     1979   232178   179265     1643
      37   212517   110564       72   249483   157721       53
      38   158757    87834      463   158768   141681      523
      39   134292    61481       79   164560   104768       70
      40   210495   120967       42   193469   141113       55
      41   174969   118752       98   206225   160189     2094
      42   157983   140766      927   127003   126041     1037
      43   174095   129580       61   199023    91215       42
      44   251304   185317       79   187853    90355       86
      45   216065    96315       69   161697   134644      104
      46   135407    67411      422   128414   110701      577
      47   128418    73384       78    94186    95202       53
      48   113268    81380       78   112763   113826      104
      49   118124    73261      279   113389    90339       78
      50   121476    85438      308    82896    54521      510

Without deep analysis, I can make a few conclusions:

And to conclude with, the performance win of atomic approach doesn’t make it a clear winner due to it’s high CPU cost. I would say that it is a good candidate to consider when there is need to protect small, short-acting operations. Especially in performance-sensitive locations. And even then there are restricting conditions to be fulfilled:

In other words, the way we utilize CPU matters. If aggregated CPU time consumed by locking loops is larger than that needed for Lock to release+acquire the involved cores then atomic becomes a waste of resources.

Conclusion

By this moment I look at the above and wonder: are there any use for the atomic approach at all? Hm… 😉

By carefully considering this dilemma I would preliminary put it this way: I would be acceptable for an application as it knows the conditions it would be operated in and this makes it possible to estimate the outcomes.

But it is most certainly no go for a library/module which has no idea where and how would it be used.

It is much easier to formulate the rule of thumb for Lock::Async acceptance:

Sounds like some heavily parallelized I/O to me, for example. In such cases it is less important to be really fast but it does matter not to hit the max_threads limit.

Ukraine

This section would probably stay here for a while, until Ukraine wins the war. Until then, please, check out this page!

I have already received some donations on my PayPal. Not sure if I’m permitted to publish the names here. But I do appreciate your help a lot! In all my sincerity: Thank you!

vrurg: A New `will complain` Trait

Published by Vadim Belman on 2022-04-18T00:00:00

Long time no see, my dear reader! I was planning a lot for this blog, as well as for the Advanced Raku For Beginners series. But you know what they say: wanna make the God laugh – tell him your plans!

Anyway, there is one tradition I should try to maintain however hard the times are: whenever I introduce something new into the Raku language an update has to be published. No exception this time.

So, welcome a new will complain trait!

The idea of it came to be from discussion about a PR by @lizmat. The implementation as such could have taken less time would I be less busy lately. Anyway, at the moment when I’m typing these lines PR#4861 is undergoing CI testing and as soon as it is completed it will be merged into the master. But even after that the trait will not be immediately available as I consider it rather an experimental feature. Thus use experimental :will-complain; will be required to make use of it.

The actual syntax is very simple:

<declaration> will complain <code>;

The <declaration> is anything what could result in a type check exception thrown. I tried to cover all such cases, but not sure if something hasn’t been left behind. See the sections below.

<code> can be any Code object which will receive a single argument: the value which didn’t pass the type check. The code must return a string to be included into exception message. Something stringifiable would also do.

Less words, more examples!

Type Objects

my enum FOO 
    will complain { "need something FOO-ish, got {.raku}" } 
    <foo1 foo2 foo3>;
my subset IntD of Int:D 
    will complain { "only non-zero positive integers, not {.raku}" } 
    where * > 0;
my class Bar 
    will complain -> $val { "need something Bar-like, got {$val.^name}" } {}

Basically, any type object can get the trait except for composables, i.e. – roles. This is because there is no unambiguous way to chose the particular complain block to be used when a type check fails:

role R will complain { "only R" } {}
role R[::T] will complain { "only R[::T]" } {}
my R $v;
$v = 13; # Which role candidate to choose from??

There are some cases when the ambiguity is pre-resolved, like my R[Int] $v;, but I’m not ready to get into these details yet.

Variables

A variable could have specific meaning. Some like to use our to configure modules (my heavily multi-threaded soul is grumbling, but we’re tolerant to people’s mistakes, aren’t we?). Therefore providing them with a way to produce less cryptic error messages is certainly for better than for worse:

our Bool:D $disable-something
    will complain { "set disable-something something boolean!" } = False;

And why not to help yourself with a little luxury of easing debugging when an assignment fails:

my Str $a-lexical 
   will complain { "string must contain 'foo'" } 
   where { !.defined || .contains("foo") }; 

The trait works with hashes and arrays too, except that it is applied not to the actual hash or array object but to its values. Therefore it really only makes sense for their typed variants:

my Str %h will complain { "hash values are to be strings, not {.^name}" };
my Int @a will complain { "this array is all about integers, not {.^name}" };

Also note that this wouldn’t work for hashes with typed keys when a key of wrong type is used. But it doesn’t mean there is no solution:

subset IntKey of Int will complain { "hash key must be an Int" };
my %h{IntKey};
%h<a> = 13;

Attributes

class Foo {
    has Int $.a 
        is rw 
        will complain { "you offer me {.raku}, but with all the respect: an integer, please!" };
}

Parameters

sub foo( Str:D $p will complain { "the first argument must be a string with 'foo'" } 
                  where *.contains('foo') ) {}

Merge

By this time all CI has passed with no errors and I have merged the PR.

Ukraine

You all are likely to know about the Russia’s war in Ukraine. Some of you know that Ukraine is my homeland. What I never told is that since the first days of the invasion we (my family) are trying to help our friends back there who fight against the aggressor. By ‘fight’ I mean it, they’re literally at the front lines. Unfortunately, our resources are not limitless. Therefore I would like to ask for any donations you could make by using the QR code below.

I’m not asking this for myself. I didn’t even think of this when I started this post. I never took a single penny for whatever I was doing for the Raku language. Even more, I was avoiding applying for any grants because it was always like “somebody would have better use for them”.

But this time I’m asking because any help to Ukrainian militaries means saving lives, both theirs and the people they protect.

PayPal:
PayPal Donate

rakudo.org: Rakudo compiler, Release #154 (2022.03)

Published on 2022-03-20T00:00:00

vrurg: Merging Symbols Issue

Published by Vadim Belman on 2021-10-05T00:00:00

First of all, I’d like to apologize for all the errors in this post. I just haven’t got time to properly proof-read it.

A while ago I was trying to fix a problem in Rakudo which, under certain conditions, causes some external symbols to become invisible for importing code, even if explicit use statement is used. And, indeed, it is really confusing when:

use L1::L2::L3::Class;
L1::L2::L3::Class.new;

fails with “Class symbol doesn’t exists in L1::L2::L3” error! It’s ok if use throws when there is no corresponding module. But .new??

Skip This Unless You Know What A Package Is

This section is needed to understand the rest of the post. A package in Raku is a typeobject which has a symbol table attached. The table is called stash (stands for “symbol table hash”) and is represented by an instance of Stash class, which is, basically, is a hash with minor tweaks. Normally each package instance has its own stash. For example, it is possible to manually create two different packages with the same name:

my $p1a := Metamodel::PackageHOW.new_type(:name<P1>); 
my $p1b := Metamodel::PackageHOW.new_type(:name<P1>); 
say $p1a.WHICH, " ", $p1a.WHO.WHICH; # P1|U140722834897656 Stash|140723638807008
say $p1b.WHICH, " ", $p1b.WHO.WHICH; # P1|U140722834897800 Stash|140723638818544

Note that they have different stashes as well.

A package is barely used in Raku as is. Usually we deal with packagy things like modules and classes.

Back On The Track

Back then I managed to trace the problem down to deserialization process within MoarVM backend. At that point I realized that somehow it pulls in packagy objects which are supposed to be the same thing, but they happen to be different and have different stashes. Because MoarVM doesn’t (and must not) have any idea about the structure of high-level Raku objects, there is no way it could properly handle this situation. Instead it considers one of the conflicting stashes as “the winner” and drops the other one. Apparently, symbols unique to the “loser” are lost then.

It took me time to find out what exactly happens. But not until a couple of days ago I realized what is the root cause and how to get around the bug.

Package Tree

What happens when we do something like:

module Foo {
    module Bar {
    }
}

How do we access Bar, speaking of the technical side of things? Foo::Bar syntax basically maps into Foo.WHO<Bar>. In other words, Bar gets installed as a symbol into Foo stash. We can also rewrite it with special syntax sugar: Foo::<Bar> because Foo:: is a representation for Foo stash.

So far, so good; but where do we find Foo itself? In Raku there is a special symbol called GLOBAL which is the root namespace (or a package if you wish) of any code. GLOBAL::, or GLOBAL.WHO is where one finds all the top-level symbols.

Say, we have a few packages like L11::L21, L11::L22, L12::L21, L12::L22. Then the namespace structure would be represented by this tree:

GLOBAL
`- L11
   `- L21
   `- L22
`- L12
   `- L21
   `- L22

Normally there is one per-process GLOBAL symbol and it belongs to the compunit which used to start the program. Normally it’s a .raku file, or a string supplied on command line with -e option, etc. But each compunit also gets its own GLOBALish package which acts as compunit’s GLOBAL until it is fully incorporated into the main code. Say, we declare a module in file Foo.rakumod:

unit module Foo;
sub print-GLOBAL($when) is export {
    say "$when: ", GLOBAL.WHICH, " ", GLOBALish.WHICH;
}
print-GLOBAL 'LOAD';

And use it in a script:

use Foo;
print-GLOBAL 'RUN ';

Then we can get an ouput like this:

LOAD: GLOBAL|U140694020262024 GLOBAL|U140694020262024
RUN : GLOBAL|U140694284972696 GLOBAL|U140694020262024

Notice that GLOBALish symbol remains the same object, whereas GLOBAL gets different. If we add a line to the script which also prints GLOBAL.WHICH then we’re going to get something like:

MAIN: GLOBAL|U140694284972696

Let’s get done with this part of the story for a while a move onto another subject.

Compunit Compilation

This is going to be a shorter story. It is not a secret that however powerful Raku’s grammars are, they need some core developer’s attention to make them really fast. In the meanwhile, compilation speed is somewhat suboptimal. It means that if a project consist of many compunits (think of modules, for example), it would make sense to try to compile them in parallel if possible. Unfortunately, the compiler is not thread-safe either. To resolve this complication Rakudo implementation parallelizes compilation by spawning individual processes per each compunit.

For example, let’s refer back to the module tree example above and imagine that all modules are used by a script. In this case there is a chance that we would end up with six rakudo processes, each compiling its own L* module. Apparently, things get slightly more complicated if there are cross-module uses, like L11::L21 could refer to L21, which, in turn, refers to L11::L22, or whatever. In this case we need to use topological sort to determine in what order the modules are to be compiled; but that’s not the point.

The point is that since each process does independent compilation, each compunit needs independent GLOBAL to manage its symbols. For the time being, what we later know as GLOBALish serves this duty for the compiler.

Later, when all pre-compiled modules are getting incorporated into the code which uses them, symbols installed into each individual GLOBAL are getting merged together to form the final namespace, available for our program. There are even methods in the source, using merge_global in their names.

TA-TA-TAAA!

(Note the clickable section header; I love the guy!)

Now, you can feel the catch. Somebody might have even guessed what it is. It crossed my mind after I was trying to implement legal symbol auto-registration which doesn’t involve using QAST to install a phaser. At some point I got an idea of using GLOBAL to hold a register object which would keep track of specially flagged roles. Apparently it failed due to the parallelized compilation mentioned above. It doesn’t matter, why; but at that point I started building a mental model of what happens when merge is taking place. And one detail drew my special attention: what happens if a package in a long name is not explicitly declared?

Say, there is a class named Foo::Bar::Baz one creates as:

unit class Foo::Bar;
class Baz { }

In this case the compiler creates a stub package for Foo. The stub is used to install class Bar. Then it all gets serialized into bytecode.

At the same time there is another module with another class:

unit class Foo::Bar::Fubar;

It is not aware of Foo::Bar::Baz, and the compiler has to create two stubs: Foo and Foo::Bar. And not only two versions of Foo are different and have different stashes; but so are the two versions of Bar where one is a real class, the other is a stub package.

Most of the time the compiler does damn good job of merging symbols in such cases. It took me stripping down a real-life code to golf it down to some minimal set of modules which reproduces the situation where a require call comes back with a Failure and a symbol becomes missing. The remaining part of this post will be dedicated to this example. In particular, this whole text is dedicated to one line.

Before we proceed further, I’d like to state that I might be speculating about some aspects of the problem cause because some details are gone from my memory and I don’t have time to re-investigate them. Still, so far my theory is backed by working workaround presented at the end.

To make it a bit easier to analyze the case, let’s start with namespace tree:

GLOBAL
`- L1
   `- App
   `- L2
      `- Collection
         `- Driver
         `- FS

Rough purpose is for application to deal with some kind of collection which stores its items with help of a driver which is loaded dynamically, depending, say, on a user configuration. We have the only driver implemented: File System (FS).

If you checkout the repository and try raku -Ilib symbol-merge.raku in the examples/2021-10-05-merge-symbols directory, you will see some output ending up with a line like Failure|140208738884744 (certainly true for up until Rakudo v2021.09 and likely to be so for at least a couple of versions later).

The key conflict in this example are modules Collection and Driver. The full name of Collection is L1::L2::Collection. L1 and L2 are both stubs. Driver is L1::L2::Collection::Driver and because it imports L1::L2, L2 is a class; but L1 remains to be a stub. By commenting out the import we’d get the bug resolved and the script would end up with something like:

L1::L2::Collection::FS|U140455893341088

This means that the driver module was successfully loaded and the driver class symbol is available.

Ok, uncomment the import and start the script again. And then once again to get rid of the output produced by compilation-time processes. We should see something like this:

[7329] L1 in L1::L2         : L1|U140360937889112
[7329] L1 in Driver         : L1|U140361742786216
[7329] L1 in Collection     : L1|U140361742786480
[7329] L1 in App            : L1|U140361742786720
[7329] L1 in MAIN           : L1|U140361742786720
[7329] L1 in FS             : L1|U140361742788136
Failure|140360664014848

We already know that L1 is a stub. Dumping object IDs also reveals that each compunit has its own copy of L1, except for App and the script (marked as MAIN). This is pretty much expected because each L1 symbol is installed at compile-time into per-compunit GLOBALish. This is where each module finds it. App is different because it is directly imported by the script and was compiled by the same compiler process, and shared its GLOBAL with the script.

Now comes the black magic. Open lib/L1/L2/Collection/FS.rakumod and uncomment the last line in the file. Then give it a try. The output would seem impossible at first; hell with it, even at second glance it is still impossible:

[17579] Runtime Collection syms      : (Driver)

Remember, this line belongs to L1::L2::Collection::FS! How come we don’t see FS in Collection stash?? No wonder that when the package cannot see itself others cannot see it too!

Here comes a bit of my speculation based on what I vaguely remember from the times ~2 years ago when I was trying to resolve this bug for the first time.

When Driver imports L1::L2, Collection gets installed into L2 stash, and Driver is recorded in Collection stash. Then it all gets serialized with Driver compunit.

Now, when FS imports Driver to consume the role, it gets the stash of L2 serialized at the previous stage. But its own L2 is a stub under L1 stub. So, it gets replaced with the serialized “real thing” which doesn’t have FS under Collection! Bingo and oops…

A Workaround

Walk through all the example files and uncomment use L1 statement. That’s it. All compunits will now have a common anchor to which their namespaces will be attached.

The common rule would state that if a problem of the kind occurs then make sure there’re no stub packages in the chain from GLOBAL down to the “missing” symbol. In particular, commenting out use L1::L2 in Driver will get our error back because it would create a “hole” between L1 and Collection and get us back into the situation where conflicting Collection namespaces are created because they’re bound to different L2 packages.

It doesn’t really matter how exactly the stubs are avoided. For example, we can easily move use L1::L2 into Collection and make sure that use L1 is still part of L2. So, for simplicity a child package may import its parent; and parent may then import its parent; and so on.

Sure, this adds to the boilerplate. But I hope the situation is temporary and there will be a fix.

Fix?

The one I was playing with required a compunit to serialize its own GLOBALish stash at the end of the compilation in a location where it would not be at risk of overwriting. Basically, it means cloning and storing it locally on the compunit (the package stash is part of the low-level VM structures). Then compunit mainline code would invoke a method on the Stash class which would forcibly merge the recorded symbols back right after deserialization of compunit’s bytecode. It was seemingly working, but looked more of a kind of a hack, than a real fix. This and a few smaller issues (like a segfault which I failed to track down) caused it to be frozen.

As I was thinking of it lately, more proper fix must be based upon a common GLOBAL shared by all compunits of a process. In this case there will be no worry about multiple stub generated for the same package because each stub will be shared by all compunits until, perhaps, the real package is found in one of them.

Unfortunately, the complexity of implementing the ‘single GLOBAL’ approach is such that I’m unsure if anybody with appropriate skill could fit it into their schedule.

6guts: The new MoarVM dispatch mechanism is here!

Published by jnthnwrthngtn on 2021-09-29T16:16:31

Around 18 months ago, I set about working on the largest set of architectural changes that Raku runtime MoarVM has seen since its inception. The work was most directly triggered by the realization that we had no good way to fix a certain semantic bug in dispatch without either causing huge performance impacts across the board or increasingly complexity even further in optimizations that were already riding their luck. However, the need for something like this had been apparent for a while: a persistent struggle to optimize certain Raku language features, the pain of a bunch of performance mechanisms that were all solving the same kind of problem but each for a specific situation, and a sense that, with everything learned since I founded MoarVM, it was possible to do better.

The result is the development of a new generalized dispatch mechanism. An overview can be found in my Raku Conference talk about it (slidesvideo); in short, it gives us a far more uniform architecture for all kinds of dispatch, allowing us to deliver better performance on a range of language features that have thus far been glacial, as well as opening up opportunities for new optimizations.

Today, this work has been merged, along with the matching changes in NQP (the Raku subset we use for bootstrapping and to implement the compiler) and Rakudo (the full Raku compiler and standard library implementation). This means that it will ship in the October 2021 releases.

In this post, I’ll give an overview of what you can expect to observe right away, and what you might expect in the future as we continue to build upon the possibilities that the new dispatch architecture has to offer.

The big wins

The biggest improvements involve language features that we’d really not had the architecture to do better on before. They involved dispatch – that is, getting a call linked to a destination efficiently – but the runtime didn’t provide us with a way to “explain” to it that it was looking at a dispatch, let alone with the information needed to have a shot at optimizing it.

The following graph captures a number of these cases, and shows the level of improvement, ranging from a factor of 3.3 to 13.3 times faster.

Graph showing benchmark results, described textually below

Let’s take a quick look at each of these. The first, new-buf, asks how quickly we can allocate Bufs.

for ^10_000_000 {
    Buf.new
}

Why is this a dispatch benchmark? Because Buf is not a class, but rather a role. When we try to make an instance of a role, it is “punned” into a class. Up until now, it works as follows:

  1. We look up the new method
  2. The find_method method would, if needed, create a pun of the role and cache it
  3. It would return a forwarding closure that takes the arguments and gives them to the same method called on the punned class, or spelt in Raku code, -> $role-discarded, |args { $pun."$name"(|args) }
  4. This closure would be invoked with the arguments

This had a number of undesirable consequences:

  1. While the pun was cached, we still had a bit of overhead to check if we’d made it already
  2. The arguments got slurped and flattened, which costs something, and…
  3. …the loss of callsite shape meant we couldn’t look up a type specialization of the method, and thus lost a chance to inline it too

With the new dispatch mechanism, we have a means to cache constants at a given program location and to replace arguments. So the first time we encounter the call, we:

  1. Get the role pun produced if needed
  2. Resolve the new method on the class punned from the role
  3. Produce a dispatch program that caches this resolved method and also replaces the role argument with the pun

For the next thousands of calls, we interpret this dispatch program. It’s still some cost, but the method we’re calling is already resolved, and the argument list rewriting is fairly cheap. Meanwhile, after we get into some hundreds of iterations, on a background thread, the optimizer gets to work. The argument re-ordering cost goes away completely at this point, and new is so small it gets inlined – at which point the buffer allocation is determined dead and so goes away too. Some remaining missed opportunities mean we still are left with a loop that’s not quite empty: it busies itself making sure it’s really OK to do nothing, rather than just doing nothing.

Next up, multiple dispatch with where clauses.

multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * fac($n - 1) }
for ^1_000_000 {
    fac(5)
}

These were really slow before, since:

  1. We couldn’t apply the multi-dispatch caching mechanism at all as soon as we had a where clause involved
  2. We would run where clauses twice in the event the candidate was chosen: once to see if we should choose that multi candidate, and once again when we entered it

With the new mechanism, we:

  1. On the first call, calculate a multiple dispatch plan: a linked list of candidates to work through
  2. Invoke the one with the where clause, in a mode whereby if the signature fails to bind, it triggers a dispatch resumption. (If it does bind, it runs to completion)
  3. In the event of a bind failure, the dispatch resumption triggers, and we attempt the next candidate

Once again, after the setup phase, we interpret the dispatch programs. In fact, that’s as far as we get with running this faster for now, because the specializer doesn’t yet know how to translate and further optimize this kind of dispatch program. (That’s how I know it currently stands no chance of turning this whole thing into another empty loop!) So there’s more to be had here also; in the meantime, I’m afraid you’ll just have to settle for a factor of ten speedup.

Here’s the next one:

proto with-proto(Int $n) { 2 * {*} }
multi with-proto(Int $n) { $n + 1 }
sub invoking-nontrivial-proto() {
    for ^10_000_000 {
        with-proto(20)
    }
}

Again, on top form, we’d turn this into an empty loop too, but we don’t quite get there yet. This case wasn’t so terrible before: we did get to use the multiple dispatch cache, however to do that we also ended up having to allocate an argument capture. The need for this also blocked any chance of inlining the proto into the caller. Now that is possible. Since we cannot yet translate dispatch programs that resume an in-progress dispatch, we don’t yet get to further inline the called multi candidate into the proto. However, we now have a design that will let us implement that.

This whole notion of a dispatch resumption – where we start doing a dispatch, and later need to access arguments or other pre-calculated data in order to do a next step of it – has turned out to be a great unification. The initial idea for it came from considering things like callsame:

class Parent {
    method m() { 1 }
}
class Child is Parent {
    method m() { 1 + callsame }
}
for ^10_000_000 {
    Child.m;
}

Once I started looking at this, and then considering that a complex proto also wants to continue with a dispatch at the {*}, and in the case a where clauses fails in a multi it also wants to continue with a dispatch, I realized this was going to be useful for quite a lot of things. It will be a bit of a headache to teach the optimizer and JIT to do nice things with resumes – but a great relief that doing that once will benefit multiple language features!

Anyway, back to the benchmark. This is another “if we were smart, it’d be an empty loop” one. Previously, callsame was very costly, because each time we invoked it, it would have to calculate what kind of dispatch we were resuming and the set of methods to call. We also had to be able to locate the arguments. Dynamic variables were involved, which cost a bit to look up too, and – despite being an implementation details – these also leaked out in introspection, which wasn’t ideal. The new dispatch mechanism makes this all rather more efficient: we can cache the calculated set of methods (or wrappers and multi candidates, depending on the context) and then walk through it, and there’s no dynamic variables involved (and thus no leakage of them). This sees the biggest speedup of the lot – and since we cannot yet inline away the callsame, it’s (for now) measuring the speedup one might expect on using this language feature. In the future, it’s destined to optimize away to an empty loop.

A module that makes use of callsame on a relatively hot path is OO::Monitors,, so I figured it would be interesting to see if there is a speedup there also.

use OO::Monitors;
monitor TestMonitor {
    method m() { 1 }
}
my $mon = TestMonitor.new;
for ^1_000_000 {
    $mon.m();
}

monitor is a class that acquires a lock around each method call. The module provides a custom meta-class that adds a lock attribute to the class and then wraps each method such that it acquires the lock. There are certainly costly things in there besides the involvement of callsame, but the improvement to callsame is already enough to see a 3.3x speedup in this benchmark. Since OO::Monitors is used in quite a few applications and modules (for example, Cro uses it), this is welcome (and yes, a larger improvement will be possible here too).

Caller side decontainerization

I’ve seen some less impressive, but still welcome, improvements across a good number of other microbenchmarks. Even a basic multi dispatch on the + op:

my $i = 0;
for ^10_000_000 {
    $i = $i + $_;
}

Comes out with a factor of 1.6x speedup, thanks primarily to us producing far tighter code with fewer guards. Previously, we ended up with duplicate guards in this seemingly straightforward case. The infix:<+> multi candidate would be specialized for the case of its first argument being an Int in a Scalar container and its second argument being an immutable Int. Since a Scalar is mutable, the specialization would need to read it and then guard the value read before proceeding, otherwise it may change, and we’d risk memory safety. When we wanted to inline this candidate, we’d also want to do a check that the candidate really applies, and so also would deference the Scalar and guard its content to do that. We can and do eliminate duplicate guards – but these guards are on two distinct reads of the value, so that wouldn’t help.

Since in the new dispatch mechanism we can rewrite arguments, we can now quite easily do caller-side removal of Scalar containers around values. So easily, in fact, that the change to do it took me just a couple of hours. This gives a lot of benefits. Since dispatch programs automatically eliminate duplicate reads and guards, the read and guard by the multi-dispatcher and the read in order to pass the decontainerized value are coalesced. This means less repeated work prior to specialization and JIT compilation, and also only a single read and guard in the specialized code after it. With the value to be passed already guarded, we can trivially select a candidate taking two bare Int values, which means there’s no further reads and guards needed in the callee either.

A less obvious benefit, but one that will become important with planned future work, is that this means Scalar containers escape to callees far less often. This creates further opportunities for escape analysis. While the MoarVM escape analyzer and scalar replacer is currently quite limited, I hope to return to working on it in the near future, and expect it will be able to give us even more value now than it would have been able to before.

Further results

The benchmarks shown earlier are mostly of the “how close are we to realizing that we’ve got an empty loop” nature, which is interesting for assessing how well the optimizer can “see through” dispatches. Here are a few further results on more “traditional” microbenchmarks:

Graph showing benchmark results, described textually below

The complex number benchmark is as follows:

my $total-re = 0e0;
for ^2_000_000 {
    my $x = 5 + 2i;
    my $y = 10 + 3i;
    my $z = $x * $x + $y;
    $total-re = $total-re + $z.re
}
say $total-re;

That is, just a bunch of operators (multi dispatch) and method calls, where we really do use the result. For now, we’re tied with Python and a little behind Ruby on this benchmark (and a surprising 48 times faster than the same thing done with Perl’s Math::Complex), but this is also a case that stands to see a huge benefit from escape analysis and scalar replacement in the future.

The hash read benchmark is:

my %h = a => 10, b => 12;
my $total = 0;
for ^10_000_000 {
    $total = $total + %h<a> + %h<b>;
}

And the hash store one is:

my @keys = 'a'..'z';
for ^500_000 {
    my %h;
    for @keys {
        %h{$_} = 42;
    }
}

The improvements are nothing whatsoever to do with hashing itself, but instead look to be mostly thanks to much tighter code all around due to caller-side decontainerization. That can have a secondary effect of bringing things under the size limit for inlining, which is also a big help. Speedup factors of 2x and 1.85x are welcome, although we could really do with the same level of improvement again for me to be reasonably happy with our results.

The line-reading benchmark is:

my $fh = open "longfile";
my $chars = 0;
for $fh.lines { $chars = $chars + .chars };
$fh.close;
say $chars

Again, nothing specific to I/O got faster, but when dispatch – the glue that puts together all the pieces – gets a boost, it helps all over the place. (We are also decently competitive on this benchmark, although tend to be slower the moment the UTF-8 decoder can’t take it’s “NFG can’t possibly apply” fast path.)

And in less micro things…

I’ve also started looking at larger programs, and hearing results from others about theirs. It’s mostly encouraging:

Smaller profiler output

One unpredicted (by me), but also welcome, improvement is that profiler output has become significantly smaller. Likely reasons for this include:

  1. The dispatch mechanism supports producing value results (either from constants, input arguments, or attributes read from input arguments). It entirely replaces an earlier mechanism, “specializer plugins”, which could map guards to a target to invoke, but always required a call to something – even if that something was the identity function. The logic was that this didn’t matter for any really hot code, since the identity function will trivially be inlined away. However, since profile size of the instrumenting profiler is a function of the number of paths through the call tree, trimming loads of calls to the identity function out of the tree makes it much smaller.
  2. We used to make lots of calls to the sink method when a value was in sink context. Now, if we see that the type simply inherits that method from Mu, we elide the call entirely (again, it would inline away, but a smaller call graph is a smaller profile).
  3. Multiple dispatch caching would previously always call the proto when the cache was missed, but would then not call an onlystar proto again when it got cache hits in the future. This meant the call tree under many multiple dispatches was duplicated in the profile. This wasn’t just a size issue; it was a bit annoying to have this effect show up in the profile reports too.

To give an example of the difference, I took profiles from Agrammon to study why it might have become slower. The one from before the dispatcher work weighed in at 87MB; the one with the new dispatch mechanism is under 30MB. That means less memory used while profiling, less time to write the profile out to disk afterwards, and less time for tools to load the profiler output. So now it’s faster to work out how to make things faster.

Is there any bad news?

I’m afraid so. Startup time has suffered. While the new dispatch mechanism is more powerful, pushes more complexity out of the VM into high level code, and is more conducive to reaching higher peak performance, it also has a higher warmup time. At the time of writing, the impact on startup time seems to be around 25%. I expect we can claw some of that back ahead of the October release.

What will be broken?

Changes of this scale always come with an amount of risk. We’re merging this some weeks ahead of the next scheduled monthly release in order to have time for more testing, and to address any regressions that get reported. However, even before reaching the point of merging it, we have:

What happens next?

As I’ve alluded to in a number of places in this post, while there are improvements to be enjoyed right away, there are also new opportunities for further improvement. Some things that are on my mind include:

Thank you

I would like to thank TPF and their donors for providing the funding that has made it possible for me to spend a good amount of my working time on this effort.

While I’m to blame for the overall design and much of the implementation of the new dispatch mechanism, plenty of work has also been put in by other MoarVM and Rakudo contributors – especially over the last few months as the final pieces fell into place, and we turned our attention to getting it production ready. I’m thankful to them not only for the code and debugging contributions, but also much support and encouragement along the way. It feels good to have this merged, and I look forward to building upon it in the months and years to come.

vrurg: Secure JSONification?

Published by Vadim Belman on 2021-09-14T00:00:00

There was an interesting discussion on IRC today. In brief, it was about exposing one’s database structures over API and security implications of this approach. I’d recommend reading the whole thing because Altreus delivers a good (and somewhat emotional 🙂) point on why such practice is most definitely bad design decision. Despite having minor objections, I generally agree to him.

But I’m not wearing out my keyboard on this post just to share that discussion. There was something in it what made me feel as if I miss something. And it came to me a bit later, when I was done with my payjob and got a bit more spare resources for the brain to utilize.

First of all, a bell rang when a hash was mentioned as the mediator between a database and API return value. I’m somewhat wary about using hashes as return values primarily for a reason of performance price and concurrency unsafety.

Anyway, the discussion went on and came to the point where it touched the ground of blacklisting of a DB table fields vs. whitelisting. The latter is really worthy approach of marking those fields we want in a JSON (or a hash) rather than marking those we don’t want because blacklisting requires us to remember to mark any new sensitive field as prohibited explicitly. Apparently, it is easy to forget to stick the mark onto it.

Doesn’t it remind you something? Aren’t we talking about hashes now? Isn’t it what we sometimes blame JavaScript for, that its objects are free-form with barely any reliable control over their structure? Thanks TypeScript for trying to get this fixed in some funky way, which I personally like more than dislike.

That’s when things clicked together. I was giving this answer already on a different occasion: using a class instance is often preferable over a hash. In the light of the JSON/API safety this simple rule gets us to another rather interesting aspect. Here is an example SmokeMachine provided on IRC:

to-json %( name => "{ .first-name } { .last-name }", 
           password => "***" )
    given $model

This was about returning basic user account information to a frontend. This is supposed to replace JSONification of a Red model like the following:

model Account {
    has UInt $.id is serial is json-skip;
    has Str $.username is column{ ... };
    has Str $.password is column{ ... } is json-skip;
    has Str $.first-name is column{ ... };
    has Str $.last-name is column{ ... };
}

The model example is mine.

By the way, in my opinion, neither first name nor last name do not belong to this model and must be part of a separate table where user’s personal data is kept. In more general case, a name must either be a long single field or an array where one can fit something like “Pablo Diego José Francisco de Paula Juan Nepomuceno María de los Remedios Cipriano de la Santísima Trinidad Ruiz y Picasso”.

The model clearly demonstrates the blacklist approach with two fields marked as non-JSONifiable. Now, let’s make it the right way, as I see it:

class API::Data::User {
    has Str:D $.username is required;
    has Str $.first-name;
    has Str $.last-name;

    method !FROM-MODEL($model) {
        self.new: username   => .username,
                  first-name => .first-name,
                  last-name  => .last-name
            given $model
    }

    multi method new(Account:D $model) {
        self!FROM-MODEL($model)
    }

    method COERCE(Account:D $model) {
        self!FROM-MODEL($model)
    }
}

And now, somewhere in our code we can do:

method get-user-info(UInt:D $id) {
    to-json API::Data::User(Account.^load: :$id)
}

With Cro::RPC::JSON module this could be part of a general API class which would provide common interface to both front- and backend:

use Cro::RPC::JSON;
class API::User {
    method get-user-info(UInt:D $id) is json-rpc {
        API::Data::User(Account.^load: :$id)
    }
}

With such an implementation our Raku backend would get an instance of API::Data::User. In a TypeScript frontend code of a private project of mine I have something like the following snippet, where connection is an object derived from jayson module:

connection.call("get-user-info", id).then(
    (user: User | undefined | null) => { ... }
);

What does it all eventually give us? First, API::Data::User provides the mechanism of whilelisting the fields we do want to expose in API. Note that with properly defined attributes we’re as explicit about that as only possible. And we do it declaratively one single place.

Second, the class prevents us from mistyping field names. It wouldn’t be possible to have something like %( usrname => $model.username, ... ) somewhere else in our codebase. Or, perhaps even more likely, to try %user<frst-name> and wonder where did the first name go? We also get the protection against wrong data types or undefined values.

It is also likely that working with a class instance would be faster than with a hash. I have this subject covered in another post of mine.

Heh, at some point I thought this post could fit into IRC format… 🤷

6guts: Raku multiple dispatch with the new MoarVM dispatcher

Published by jnthnwrthngtn on 2021-04-15T09:54:30

I recently wrote about the new MoarVM dispatch mechanism, and in that post noted that I still had a good bit of Raku’s multiple dispatch semantics left to implement in terms of it. Since then, I’ve made a decent amount of progress in that direction. This post contains an overview of the approach taken, and some very rough performance measurements.

My goodness, that’s a lot of semantics

Of all the kinds of dispatch we find in Raku, multiple dispatch is the most complex. Multiple dispatch allows us to write a set of candidates, which are then selected by the number of arguments:

multi ok($condition, $desc) {
    say ($condition ?? 'ok' !! 'not ok') ~ " - $desc";
}
multi ok($condition) {
    ok($condition, '');
}

Or the types of arguments:

multi to-json(Int $i) { ~$i }
multi to-json(Bool $b) { $b ?? 'true' !! 'false' }

And not just one argument, but potentially many:

multi truncate(Str $str, Int $chars) {
    $str.chars < $chars ?? $str !! $str.substr(0, $chars) ~ '...'
}
multi truncate(Str $str, Str $after) {
    with $str.index($after) -> $pos {
        $str.substr(0, $pos) ~ '...'
    }
    else {
        $str
    }
}

We may write where clauses to differentiate candidates on properties that are not captured by nominal types:

multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * fac($n - 1) }

Every time we write a set of multi candidates like this, the compiler will automatically produce a proto routine. This is what is installed in the symbol table, and holds the candidate list. However, we can also write our own proto, and use the special term {*} to decide at which point we do the dispatch, if at all.

proto mean($collection) {
    $collection.elems == 0 ?? Nil !! {*}
}
multi mean(@arr) {
    @arr.sum / @arr.elems
}
multi mean(%hash) {
    %hash.values.sum / %hash.elems
}

Candidates are ranked by narrowness (using topological sorting). If multiple candidates match, but they are equally narrow, then that’s an ambiguity error. Otherwise, we call narrowest one. The candidate we choose may then use callsame and friends to defer to the next narrowest candidate, which may do the same, until we reach the most general matching one.

Multiple dispatch is everywhere

Raku leans heavily on multiple dispatch. Most operators in Raku are compiled into calls to multiple dispatch subroutines. Even $a + $b will be a multiple dispatch. This means doing multiple dispatch efficiently is really important for performance. Given the riches of its semantics, this is potentially a bit concerning. However, there’s good news too.

Most multiple dispatches are boring

The overwhelmingly common case is that we have:

This isn’t to say the other cases are unimportant; they are really quite useful, and it’s desirable for them to perform well. However, it’s also desirable to make what savings we can in the common case. For example, we don’t want to eagerly calculate the full set of possible candidates for every single multiple dispatch, because the majority of the time only the first one matters. This is not just a time concern: recall that the new dispatch mechanism stores dispatch programs at each callsite, and if we store the list of all matching candidates at each of those, we’ll waste a lot of memory too.

How do we do today?

The situation in Rakudo today is as follows:

Effectively, the situation today is that you simply don’t use where clauses in a multiple dispatch if its anywhere near a hot path (well, and if you know where the hot paths are, and know that this kind of dispatch is slow). Ditto for callsame, although that’s less commonly reached for. The question is, can we do better with the new dispatcher?

Guard the types

Let’s start out with seeing how the simplest cases are dealt with, and build from there. (This is actually what I did in terms of the implementation, but at the same time I had a rough idea where I was hoping to end up.)

Recall this pair of candidates:

multi truncate(Str $str, Int $chars) {
    $str.chars < $chars ?? $str !! $str.substr(0, $chars) ~ '...'
}
multi truncate(Str $str, Str $after) {
    with $str.index($after) -> $pos {
        $str.substr(0, $pos) ~ '...'
    }
    else {
        $str
    }
}

We then have a call truncate($message, "\n"), where $message is a Str. Under the new dispatch mechanism, the call is made using the raku-call dispatcher, which identifies that this is a multiple dispatch, and thus delegates to raku-multi. (Multi-method dispatch ends up there too.)

The record phase of the dispatch – on the first time we reach this callsite – will proceed as follows:

  1. Iterate over the candidates
  2. If a candidate doesn’t match on argument count, just discard it. Since the shape of a callsite is a constant, and we calculate dispatch programs at each callsite, we don’t need to establish any guards for this.
  3. If it matches on types and concreteness, note which parameters are involved and what kinds of guards they need.
  4. If there was no match or an ambiguity, report the error without producing a dispatch program.
  5. Otherwise, having established the type guards, delegate to the raku-invoke dispatcher with the chosen candidate.

When we reach the same callsite again, we can run the dispatch program, which quickly checks if the argument types match those we saw last time, and if they do, we know which candidate to invoke. These checks are very cheap – far cheaper than walking through all of the candidates and examining each of them for a match. The optimizer may later be able to prove that the checks will always come out true and eliminate them.

Thus the whole of the dispatch processes – at least for this simple case where we only have types and arity – can be “explained” to the virtual machine as “if the arguments have these exact types, invoke this routine”. It’s pretty much the same as we were doing for method dispatch, except there we only cared about the type of the first argument – the invocant – and the value of the method name. (Also recall from the previous post that if it’s a multi-method dispatch, then both method dispatch and multiple dispatch will guard the type of the first argument, but the duplication is eliminated, so only one check is done.)

That goes in the resumption hole

Coming up with good abstractions is difficult, and therein lies much of the challenge of the new dispatch mechanism. Raku has quite a number of different dispatch-like things. However, encoding all of them directly in the virtual machine leads to high complexity, which makes building reliable optimizations (or even reliable unoptimized implementations!) challenging. Thus the aim is to work out a comparatively small set of primitives that allow for dispatches to be “explained” to the virtual machine in such a way that it can deliver decent performance.

It’s fairly clear that callsame is a kind of dispatch resumption, but what about the custom proto case and the where clause case? It turns out that these can both be neatly expressed in terms of dispatch resumption too (the where clause case needing one small addition at the virtual machine level, which in time is likely to be useful for other things too). Not only that, but encoding these features in terms of dispatch resumption is also quite direct, and thus should be efficient. Every trick we teach the specializer about doing better with dispatch resumptions can benefit all of the language features that are implemented using them, too.

Custom protos

Recall this example:

proto mean($collection) {
    $collection.elems == 0 ?? Nil !! {*}
}

Here, we want to run the body of the proto, and then proceed to the chosen candidate at the point of the {*}. By contrast, when we don’t have a custom proto, we’d like to simply get on with calling the correct multi.

To achieve this, I first moved the multi candidate selection logic from the raku-multi dispatcher to the raku-multi-core dispatcher. The raku-multi dispatcher then checks if we have an “onlystar” proto (one that does not need us to run it). If so, it delegates immediately to raku-multi-core. If not, it saves the arguments to the dispatch as the resumption initialization state, and then calls the proto. The proto‘s {*} is compiled into a dispatch resumption. The resumption then delegates to raku-multi-core. Or, in code:

nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-multi',
    # Initial dispatch, only setting up resumption if we need to invoke the
    # proto.
    -> $capture {
        my $callee := nqp::captureposarg($capture, 0);
        my int $onlystar := nqp::getattr_i($callee, Routine, '$!onlystar');
        if $onlystar {
            # Don't need to invoke the proto itself, so just get on with the
            # candidate dispatch.
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi-core', $capture);
        }
        else {
            # Set resume init args and run the proto.
            nqp::dispatch('boot-syscall', 'dispatcher-set-resume-init-args', $capture);
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-invoke', $capture);
        }
    },
    # Resumption means that we have reached the {*} in the proto and so now
    # should go ahead and do the dispatch. Make sure we only do this if we
    # are signalled to that it's a resume for an onlystar (resumption kind 5).
    -> $capture {
        my $track_kind := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 0);
        nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_kind);
        my int $kind := nqp::captureposarg_i($capture, 0);
        if $kind == 5 {
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi-core',
                nqp::dispatch('boot-syscall', 'dispatcher-get-resume-init-args'));
        }
        elsif !nqp::dispatch('boot-syscall', 'dispatcher-next-resumption') {
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-constant',
                nqp::dispatch('boot-syscall', 'dispatcher-insert-arg-literal-obj',
                    $capture, 0, Nil));
        }
    });

Two become one

Deferring to the next candidate (for example with callsame) and trying the next candidate because a where clause failed look very similar: both involve walking through a list of possible candidates. There’s some details, but they have a great deal in common, and it’d be nice if that could be reflected in how multiple dispatch is implemented using the new dispatcher.

Before that, a slightly terrible detail about how things work in Rakudo today when we have where clauses. First, the dispatcher does a “trial bind”, where it asks the question: would this signature bind? To do this, it has to evaluate all of the where clauses. Worse, it has to use the slow-path signature binder too, which interprets the signature, even though we can in many cases compile it. If the candidate matches, great, we select it, and then invoke it…which runs the where clauses a second time, as part of the compiled signature binding code. There is nothing efficient about this at all, except for it being by far more efficient on developer time, which is why it happened that way.

Anyway, it goes without saying that I’m rather keen to avoid this duplicate work and the slow-path binder where possible as I re-implement this using the new dispatcher. And, happily, a small addition provides a solution. There is an op assertparamcheck, which any kind of parameter checking compiles into (be it type checking, where clause checking, etc.) This triggers a call to a function that gets the arguments, the thing we were trying to call, and can then pick through them to produce an error message. The trick is to provide a way to invoke a routine such that a bind failure, instead of calling the error reporting function, will leave the routine and then do a dispatch resumption! This means we can turn failure to pass where clause checks into a dispatch resumption, which will then walk to the next candidate and try it instead.

Trivial vs. non-trivial

This gets us most of the way to a solution, but there’s still the question of being memory and time efficient in the common case, where there is no resumption and no where clauses. I coined the term “trivial multiple dispatch” for this situation, which makes the other situation “non-trivial”. In fact, I even made a dispatcher called raku-multi-non-trivial! There are two ways we can end up there.

  1. The initial attempt to find a matching candidate determines that we’ll have to consider where clauses. As soon as we see this is the case, we go ahead and produce a full list of possible candidates that could match. This is a linked list (see my previous post for why).
  2. The initial attempt to find a matching candidate finds one that can be picked based purely on argument count and nominal types. We stop there, instead of trying to build a full candidate list, and run the matching candidate. In the event that a callsame happens, we end up in the trivial dispatch resumption handler, which – since this situation is now non-trivial – builds the full candidate list, snips the first item off it (because we already ran that), and delegates to raku-multi-non-trivial.

Lost in this description is another significant improvement: today, when there are where clauses, we entirely lose the ability to use the MoarVM multiple dispatch cache, but under the new dispatcher, we store a type-filtered list of candidates at the callsite, and then cheap type guards are used to check it is valid to use.

Preliminary results

I did a few benchmarks to see how the new dispatch mechanism did with a couple of situations known to be sub-optimal in Rakudo today. These numbers do not reflect what is possible, because at the moment the specializer does not have much of an understanding of the new dispatcher. Rather, they reflect the minimal improvement we can expect.

Consider this benchmark using a multi with a where clause to recursively implement factorial.

multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * fac($n - 1) }
for ^100_000 {
    fac(10)
}
say now - INIT now;

This needs some tweaks (and to be run under an environment variable) to use the new dispatcher; these are temporary, until such a time I switch Rakudo over to using the new dispatcher by default:

use nqp;
multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * nqp::dispatch('raku-call', &fac, $n - 1) }
for ^100_000 {
    nqp::dispatch('raku-call', &fac, 10);
}
say now - INIT now;

On my machine, the first runs in 4.86s, the second in 1.34s. Thus under the new dispatcher this runs in little over a quarter of the time it used to – a quite significant improvement already.

A case involving callsame is also interesting to consider. Here it is without using the new dispatcher:

multi fallback(Any $x) { "a$x" }
multi fallback(Numeric $x) { "n" ~ callsame }
multi fallback(Real $x) { "r" ~ callsame }
multi fallback(Int $x) { "i" ~ callsame }
for ^1_000_000 {
    fallback(4+2i);
    fallback(4.2);
    fallback(42);
}   
say now - INIT now;

And with the temporary tweaks to use the new dispatcher:

use nqp;
multi fallback(Any $x) { "a$x" }
multi fallback(Numeric $x) { "n" ~ new-disp-callsame }
multi fallback(Real $x) { "r" ~ new-disp-callsame }
multi fallback(Int $x) { "i" ~ new-disp-callsame }
for ^1_000_000 {
    nqp::dispatch('raku-call', &fallback, 4+2i);
    nqp::dispatch('raku-call', &fallback, 4.2);
    nqp::dispatch('raku-call', &fallback, 42);
}
say now - INIT now;

On my machine, the first runs in 31.3s, the second in 11.5s, meaning that with the new dispatcher we manage it in a little over a third of the time that current Rakudo does.

These are both quite encouraging, but as previously mentioned, a majority of multiple dispatches are of the trivial kind, not using these features. If I make the most common case worse on the way to making other things better, that would be bad. It’s not yet possible to make a fair comparison of this: trivial multiple dispatches already receive a lot of attention in the specializer, and it doesn’t yet optimize code using the new dispatcher well. Of note, in an example like this:

multi m(Int) { }
multi m(Str) { }
for ^1_000_000 {
    m(1);
    m("x");
}
say now - INIT now;

Inlining and other optimizations will turn this into an empty loop, which is hard to beat. There is one thing we can already do, though: run it with the specializer disabled. The new dispatcher version looks like this:

use nqp;
multi m(Int) { }
multi m(Str) { }
for ^1_000_000 {
    nqp::dispatch('raku-call', &m, 1);
    nqp::dispatch('raku-call', &m, "x");
}
say now - INIT now;

The results are 0.463s and 0.332s respectively. Thus, the baseline execution time – before the specializer does its magic – is less using the new general dispatch mechanism than it is using the special-case multiple dispatch cache that we currently use. I wasn’t sure what to expect here before I did the measurement. Given we’re going from a specialized mechanism that has been profiled and tweaked to a new general mechanism that hasn’t received such attention, I was quite ready to be doing a little bit worse initially, and would have been happy with parity. Running in 70% of the time was a bigger improvement than I expected at this point.

I expect that once the specializer understands the new dispatch mechanism better, it will be able to also turn the above into an empty loop – however, since more iterations can be done per-optimization, this should still show up as a win for the new dispatcher.

Final thoughts

With one relatively small addition, the new dispatch mechanism is already handling most of the Raku multiple dispatch semantics. Furthermore, even without the specializer and JIT really being able to make a good job of it, some microbenchmarks already show a factor of 3x-4x improvement. That’s a pretty good starting point.

There’s still a good bit to do before we ship a Rakudo release using the new dispatcher. However, multiple dispatch was the biggest remaining threat to the design: it’s rather more involved than other kinds of dispatch, and it was quite possible that an unexpected shortcoming could trigger another round of design work, or reveal that the general mechanism was going to struggle to perform compared to the more specialized one in the baseline unoptimized, case. So far, there’s no indication of either of these, and I’m cautiously optimistic that the overall design is about right.

Pawel bbkr Pabian: Asynchronous, parallel and... dead. My Perl 6 daily bread.

Published by Pawel bbkr Pabian on 2015-09-06T14:00:56

I love Perl 6 asynchronous features. They are so easy to use and can give instant boost by changing few lines of code that I got addicted to them. I became asynchronous junkie. And finally overdosed. Here is my story...

I was processing a document that was divided into chapters, sub-chapters, sub-sub-chapters and so on. Parsed to data structure it looked like this:

    my %document = (
        '1' => {
            '1.1' => 'Lorem ipsum',
            '1.2' => {
                '1.2.1' => 'Lorem ipsum',
                '1.2.2' => 'Lorem ipsum'
            }
        },
        '2' => {
            '2.1' => {
                '2.1.1' => 'Lorem ipsum'
            }
        }
    );

Every chapter required processing of its children before it could be processed. Also processing of each chapter was quite time consuming - no matter which level it was and how many children did it have. So I started by writing recursive function to do it:

    sub process (%chapters) {
        for %chapters.kv -> $number, $content {
            note "Chapter $number started";
            &?ROUTINE.($content) if $content ~~ Hash;
            sleep 1; # here the chapter itself is processed
            note "Chapter $number finished";
        }
    }
    
    process(%document);

So nothing fancy here. Maybe except current &?ROUTINE variable which makes recursive code less error prone - there is no need to repeat subroutine name explicitly. After running it I got expected DFS (Depth First Search) flow:

    $ time perl6 run.pl
    Chapter 1 started
    Chapter 1.1 started
    Chapter 1.1 finished
    Chapter 1.2 started
    Chapter 1.2.1 started
    Chapter 1.2.1 finished
    Chapter 1.2.2 started
    Chapter 1.2.2 finished
    Chapter 1.2 finished
    Chapter 1 finished
    Chapter 2 started
    Chapter 2.1 started
    Chapter 2.1.1 started
    Chapter 2.1.1 finished
    Chapter 2.1 finished
    Chapter 2 finished
    
    real    0m8.184s

It worked perfectly, but that was too slow. Because 1 second was required to process each chapter in serial manner it ran for 8 seconds total. So without hesitation I reached for Perl 6 asynchronous goodies to process chapters in parallel.

    sub process (%chapters) {
        await do for %chapters.kv -> $number, $content {
            start {
                note "Chapter $number started";
                &?ROUTINE.outer.($content) if $content ~~ Hash;
                sleep 1; # here the chapter itself is processed
                note "Chapter $number finished";
            }
        }
    }
    
    process(%document);

Now every chapter is processed asynchronously in parallel and first waits for its children to be also processed asynchronously in parallel. Note that after wrapping processing in await/start construct &?ROUTINE must now point to outer scope.

    $ time perl6 run.pl
    Chapter 1 started
    Chapter 2 started
    Chapter 1.1 started
    Chapter 1.2 started
    Chapter 2.1 started
    Chapter 1.2.1 started
    Chapter 2.1.1 started
    Chapter 1.2.2 started
    Chapter 1.1 finished
    Chapter 1.2.1 finished
    Chapter 1.2.2 finished
    Chapter 2.1.1 finished
    Chapter 2.1 finished
    Chapter 1.2 finished
    Chapter 1 finished
    Chapter 2 finished
    
    real    0m3.171s

Perfect. Time dropped to expected 3 seconds - it was not possible to go any faster because document had 3 nesting levels and each required 1 second to process. Still smiling I threw bigger document at my beautiful script - 10 chapters, each with 10 sub-chapters, each with 10 sub-sub-chapters. It started processing, run for a while... and DEADLOCKED.

Friedrich Nietzsche said that "when you gaze long into an abyss the abyss also gazes into you". Same rule applies to code. After few minutes me and my code were staring at each other. And I couldn't find why it worked perfectly for small documents but was deadlocking in random moments for big ones. Half an hour later I blinked and got defeated by my own code in staring contest. So it was time for debugging.

I noticed that when it was deadlocking there was always constant amount of 16 chapters that were still in progress. And that number looked familiar to me - thread pool!

    $ perl6 -e 'say start { }'
    Promise.new(
        scheduler => ThreadPoolScheduler.new(
            initial_threads => 0,
            max_threads => 16,
            uncaught_handler => Callable
        ),
        status => PromiseStatus::Kept
    )

Every asynchronous task that is planned needs free thread so it can be executed. And on my system only 16 concurrent threads are allowed as shown above. To analyze what happened let's use document from first example but also assume thread pool is limited to 4:

    $ perl6 run.pl          # 4 threads available by default
    Chapter 1 started       # 3 threads available
    Chapter 1.1 started     # 2 threads available
    Chapter 2 started       # 1 thread available
    Chapter 1.1 finished    # 2 threads available again
    Chapter 1.2 started     # 1 thread available
    Chapter 1.2.1 started   # 0 threads available
                            # deadlock!

At this moment chapter 1 subtree holds three threads and waits for one more for chapter 1.2.2 to complete everything and start ascending from recursion. And subtree of chapter 2 holds one thread and waits for one more for chapter 2.1 to descend into recursion. In result processing gets to a point where at least one more thread is required to proceed but all threads are taken and none can be returned to thread pool. Script deadlocks and stops here forever.

How to solve this problem and maintain parallel processing? There are many ways to do it :)
The key to the solution is to process asynchronously only those chapters that do not have unprocessed chapters on lower level.

Luckily Perl 6 offers perfect tool - promise junctions. It is possible to create a promise that waits for other promises to be kept and until it happens it is not sent to thread pool for execution. Following code illustrates that:

    my $p = Promise.allof( Promise.in(2), Promise.in(3) );
    sleep 1;
    say "Promise after 1 second: " ~ $p.perl;
    sleep 3;
    say "Promise after 4 seconds: " ~ $p.perl;

Prints:

    Promise after 1 second: Promise.new(
        ..., status => PromiseStatus::Planned
    )
    Promise after 4 seconds: Promise.new(
        ..., status => PromiseStatus::Kept
    )

Let's rewrite processing using this cool property:

    sub process (%chapters) {
        return Promise.allof(
            do for %chapters.kv -> $number, $content {
                my $current = {
                    note "Chapter $number started";
                    sleep 1; # here the chapter itself is processed
                    note "Chapter $number finished";
                };
                
                if $content ~~ Hash {
                    Promise.allof( &?ROUTINE.($content) )
                        .then( $current );
                }
                else {
                    Promise.start( $current );
                }
            }
        );
    }
    
    await process(%document);

It solves the problem when chapter was competing with its sub-chapters for free threads but at the same time it needed those sub-chapters before it can process itself. Now awaiting for sub-chapters to complete does not require free thread. Let's run it:

    $ perl6 run.pl
    Chapter 1.1 started
    Chapter 1.2.1 started
    Chapter 1.2.2 started
    Chapter 2.1.1 started
    -
    Chapter 1.1 finished
    Chapter 1.2.1 finished
    Chapter 1.2.2 finished
    Chapter 1.2 started
    Chapter 2.1.1 finished
    Chapter 2.1 started
    -
    Chapter 1.2 finished
    Chapter 1 started
    Chapter 2.1 finished
    Chapter 2 started
    -
    Chapter 1 finished
    Chapter 2 finished
    
    real    0m3.454s

I've added separator for each second passed so it is easier to understand. When script starts chapters 1.1, 1.2.1, 1.2.2 and 2.1.1 do not have sub-chapters at all. So they can take threads from thread pool immediately. When they are completed after one second then Promises that were awaiting for all of them are kept and chapters 1.2 and 2.1 can be processed safely on thread pool. It keeps going until getting out of recursion.

After trying big document again it was processed flawlessly in 72 seconds instead of linear 1000.

I'm high on asynchronous processing again!

You can download script here and try different data sizes and algorithms for yourself (params are taken from command line).

6guts: Towards a new general dispatch mechanism in MoarVM

Published by jnthnwrthngtn on 2021-03-15T02:08:42

My goodness, it appears I’m writing my first Raku internals blog post in over two years. Of course, two years ago it wasn’t even called Raku. Anyway, without further ado, let’s get on with this shared brainache.

What is dispatch?

I use “dispatch” to mean a process by which we take a set of arguments and end up with some action being taken based upon them. Some familiar examples include:

At first glance, perhaps the first two seem fairly easy and the third a bit more of a handful – which is sort of true. However, Raku has a number of other features that make dispatch rather more, well, interesting. For example:

Thanks to this, dispatch – at least in Raku – is not always something we do and produce an outcome, but rather a process that we may be asked to continue with multiple times!

Finally, while the examples I’ve written above can all quite clearly be seen as examples of dispatch, a number of other common constructs in Raku can be expressed as a kind of dispatch too. Assignment is one example: the semantics of it depend on the target of the assignment and the value being assigned, and thus we need to pick the correct semantics. Coercion is another example, and return value type-checking yet another.

Why does dispatch matter?

Dispatch is everywhere in our programs, quietly tieing together the code that wants stuff done with the code that does stuff. Its ubiquity means it plays a significant role in program performance. In the best case, we can reduce the cost to zero. In the worst case, the cost of the dispatch is high enough to exceed that of the work done as a result of the dispatch.

To a first approximation, when the runtime “understands” the dispatch the performance tends to be at least somewhat decent, but when it doesn’t there’s a high chance of it being awful. Dispatches tend to involve an amount of work that can be cached, often with some cheap guards to verify the validity of the cached outcome. For example, in a method dispatch, naively we need to walk a linearization of the inheritance graph and ask each class we encounter along the way if it has a method of the specified name. Clearly, this is not going to be terribly fast if we do it on every method call. However, a particular method name on a particular type (identified precisely, without regard to subclassing) will resolve to the same method each time. Thus, we can cache the outcome of the lookup, and use it whenever the type of the invocant matches that used to produce the cached result.

Specialized vs. generalized mechanisms in language runtimes

When one starts building a runtime aimed at a particular language, and has to do it on a pretty tight budget, the most obvious way to get somewhat tolerable performance is to bake various hot-path language semantics into the runtime. This is exactly how MoarVM started out. Thus, if we look at MoarVM as it stood several years ago, we find things like:

These are all still there today, however are also all on the way out. What’s most telling about this list is what isn’t included. Things like:

A few years back I started to partially address this, with the introduction of a mechanism I called “specializer plugins”. But first, what is the specializer?

When MoarVM started out, it was a relatively straightforward interpreter of bytecode. It only had to be fast enough to beat the Parrot VM in order to get a decent amount of usage, which I saw as important to have before going on to implement some more interesting optimizations (back then we didn’t have the kind of pre-release automated testing infrastructure we have today, and so depended much more on feedback from early adopters). Anyway, soon after being able to run pretty much as much of the Raku language as any other backend, I started on the dynamic optimizer. It gathered type statistics as the program was interpreted, identified hot code, put it into SSA form, used the type statistics to insert guards, used those together with static properties of the bytecode to analyze and optimize, and produced specialized bytecode for the function in question. This bytecode could elide type checks and various lookups, as well as using a range of internal ops that make all kinds of assumptions, which were safe because of the program properties that were proved by the optimizer. This is called specialized bytecode because it has had a lot of its genericity – which would allow it to work correctly on all types of value that we might encounter – removed, in favor of working in a particular special case that actually occurs at runtime. (Code, especially in more dynamic languages, is generally far more generic in theory than it ever turns out to be in practice.)

This component – the specializer, known internally as “spesh” – delivered a significant further improvement in the performance of Raku programs, and with time its sophistication has grown, taking in optimizations such as inlining and escape analysis with scalar replacement. These aren’t easy things to build – but once a runtime has them, they create design possibilities that didn’t previously exist, and make decisions made in their absence look sub-optimal.

Of note, those special-cased language-specific mechanisms, baked into the runtime to get some speed in the early days, instead become something of a liability and a bottleneck. They have complex semantics, which means they are either opaque to the optimizer (so it can’t reason about them, meaning optimization is inhibited) or they need special casing in the optimizer (a liability).

So, back to specializer plugins. I reached a point where I wanted to take on the performance of things like $obj.?meth (the “call me maybe” dispatch), $obj.SomeType::meth() (dispatch qualified with a class to start looking in), and private method calls in roles (which can’t be resolved statically). At the same time, I was getting ready to implement some amount of escape analysis, but realized that it was going to be of very limited utility because assignment had also been special-cased in the VM, with a chunk of opaque C code doing the hot path stuff.

But why did we have the C code doing that hot-path stuff? Well, because it’d be too espensive to have every assignment call a VM-level function that does a bunch of checks and logic. Why is that costly? Because of function call overhead and the costs of interpretation. This was all true once upon a time. But, some years of development later:

I solved the assignment problem and the dispatch problems mentioned above with the introduction of a single new mechanism: specializer plugins. They work as follows:

The vast majority of cases are monomorphic, meaning that only one set of guards are produced and they always succeed thereafter. The specializer can thus compile those guards into the specialized bytecode and then assume the given target invocant is what will be invoked. (Further, duplicate guards can be eliminated, so the guards a particular plugin introduces may reduce to zero.)

Specializer plugins felt pretty great. One new mechanism solved multiple optimization headaches.

The new MoarVM dispatch mechanism is the answer to a fairly simple question: what if we get rid of all the dispatch-related special-case mechanisms in favor of something a bit like specializer plugins? The resulting mechanism would need to be a more powerful than specializer plugins. Further, I could learn from some of the shortcomings of specializer plugins. Thus, while they will go away after a relatively short lifetime, I think it’s fair to say that I would not have been in a place to design the new MoarVM dispatch mechanism without that experience.

The dispatch op and the bootstrap dispatchers

All the method caching. All the multi dispatch caching. All the specializer plugins. All the invocation protocol stuff for unwrapping the bytecode handle in a code object. It’s all going away, in favor of a single new dispatch instruction. Its name is, boringly enough, dispatch. It looks like this:

dispatch_o result, 'dispatcher-name', callsite, arg0, arg1, ..., argN

Which means:

(Aside: this implies a new calling convention, whereby we no longer copy the arguments into an argument buffer, but instead pass the base of the register set and a pointer into the bytecode where the register argument map is found, and then do a lookup registers[map[argument_index]] to get the value for an argument. That alone is a saving when we interpret, because we no longer need a loop around the interpreter per argument.)

Some of the arguments might be things we’d traditionally call arguments. Some are aimed at the dispatch process itself. It doesn’t really matter – but it is more optimal if we arrange to put arguments that are only for the dispatch first (for example, the method name), and those for the target of the dispatch afterwards (for example, the method parameters).

The new bootstrap mechanism provides a small number of built-in dispatchers, whose names start with “boot-“. They are:

That’s pretty much it. Every dispatcher we build, to teach the runtime about some other kind of dispatch behavior, eventually terminates in one of these.

Building on the bootstrap

Teaching MoarVM about different kinds of dispatch is done using nothing less than the dispatch mechanism itself! For the most part, boot-syscall is used in order to register a dispatcher, set up the guards, and provide the result that goes with them.

Here is a minimal example, taken from the dispatcher test suite, showing how a dispatcher that provides the identity function would look:

nqp::dispatch('boot-syscall', 'dispatcher-register', 'identity', -> $capture {
    nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-value', $capture);
});
sub identity($x) {
    nqp::dispatch('identity', $x)
}
ok(identity(42) == 42, 'Can define identity dispatch (1)');
ok(identity('foo') eq 'foo', 'Can define identity dispatch (2)');

In the first statement, we call the dispatcher-register MoarVM system call, passing a name for the dispatcher along with a closure, which will be called each time we need to handle the dispatch (which I tend to refer to as the “dispatch callback”). It receives a single argument, which is a capture of arguments (not actually a Raku-level Capture, but the idea – an object containing a set of call arguments – is the same).

Every user-defined dispatcher should eventually use dispatcher-delegate in order to identify another dispatcher to pass control along to. In this case, it delegates immediately to boot-value – meaning it really is nothing except a wrapper around the boot-value built-in dispatcher.

The sub identity contains a single static occurrence of the dispatch op. Given we call the sub twice, we will encounter this op twice at runtime, but the two times are very different.

The first time is the “record” phase. The arguments are formed into a capture and the callback runs, which in turn passes it along to the boot-value dispatcher, which produces the result. This results in an extremely simple dispatch program, which says that the result should be the first argument in the capture. Since there’s no guards, this will always be a valid result.

The second time we encounter the dispatch op, it already has a dispatch program recorded there, so we are in run mode. Turning on a debugging mode in the MoarVM source, we can see the dispatch program that results looks like this:

Dispatch program (1 temporaries)
  Ops:
    Load argument 0 into temporary 0
    Set result object value from temporary 0

That is, it reads argument 0 into a temporary location and then sets that as the result of the dispatch. Notice how there is no mention of the fact that we went through an extra layer of dispatch; those have zero cost in the resulting dispatch program.

Capture manipulation

Argument captures are immutable. Various VM syscalls exist to transform them into new argument captures with some tweak, for example dropping or inserting arguments. Here’s a further example from the test suite:

nqp::dispatch('boot-syscall', 'dispatcher-register', 'drop-first', -> $capture {
    my $capture-derived := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg', $capture, 0);
    nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-value', $capture-derived);
});
ok(nqp::dispatch('drop-first', 'first', 'second') eq 'second',
    'dispatcher-drop-arg works');

This drops the first argument before passing the capture on to the boot-value dispatcher – meaning that it will return the second argument. Glance back at the previous dispatch program for the identity function. Can you guess how this one will look?

Well, here it is:

Dispatch program (1 temporaries)
  Ops:
    Load argument 1 into temporary 0
    Set result string value from temporary 0

Again, while in the record phase of such a dispatcher we really do create capture objects and make a dispatcher delegation, the resulting dispatch program is far simpler.

Here’s a slightly more involved example:

my $target := -> $x { $x + 1 }
nqp::dispatch('boot-syscall', 'dispatcher-register', 'call-on-target', -> $capture {
    my $capture-derived := nqp::dispatch('boot-syscall',
            'dispatcher-insert-arg-literal-obj', $capture, 0, $target);
    nqp::dispatch('boot-syscall', 'dispatcher-delegate',
            'boot-code-constant', $capture-derived);
});
sub cot() { nqp::dispatch('call-on-target', 49) }
ok(cot() == 50,
    'dispatcher-insert-arg-literal-obj works at start of capture');
ok(cot() == 50,
    'dispatcher-insert-arg-literal-obj works at start of capture after link too');

Here, we have a closure stored in a variable $target. We insert it as the first argument of the capture, and then delegate to boot-code-constant, which will invoke that code object and pass the other dispatch arguments to it. Once again, at the record phase, we really do something like:

And the resulting dispatch program? It’s this:

Dispatch program (1 temporaries)
  Ops:
    Load collectable constant at index 0 into temporary 0
    Skip first 0 args of incoming capture; callsite from 0
    Invoke MVMCode in temporary 0

That is, load the constant bytecode handle that we’re going to invoke, set up the args (which are in this case equal to those of the incoming capture), and then invoke the bytecode with those arguments. The argument shuffling is, once again, gone. In general, whenever the arguments we do an eventual bytecode invocation with are a tail of the initial dispatch arguments, the arguments transform becomes no more than a pointer addition.

Guards

All of the dispatch programs seen so far have been unconditional: once recorded at a given callsite, they shall always be used. The big missing piece to make such a mechanism have practical utility is guards. Guards assert properties such as the type of an argument or if the argument is definite (Int:D) or not (Int:U).

Here’s a somewhat longer test case, with some explanations placed throughout it.

# A couple of classes for test purposes
my class C1 { }
my class C2 { }

# A counter used to make sure we're only invokving the dispatch callback as
# many times as we expect.
my $count := 0;

# A type-name dispatcher that maps a type into a constant string value that
# is its name. This isn't terribly useful, but it is a decent small example.
nqp::dispatch('boot-syscall', 'dispatcher-register', 'type-name', -> $capture {
    # Bump the counter, just for testing purposes.
    $count++;

    # Obtain the value of the argument from the capture (using an existing
    # MoarVM op, though in the future this may go away in place of a syscall)
    # and then obtain the string typename also.
    my $arg-val := nqp::captureposarg($capture, 0);
    my str $name := $arg-val.HOW.name($arg-val);

    # This outcome is only going to be valid for a particular type. We track
    # the argument (which gives us an object back that we can use to guard
    # it) and then add the type guard.
    my $arg := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 0);
    nqp::dispatch('boot-syscall', 'dispatcher-guard-type', $arg);

    # Finally, insert the type name at the start of the capture and then
    # delegate to the boot-constant dispatcher.
    nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-constant',
        nqp::dispatch('boot-syscall', 'dispatcher-insert-arg-literal-str',
            $capture, 0, $name));
});

# A use of the dispatch for the tests. Put into a sub so there's a single
# static dispatch op, which all dispatch programs will hang off.
sub type-name($obj) {
    nqp::dispatch('type-name', $obj)
}

# Check with the first type, making sure the guard matches when it should
# (although this test would pass if the guard were ignored too).
ok(type-name(C1) eq 'C1', 'Dispatcher setting guard works');
ok($count == 1, 'Dispatch callback ran once');
ok(type-name(C1) eq 'C1', 'Can use it another time with the same type');
ok($count == 1, 'Dispatch callback was not run again');

# Test it with a second type, both record and run modes. This ensures the
# guard really is being checked.
ok(type-name(C2) eq 'C2', 'Can handle polymorphic sites when guard fails');
ok($count == 2, 'Dispatch callback ran a second time for new type');
ok(type-name(C2) eq 'C2', 'Second call with new type works');

# Check that we can use it with the original type too, and it has stacked
# the dispatch programs up at the same callsite.
ok(type-name(C1) eq 'C1', 'Call with original type still works');
ok($count == 2, 'Dispatch callback only ran a total of 2 times');

This time two dispatch programs get produced, one for C1:

Dispatch program (1 temporaries)
  Ops:
    Guard arg 0 (type=C1)
    Load collectable constant at index 1 into temporary 0
    Set result string value from temporary 0

And another for C2:

Dispatch program (1 temporaries)
  Ops:
    Guard arg 0 (type=C2)
    Load collectable constant at index 1 into temporary 0
    Set result string value from temporary 0

Once again, no leftovers from capture manipulation, tracking, or dispatcher delegation; the dispatch program does a type guard against an argument, then produces the result string. The whole call to $arg-val.HOW.name($arg-val) is elided, the dispatcher we wrote encoding the knowledge – in a way that the VM can understand – that a type’s name can be considered immutable.

This example is a bit contrived, but now consider that we instead look up a method and guard on the invocant type: that’s a method cache! Guard the types of more of the arguments, and we have a multi cache! Do both, and we have a multi-method cache.

The latter is interesting in so far as both the method dispatch and the multi dispatch want to guard on the invocant. In fact, in MoarVM today there will be two such type tests until we get to the point where the specializer does its work and eliminates these duplicated guards. However, the new dispatcher does not treat the dispatcher-guard-type as a kind of imperative operation that writes a guard into the resultant dispatch program. Instead, it declares that the argument in question must be guarded. If some other dispatcher already did that, it’s idempotent. The guards are emitted once all dispatch programs we delegate through, on the path to a final outcome, have had their say.

Fun aside: those being especially attentive will have noticed that the dispatch mechanism is used as part of implementing new dispatchers too, and indeed, this ultimately will mean that the specializer can specialize the dispatchers and have them JIT-compiled into something more efficient too. After all, from the perspective of MoarVM, it’s all just bytecode to run; it’s just that some of it is bytecode that tells the VM how to execute Raku programs more efficiently!

Dispatch resumption

A resumable dispatcher needs to do two things:

  1. Provide a resume callback as well as a dispatch one when registering the dispatcher
  2. In the dispatch callback, specify a capture, which will form the resume initialization state

When a resumption happens, the resume callback will be called, with any arguments for the resumption. It can also obtain the resume initialization state that was set in the dispatch callback. The resume initialization state contains the things needed in order to continue with the dispatch the first time it is resumed. We’ll take a look at how this works for method dispatch to see a concrete example. I’ll also, at this point, switch to looking at the real Rakudo dispatchers, rather than simplified test cases.

The Rakudo dispatchers take advantage of delegation, duplicate guards, and capture manipulations all having no runtime cost in the resulting dispatch program to, in my mind at least, quite nicely factor what is a somewhat involved dispatch process. There are multiple entry points to method dispatch: the normal boring $obj.meth(), the qualified $obj.Type::meth(), and the call me maybe $obj.?meth(). These have common resumption semantics – or at least, they can be made to provided we always carry a starting type in the resume initialization state, which is the type of the object that we do the method dispatch on.

Here is the entry point to dispatch for a normal method dispatch, with the boring details of reporting missing method errors stripped out.

# A standard method call of the form $obj.meth($arg); also used for the
# indirect form $obj."$name"($arg). It receives the decontainerized invocant,
# the method name, and the the args (starting with the invocant including any
# container).
nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-meth-call', -> $capture {
    # Try to resolve the method call using the MOP.
    my $obj := nqp::captureposarg($capture, 0);
    my str $name := nqp::captureposarg_s($capture, 1);
    my $meth := $obj.HOW.find_method($obj, $name);

    # Report an error if there is no such method.
    unless nqp::isconcrete($meth) {
        !!! 'Error reporting logic elided for brevity';
    }

    # Establish a guard on the invocant type and method name (however the name
    # may well be a literal, in which case this is free).
    nqp::dispatch('boot-syscall', 'dispatcher-guard-type',
        nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 0));
    nqp::dispatch('boot-syscall', 'dispatcher-guard-literal',
        nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 1));

    # Add the resolved method and delegate to the resolved method dispatcher.
    my $capture-delegate := nqp::dispatch('boot-syscall',
        'dispatcher-insert-arg-literal-obj', $capture, 0, $meth);
    nqp::dispatch('boot-syscall', 'dispatcher-delegate',
        'raku-meth-call-resolved', $capture-delegate);
});

Now for the resolved method dispatcher, which is where the resumption is handled. First, let’s look at the normal dispatch callback (the resumption callback is included but empty; I’ll show it a little later).

# Resolved method call dispatcher. This is used to call a method, once we have
# already resolved it to a callee. Its first arg is the callee, the second and
# third are the type and name (used in deferral), and the rest are the args to
# the method.
nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-meth-call-resolved',
    # Initial dispatch
    -> $capture {
        # Save dispatch state for resumption. We don't need the method that will
        # be called now, so drop it.
        my $resume-capture := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg',
            $capture, 0);
        nqp::dispatch('boot-syscall', 'dispatcher-set-resume-init-args', $resume-capture);

        # Drop the dispatch start type and name, and delegate to multi-dispatch or
        # just invoke if it's single dispatch.
        my $delegate_capture := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg',
            nqp::dispatch('boot-syscall', 'dispatcher-drop-arg', $capture, 1), 1);
        my $method := nqp::captureposarg($delegate_capture, 0);
        if nqp::istype($method, Routine) && $method.is_dispatcher {
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi', $delegate_capture);
        }
        else {
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-invoke', $delegate_capture);
        }
    },
    # Resumption
    -> $capture {
        ... 'Will be shown later';
    });

There’s an arguable cheat in raku-meth-call: it doesn’t actually insert the type object of the invocant in place of the invocant. It turns out that it doesn’t really matter. Otherwise, I think the comments (which are to be found in the real implementation also) tell the story pretty well.

One important point that may not be clear – but follows a repeating theme – is that the setting of the resume initialization state is also more of a declarative rather than an imperative thing: there isn’t a runtime cost at the time of the dispatch, but rather we keep enough information around in order to be able to reconstruct the resume initialization state at the point we need it. (In fact, when we are in the run phase of a resume, we don’t even have to reconstruct it in the sense of creating a capture object.)

Now for the resumption. I’m going to present a heavily stripped down version that only deals with the callsame semantics (the full thing has to deal with such delights as lastcall and nextcallee too). The resume initialization state exists to seed the resumption process. Once we know we actually do have to deal with resumption, we can do things like calculating the full list of methods in the inheritance graph that we want to walk through. Each resumable dispatcher gets a single storage slot on the call stack that it can use for its state. It can initialize this in the first step of resumption, and then update it as we go. Or more precisely, it can set up a dispatch program that will do this when run.

A linked list turns out to be a very convenient data structure for the chain of candidates we will walk through. We can work our way through a linked list by keeping track of the current node, meaning that there need only be a single thing that mutates, which is the current state of the dispatch. The dispatch program mechanism also provides a way to read an attribute from an object, and that is enough to express traversing a linked list into the dispatch program. This also means zero allocations.

So, without further ado, here is the linked list (rather less pretty in NQP, the restricted Raku subset, than it would be in full Raku):

# A linked list is used to model the state of a dispatch that is deferring
# through a set of methods, multi candidates, or wrappers. The Exhausted class
# is used as a sentinel for the end of the chain. The current state of the
# dispatch points into the linked list at the appropriate point; the chain
# itself is immutable, and shared over (runtime) dispatches.
my class DeferralChain {
    has $!code;
    has $!next;
    method new($code, $next) {
        my $obj := nqp::create(self);
        nqp::bindattr($obj, DeferralChain, '$!code', $code);
        nqp::bindattr($obj, DeferralChain, '$!next', $next);
        $obj
    }
    method code() { $!code }
    method next() { $!next }
};
my class Exhausted {};

And finally, the resumption handling.

nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-meth-call-resolved',
    # Initial dispatch
    -> $capture {
        ... 'Presented earlier;
    },
    # Resumption. The resume init capture's first two arguments are the type
    # that we initially did a method dispatch against and the method name
    # respectively.
    -> $capture {
        # Work out the next method to call, if any. This depends on if we have
        # an existing dispatch state (that is, a method deferral is already in
        # progress).
        my $init := nqp::dispatch('boot-syscall', 'dispatcher-get-resume-init-args');
        my $state := nqp::dispatch('boot-syscall', 'dispatcher-get-resume-state');
        my $next_method;
        if nqp::isnull($state) {
            # No state, so just starting the resumption. Guard on the
            # invocant type and name.
            my $track_start_type := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $init, 0);
            nqp::dispatch('boot-syscall', 'dispatcher-guard-type', $track_start_type);
            my $track_name := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $init, 1);
            nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_name);

            # Also guard on there being no dispatch state.
            my $track_state := nqp::dispatch('boot-syscall', 'dispatcher-track-resume-state');
            nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_state);

            # Build up the list of methods to defer through.
            my $start_type := nqp::captureposarg($init, 0);
            my str $name := nqp::captureposarg_s($init, 1);
            my @mro := nqp::can($start_type.HOW, 'mro_unhidden')
                ?? $start_type.HOW.mro_unhidden($start_type)
                !! $start_type.HOW.mro($start_type);
            my @methods;
            for @mro {
                my %mt := nqp::hllize($_.HOW.method_table($_));
                if nqp::existskey(%mt, $name) {
                    @methods.push(%mt{$name});
                }
            }

            # If there's nothing to defer to, we'll evaluate to Nil (just don't set
            # the next method, and it happens below).
            if nqp::elems(@methods) >= 2 {
                # We can defer. Populate next method.
                @methods.shift; # Discard the first one, which we initially called
                $next_method := @methods.shift; # The immediate next one

                # Build chain of further methods and set it as the state.
                my $chain := Exhausted;
                while @methods {
                    $chain := DeferralChain.new(@methods.pop, $chain);
                }
                nqp::dispatch('boot-syscall', 'dispatcher-set-resume-state-literal', $chain);
            }
        }
        elsif !nqp::istype($state, Exhausted) {
            # Already working through a chain of method deferrals. Obtain
            # the tracking object for the dispatch state, and guard against
            # the next code object to run.
            my $track_state := nqp::dispatch('boot-syscall', 'dispatcher-track-resume-state');
            my $track_method := nqp::dispatch('boot-syscall', 'dispatcher-track-attr',
                $track_state, DeferralChain, '$!code');
            nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_method);

            # Update dispatch state to point to next method.
            my $track_next := nqp::dispatch('boot-syscall', 'dispatcher-track-attr',
                $track_state, DeferralChain, '$!next');
            nqp::dispatch('boot-syscall', 'dispatcher-set-resume-state', $track_next);

            # Set next method, which we shall defer to.
            $next_method := $state.code;
        }
        else {
            # Dispatch already exhausted; guard on that and fall through to returning
            # Nil.
            my $track_state := nqp::dispatch('boot-syscall', 'dispatcher-track-resume-state');
            nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_state);
        }

        # If we found a next method...
        if nqp::isconcrete($next_method) {
            # Call with same (that is, original) arguments. Invoke with those.
            # We drop the first two arguments (which are only there for the
            # resumption), add the code object to invoke, and then leave it
            # to the invoke or multi dispatcher.
            my $just_args := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg',
                nqp::dispatch('boot-syscall', 'dispatcher-drop-arg', $init, 0),
                0);
            my $delegate_capture := nqp::dispatch('boot-syscall',
                'dispatcher-insert-arg-literal-obj', $just_args, 0, $next_method);
            if nqp::istype($next_method, Routine) && $next_method.is_dispatcher {
                nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi',
                        $delegate_capture);
            }
            else {
                nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-invoke',
                        $delegate_capture);
            }
        }
        else {
            # No method, so evaluate to Nil (boot-constant disregards all but
            # the first argument).
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-constant',
                nqp::dispatch('boot-syscall', 'dispatcher-insert-arg-literal-obj',
                    $capture, 0, Nil));
        }
    });

That’s quite a bit to take in, and quite a bit of code. Remember, however, that this is only run for the record phase of a dispatch resumption. It also produces a dispatch program at the callsite of the callsame, with the usual guards and outcome. Implicit guards are created for the dispatcher that we are resuming at that point. In the most common case this will end up monomorphic or bimorphic, although situations involving nestings of multiple dispatch or method dispatch could produce a more morphic callsite.

The design I’ve picked forces resume callbacks to deal with two situations: the first resumption and the latter resumptions. This is not ideal in a couple of ways:

  1. It’s a bit inconvenient for those writing dispatch resume callbacks. However, it’s not like this is a particularly common activity!
  2. The difference results in two dispatch programs being stacked up at a callsite that might otherwise get just one

Only the second of these really matters. The reason for the non-uniformity is to make sure that the overwhelming majority of calls, which never lead to a dispatch resumption, incur no per-dispatch cost for a feature that they never end up using. If the result is a little more cost for those using the feature, so be it. In fact, early benchmarking shows callsame with wrap and method calls seems to be up to 10 times faster using the new dispatcher than in current Rakudo, and that’s before the specializer understands enough about it to improve things further!

What’s done so far

Everything I’ve discussed above is implemented, except that I may have given the impression somewhere that multiple dispatch is fully implemented using the new dispatcher, and that is not the case yet (no handling of where clauses and no dispatch resumption support).

Next steps

Getting the missing bits of multiple dispatch fully implemented is the obvious next step. The other missing semantic piece is support for callwith and nextwith, where we wish to change the arguments that are being used when moving to the next candidate. A few other minor bits aside, that in theory will get all of the Raku dispatch semantics at least supported.

Currently, all standard method calls ($obj.meth()) and other calls (foo() and $foo()) go via the existing dispatch mechanism, not the new dispatcher. Those will need to be migrated to use the new dispatcher also, and any bugs that are uncovered will need fixing. That will get things to the point where the new dispatcher is semantically ready.

After that comes performance work: making sure that the specializer is able to deal with dispatch program guards and outcomes. The goal, initially, is to get steady state performance of common calling forms to perform at least as well as in the current master branch of Rakudo. It’s already clear enough there will be some big wins for some things that to date have been glacial, but it should not come at the cost of regression on the most common kinds of dispatch, which have received plenty of optimization effort before now.

Furthermore, NQP – the restricted form of Raku that the Rakudo compiler and other bits of the runtime guts are written in – also needs to be migrated to use the new dispatcher. Only when that is done will it be possible to rip out the current method cache, multiple dispatch cache, and so forth from MoarVM.

An open question is how to deal with backends other than MoarVM. Ideally, the new dispatch mechanism will be ported to those. A decent amount of it should be possible to express in terms of the JVM’s invokedynamic (and this would all probably play quite well with a Truffle-based Raku implementation, although I’m not sure there is a current active effort in that area).

Future opportunities

While my current focus is to ship a Rakudo and MoarVM release that uses the new dispatcher mechanism, that won’t be the end of the journey. Some immediate ideas:

Some new language features may also be possible to provide in an efficient way with the help of the new dispatch mechanism. For example, there’s currently not a reliable way to try to invoke a piece of code, just run it if the signature binds, or to do something else if it doesn’t. Instead, things like the Cro router have to first do a trial bind of the signature, and then do the invoke, which makes routing rather more costly. There’s also the long suggested idea of providing pattern matching via signatures with the when construct (for example, when * -> ($x) {}; when * -> ($x, *@tail) { }), which is pretty much the same need, just in a less dynamic setting.

In closing…

Working on the new dispatch mechanism has been a longer journey than I first expected. The resumption part of the design was especially challenging, and there’s still a few important details to attend to there. Something like four potential approaches were discarded along the way (although elements of all of them influenced what I’ve described in this post). Abstractions that hold up are really, really, hard.

I also ended up having to take a couple of months away from doing Raku work at all, felt a bit crushed during some others, and have been juggling this with the equally important RakuAST project (which will be simplified by being able to assume the presence of the new dispatcher, and also offers me a range of softer Raku hacking tasks, whereas the dispatcher work offers few easy pickings).

Given all that, I’m glad to finally be seeing the light at the end of the tunnel. The work that remains is enumerable, and the day we ship a Rakudo and MoarVM release using the new dispatcher feels a small number of months away (and I hope writing that is not tempting fate!)

The new dispatcher is probably the most significant change to MoarVM since I founded it, in so far as it sees us removing a bunch of things that have been there pretty much since the start. RakuAST will also deliver the greatest architectural change to the Rakudo compiler in a decade. Both are an opportunity to fold years of learning things the hard way into the runtime and compiler. I hope when I look back at it all in another decade’s time, I’ll at least feel I made more interesting mistakes this time around.

brrt to the future: Why bother with Scripting?

Published by Bart Wiegmans on 2021-03-14T14:33:00

Many years back, Larry Wall shared his thesis on the nature of scripting. Since recently even Java gained 'script' support I thought it would be fitting to revisit the topic, and hopefully relevant to the perl and raku language community.

The weakness of Larry's treatment (which, to be fair to the author, I think is more intended to be enlightening than to be complete) is the contrast of scripting with programming. This contrast does not permit a clear separation because scripts are programs. That is to say, no matter how long or short, scripts are written commands for a machine to execute, and I think that's a pretty decent definition of a program in general.

A more useful contrast - and, I think, the intended one - is between scripts and other sorts of programs, because that allows us to compare scripting (writing scripts) with 'programming' (writing non-script programs). And to do that we need to know what other sorts of programs there are.

The short version of that answer is - systems and applications, and a bunch of other things that aren't really relevant to the working programmer, like (embedded) control algorithms, spreadsheets and database queries. (The definition I provided above is very broad, by design, because I don't want to get stuck on boundary questions). Most programmers write applications, some write systems, virtually all write scripts once in a while, though plenty of people who aren't professional programmers also write scripts.

I think the defining features of applications and systems are, respectively:

Consider for instance a mail client (like thunderbird) in comparison to a mailer daemon (like sendmail) - one provides an interface to read and write e-mails (the model) and the other provides functionality to send that e-mail to other servers.

Note that under this (again, broad) definition, libraries are also system software, which makes sense, considering that their users are developers (just as for, say, PostgreSQL) who care about things like performance, reliability, and correctness. Incidentally, libraries as well as 'typical' system software (such as database engines and operating system kernels) tend to be written in languages like C and C++ for much the same reasons.

What then, are the differences between scripts, applications, and systems? I think the following is a good list:

Obviously these distinctions aren't really binary - 'short' versus 'long', 'ad-hoc' versus 'general purpose'  - and can't be used to conclusively settle the question whether something is a script or an application. (If, indeed, that question ever comes up). More important is that for the 10 or so scripts I've written over the past year - some professionally, some not - all or most of these properties held, and I'd be surprised if the same isn't true for most readers. 

And - finally coming at the point that I'm trying to make today - these features point to a specific niche of programs more than to a specific technology (or set of technologies). To be exact, scripts are (mostly) short, custom programs to automate ad-hoc tasks, tasks that are either to specific or too small to develop and distribute another program for.

This has further implications on the preferred features of a scripting language (taken to mean, a language designed to enable the development of scripts). In particular:

As an example of the last point - Python 3 requires users to be exact about the encoding of their input, causing all sorts of trouble for unsuspecting scripters when they accidentally try to read ISO-8551 data as UTF-8, or vice versa. Python 2 did not, and for most scripts - not applications - I actually think that is the right choice.

This niche doesn't always exist. In computing environments where everything of interest is adequately captured by an application, or which lacks the ability to effectively automate ad-hoc tasks (I'm thinking in particular of Windows before PowerShell), the practice of scripting tends to not develop. Similarily, in a modern 'cloud' environment, where system setup is controlled by a state machine hosted by another organization, scripting doesn't really have much of a future.

To put it another way, scripting only thrives in an environment that has a lot of 'scriptable' tasks; meaning tasks for which there isn't already a pre-made solution available, environments that have powerful facilities available for a script to access, and whose users are empowered to automate those tasks. Such qualities are common on Unix/Linux 'workstations' but rather less so on smartphones and (as noted before) cloud computing environments.

Truth be told I'm a little worried about that development. I could point to, and expound on, the development and popularity of languages like go and rust, which aren't exactly scripting languages, or the replacement of Javascript with TypeScript, to make the point further, but I don't think that's necessary. At the same time I could point to the development of data science as a discipline to demonstrate that scripting is alive and well (and indeed perhaps more economically relevant than before).

What should be the conclusion for perl 5/7 and raku? I'm not quite sure, mostly because I'm not quite sure whether the broader perl/raku community would prefer their sister languages to be scripting or application languages. (As implied above, I think the Python community chose that they wanted Python 3 to be an application language, and this was not without consequences to their users). 

Raku adds a number of features common to application languages (I'm thinking of it's powerful type system in particular), continuing a trend that perl 5 arguably pioneered. This is indeed a very powerful strategy - a language can be introduced for scripts and some of those scripts are then extended into applications (or even systems), thereby ensuring its continued usage. But for it to work, a new perl family language must be introduced on its scripting merits, and there must be a plentiful supply of scriptable tasks to automate, some of which - or a combination of which - grow into an application.

For myself, I would like to see scripting have a bright future. Not just because scripting is the most accessible form of programming, but also because an environment that permits, even requires scripting, is one were not all interesting problems have been solved, one where it's users ask it to do tasks so diverse that there isn't an app for that, yet. One where the true potential of the wonderful devices that surround is can be explored.

In such a world there might well be a bright future for scripting.

Andrew Shitov: Computing factorials using Raku

Published by Andrew Shitov on 2021-01-31T18:19:33

In this post, I’d like to demonstrate a few ways of computing factorials using the Raku programming language.

1 — reduction

Let me start with the basic and the most effective (non necessarily the most efficient) form of computing the factorial of a given integer number:

say [*] 1..10; # 3628800

In the below examples, we mostly will be dealing with the factorial of 10, so remember the result. But to make the programs more versatile, let us read the number from the command line:

unit sub MAIN($n);

say [*] 1..$n;

To run the program, pass the number:

$ raku 00-cmd.raku 10
3628800

The program uses the reduction meta-operator [ ] with the main operator * in it.

You can also start with 2 (you can even compute 0! and 1! this way).

unit sub MAIN($n);

say [*] 2..$n;

2 — for

The second solution is using a postfix for loop to multiply the numbers in the range:

unit sub MAIN($n);

my $f = 1;
$f *= $_ for 2..$n;

say $f;

This solution is not that expressive but still demonstrates quite a clear code.

3 — map

You can also use map that is applied to a range:

unit sub MAIN($n);

my $f = 1;
(2..$n).map: $f *= *;

say $f;

Refer to my article All the stars of Perl 6, or * ** * to learn more about how to read *= *.

4 — recursion

Let’s implement a recursive solution.

unit sub MAIN($n);

sub factorial($n) {
    if $n < 2 {
        return 1;
    }
    else {
        return $n * factorial($n - 1);
    }
}

say factorial(n);

There are two branches, one of which terminates recursion.

5 — better recursion

The previous program can be rewritten to make a code with less punctuation:

unit sub MAIN($n);

sub factorial($n) {
    return 1 if $n < 2;
    return $n * factorial($n - 1);
}

say factorial($n);

Here, the first return is managed by a postfix if, and the second return can only be reached if the condition in if is false. So, neither an additional Boolean test nor else is needed.

6 — big numbers

What if you need to compute a factorial of a relatively big number? No worries, Raku will just do it:

say [*] 1..500;

The speed is more than acceptable for any practical application:

raku 06-long-factorial.raku  0.14s user 0.02s system 124% cpu 0.127 total

7 — small numbers

Let’s try something opposite and compute a factorial, which can fit a native integer:

unit sub MAIN($n);

my int $f = 1;
$f *= $_ for 2..$n;

say $f;

I am using a for loop here, but notice that the type of $f is a native integer (thus, 4 bytes). This program works with the numbers up to 20:

$ raku 07-int-factorial.raku 20
2432902008176640000

8 — sequence

The fun fact is that you can add a dot to the first program 🙂

unit sub MAIN($n);

say [*] 1 ... $n;

Now, 1 ... $n is a sequence. You can start it with 2 if you are not planning to compute a factorials of 0 and 1.

9 — reversed sequence

Unlike the solution with a range, it is possible to swap the ends of the sequence:

unit sub MAIN($n);

say [*] $n ... 1;

10 — sequence with definition

Nothing stops us from defining the elements of the sequence with a code block. The next program shows how you do it:

unit sub MAIN($n);

my @f = 1, * * ++$ ... *;
say @f[$n];

This time, the program generates a sequence of factorials from 1! to $n!, and to print the only one we need, we take the value from the array as @f[$n]. Notice that the sequence itself is lazy and its right end is undefined, so you can’t use @f[*-1], for example.

The rule here is * * ++$ (multiply the last computed value by the incremented index); it is using the built-in state variable $.

11 — multi functions

The idea of the solutions 4 and 5 with two branches can be further transformed to using multi-functions:

unit sub MAIN($n);

multi sub factorial(1)  { 1 }
multi sub factorial($n) { $n * factorial($n - 1) }

say factorial($n);

For the numbers above 1, Raku calls the second variant of the function. When the number comes down to 1, recursion stops, because the first variant is called. Notice how easily you can create a variant of a function that only reacts to the given value.

12 — where

The previous program loops infinitely if you try to set $n to 0. One of the simplest solution is to add a where clause to catch that case too.

unit sub MAIN($n);

multi sub factorial($n where $n < 2)  { 1 }
multi sub factorial($n) { $n * factorial($n - 1) }

say factorial($n);

13 — operator

Here’s another classical Raku solution: modifying its grammar to allow mathematical notation $n!.

unit sub MAIN($n);

sub postfix:<!>($n) {
    [*] 1..$n
}

say $n!;

14 — methodop

A rarely seen Raku’s feature called methodop (method operator) that allows you to call a function as it if was a method:

unit sub MAIN($n);

sub factorial($n) { 
    [*] 1..$n
}

say $n.&factorial;

15 — cached

Recursive solutions are perfect subjects for result caching. The following program demonstrates this approach.

unit sub MAIN($n);

use experimental :cached;

sub f($n) is cached {
    say "Called f($n)";
    return 1 if $n < 2;
    return $n * f($n - 1);
}

say f($n div 2);
say f(10);

This program first computes a factorial of the half of the input number, and then of the number itself. The program logs all the calls of the function. You can clearly see that, say, the factorial of 10 is using the results that were already computed for the factorial of 5:

$ raku 15-cached-factorial.raku 10
Called f(5)
Called f(4)
Called f(3)
Called f(2)
Called f(1)
120
Called f(10)
Called f(9)
Called f(8)
Called f(7)
Called f(6)
3628800

Note that the feature is experimental.

16 — triangular reduction

The reduction operator that we already used has a special variant [\ ] that allows to keep all the intermediate results. This is somewhat similar to using a sequence in the example 10.

unit sub MAIN($n);

my @f = [\*] 1..$n;

say @f[$n - 1];

17 — division of factorials

Now a few programs that go beyond the factorials themselves. The first program computes the value of the expression a! / b!, where both a and b are integer numbers, and a is not less than b.

The idea is to optimise the solution to skip the overlapping parts of the multiplication sequences. For example, 10! / 5! is 6 * 7 * 8 * 9 * 10.

To have more fun, let us modify Raku’s grammar so that it really parses the above expression.

unit sub MAIN($a, $b where $a >= $b);

class F {
    has $.n;    
}

sub postfix:<!>(Int $n) {    
    F.new(n => $n)
}

sub infix:</>(F $a, F $b) { 
    [*] $b.n ^.. $a.n
}

say $a! / $b!;

We already have seen the postfix:<!> operator. To catch division, another operator is defined, but to prevent catching the division of data of other types, a proxy class F is introduced.

To keep proper processing of expression such as 4 / 5, define another / operator that catches things which are not F. Don’t forget to add multi to both options. The callsame built-in routine dispatches control to built-in operator definitions.

. . .

multi sub infix:</>(F $a, F $b) { 
    [*] $b.n ^.. $a.n
}

multi sub infix:</>($a, $b) {
    callsame
}

say $a! / $b!;
say 4 / 5;

18 — optimisation

Let’s try to reduce the number of multiplications. Take a factorial of 10:

10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1

Now, take one number from each end, multiply them, and repeat the procedure:

10 * 1 = 10
 9 * 2 = 18
 8 * 3 = 24
 7 * 4 = 28
 6 * 5 = 30

You can see that every such result is bigger than the previous one by 8, 6, 4, and 2. In other words, the difference reduces by 2 on each iteration, starting from 10, which is the input number.

The whole program that implements this algorithm is shown below:

unit sub MAIN(
    $n is copy where $n %% 2 #= Even numbers only
);

my $f = $n;

my $d = $n - 2;
my $m = $n + $d;

while $d > 0 {
    $f *= $m;
    $d -= 2;
    $m += $d;
}

say $f;

It only works for even input numbers, so it contains a restriction reflected in the where clause of the MAIN function. As homework, modify the program to accept odd numbers too.

19 — integral

Before wrapping up, let’s look at a couple of exotic methods, which, however, can be used to compute factorials of non-integer numbers (or, to be stricter, to compute what can be called extended definition of it).

The proper way would be to use the Gamma function, but let me illustrate the method with a simpler formula:

An integral is a sum by definition, so let’s make a straightforward loop:

unit sub MAIN($n);

my num $f = 0E0;
my num $dx = 1E-6;
loop (my $x = $dx; $x <= 1; $x += $dx) {
    $f += (-log($x)) ** $n;
}

say $f * $dx;

With the given step of 1E-6, the result is not that exact:

$ raku 19-integral-factorial.raku 10
3086830.6595557937

But you can compute a ‘factorial’ of a floating-point number. For example, 5! is 120 and 6! is 720, but what is 5.5!?

$ raku 19-integral-factorial.raku 5.5
285.948286477563

20 — another formula

And finally, the Stirling’s formula for the rescue. The bigger the n, the more correct is the result.

The implementation can be as simple as this:

unit sub MAIN($n);

# τ = 2 * π
say (τ * $n).sqrt * ($n / e) ** $n;

But you can make it a bit more outstanding if you have a fixed $n:

say sqrt(τ * 10) * (10 / e)¹⁰;

* * *

And that’s it for now. You can find the source code of all the programs shown here in the GitHub repository github.com/ash/factorial.

Andrew Shitov: The course of Raku

Published by Andrew Shitov on 2021-01-13T08:44:00

I am happy to report that the first part of the Raku course is completed and published. The course is available at course.raku.org.

The grant was approved a year and a half ago right before the PerlCon conference in Rīga. I was the organiser of the event, so I had to postpone the course due to high load. During the conference, it was proposed to rename Perl 6, which, together with other stuff, made me think if the course is needed.

After months, the name was settled, the distinction between Perl and Raku became clearer, and, more importantly, external resourses and services, e.g., Rosettacode and glot.io started using the new name. So, now I think it is still a good idea to create the course that I dreamed about a couple of years ago. I started the main work in the middle of November 2020, and by the beginning of January 2021, I had the first part ready.

The current plan includes five parts:

  1. Raku essentials
  2. Advanced Raku subjects
  3. Object-oriented programming in Raku
  4. Regexes and grammars
  5. Functional, concurrent, and reactive programming

It differs a bit from the original plan published in the grant proposal. While the material stays the same, I decided to split it differently. Initially, I was going to go through all the topics one after another. Now, the first sections reveal the basics of some topics, and we will return to the same topics on the next level in the second part.

For example, in the first part, I only talk about the basic data types: IntRatNumStrRangeArrayList, and Hash and basic usage of them. The rest, including other types (e.g., Date or DateTime) and the methods such as @array.rotate or %hash.kv is delayed until the second part.

Contrary, functions were a subject of the second part initially, but they are now discussed in the first part. So, we now have Part 1 “Raku essentials” and Part 2 “Advanced Raku topics”. This shuffling allowed me to create a liner flow in such a way that the reader can start writing real programs already after they finish the first part of the course.

I must say that it is quite a tricky task to organise the material without backward links. In the ideal course, any topic may only be based on the previously explained information. A couple of the most challenging cases were ranges and typed variables. They both causes a few chicken-and-egg loops.

During the work on the first part, I also prepared a ‘framework’ that generates the navigation through the site and helps with quiz automation. It is hosted as GitHub Pages and uses Jekyll and Liquid for generating static pages, and a couple of Raku programs to automate the process of adding new exercises and highlighting code snippets. Syntax highlighting is done with Pygments.

Returning the to course itself, it includes pages of a few different types:

The quizzes were not part of the grant proposal, but I think they help making a better user experience. All the quizzes have answers and comments. All the exercises are solved and published with the comments to explain the solution, or even to highlight some theoretical aspects.

The first part covers 91 topics and includes 73 quizzes and 65 exercises (with 70 solutions :-). There are about 330 pages in total. The sources are kept in a GitHub repository github.com/ash/raku-course, so people can send pull requiest, etc.

At this point, the first part is fully ready. I may slightly update it if the following parts require additional information about the topics covered in Part 1.

This text is a grant report, and it is also (a bit modified) published at https://news.perlfoundation.org/post/rakucourse1 on 13 January 2021.

Andrew Shitov: Raku Challenge, Week 92, Issue 1

Published by Andrew Shitov on 2020-12-22T08:24:00

This week’s task has an interesting solution in Raku. So, here’s the task:

You are given two strings $A and $B. Write a script to check if the given strings are Isomorphic. Print 1 if they are otherwise 0.

OK, so if the two strings are isomorphic, their characters are mapped: for each character from the first string, the character at the same position in the second string is always the same.

In the stings abc and def, a always corresponds to d, b to e, and c to f. That’s a trivial case. But then for the string abca, the corresponding string must be defd.

The letters do not need to go sequentially, so the strings aeiou and bcdfg are isomorphic too, as well as aeiou and gxypq. But also aaeeiioouu and bbccddffgg, or the pair aeaieoiuo and gxgyxpyqp.

The definition also means that the number of different characters is equal in both strings. But it also means that if we make the pairs of corresponding letters, the number of unique pairs is also the same, right? If a matches x, there cannot be any other pair with the first letter a.

Let’s exploit these observation:

sub is-isomorphic($a, $b) {
    +(([==] ($a, $b)>>.chars) && 
      ([==] ($a.comb, $b.comb, ($a.comb Z~ $b.comb))>>.unique));
}

First of all, the strings must have the same length.

Then, the strings are split into characters, and the number of unique characters should also be equal. But the collection of the unique pairs from the corresponding letters from both strings should also be of the same size.

Test it:

use Test;

# . . .

is(is-isomorphic('abc', 'def'), 1);
is(is-isomorphic('abb', 'xyy'), 1);
is(is-isomorphic('sum', 'add'), 0);
is(is-isomorphic('ACAB', 'XCXY'), 1);
is(is-isomorphic('AAB', 'XYZ'), 0);
is(is-isomorphic('AAB', 'XXZ'), 1);
is(is-isomorphic('abc', 'abc'), 1);
is(is-isomorphic('abc', 'ab'), 0);

* * *

→ GitHub repository
→ Navigation to the Raku challenges post series

Andrew Shitov: Advent of Code 2020 Day 18/25 in the Raku programming language

Published by Andrew Shitov on 2020-12-18T22:07:44

Today there’s a chance to demonstrate powerful features of Raku on the solution of Day 18 of this year’s Advent of Code.

The task is to print the sum of a list of expressions with +, *, and parentheses, but the precedence of the operations is equal in the first part of the problem, and is opposite to the standard precedence in the second part.

In other words, 3 + 4 * 5 + 6 is (((3 + 4) * 5) + 6) in the first case and (3 + 4) * (5 + 6) in the second.

Here is the solution. I hope you are impressed too.

use MONKEY-SEE-NO-EVAL;
 
sub infix:<m>($a, $b) { $a * $b }

say [+] ('input.txt'.IO.lines.race.map: *.trans('*' => 'm')).map: {EVAL($_)}

The lines with the expressions come from the file input.txt. For each line, I am replacing * with m, which I earlier made an infix operator that actually does multiplication.

For the second part, we need our m to have lower precedence than +. There’s nothing simpler:

sub infix:<m>($a, $b) is looser<+> { $a * $b }

Parsing and evaluation are done using EVAL.

* * *

→ Browse the code on GitHub
→ See all blog posts on Advent of Code 2020

Andrew Shitov: The second wave of Covid.observer

Published by Andrew Shitov on 2020-12-15T22:15:33

When I started covid.observer about seven months ago, I thought there would be no need to update it after about 3-4 months. In reality, we are approaching to the end of the year, and I will have to fix the graphs which display data per week, as the week numbers will very soon make a loop.

All this time, more data arrived, and I also made it even more by adding a separate statistics for the regions of Russia, with its 85 subdivisions, which brought the total count of countries and regions up to almost 400.

mysql> select count(distinct cc) from totals;
+--------------------+
| count(distinct cc) |
+--------------------+
|                392 |
+--------------------+
1 row in set (0.00 sec)

Due to frequent updates that changes data in the past, it is not that easy to make incremental update of statistics, and again, I did not expect that I’ll run the site for so long.

mysql> select count(distinct date) from daily_totals;
+----------------------+
| count(distinct date) |
+----------------------+
|                  329 |
+----------------------+
1 row in set (0.00 sec)

The bottom line is that daily generation became clumsy and not smooth. Before summer, the whole website could be regenerated in less than 15 minutes, but now it turned to 40-50 minutes. And I tend to do it twice a day, as a fresh portion of today’s Russian data arrives a few hours after we’ve got a daily update by the Johns Hopkins University (for yesterday’s stats).

But the most scary signals began after the program started crashing with quite unpleasant errors.

Latest JHU data on 12/12/20
Latest RU data on 12/13/20
Generating impact timeline...
Generating world data...
MoarVM panic: Unable to initialize event loop
Failed to open file /Users/ash/Projects/covid.observer/COVID-19/csse_covid_19_data/csse_covid_19_daily_reports_us/12-10-2020.csv: Too many open files
Generating impact timeline...
Generating world data...
Not enough positional arguments; needed at least 4
  in sub per-capita-data at /Users/ash/Projects/covid.observer/lib/CovidObserver/Statistics.rakumod (CovidObserver::Statistics) line 1906

The errors were not consistent, and I managed to re-run the program by pieces to get the update. But none of the errors were easily explainable.

MoarVM panic gives no explanation, but it completely disappears if I run the program in two parts:

$ ./covid.raku fetch
$ ./covid.raku generate

instead of a combined run that both fetches the data and generates the statistics:

$ ./covid.raku update

The Too many open files is a strange one as while I process the files in loops, I do not intentionally keep them open. But that error seems to be solved by changing system settings:

$ ulimit -n 10000

The final error, Not enough positional arguments; needed at least 4, is the weirdest. Such thing happens when you call a function that expects a different number of arguments. That never occurred for months after all bugs were found and fixed. It can only be explained by the new piece of data. Indeed, it may happen that some data is missing, but I believe I already found all the cases where I need to provide the function calls with default zero values.

Having all that, and the fact that the program run takes dozens of minutes before you can catch an error, it was quite frustrating.

And here comes Liz!

She proposed to look into the things and actually spent the whole day by first installing the code and all its requirements and then by actually doing that job to run, debug, and re-run. By the end of the day she created a pull request, which made the program twice as fast!

Let’s look at the changes. There are three of them (but no, they do not directly answer the above-mentioned three error messages).

The first two changes introduce parallel processing of countries (remember, there are about 400 of what is considered a unique $cc in the program).

my %country-stats = get-known-countries<>.race(:1batch,:8degree).map: -> $cc {
   $cc => generate-country-stats($cc, %CO, :%mortality, :%crude, :$skip-excel)
}

Calling .race on the result of get-known-countries() function improves the previously sequential processing of countries. Indeed, their stats are computed independently, so there’s no reason for one country to wait for another. The parameters of race, the batch size and the number of workers, can probably be tuned to fit your hardware.

The second change is similar, but for another part of the code where the continents are processed in a loop:

for %continents.keys.race(:1batch,:8degree) -> $cont {
   generate-continent-stats($cont, %CO, :$skip-excel);
}

Finally, the third change is to make some counters native integers instead of Raku Ints:

my int $c = $confirmed[$index] // 0;
my int $f = $failed[$index] // 0;
my int $r = $recovered[$index] // 0;
my int $a = $active[$index] // 0;

I understand that this reduces both the memory and the processing time of these variables, but for some reason it also eliminated the error in counting function parameters.

And finally, I want to mention the <> thing that you may have noticed in the first code change. This is the so-called decontainerization operator. What it does is illustrated by this example from the documentation:

use JSON::Tiny;

my $config = from-json('{ "files": 3, "path": "/home/some-user/raku.pod6" }');

say $config.raku;
# OUTPUT: «${:files(3), :path("/home/some-user/raku.pod6")}» 

my %config-hash = $config<>;
say %config-hash.raku;
# OUTPUT: «{:files(3), :path("/home/some-user/raku.pod6")}»

The $config variable is a scalar variable that keeps a hash. To work with it as with a hash, the variable is decontainerized as $config<>. This gives us a proper hash %config-hash.

I think that’s it for now. The main advantage of the above changes is that the program now needs less than 25 minutes to re-generate the whole site and it does not fail.

Well, but it became a bit louder too as Rakudo uses more cores 🙂

Thanks, Liz!

6guts: Taking a break from Raku core development

Published by jnthnwrthngtn on 2020-10-05T19:44:26

I’d like to thank everyone who voted for me in the recent Raku Steering Council elections. By this point, I’ve been working on the language for well over a decade, first to help turn a language design I found fascinating into a working implementation, and since the Christmas release to make that implementation more robust and performant. Overall, it’s been as fun as it has been challenging – in a large part because I’ve found myself sharing the journey with a lot of really great people. I’ve also tried to do my bit to keep the community around the language kind and considerate. Receiving a vote from around 90% of those who participated in the Steering Council elections was humbling.

Alas, I’ve today submitted my resignation to the Steering Council, on personal health grounds. For the same reason, I’ll be taking a step back from Raku core development (Raku, MoarVM, language design, etc.) Please don’t worry too much; I’ll almost certainly be fine. It may be I’m ready to continue working on Raku things in a month or two. It may also be longer. Either way, I think Raku will be better off with a fully sized Steering Council in place, and I’ll be better off without the anxiety that I’m holding a role that I’m not in a place to fulfill.

brrt to the future: Reverse Linear Scan Allocation is probably a good idea

Published by Bart Wiegmans on 2019-03-21T15:52:00

Hi hackers! Today First of all, I want to thank everybody who gave such useful feedback on my last post.  For instance, I found out that the similarity between the expression JIT IR and the Testarossa Trees IR is quite remarkable, and that they have a fix for the problem that is quite different from what I had in mind.

Today I want to write something about register allocation, however. Register allocation is probably not my favorite problem, on account of being both messy and thankless. It is a messy problem because - aside from being NP-hard to solve optimally - hardware instruction sets and software ABI's introduce all sorts of annoying constraints. And it is a thankless problem because the case in which a good register allocator is useful - for instance, when there's lots of intermediate values used over a long stretch of code - are fairly rare. Much more common are the cases in which either there are trivially sufficient registers, or ABI constraints force a spill to memory anyway (e.g. when calling a function, almost all registers can be overwritten).

So, on account of this being not my favorite problem, and also because I promised to implement optimizations in the register allocator, I've been researching if there is a way to do better. And what better place to look than one of the fastest dynamic language implementations arround, LuaJIT? So that's what I did, and this post is about what I learned from that.

Truth be told, LuaJIT is not at all a learners' codebase (and I don't think it's author would claim this). It uses a rather terse style of C and lots and lots of preprocessor macros. I had somewhat gotten used to the style from hacking dynasm though, so that wasn't so bad. What was more surprising is that some of the steps in code generation that are distinct and separate in the MoarVM JIT - instruction selection, register allocation and emitting bytecode - were all blended together in LuaJIT. Over multiple backend architectures, too. And what's more - all these steps were done in reverse order - from the end of the program (trace) to the beginning. Now that's interesting...

I have no intention of combining all phases of code generation like LuaJIT has. But processing the IR in reverse seems to have some interesting properties. To understand why that is, I'll first have to explain how linear scan allocation currently works in MoarVM, and is most commonly described:

  1. First, the live ranges of program values are computed. Like the name indicates, these represent the range of the program code in which a value is both defined and may be used. Note that for the purpose of register allocation, the notion of a value shifts somewhat. In the expression DAG IR, a value is the result of a single computation. But for the purposes of register allocation, a value includes all its copies, as well as values computed from different conditional branches. This is necessary because when we actually start allocating registers, we need to know when a value is no longer in use (so we can reuse the register) and how long a value will remain in use -
  2. Because a value may be computed from distinct conditional branches, it is necessary to compute the holes in the live ranges. Holes exists because if a value is defined in both sides of a conditional branch, the range will cover both the earlier (in code order) branch and the later branch - but from the start of the later branch to its definition that value doesn't actually exist. We need this information to prevent the register allocator from trying to spill-and-load a nonexistent value, for instance.
  3. Only then can we allocate and assign the actual registers to instructions. Because we might have to spill values to memory, and because values now can have multiple definitions, this is a somewhat subtle problem. Also, we'll have to resolve all architecture specific register requirements in this step.
In the MoarVM register allocator, there's a fourth step and a fifth step. The fourth step exists to ensure that instructions conform to x86 two-operand form (Rather than return the result of an instruction in a third register, x86 reuses one of the input registers as the output register. E.g. all operators are of the form a = op(a, b)  rather than a = op(b, c). This saves on instruction encoding space). The fifth step inserts instructions that are introduced by the third step; this is done so that each instruction has a fixed address in the stream while the stream is being processed.

Altogether this is quite a bit of complexity and work, even for what is arguably the simplest correct global register allocation algorithm. So when I started thinking of the reverse linear scan algorithm employed by LuaJIT, the advantages became clear:
There are downsides as well, of course. Not knowing exactly how long a value will be live while processing it may cause the algorithm to make worse choices in which values to spill. But I don't think that's really a great concern, since figuring out the best possible value is practically impossible anyway, and the most commonly cited heuristic - evict the value that is live furthest in the future, because this will release a register over a longer range of code, reducing the chance that we'll need to evict again - is still available. (After all, we do always know the last use, even if we don't necessarily know the first definition).

Altogether, I'm quite excited about this algorithm; I think it will be a real simplification over the current implementation. Whether that will work out remains to be seen of course. I'll let you know!

brrt to the future: Something about IR optimization

Published by Bart Wiegmans on 2019-03-17T06:23:00

Hi hackers! Today I want to write about optimizing IR in the MoarVM JIT, and also a little bit about IR design itself.

One of the (major) design goals for the expression JIT was to have the ability to optimize code over the boundaries of individual MoarVM instructions. To enable this, the expression JIT first expands each VM instruction into a graph of lower-level operators. Optimization then means pattern-matching those graphs and replacing them with more efficient expressions.

As a running example, consider the idx operator. This operator takes two inputs (base and element) and a constant parameter scale and computes base+element*scale. This represents one of the operands of an  'indexed load' instruction on x86, typically used to process arrays. Such instructions allow one instruction to be used for what would otherwise be two operations (computing an address and loading a value). However, if the element of the idx operator is a constant, we can replace it instead with the addr instruction, which just adds a constant to a pointer. This is an improvement over idx because we no longer need to load the value of element into a register. This saves both an instruction and valuable register space.

Unfortunately this optimization introduces a bug. (Or, depending on your point of view, brings an existing bug out into the open). The expression JIT code generation process selects instructions for subtrees (tile) of the graph in a bottom-up fashion. These instructions represent the value computed or work performed by that subgraph. (For instance, a tree like (load (addr ? 8) 8) becomes mov ?, qword [?+8]; the question marks are filled in during register allocation). Because an instruction is always represents a tree, and because the graph is an arbitrary directed acyclic graph, the code generator projects that graph as a tree by visiting each operator node only once. So each value is computed once, and that computed value is reused by all later references.

It is worth going into some detail into why the expression graph is not a tree. Aside from transformations that might be introduced by optimizations (e.g. common subexpression elimination), a template may introduce a value that has multiple references via the let: pseudo-operator. See for instance the following (simplified) template:

(let: (($foo (load (local))))
    (add $foo (sub $foo (const 1))))

Both ADD and SUB refer to the same LOAD node


In this case, both references to $foo point directly to the same load operator. Thus, the graph is not a tree. Another case in which this occurs is during linking of templates into the graph. The output of an instruction is used, if possible, directly as the input for another instruction. (This is the primary way that the expression JIT can get rid of unnecessary memory operations). But there can be multiple instructions that use a value, in which case an operator can have multiple references. Finally, instruction operands are inserted by the compiler and these can have multiple references as well.

If each operator is visited only once during code generation, then this may introduce a problem when combined with another feature - conditional expressions. For instance, if two branches of a conditional expression both refer to the same value (represented by name $foo) then the code generator will only emit code to compute its value when it encounters the first reference. When the code generator encounters $foo for the second time in the other branch, no code will be emitted. This means that in the second branch, $foo will effectively have no defined value (because the code in the first branch is never executed), and wrong values or memory corruption is then the predictable result.

This bug has always existed for as long as the expression JIT has been under development, and in the past the solution has been not to write templates which have this problem. This is made a little easier by a feature the let: operator, in that it inserts a do operator which orders the values that are declared to be computed before the code that references them. So that this is in fact non-buggy:

(let: (($foo (load (local))) # code to compute $foo is emitted here
  (if (...)  
    (add $foo (const 1)) # $foo is just a reference
    (sub $foo (const 2)) # and here as well

The DO node is inserted for the LET operator. It ensures that the value of the LOAD node is computed before the reference in either branch


Alternatively, if a value $foo is used in the condition of the if operator, you can also be sure that it is available in both sides of the condition.

All these methods rely on the programmer being able to predict when a value will be first referenced and hence evaluated. An optimizer breaks this by design. This means that if I want the JIT optimizer to be successful, my options are:

  1. Fix the optimizer so as to not remove references that are critical for the correctness of the program
  2. Modify the input tree so that such references are either copied or moved forward
  3. Fix the code generator to emit code for a value, if it determines that an earlier reference is not available from the current block.
In other words, I first need to decide where this bug really belongs - in the optimizer, the code generator, or even the IR structure itself. The weakness of the expression IR is that expressions don't really impose a particular order. (This is unlike the spesh IR, which is instruction-based, and in which every instruction has a 'previous' and 'next' pointer). Thus, there really isn't a 'first' reference to a value, before the code generator introduces the concept. This is property is in fact quite handy for optimization (for instance, we can evaluate operands in whatever order is best, rather than being fixed by the input order) - so I'd really like to preserve it. But it also means that the property we're interested in - a value is computed before it is used in, in all possible code flow paths - isn't really expressible by the IR. And there is no obvious local invariant that can be maintained to ensure that this bug does not happen, so any correctness check may have to check the entire graph, which is quite impractical.

I hope this post explains why this is such a tricky problem! I have some ideas for how to get out of this, but I'll reserve those for a later post, since this one has gotten quite long enough. Until next time!

brrt to the future: A short post about types and polymorphism

Published by Bart Wiegmans on 2019-01-14T13:34:00

Hi all. I usually write somewhat long-winded posts, but today I'm going to try and make an exception. Today I want to talk about the expression template language used to map the high-level MoarVM instructions to low-level constructs that the JIT compiler can easily work with:

This 'language' was designed back in 2015 subject to three constraints:
Recently I've been working on adding support for floating point operations, and  this means working on the type system of the expression language. Because floating point instructions operate on a distinct set of registers from integer instructions, a floating point operator is not interchangeable with an integer (or pointer) operator.

This type system is enforced in two ways. First, by the template compiler, which attempts to check if you've used all operands correctly. This operates during development, which is convenient. Second, by instruction selection, as there will simply not be any instructions available that have the wrong combinations of types. Unfortunately, that happens at runtime, and such errors so annoying to debug that it motivated the development of the first type checker.

However, this presents two problems. One of the advantages of the expression IR is that, by virtue of having a small number of operators, it is fairly easy to analyze. Having a distinct set of operators for each type would undo that. But more importantly, there are several MoarVM instructions that are generic, i.e. that operate on integer, floating point, and pointer values. (For example, the set, getlex and bindlex instructions are generic in this way). This makes it impossible to know whether its values will be integers, pointers, or floats.

This is no problem for the interpreter since it can treat values as bags-of-bits (i.e., it can simply copy the union MVMRegister type that holds all values of all supported types). But the expression JIT works differently - it assumes that it can place any value in a register, and that it can reorder and potentially skip storing them to memory. (This saves work when the value would soon be overwritten anyway). So we need to know what register class that is, and we need to have the correct operators to manipulate a value in the right register class.

To summarize, the problem is:
There are two ways around this, and I chose to use both. First, we know as a fact for each local or lexical value in a MoarVM frame (subroutine) what type it should have. So even a generic operator like set can be resolved to a specific type at runtime, at which point we can select the correct operators. Second, we can introduce generic operators of our own. This is possible so long as we can select the correct instruction for an operator based on the types of the operands.

For instance, the store operator takes two operands, an address and a value. Depending on the type of the value (reg or num), we can always select the correct instruction (mov or movsd). It is however not possible to select different instructions for the load operator based on the type required, because instruction selection works from the bottom up. So we need a special load_num operator, but a store_num operator is not necessary. And this is true for a lot more operators than I had initially thought. For instance, aside from the (naturally generic) do and if operators, all arithmetic operators and comparison operators are generic in this way.

I realize that, despite my best efforts, this has become a rather long-winded post anyway.....

Anyway. For the next week, I'll be taking a slight detour, and I aim to generalize the two-operand form conversion that is necessary on x86. I'll try to write a blog about it as well, and maybe it'll be short and to the point. See you later!

brrt to the future: New years post

Published by Bart Wiegmans on 2019-01-06T13:15:00

Hi everybody! I recently read jnthns Perl 6 new years resolutions post, and I realized that this was an excellent example to emulate. So here I will attempt to share what I've been doing in 2018 and what I'll be doing in 2019.

In 2018, aside from the usual refactoring, bugfixing and the like:
So 2019 starts with me trying to complete the goals specified in that grant request. I've already partially completed one goal (as explained in the interim report) - ensuring that register encoding works correctly for SSE registers in DynASM. Next up is actually ensuring support for SSE (and floating point) registers in the JIT, which is surprisingly tricky, because it means introducing a type system where there wasn't really one previously. I will have more to report on that in the near future.

After that, the first thing on my list is the support for irregular register requirements in the register allocator, which should open up the possibility of supporting various instructions.

I guess that's all for now. Speak you later!