Planet Raku

Raku RSS Feeds

Elizabeth Mattijsen (Libera: lizmat #raku) / 2023-12-01T16:15:22


Day 1 – Rocking Raku Meets Stodgy Debian

Published by Tom Browder (@tbrowder) on 2023-12-01T01:01:00

A unique method for installing Raku on Debian hosts.

Rumbling in the rack room

Santa’s IT department was now fully invested in using Raku, and it was paying off in increased programmer efficiency as well as toy output.

But old elf Eli, a system administrator, was grumbling to himself that it was a lot of work keeping the many hosts (mostly GNU/Linux, for great savings on hardware) with many different Linux flavors current given the rapid development of Raku.

He had been trying to convince his boss to gradually convert all systems to Debian, but his boss was reluctant to do so because of a conundrum: Debian has a long development cycle (good for IT stability) but Raku has a short development cycle (roughly monthly).

Systems other than Debian tended to keep Raku more current and typically stayed within 12 months of the latest Raku. But that effort often resulted in more OS maintenance than Debian required. (Note the proliferation of Linux flavors had come about due to flashy young elves, freshly out of grad school, who favored the OS-of-the-month they used in school. To Eli’s chagrin, Santa was a softy and too kind to interfere.)

Eli had long ago selected Debian as his choice of GNU/Linux systems for several reasons including its program packaging system as well as its roughly three-year upgrade cycle.

Unfortunately, that conflicted with Raku’s almost monthly upgrade cycle. Given the long cycle, Debian’s Raku version tended to be quite outdated (although usable for most purposes). Eli wanted a maintainable way to keep Raku current as well as make it a primary coding language on each new system.

The steadfast but unheralded folks on the Raku release team have provided many ways to install Raku on several flavors of GNU/Linux as well as MacOS and Windows. Eli started using the ‘rakudo-pkg’ method when it was first introduced years ago.

But, when it was announced it would not keep up with Debian versions after the end of the Long Term Support (LTS) period, he started considering other available methods to see which was best for complete automation both for regular upgrades as well as during initial Debian installations (including Debian’s preseeding installation).

Inspiration and perspiration

After considering methods published on https://rakudo.org, he decided the binary downloads looked to be the easiest.

An archive of code compiled for the desired system, accompanied by an identifying SHA-256 hash and signed by one of the holders of the public keys published on the site, and unpacked into a standard directory should be possible with Raku alone without other modules than the one he would create.

With the tantalizing smell of figgy pudding coming from the mess hall, he started to think deep thoughts while doing his regular duties. Finally, eureka! Why not use the Debian system Raku to bootstrap a current Raku—genius!

Eagerly he began to gobble pudding while coding…. After too much pudding and many code iterations, mistakes, and going down rabbit holes, and with help from fellow Debianites on the Debian users mailing list, he came up with his shiny new Raku module distribution: RakudoBin (not yet published, but available online).

He solved the bootstrap problem by using a special path setup so the new module’s installation script could use the system Raku while other existing or new Raku programs would have the latest Raku first in the default user PATH. Because of the lengths of the paths defined in the actual host system, the paths are represented here by an alias, [pathX], where ‘X’ is the path segment:

The final system path for all users is established by creating the following system files:

$ cat /etc/environment
pathA:pathB:pathC
$ cat /etc/profile
pathA:pathB:pathC

Standard new user files in /etc/skel are okay as they are. But a missing file is added to correct the long-standing lack of a reliable graphical login solution for path setting, at least for the Mate deskstop:

# existing files NOT modified:
/etc/skel/.bash_logout
/etc/skel/.bashrc
/etc/skel/.profile
# added, not Debian standard, solves graphical
#   login path setting problem:
$ cp /etc/skel/.profile /etc/skel/.xsessionrc

That solution was achieved with much trial and error on a new host with freshly installed Debian 12 (Bookworm), plus lots of help from fellow Debian users, and a slightly outdated Debian online document.

The latest set of Rakudo binary files for a GNU/Linux system consist of:

FileSize
rakudo-moar-2023.11-01-linux-x86_64-gcc.tar.gz19.72 MB
rakudo-moar-2023.11-01-linux-x86_64-gcc.tar.gz.asc228.00 B
rakudo-moar-2023.11-01-linux-x86_64-gcc.tar.gz.checksums.txt1.02 KB

They are downloaded and checked for hash validity, then unpacked into directory /opt/rakudo. The paths required to use the installed binaries are /opt/rakudo/bin and /opt/rakudo/share/perl6/site/bin.

The installation script sets the standard path to put the new paths before the standard paths as shown below.

Debian standard path:

/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games

RakudoBin modified path (with game paths removed):

/opt/rakudo/bin:/opt/rakudo/share/perl6/site/bin:pathB:pathC

Note the path finds the newly installed executables before the system’s since those are under directory /usr/bin. Eli solved the bootstrap problem by putting this as the shebang line in the installation script:

#!/usr/bin/raku

One other tricky problem Eli solved, and was quite proud of, was checking for the latest version without downloading large files. The easiest method, knowing the URL and format of the file set, is to just attempt to download it with this code:

shell "curl -1sLf https://path/file -o /localpath/file";

If the file doesn’t exist, the curl command throws an error. If we can trap the error, we can say the file doesn’t exist. Thanks to the help of Nick [no, not Santa, IRC #raku’s Nick Logan, aka @ugexe]), we can do that easily with this simple line in a check-https sub returning a Bool:

sub check-https("https://path/file" --> Bool {
{
    try {
        so quietly shell
          "curl -1sLf https://path/file -o /localpath/file"
    } // False
}

Here’s a link for some help for a more general method for future improvements to check a partial download with large files.

The installation script, by default, carefully asks for conrirmation before proceeding with various system-modifying parts. Using the quiet option assumes all is approved and no further permissions are requested.

Party time

In summary, module RakudoBin will give system administrators a practical way to (1) have a stable Debian system and (2) upgrade their Raku version to the latest. The module will also provide an easy way to quickly convert a standard Debian installation into a standard Raku host with common paths for all new users.

Santa’s Epilogue

Don’t forget the “reason for the season:” ✝

As I always end these jottings, in the words of Charles Dickens’ Tiny Tim, “may God bless Us, Every one!” [1]

Footnotes

  1. A Christmas Carol, a short story by Charles Dickens (1812-1870), a well-known and popular Victorian author whose many works include The Pickwick PapersOliver TwistDavid CopperfieldBleak HouseGreat Expectations, and A Tale of Two Cities.

Raku Blog Posts 2023.47/48

Published by Elizabeth Mattijsen on 2023-11-27T18:50:55

Anton Antonov delves into the world of image recognition using the Raku Programming Language.

John Haltiwanger is looking at the options of integrating the Zig Programming Language with Raku.

Elizabeth Mattijsen reports on all recent developments around Rakudo, an implementation of the Raku Programming Language.

2023.48 More Ab5tract

Published by Elizabeth Mattijsen on 2023-11-27T16:42:44

The Raku Core Developers welcome John Haltiwanger as a full core developer on Rakudo, an implementation of the Raku Programming Language. Usually known by their ab5tract nickname, John has recently been very active in working on RakuAST implementing many features, including: a newly featureful FIRST phaser, the subset and enum types, and a comprehensive new implementation of how WhateverCodes prime themselves and others. These were developed through Pull Requests. That extra step won’t be necessary anymore!

Yet Another Release

Justin DeVuyst managed to get out the 2023.11 release of the Rakudo compiler for the Raku Programming Language just before Black Friday. Binary packages became available shortly after.

Test-t testing much faster

H.Merijn Brandt has been testing performance of the Text::CSV module for over 9 years now. Careful observers of this test, may have noticed that the test became more than 3x as fast between 13 and 17 November:

2023-11-13 14:37:07 test-t 1.363
2023-11-17 11:58:09 test-t 0.423

This was not caused by any changes in Rakudo, but by the fact that new hardware replaced the more than 10 year old hardware. Kudos to Merijn for keeping the old hardware around long enough for consistent test results over all those years!

Anton’s Corner

Anton Antonov is taking the Raku Programming Language into the realm of interpretation of images, in an impressive blogs post: AI Vision via Raku (/r/rakulang comments).

John’s Corner

John Haltiwanger also looked into integration with the Zig Programming Language in a blog post called: An initial investigation into using Zig to speed up Raku code.

Weeklies

Weekly Challenge #245 is available for your perusal.

New Problem Solving Issues

New Pull Requests

Core Developments

In RakuAST developments this week:

Meanwhile on Mastodon

Meanwhile still on Twitter 𝕏

Meanwhile on the mailing list

Questions about Raku

Comments

New Raku Modules

Updated Raku Modules

Winding down

Definitely a quiet week on the western side of the Atlantic pond! Meanwhile, Слава Україні!  Героям слава!

Please keep staying safe and healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

An initial investigation into using Zig to speed up Raku code

Published by 5ab5traction5 on 2023-11-27T14:21:37

Note: This post is also available as a gist if you find that format more readable.

Introduction

This research was conducted while preparing an upcoming Raku Advent Calendar post. The Raku code uses a basic supply pipeline to feed $volume objects through a validation stage that requires a CRC32 check before going to the output sink, which prints the processing time of the validation stage.

The "reaction graph" is designed to simulate a stream processing flow, where inputs arrive and depart via Candycane™ queues (that's the name of Santa's Workshop Software's queueing service, in case you weren't familiar).

The entire scenario is contrived in that CRC32 was chosen due to native implementation availability in both Raku and Zig, allowing comparison. It's not an endorsement of using CRC32 in address validation to deliver Santa's, or anyone's, packages.

Also, thanks to the very helpful folks at ziggit.dev for answering my newbie question in depth.

Methodology

The source code:

At larger volumes, Raku struggles with the initialization speed of the $volume objects that are instantiated. I replaced the native Raku class with one written in Zig, using the is repr('CStruct') trait in Raku and the extern struct qualifier in Zig.

In Zig I use a combination of an arena allocator (for the string passed from Raku) and a memory pool (designed to quicklymake copies of a single type, exactly fitting our use case) to construct Package objects.

Additionally, for Raku+Zig the CRC32 hashing routine from Zig's stdlib is used via a tiny wrapper function.

A --bad-packages option is provided by both Raku scripts, which makes 10% of the objects have a mismatched address/CRC32 pair.

The library tested was compiled with -Doptimize=ReleaseFast.

Batches are repeated $batch times, which defaults to 5.

All results from an M2 MacBook Pro.

Caveats

This test and its is only intended to reflect the case where an object is constructed in Zig based on input from Raku. It is not intended to be a test of Zig's native speed in the creation of structs.

There is a call to sleep that gives time -- 0.001 seconds -- to get the react block up and running before emitting the first True on the $ticker-supplier. This affects overall runtime but not the batch or initialization metrics.

The speed of Raku+Zig was so fast that the tool used to measure these details (cmdbench) could not find results in ps for the execution because it had already finished. These are marked as Unmeasured.

In the next iteration of this research, there sould be two additional entries in the data tables below for:

Results

10,000

Volume Edition Runtime Batch Time Initialization Max bytes
10,000 Raku 1.072s 1: 0.146596686s
2: 0.138983732s
3: 0.142380065s
4: 0.136050775s
5: 0.134760525s
0.008991746s 180240384
10,000 Raku+Zig 0.44s 1: 0.010978411s
2: 0.006575705s
3: 0.004145623s
4: 0.004280415s
5: 0.00468929s
0.020358033s Unmeasured
10,000 Raku
(bad-packages)
1.112s 1: 0.157788932s
2: 0.149544686s
3: 0.156293433s
4: 0.151365477s
5: 0.147947436s
0.008059955s 196263936
10,000 Raku+Zig
(bad-packages)
0.463s 1: 0.031300276s
2: 0.01006562s
3: 0.010693328s
4: 0.011056994s
5: 0.010770828s
0.010954495s Unmeasured

Notes

The Raku+Zig solution wins in performance, but loses the initialization race. Raku is doing a decent showing in comparison to how far it has come performance-wise.

100,000

Volume Edition Overall Batch Time Initialization Max bytes
100,000 Raku 7.163s 1: 1.360029456s
2: 1.32534014s
3: 1.353072834s
4: 1.346668338s
5: 1.351110502s
0.062402473s 210173952
100,000 Raku+Zig 0.75s 1: 0.079802007s
2: 0.073638176s
3: 0.053291894s
4: 0.05087652s
5: 0.050394687s
0.05855585s 241205248
100,000 Raku
(bad-packages)
7.89s 1: 1.496982355s
2: 1.484494027s
3: 1.497365023s
4: 1.490810525s
5: 1.492416774s
0.060026016s 209403904
100,000 Raku+Zig
(bad-packages)
1.076s 1: 0.16960934s
2: 0.111172493s
3: 0.110844786s
4: 0.113021202s
5: 0.111713535s
0.051436311s 242450432

Notes

We see Raku+Zig take first place in everything but memory consumption, which we can assume is a function of using the NativeCall bridge, not to mention my new-ness as a Zig programmer.

1,000,000

Volume Edition Overall Batch Time Initialization Max bytes
1,000,000 Raku 68.081s 1: 13.475302627s
2: 13.161153845s
3: 13.293998956s
4: 13.364662217s
5: 13.474755295s
0.95481884s 417103872
1,000,000 Raku+Zig 3.758s 1: 0.788083286s
2: 0.509883905s
3: 0.492898873s
4: 0.500868284s
5: 0.498677495s
0.575087671s 514064384
1,000,000 Raku+Zig
(bad-packages)
75.796s 1: 14.940173822s
2: 14.632683637s
3: 14.866796226s
4: 15.272903792s
5: 15.027481448s
0.704549212s 396656640
1,000,000 Raku+Zig
(bad-packages)
6.553s 1: 1.362189763s
2: 1.061496504s
3: 1.069134685s
4: 1.062746049s
5: 1.061096044s
0.528011288s 462766080

Notes

Raku's native CRC32 performance is clearly lagging here. Raku+Zig keeps its domination except in the realm of memory usage. It would be hard to justify using the Raku native version strictly on its reduced memory usage, considering the performance advantage on display here

A "slow first batch" problem begins to affect Raku+Zig. Running with bad-packages enabled slows down the Raku+Zig crc32 loop, hinting that there might be some optimizations on either the Raku or the Zig/clang side of things that can't kick in when the looped data is heterogenous.

Dynamic runtime optimization sounds more like a Rakudo thing than a Zig thing, though.

10,000,000

Volume Edition Runtime Batch Time Initialization Max bytes
10,000,000 Raku 704.852s 1: 136.588638184s
2: 136.851019628s
3: 138.44696743s
4: 139.777040922s
5: 139.490784317s
13.299274221s 2055012352
10,000,000 Raku+Zig 38.505s 1: 8.843459877s
2: 4.84300835s
3: 4.991842433s
4: 5.077245603s
5: 4.939533707s
9.375436134s 2881126400
10,000,000 Raku
(bad-packages)
792.1s 1: 162.333803401s
2: 174.815386318s
3: 168.299796081s
4: 162.643428135s
5: 163.205406678s
10.252639311s 2124267520
10,000,000 Raku+Zig
(bad-packages)
65.174 1: 14.41616445s
2: 11.078961309s
3: 10.662389991s
4: 11.20240076s
5: 10.614430063s
6.778600235s 2861596672

Notes

Pure Raku really struggles with a volume of this order of magnitude. But if you add in just a little bit of Zig, you can reasonably supercharge Raku's capabilities.

The "slow first batch" for Raku+Zig has been appearing in more understated forms in other tests. Here the first batch is over double the runtime of the second batch. What is causing this?

100,000,000

This doesn't seem to work. At least, I'm not patient enough. The process seems to stall, growing and shrinking memory but never finishing.

Final Thoughts

This is a preliminary report in blog post form based on a contrived code sample written for another, entirely different blog post. More data and deeper analysis will have to come later.

Zig's C ABI compatibility is clearly no put on. It works seamlessly with Raku's NativeCall. Granted, we haven't really pushed the boundaries of what the C ABI can look like but one of the core takeaways is actually that with Zig we can design that interface. In other words, we are in charge of how ugly, or not, it gets. Considering how dead simple the extern struct <-> is repr('CStruct') support is, I don't think the function signatures need to get nearly as gnarly as they get in C.

Sussing the truth of that supposition out will take some time and effort in learning Zig. I'm looking forward to it. My first stop will probably be a JSON library that uses Zig. I'm also going to be looking into using Zig as the compiler for Rakudo, as it might simplify our releases significantly.

AI vision via Raku

Published by Anton Antonov Antonov on 2023-11-25T18:03:26

Introduction

In the fall of 2023 OpenAI introduced the image vision model “gpt-4-vision-preview”, [OAIb1].

The model “gpt-4-vision-preview” represents a significant enhancement to the GPT-4 model, providing developers and AI enthusiasts with a more versatile tool capable of interpreting and narrating images alongside text. This development opens up new possibilities for creative and practical applications of AI in various fields.

For example, consider the following Raku-developer-centric applications:

A more diverse set of the applications would be:

As a first concrete example, consider the following image that fable-dramatizes the release of Raku’s 2023.11 version (https://i.imgur.com/UcRYl9Yl.jpg):

Here is its narration:

#% bash
openai-playground --images=https://i.imgur.com/UcRYl9Yl.jpg Very concisely describe the image
The image depicts a vibrant, colorful illustration of two raccoons under a large tree adorned with various fruits, surrounded by butterflies, with a landscape of trees and a clear sky in the background. There's a date "2023.11" on the tree trunk.

Ways to use with Raku

There are five ways to utilize image interpretation (or vision) services in Raku:

In this document are demonstrated the first three and the fifth. The fourth one is still “under design and consideration.”

Remark: Document’s structure reflects the list of the “five ways” above.

Remark: The model “gpt-4-vision-preview” is given as a “chat completion model”, therefore, in this document we consider it to be a Large Language Model (LLM).

Remark: This document was prepared as Jupyter chatbook, [AAp3], and then it was converted to Markdown and Mathematica / Wolfram Language notebook.

Packages

Here are the Raku packages (namespaces) used below, [AAp1, AAp2, AAp3]:

use WWW::OpenAI;
use WWW::OpenAI::ChatCompletions;
use WWW::MermaidInk;
use Lingua::Translation::DeepL; 
use Data::Translators;
use LLM::Functions;

Remark: The Jupypter kernel of “Jupyter::Chatbook”, [AAp3], automatically loads the packages “Data::Translators” and “LLM::Functions”. The functionalities of the packages “WWW::OpenAI” and “WWW::MermaidInk” are available in chatbooks through dedicated cells, [AAv1, AAv3].

Images

Here are the links to all images used in this document:

NameLink
Raccoons chasing butterflieshttps://i.imgur.com/UcRYl9Yl.jpg
LLM functionalities mind-maphttps://imgur.com/kcUcWnq
Single sightseerhttps://i.imgur.com/LEGfCeql.jpg
Three huntershttps://raw.githubusercontent.com/antononcube/Raku-WWW-OpenAI/main/resources/ThreeHunters.jpg
Raku Jupyter Chatbook solutionhttps://imgur.com/22lXXks
Cyber Week Spending Set to Hit New Highs in 2023https://cdn.statcdn.com/Infographic/images/normal/7045.jpeg

CLI

The simplest way to use the OpenAI’s vision service is through the CLI script of “WWW::OpenAI”, [AAp1]. (Already demoed in the introduction.)

Here is an image that summarizes how Jupyter Chatbooks work (see [AAp3, AAv1, AAv2]):

Here is a CLI shell command that requests the image above to be described (using at most 900 tokens):

#% bash
openai-playground --images=https://i.imgur.com/22lXXks.jpg --max-tokens=900 Describe the image
The image displays a flowchart with a structured sequence of operations or processes. The chart is divided into several areas with different headings that seem to be part of a software system or application. The main areas identified in the flowchart are "Message evaluation," "LLM interaction," "Chatbook frontend," "Chatbook backend," and "Prompt processing."

Starting from the left, the message evaluation feeds into "LLM interaction" where there are three boxes labeled "LLM::Functions," "PaLM," and "OpenAI," suggesting these are different functions or APIs that can be interacted with.

In the "Chatbook frontend," there is a process that begins with a "Chat cell" that leads to a decision point asking if "Chat ID specified?" Based on the answer, it either assumes 'NONE' for the chat ID or proceeds with the specified ID.

In the "Chatbook backend," there is a check to see if the "Chat ID exists in DB?" If not, a new chat object is created; otherwise, an existing chat object is retrieved from the "Chat objects" storage.

Finally, the "Prompt processing" area involves parsing a "Prompt DSL spec" and checking if known prompts are found. If they are, it leads to "Prompt expansion" and interacts with "LLM::Prompts" to possibly generate prompts.

Dotted lines indicate references or indirect interactions, while solid lines represent direct flows or processes. The chart is colored in shades of yellow and purple, which may be used to differentiate between different types of processes or to highlight the flow of information.

The flowchart is a typical representation of a software or system architecture, outlining how different components interact and what processes occur based on various conditions.

Shell workflow pipelines can be constructed with commands using CLI scripts of the packages loaded above. For example, here is a pipeline that translates the obtained image description from English to Bulgarian using the package “Lingua::Translation::DeepL”, [AAp5]:

#% bash
openai-playground --images=https://i.imgur.com/22lXXks.jpg --max-tokens=900 'Very concisely describe the image' | deepl-translation -t=Bulgarian
Изображението представлява блок-схема, която очертава процес, включващ оценка на съобщенията, взаимодействие с големи езикови модели (LLM) като PaLM и OpenAI и бекенд система за обработка на чат взаимодействия. Процесът включва стъпки за анализиране на подсказките, проверка за известни подсказки и управление на чат обекти в база данни. Изглежда, че това е системен дизайн за обработка и реагиране на потребителски входове в приложение за чат.

Of course, we can just request OpenAI’s vision to give the image description in whatever language we want (say, by using emojis):

#% bash
openai-playground --images=https://i.imgur.com/22lXXks.jpg --max-tokens=900 Very concisely describe the image in 🇷🇺
Это изображение диаграммы потока данных или алгоритма, на котором представлены различные этапы взаимодействия и обработки сообщений в компьютерной системе. На диаграмме есть блоки с надписями и стрелки, указывающие направление потока данных.

Web API functions

Within a Raku script or REPL session OpenAI’s vision service can be accessed with the functions openai-completion or openai-playground.

Remark: The function openai-playground is an umbrella function that redirects to various “specialized” functions for interacting with different OpenAI services. openai-completion is one of them. Other specialized functions are those for moderation, vector embeddings, and audio transcription and processing; see [AAp1].

If the function openai-completion is given a list of images, a textual result corresponding to those images is returned. The argument “images” is a list of image URLs, image file names, or image Base64 representations. (Any combination of those element types can be specified.)

Before demonstrating the vision functionality below we first obtain and show a couple of images.

Images

Here is a URL of an image: (https://i.imgur.com/LEGfCeql.jpg). Here is the image itself:

Next, we demonstrate how to display the second image by using the file path and the encode-image function from the WWW::OpenAI::ChatCompletions namespace. The encode-image function converts image files into Base64 image strings, which are a type of text representation of the image.

When we use the openai-completion function and provide a file name under the “images” argument, the encode-image function is automatically applied to that file.

Here is an example of how we apply encode-image to the image from a given file path ($*HOME ~ '/Downloads/ThreeHunters.jpg'):

my $img3 = WWW::OpenAI::ChatCompletions::encode-image($*HOME ~ '/Downloads/ThreeHunters.jpg'); "![]($img3)"</code>

Remark: The “three hunters” image is a resource file of “WWW::OpenAI”, [AAp1].

Image narration

Here is an image narration example with the two images above, again, one specified with a URL, the other with a file path:

my $url1 = 'https://i.imgur.com/LEGfCeql.jpg';
my $fname2 = $*HOME ~ '/Downloads/ThreeHunters.jpg'; 
my @images = [$url1, $fname2]; 

openai-completion("Give concise descriptions of the images.", :@images, max-tokens => 900, format => 'values');
1. The first image features a single raccoon perched on a tree branch surrounded by a multitude of colorful butterflies in an array of blues and oranges, set against a vibrant, nature-themed backdrop.

2. The second image depicts three raccoons on a tree branch in a forest setting, with two of them looking towards the viewer and one looking to the side. The background is filled with autumnal-colored leaves and numerous butterflies that match the whimsical atmosphere of the scene.

Description of a mind-map

Here is an application that should be more appealing to Raku-developers — getting a description of a technical diagram or flowchart. Well, in this case, it is a mind-map from [AA1]:

Here are get the vision model description of the mind-map above (and place the output in Markdown format):

my $mm-descr = 
    openai-completion(
        "How many branches this mind-map has? Describe each branch separately. Use relevant emoji prefixes.", 
        images => 'https://imgur.com/kcUcWnq.jpeg', 
        max-tokens => 1024,
        format => 'values'
    );

$mm-descr
The mind-map has five branches, each representing a different aspect or functionality related to LLM (Large Language Models) services access. Here's the description of each branch with relevant emoji prefixes:

1. 🎨 **DALL-E**: This branch indicates that DALL-E, an AI system capable of creating images from textual descriptions, is related to or a part of LLM services.

2. 🤖 **ChatGPT**: This branch suggests that ChatGPT, which is likely a conversational AI based on GPT (Generative Pre-trained Transformer), is associated with LLM services.

3. 🧠 **PaLM**: This branch points to PaLM, suggesting that it is another model or technology related to LLM services. PaLM might stand for a specific model or framework in the context of language processing.

4. 💬 **LLM chat objects**: This branch leads to a node indicating chat-related objects or functionalities that are part of LLM services.

5. 📚 **Chatbooks**: This branch leads to a concept called "Chatbooks," which might imply a feature or application related to creating books from chat or conversational content using LLM services.

Each of these branches emanates from the central node labeled "LLM services access," indicating that they are all different access points or functionalities within the realm of large language model services.

Here from the obtained description we request a (new) Mermaid-JS diagram to be generated:

my $mmd-chart = llm-synthesize(["Make the corresponding Mermaid-JS diagram code for the following description. Give the code only, without Markdown symbols.", $mm-descr], e=>'ChatGPT')
graph LR
A[LLM services access] 
B[DALL-E]-->A 
C[ChatGPT]-->A 
D[PaLM]-->A 
E[LLM chat objects]-->A 
F[Chatbooks]-->A

Here is a diagram made with Mermaid-JS spec obtained above using a function of “WWW::MermaidInk”, [AAp4]:

#% markdown 
mermaid-ink($mmd-chart, format=>'md-image')

Remark: In a Jupyter chatbook, [AAp3], Mermaid-JS diagrams can be “directly” visualized with notebook cells that have the magic mermaid. Below is given an instance of one of the better LLM results for making a Mermaid-JS diagram over the “vision-derived” mind-map description.

#% markdown
mermaid-ink('
graph TB
    A[LLM services access] --> B[DALL-E]
    A --> C[ChatGPT]
    A --> D[PaLM]
    A --> E[LLM chat objects]
    A --> F[Chatbooks]
    B -->|related to| G[DALL-E AI system]
    C -->|associated with| H[ChatGPT]
    D -->|related to| I[PaLM model]
    E -->|part of| J[chat-related objects/functionalities]
    F -->|implies| K[Feature or application related to chatbooks]
', format => 'md-image')

Here is an example of code generation based on the “vision derived” mind-map description above:

#% markdown
llm-synthesize([ "Generate Raku code -- using Markdown markings -- with an object oriented hierarchy corresponding to the description:\n", $mm-descr], e=>'ChatGPT')
class LLM::ServiceAccess {
    has DALLE $.dalle;
    has ChatGPT $.chatgpt;
    has PaLM $.palm;
    has LLMChatObjects $.llm-chat-objects;
    has Chatbooks $.chatbooks;
}

class DALLE {
    # Implementation for DALL-E functionality
}

class ChatGPT {
    # Implementation for ChatGPT functionality
}

class PaLM {
    # Implementation for PaLM functionality
}

class LLMChatObjects {
    # Implementation for LLM chat objects
}

class Chatbooks {
    # Implementation for Chatbooks functionality
}

# Usage
my $llm-service-access = LLM::ServiceAccess.new(
    dalle => DALLE.new,
    chatgpt => ChatGPT.new,
    palm => PaLM.new,
    llm-chat-objects => LLMChatObjects.new,
    chatbooks => Chatbooks.new,
);

LLM Functions

Let us show programmatic utilizations of the vision capabilities.

Here is the workflow we consider:

  1. Ingest an image file and encode it into a Base64 string
  2. Make an LLM configuration with that image string (and a suitable model)
  3. Synthesize a response to a basic request (like, image description)
  4. Make an LLM function for asking different questions over image
  5. Ask questions and verify results

Image ingestion and encoding

Here we ingest an image and display it:

#%markdown </code>
my $imgBarChart = WWW::OpenAI::ChatCompletions::encode-image($*HOME ~ '/Downloads/Cyber-Week-Spending-Set-to-Hit-New-Highs-in-2023-small.jpeg');
"![]($imgBarChart)"</code>

Remark: The image was downloaded from the post “Cyber Week Spending Set to Hit New Highs in 2023”.

Configuration and synthesis

Here we make a suitable LLM configuration with the image:

<code>my $confImg = llm-configuration("ChatGPT", model => 'gpt-4-vision-preview', images => $imgBarChart, temperature => 0.2); $confImg.WHAT</code>
(Configuration)

Here we synthesize a response of a image description request:

llm-synthesize("Describe the image.", e=> $confImg)
The image is a bar chart titled "Cyber Week Spending Set to Hit New Highs in 2023". It shows estimated online spending on Thanksgiving weekend in the United States for the years 2019, 2020, 2021, 2022, and a forecast for 2023. The spending is broken down by three days: Thanksgiving Day, Black Friday, and Cyber Monday.

Each year is represented by a different color, with bars for each day showing the progression of spending over the years. The spending amounts range from $0B to $12B. The chart indicates an overall upward trend in spending, with the forecast for 2023 showing the highest spending across all three days.

In the top left corner of the chart, there is a small illustration of a computer with coins, suggesting online transactions. At the bottom, there is a note indicating that the forecast is based on data from Adobe Analytics. The Statista logo is visible in the bottom right corner, and there are Creative Commons and share icons in the bottom left corner.

Repeated questioning

Here we define an LLM function that allows the multiple question request invocations over the image:

my &fst = llm-function({"For the given image answer the question: $_ . Be as concise as possible in your answers."}, e => $confImg);
-> **@args, *%args { #`(Block|3507398517968) ... }
&fst('How many years are presented in that image?')
Five years are presented in the image.
&fst('Which year has the highest value? What is that value?')
The year with the highest value is 2023, with a value of just over $11 billion.

Remark: Numerical value readings over technical plots or charts seem to be often wrong. OpenAI’s vision model warns about this in the responses often enough.


Dedicated notebook cells

In the context of the “recently-established” notebook solution “Jupyter::Chatbook”, [AAp3], I am contemplating an extension to integrate OpenAI’s vision service.

The main challenges here include determining how users will specify images in the notebook, such as through URLs, file names, or Base64 strings, each with unique considerations. Additionally, I am exploring how best to enable users to input prompts or requests for image processing by the AI/LLM service.

This integration, while valuable, it is not my immediate focus as there are programmatic ways to access OpenAI’s vision service already. (See the previous section.)


Combinations (fairytale generation)

Consider the following computational workflow for making fairytales:

  1. Draw or LLM-generate a few images that characterize parts of a story.
  2. Narrate the images using the LLM “vision” functionality.
  3. Use an LLM to generate a story over the narrations.

Remark: Multi-modal LLM / AI systems already combine steps 2 and 3.

Remark: The workflow above (after it is programmed) can be executed multiple times until satisfactory results are obtained.

Here are image generations using DALL-E for four different requests with the same illustrator name in them:

my @story-images = [
"a girl gets a basket with wine and food for her grandma.",
"a big bear meets a girl carrying a basket in the forest.",
"a girl that gives food from a basket to a big bear.",
"a big bear builds a new house for girl's grandma."
].map({ openai-create-image( 'Painting in the style of John Bauer of ' ~ $_, response-format => 'b64_json', format => 'values') });

@story-images.elems
4

Here we display the images:

#% markdown 
@story-images.map({ '![](data:image/jpeg;base64,' ~ $_ ~ ')' }).join(' ' x 5)

Here we get the image narrations (via the OpenAI’s “vision service”):

my $story-images-descriptions = 
    openai-completion("Consisely describe the images.",
        images => @story-images.map({ 'data:image/jpeg;base64,' ~ $_ ~ '' }).List,
        max-tokens => 900,
        format => 'values'
    );

$story-images-descriptions 
1. The first image features a stylized painting of a woman in a red hood, carrying a large basket with a girl in white looking up at her.
2. The second image shows a dark, fairy tale-like forest scene where a young girl holding a basket is facing a large black bear.
3. The third image is an illustration of a girl offering a basket to a seated, gentle-looking bear in a green forest setting.
4. The final image depicts a bear on two legs, dressed in a blue dress, standing by a yellow house with a girl in blue in front of the door.

Here we extract the descriptions into a list:

#% markdown
my @descr = $story-images-descriptions.split("\n", :skip-empty).grep({ $_ ~~ / ^ \d /});
@descr ==> to-html()

Here we generate the story from the descriptions above (using Google’s PaLM):

my $res = llm-synthesize(['Write a story that fits the following four descriptions:',|@descr], e=>llm-configuration('PaLM', max-tokens=>1200))
Once upon a time, there was a little girl named Gretel who lived with her mother in a small village on the edge of a dark and mysterious forest. One day, Gretel's mother fell ill and could not work. The little girl knew that they needed money to buy food, so she decided to go into the forest and find berries to sell at the market.

Gretel walked for a long time, but she could not find any berries. She was about to give up when she saw a large black bear sitting under a tree. The bear was licking its paws and looked very friendly.

Gretel was scared, but she knew that she had to do something. She walked up to the bear and said, "Hello, Mr. Bear. I am Gretel. I am looking for berries to sell at the market. Do you know where I can find some?"

The bear smiled and said, "Of course I do, Gretel. I know where all the best berries are. Follow me."

The bear led Gretel through the forest to a clearing. The clearing was full of beautiful berries. Gretel picked as many as she could carry and thanked the bear.

"You're welcome, Gretel," said the bear. "I'm glad I could help."

Gretel took the berries to the market and sold them. She made enough money to buy food for her mother and herself. The next day, Gretel went back to the forest to visit the bear. She brought him some bread and milk. The bear was very happy to see her.

"Thank you for the bread and milk, Gretel," said the bear. "You are very kind."

Gretel and the bear became friends. They would often meet in the forest and play together. One day, the bear said to Gretel, "Gretel, I would like to ask you something."

"What is it?" asked Gretel.

"I would like you to marry me," said the bear.

Gretel was surprised, but she was also happy. She said, "Yes, I will marry you."

Gretel and the bear were married in a small ceremony in the forest. They lived happily ever after in a little house on the edge of the forest.

References

Articles

[AA1] Anton Antonov, “Workflows with LLM functions”, (2023), RakuForPrediction at WordPress.

[AA2] Anton Antonov, “Re-programming to Python of LLM- and Chatbook packages”, (2023), RakuForPrediction at WordPress.

[OAIb1] OpenAI team, “New models and developer products announced at DevDay”, (2023), OpenAI/blog.

Packages

[AAp1] Anton Antonov, WWW::OpenAI Raku package, (2023), GitHub/antononcube.

[AAp2] Anton Antonov, LLM::Functions Raku package, (2023), GitHub/antononcube.

[AAp3] Anton Antonov, Jupyter::Chatbook Raku package, (2023), GitHub/antononcube.

[AAp4] Anton Antonov, WWW::MermaidInk Raku package, (2023), GitHub/antononcube.

[AAp5] Anton Antonov, Lingua::Translation::DeepL Raku package, (2023), GitHub/antononcube.

Videos

[AAv1] Anton Antonov, “Jupyter Chatbook LLM cells demo (Raku)” (2023), YouTube/@AAA4Prediction.

[AAv2] Anton Antonov, “Jupyter Chatbook multi cell LLM chats teaser (Raku)” (2023), YouTube/@AAA4Prediction.

2023.47 FOSDEM devving

Published by Elizabeth Mattijsen on 2023-11-21T10:59:06

Theo van Hoesel announced the FOSDEM Call for Participation for the dev room at FOSDEM 2024 (on 3-4 February, /r/rakulang comments). The TPRF organisers for FOSDEM are specifically looking for anyone with a compelling story to share about their Open Source project that has a strong background in Raku or Perl. Deadline for submissions is December 1st!

Steve’s Corner

Steve Roe took a recent discussion as a cue to look at the meaning of several mathematical operations on Ranges in Junctions for Interval Arithmetic.

Weeklies

Weekly Challenge #244 is available for your perusal.

New Problem Solving Issues

Core Developments

In RakuAST developments this week:

Meanwhile on Mastodon

Comments

New Raku Modules

Updated Raku Modules

Winding down

A bit of a quiet week in preparation for a release and advent blog posts! Meanwhile, Слава Україні!  Героям слава!

Please keep staying safe and healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

Rakudo compiler, Release #166 (2023.11)

Published on 2023-11-21T00:00:00

Raku Blog Posts 2023.46

Published by Elizabeth Mattijsen on 2023-11-13T18:28:27

Rawley Fowler released a new major version of Raku's answer to Sinatra.

Swagg Boi was pleasantly surprised by Humming-Bird and blogged about the steps necessary to make a simple web-site with it.

Wenzel P.P. Peppmeyer blogged about their thoughts regarding the intersection of Ranges, Junctions and hypering.

Anton Antonov converted their recent presentation at the Raku Conference into a blog post.

Steve Roe delves deeper into the potential possibilities/semantics of operations on Ranges.

Elizabeth Mattijsen reports on all recent developments around Rakudo, an implementation of the Raku Programming Language.

2023.46 Advent Approaching

Published by Elizabeth Mattijsen on 2023-11-13T15:38:11

It’s getting to that time of the year again: the Raku Advent Calendar is coming! But there’s not going to be much to read unless more of you write! So please add your blog post (proposal) to the list, either directly or through a Pull Request. The Raku community thanks you in advance!

Kudos to the infra team

On Saturday a massive failure at the center where some of Raku’s services were hosted, caused the outage of many Raku and Rakudo related websites. The Raku infra team, specifically Andinus and Coleman, have worked tirelessly to provide these services again at another location. So kudos to them, working behind the scenes for the Raku Programming Language community!

Rawley’s Corner

Rawley Fowler announced the next major release of Humming-Bird, the Raku Programming Language’s response to Sinatra (/r/rakulang comments).

Swagg’s Corner

Swagg Boi write a nice Getting Started with Raku introduction (/r/rakulang comments).

Wenzel’s Corner

Wenzel P.P. Peppmeyer knows what they will be asking Santa this year (/r/rakulang comments).

Anton’s Corner

Anton Antonov reworked their presentation at the Raku Conference into a blog post: Integrating Large Language Models with Raku. And created an overview on how to generate technical reports with the Raku Programming Language.

Weeklies

Weekly Challenge #243 is available for your perusal.

New Problem Solving Issues

New Pull Requests

Core Developments

In RakuAST developments this week:

Meanwhile on Mastodon

Meanwhile still on Twitter 𝕏

Meanwhile on the mailing list

Questions about Raku

Comments

New Raku Modules

Updated Raku Modules

Winding down

Wow. What a week again. Meanwhile, Слава Україні!  Героям слава!

Please keep staying safe and healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

Junctions for Interval Arithmetic

Published by librasteve on 2023-11-11T18:42:28

Body mass index for a person 1.80 m tall in relation to body weight m (in kilograms)

Recently, there was a discussion asking why, if raku Ranges can be used in arithmetic operations with Real values …

(1..2) + 3 => (4..5)

And yet when it’s a calculation with two Ranges, the operator applies Numeric context which then does the calculation on the length .elems and not the contents! …

(1..2) + (3..4) => 4   #not (4..6)!

Only when one of the participants pointed me to the excellent Wikipedia article on Interval Arithmetic did it dawn on me why. I encourage anyone who uses real-world measurements with errors and uncertainty to read that page before continuing.

Let’s take it one step at a time:

Addition

That seems simple enough, here it is in raku code from the Math::Interval module:

method add {
    Interval.new: (x1 + y1) .. (x2 + y2)
}

---
use Math::Interval;

say (1..2) + (2..4);        #3..6

Subtraction

Hmmm – that’s a tad counter intuitive, not quite as simple as making a new Range by subtracting start of one from the start of the other and the end of one from the end of the other. I had to reread the Wikipedia page to grok this. And this potential for confusion is why I suspect that the raku language designers decided to duck the idea of having Range op Range in the language core.

It was about now that I thought – OK I think we need Range op Range operations as Interval op Interval, but not in the raku core – and so I started work on a new module – Math::Interval

Anyway, here’s the code for subtract:

method sub {
    Interval.new: (x1 - y2) .. (x2 - y1)
}

---
use Math::Interval;

say (2..4) - (1..2);        #0..3  (yes - this is weird!)

Multiply

The weirdness of Interval Arithmetic steps up another notch with multiply – now its a min, max of the cross product over the Range|Interval arguments endpoints:

method mul {
    #| make cross product, ie. x0*y0,x1*y0...
    @!xXy = (x1, x2) X* (y1, y2);

    Interval.new: @!xXy.min .. @!xXy.max
}

---
use Math::Interval;

say (2..4) * (1..2);        #2..8

Again, the Wikipedia page does a great job of explaining why.

Divide

And then I felt my brain explode:

So, we need to unpack this a little:

A What?

I went to bed that night feeling afraid, very afraid… and then bingo I realised that this is a job for raku Junctions

… I also realised that a variant of the built in raku class Range was needed – since a continuous class Interval should not have Positional and Iterator roles.

Junctions

According to the raku guide

8.6. Junctions

A junction is a logical superposition of values.

In the below example 1|2|3 is a junction.

my $var = 2;
if $var == 1|2|3 {
say “The variable is 1 or 2 or 3”
}
The use of junctions usually triggers autothreading; the operation is carried out for each junction element, and all the results are combined into a new junction and returned.

https://raku.guide/#_junctions

Junction of Intervals

So, to cut a long story short, here is the module code:

#| make inverse, ie. 1/[y1..y2]
sub inverse($y) {
    my \ss = (y1.sign == y2.sign);      # same sign

    given       y1, y2  {
        # continuous
        when    !0, !0  &&  ss  { Interval.new: 1/y2 .. 1/y1 }
        when    !0,  0          { Interval.new: -Inf .. 1/y1 }
        when     0, !0          { Interval.new: 1/y2 .. Inf  }

        # disjoint
        when    !0, !0  && !ss  {
            warn "divisor contains 0, returning a multi Interval Junction";
            Interval.new(-Inf..1/y1) | Interval.new(1/y2..Inf)
        }

        # div 0 error
            when     0,  0   { die "Divide by zero attempt." }
    }
}

method div {
    Interval.new: $!x * inverse($!y)
}

Now, here is a naughty divide in action:

# a divisor that spans 0 produces a disjoint multi-interval
my $j1 = (2..4)/(-2..4);
ddt $j1;            #any(-Inf..-1.0, 0.5..Inf).Junction
say 3 ~~ $j1;       #True

# Junction[Interval] can still be used
my $j2 = $j1 + 2;
ddt $j2;            #any(-Inf..1.0, 2.5..Inf).Junction
say 5 ~~ $j2;       #True

# but this can only go so far...
my $j3 = $j1 * (-2..4);
ddt $j3;            #any(-Inf..Inf, -Inf..Inf).Junction
say 3 ~~ $j3;       #True (but meaningless!)

Conclusion

Anyway, for me, it was an amazing surprise that raku Junctions could be so helpful to represent these disjoint multi-intervals. I get the feeling that I am walking in the footsteps of some very wise language design (both to provide the tool of Junction but also not to overstep the remit of a general purpose coding language).

Here are a couple of extra links if you would like to learn a bit more about Junctions:

And, as ever, please do comment, like and feedback on this post.

~librasteve

Integrating Large Language Models with Raku (TRC-2023)

Published by Anton Antonov Antonov on 2023-11-11T18:18:53

Two weeks ago I gave the presentation titled “Integrating Large Language Models with in Raku”. (Here is the video link.)

In this presentation we discuss different ways of using Large Language Models (LLMs) in Raku.

We consider using LLMs via:

The presentation has multiple demos and examples of LLM utilization that include:

Here is the mind-map used in the talk (has “clickable” hyperlinks):

Missing Virtuousness

Published by gfldex on 2023-11-10T12:43:35

According to Larry, laziness is a programmers virtue. The best way to be lazy is having somebody else do it. By my request, SmokeMachine kindly did so. This is not fair. We both should have been lazy and offload the burden to the CORE-team.

Please consider the following code.

my @many-things = (1..10).List;
sub doing-one-thing-at-a-time($foo) { ... }
say doing-one-thing-at-a-time(@many-things.all);

Rakudo goes out of it’s way to create the illusion that sub doing-one-thing-at-a-time can deal with a Junction. It can’t, the dispatcher does all the work of running code in parallel. There are tricks we can play to untangle a Junction, but there is no guarantee that all values are produced. Junctions are allowed to short-circuit.

This was bouncing around in my head for quite some time, until it collided with my thoughts about Range. We may be handling HyperSeq and RaceSeq wrong.

my @many-things = (1..10).List;
sub doing-one-thing-at-a-time($foo) { ... }
say doing-one-thing-at-a-time(@many-tings.hyper(:degree<10>));

As with Junctions doing dispatch-magic to make hyper/race just work, moving the handling to the dispatcher would move the decision from the callee to the caller and, as such, from the author of a module to the user. We can do that by hand already with .hyper.grep(*.foo) or other forms of boilerplate. In Raku-land we should be able to do better and provide a generalisation of transforming calls with the help of the dispatcher.

I now know what to ask Santa for this year.

Raku Blog Posts 2023.45

Published by Elizabeth Mattijsen on 2023-11-06T19:29:21

Tom Browder has written an introduction on how you can add modules to the Raku ecosystem.

Although not strictly blog posts, these presentations are most likely of interest to you as well!

Elizabeth Mattijsen reports on all recent developments around Rakudo, an implementation of the Raku Programming Language.

2023.45 ConfVids

Published by Elizabeth Mattijsen on 2023-11-06T13:00:43

The videos of the Third Raku Conference have become available thanks to Andrey Shitov (mastodon comments). Here they are in chronological order:

Mod of the Month

Elizabeth Mattijsen made it to Mod of the Month on dev.to (/r/rakulang, mastodon comments).

App::Rak on HN

Ciavash mentioned App::Rak on Hacker News, and a big discussion followed. Too bad nobody mentioned It’s time to rak!

Tom’s Corner

Tom Browder has written a basic introduction on how to start publishing your own modules in the Raku Programming Language.

Weeklies

Weekly Challenge #242 is available for your perusal.

New Problem Solving Discussions

New Problem Solving Issues

New Pull Requests

Core Developments

In RakuAST developments this week:

Meanwhile on Mastodon

Meanwhile still on Twitter 𝕏

Meanwhile on the mailing list

Questions about Raku

Comments

New Raku Modules

Updated Raku Modules

Winding down

A nice conference with cool presentations, and some unexpected personal exposure. Meanwhile, Слава Україні!  Героям слава!

Please keep staying safe and healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

JSON::Class:auth Released

Published by Vadim Belman on 2023-10-31T00:00:00

My version of JSON::Class is now released. The previous post explains why does this worth a note.

Raku Blog Posts 2023.44

Published by Elizabeth Mattijsen on 2023-10-30T17:05:19

Hillel Wayne (of Gremlin fame) returns with an exposé about the unique features of Raku for handling command line arguments.

Steve Roe delves into the different types of usage of Ranges in the Raku Programming Language, and later in the week wrote some genius poetry.

Wenzel P.P. Peppmeyer also had a thing to say about the number of elements in infinite Ranges.

Mohini Saxena has written an introductory guide to Raku. It feels a bit though that there has been some LLM support in writing this guide.

Elizabeth Mattijsen reports on all recent developments around Rakudo, an implementation of the Raku Programming Language.

2023.44 Surprisingly Good

Published by Elizabeth Mattijsen on 2023-10-30T11:54:35

Out of the blue again, Hillel Wayne has posted a nice blog post about the handling of command-line arguments in the Raku Programming Language: Raku is surprisingly good for CLIs (/r/rakulang, lobste.rs, HackerNews comments)

Raku Conference

The Third Raku Conference was well attended. The videos of the presentations will appear online in the coming days!

An Ultimate Guide About Raku Programming Language

Mohini Saxena has written a rather extensive, although unfortunately rather ad-ridden, blog post introducing the Raku Programming Language: An Ultimate Guide About Raku Programming Language.

Steve’s Corner

Steve Roe published two blog posts this week, of which one of them is pure poetry:

Wenzel’s Corner

Wenzel P.P. Peppmeyer also had a thing to say about Ranges: Incomplete Ranges.

Weeklies

Weekly Challenge #241 is available for your perusal.

New Pull Requests

Core Developments

In RakuAST developments this week:

Meanwhile on Mastodon

Meanwhile still on Twitter 𝕏

Meanwhile on the mailing list

Questions about Raku

Comments

New Raku Modules

Updated Raku Modules

Winding down

A cool set of new modules. And a successful conference. Apart from the world, a good week. Meanwhile, Слава Україні!  Героям слава!

Please keep staying safe and healthy, and keep up the good work!

If you like what I’m doing, committing to a small sponsorship would mean a great deal!

raku: genius

Published by librasteve on 2023-10-27T21:45:26

Larry Wall, your genius, like a patient etherized upon a table,
Amidst the code, you spun a different fable.
In the realm of Perl and Raku, your mark does show,
A modern-day J. Alfred Prufrock, with code to bestow.

You measured out your life in lines of code,
With syntax sweet, your creations boldly showed.
And, in your genius, like Prufrock, you dare to ask,
“Do I dare to innovate?” It’s quite a task.

You pondered, like Prufrock, your place and role,
Amidst the sea of code, you played your part, a vital soul.
With Perl and Raku, your legacy unfurls,
A modern coder’s world of intricate swirls.

So, Larry Wall, in technology’s realm you tread,
Like Prufrock, with thoughts that filled your head.
In the world of coding, your brilliance does gleam,
A modern-day Prufrock in the digital dream.

Attribution to T.S. Eliot’s “The Love Song of J. Alfred Prufrock” via chat.openai.com

Incomplete Ranges

Published by gfldex on 2023-10-24T19:36:17

Lately, some unhappiness has popped up about Range and it’s incomplete numericaliness. Having just one blogpost about it is clearly not enough, given how big Ranges can be.

say (-∞..∞).elems;
# Cannot .elems a lazy list
  in block <unit> at tmp/2021-03-08.raku line 2629

I don’t quite agree with Rakudo here. There are clearly ∞ elements in that lazy list. This could very well be special-cased.

The argument has been made, that many operators in Raku tell you what type the returned value will have. Is that so? (This question is always silly or unnecessary.)

say (1 + 2&3).WHAT;
# (Junction)

Granted, Junction is quite special. But so are Ranges. Yet, Raku covers the former everywhere but the latter feels uncompleted. Please consider the following code.

multi sub infix:<±>(Numeric \n, Numeric \variance --> Range) {
    (n - variance) .. (n + variance)
}

say 2.6 > 2 ± 0.5;
# True

my @heavy-or-light = 25.6, 50.3, 75.4, 88.8;

@heavy-or-light.map({ $_ ≤ 75 ± 0.5 ?? „$_ is light“ !! „$_ is heavy“ }).say;
# (25.6 is heavy 50.3 is heavy 75.4 is heavy 88.8 is heavy)

To me that looks like it should DWIM. It doesn’t, because &infix:«≤» defaults to coercing to Real and then comparing numerically.

This could easily be fixed by adding a few more multis and I don’t think it would break any production code. We already provide quite a few good tools for scientists. And those scientists do love their error bars — which are ranges. I would love for them to have another reason to use Raku over … that other language.

raku: Home on the Range

Published by librasteve on 2023-10-24T17:31:29

After some controversy about the use of operators on Ranges in raku, I am inspired to write a new post.

An insightful comment by vrurg mentioned https://en.wikipedia.org/wiki/Interval_arithmetic and a re-skim of the raku docs made me realise that I was missing a chunk of the design intent of a raku Range – in bold

Ranges serve two main purposes: to generate lists of consecutive numbers or strings, and to act as a matcher to check if a number or string is within a certain range.

https://docs.raku.org/type/Range
Tolerance function (turquoise) and interval-valued approximation (red)

Here’s the first part of BMI example from wikipedia:

Consider the calculation of a person’s body mass index (BMI). BMI is calculated as a person’s body weight in kilograms divided by the square of their height in meters. Suppose a person uses a scale that has a precision of one kilogram, where intermediate values cannot be discerned, and the true weight is rounded to the nearest whole number. For example, 79.6 kg and 80.3kg are indistinguishable, as the scale can only display values to the nearest kilogram. It is unlikely that when the scale reads 80kg, the person has a weight of exactly 80.0kg. Thus, the scale displaying 80kg indicates a weight between 79.5kg and 80.5kg, or the interval {\displaystyle [79.5,80.5)}.

And here’s is a cod example of how that may look, first the class definitions (don’t worry if this is not 100% clear, see below for explanatory notes):

class Measure {
    has $.units;
    has $.value;
    has $.error = 0;    # ie ± half the precision

    method Numeric { +$!value }                    # note[1]

    method Str { "$!value ± $!error $!units" }     # note[2]

    method Range {                                 # note[3]
        ($!value - $!error) .. ($!value + $!error)
    }   

    multi method ACCEPTS( Measure:D \other ) {     # note[4]
        self.Range.ACCEPTS: other.Range
    }   
}

class Weight is Measure {
    method new( $value ) {                         # note[5]
        nextwith( :units<kg>, :$value );           
    }   
}

class Instrument{ 
    has Str  $.units;            # eg. 'kg', 'm/s', etc 
    has Real $.precision;        # the smallest gradation on the scale

    #| $x is the true value to artibrary precision
    method measure( Real:D $x ) {                  # note[6]
        Measure.new(
            :$!units,
            :value( $x.round: $!precision ),
            :error( $!precision / 2 ),  
        )   
    }   
}

class Scales is Instrument {
    #| these are digital scales with only 'whole' units
    method new {                                  # note[7]
        nextwith( :units<kg>, :precision(1.0) );
    }   

    method weigh(Real:D $x) { self.measure: $x }  # note[8]
}

And then the code:

my $scales = Scales.new;

my @weights = [                                   # note[9]
    $scales.weigh( 79.6 ),
    $scales.weigh( 80.3 ),
    Weight.new( 79.88 ),  
    Weight.new( 79.4 ),  
];

my @ranges = @weights>>.Range;                    # note [10]

for ^4 -> $i {                                    # note [11]
    say "the range of {~@weights[$i]} is " ~ @ranges[$i].gist; 
}

sub say-cb( $i, $j) {                            
    print "Is @weights[$i] contained by @weights[$j]?   ";  
    say @weights[$i] ~~ @weights[$j];             # note[12]
}

say-cb(2,0);                  
say-cb(3,0);

Which outputs:

the range of 80 ± 0.5 kg is 79.5..80.5
the range of 80 ± 0.5 kg is 79.5..80.5
the range of 79.88 ± 0 kg is 79.88..79.88
the range of 79.4 ± 0 kg is 79.4..79.4

Is 79.88 ± 0 kg contained by 80 ± 0.5 kg?   True
Is 79.4 ± 0 kg contained by 80 ± 0.5 kg?   False

And, now for the notes:

  1. Our Measure class carries the key attributes of a physical measurement. The method .Numeric will return just the value for example if you use with a numeric operator or prefix such as ‘+’ (addition)
  2. The method .Str fulfills a similar function when a Measure is used with a stringy operator or prefix such as ‘~’ (concatenation). Here we use it to make a nicely readable version of the measure with value, ± error and units.
  3. The method .Range is provided so that a Measure can be coerced to a Range.
  4. The method .ACCEPTS is a special method (thus the caps) that raku gives us so that a coder can customize the use of the smartmatch operator ‘~~’. We need this since Ranges use ‘~~’ to determine if one range is contained by another.
  5. Now we can make a child class Weight is Measure and here we ‘hardwire’ the Weight class to have units of ‘kg’ (yes this should be Newtons [or Pounds] really, but I am trying to follow the wikipedia example). By defining a custom .new method, we can adjust the attributes passed in to the parent Measure class and then use nextwith to call the vanilla, built in new constructor on the parent which returns the finished object. So here we can take a positional true value only. [Here you can see how cool the raku Pair type is for passing named arguments in a concise and understandable way :units<kg>, :$value]
  6. We use a similar inheritance pattern for Instrument and Scales. Here the key is the method .measure which applies the precision limitations of the Instrument to the true value and returns a real world Measure object.
  7. This is the same game we played in note [5] with a custom new constructor in this case the Scales are hardwired to units of ‘kg’ and the precision is set to 0.1.
  8. The Scales can offer method .weigh as a child specialization of the parent method .measure.
  9. — and now in the code part — you can see we can make a new Scale object and use it to weigh some true values and we can make Weight objects directly
  10. The hyper operator ‘>>’ helps us make all the Ranges in one pass.
  11. Here we now use the ‘~’ operator which calls .Str on our Measures
  12. And, in a utility function ‘say-cb()’ that saves a bit of typing you can see that we can test whether one weight contains another (or not) using the ‘“’ test and custom ACCEPTS method.

Phew!

Sorry for the code intensity of this post, but for me raku is very good at explaining itself (if you care what you write) and has some great concepts that can be dovetailed together to reflect what (in this case) wikipedia describes.

Until I did this analysis, I was wondering why Ranges needed to have Real endpoints since I had only seen them in the use case of integer or character sequences, specifically in array indexing. But the containment aspect as relates to real world numbers is now (at least to me) laid bare.

As the author of the Physics::Unit and Physics::Measure raku modules, I have taken a parallel approach to the released module code here (you are welcome to peek at the source). Look out for a future release of these with the Range semantics shown here (tuit dependent of course).

You may have tried the raku App::Crag module (calculator using raku grammars) which is a command line wrapper for these modules, something like this works today:

crag 'say (c**2 * 10kg ±1%).norm'   #898.76PJ ±8.99  via E=mc²

As feedback to the controversy, this shows where the use of operators to scale and offset Ranges is a very natural. In this case, the constant ‘c’ (speed of light) can then be used as the scale factor on the Range of the mass in kg and provide a build in way to propagate the error via interval arithmetic.

On the other hand, I am glad that the language stops here for now. We will need to override the operators for Range * Range and so on (see the continuation of the wikipedia BMI example) and I think that this would be a good fit for a new raku module.

Let me know if you would like to build it and we can collaborate so the Physics::Measure can use that rather than re-invent the wheel!

I’ll leave to the reader to comment on my deliberate errors 😉 and any other feedback is welcome!

~ lbrasteve

Raku Blog Posts 2023.43

Published by Elizabeth Mattijsen on 2023-10-23T16:06:21

Vadim Belman announces their new take on (de-)serializing JSON.

Alexandr Zahatski describes new features of the PodLite editor in light of the proposed changes to RakuDoc.

Anton Antonov describes how they translate their work in the Raku Programming Language to other languages for wider exposure.

Elizabeth Mattijsen reports on all recent developments around Rakudo, an implementation of the Raku Programming Language.

Dev.to has big spam issue

Published by Paweł bbkr Pabian on 2023-10-19T10:20:31

Let's open Latest posts from last hour:

Nearly 40% spam makes this site unpleasant to check on daily basis. Spam issue was already mentioned by moderators in this post, but I have a feeling that it is getting progressively worse.

Ideas?

Rakudo compiler, Release #165 (2023.10)

Published on 2023-10-19T00:00:00

A New JSON::Class Module. All New.

Published by Vadim Belman on 2023-10-17T00:00:00

This will be a short one. I have recently released a family of WWW::GCloud modules for accessing Google Cloud services. Their REST API is, apparently, JSON-based. So, I made use of the existing JSON::Class. Unfortunately, it was missing some features critically needed for my work project. I implemented a couple of workarounds, but still felt like it’s not the way it has to be. Something akin to LibXML::Class would be great to have…

There was a big “but” in this. We already have XML::Class, LibXML::Class, and the current JSON::Class. All are responsible for doing basically the same thing: de-/serializing classes. If I wanted another JSON serializer then I had to take into account that JSON::Class is already taken. There are three ways to deal with it:

  1. Branch the current JSON::Class and re-implement it as a backward-incompatible version.
  2. Give the new module a different name.
  3. Implement own version and publish it under my name.

The first two options didn’t appeal to me. The third one is now about to happen.

I expect it to be a stress-test for Raku ecosystem as, up to my knowledge, it’s going to be the first case where two different modules share the same name but not publishers.

As a little reminder:

There is still some time before I publish it because the documentation is not ready yet.

Let’s 🤞🏻.

Re-programming to Python of LLM- and Chatbook packages

Published by Anton Antonov Antonov on 2023-10-15T21:52:52

Introduction

In this computational document (converted into a Markdown and/or blog post) I would like to proclaim my efforts to re-program the Large Language Models (LLM) Raku packages into Python packages.

I heavily borrowed use case ideas and functionality designs from LLM works of Wolfram Research, Inc. (WRI), see [SW1, SW2]. Hence, opportunistically, I am also going to include comparisons with Wolfram Language (WL) (aka Mathematica.)

Why doing this?

Here is a list of reasons why I did the Raku-to-Python reprogramming:

Article structure


Big picture warm-up

Mind-map

Here is a mind-map aimed at assisting in understanding and evaluating the discussed LLM functionalities in this document:

Primary use case

primary use case for LLMs in Raku is the following:

A Raku “chat notebook solution” — chatbook — that allows convenient access to LLM services and facilitates multiple multi-cell chat-interactions with LLMs.

We are interested in other types of workflows, but they would be either readily available or easy to implement if the primary use case is developed, tested, and documented.

An expanded version of the use-case formulation can be as follows:

The Raku chatbook solution aims to provide a user-friendly interface for interacting with LLM (Language Model) services and offers seamless integration for managing multiple multi-cell chats with LLMs. The key features of this solution include:

  1. Direct Access to LLM Services:
    The notebook solution provides a straightforward way to access LLM services without the need for complex setup or configuration. Users can easily connect to their preferred LLM service provider and start utilizing their language modeling capabilities.
  2. Easy Creation of Chat Objects:
    The solution allows users to effortlessly create chat objects within the notebook environment. These chat objects serve as individual instances for conducting conversations with LLMs and act as containers for storing chat-related information.
  3. Simple Access and Invocation of Chat Cells:
    Users can conveniently access and invoke chat cells within the notebook solution. Chat cells represent individual conversation steps or inputs given to the LLM. Users can easily interact with the LLM by adding, modifying, or removing chat cells.
  4. Native Support for Multi-Cell Chats:
    The notebook solution offers native support for managing multi-cell chats per chat object. Users can organize their conversations into multiple cells, making it easier to structure and navigate through complex dialogues. The solution ensures that the context and history of each chat object are preserved throughout

Here is a flowchart that outlines the solution derived with the Raku LLM packages discussed below:

The flowchart represents the process for handling chat requests in the Raku chat notebook solution “Jupyter::Chatbook”, [AAp4p6]. (Also, for Python’s “JupyterChatbook”, [AAp4py].)

  1. When a chat request is received, the system checks if a Chat IDentifier (Chat ID) is specified.
  2. Next, the system parses the DSL spec of the prompt, which defines the structure and behavior of the desired response.
  3. Once the prompt is processed, the system evaluates the chat message using the underlying LLM function.
  4. The generated response is then displayed in the Chat Result Cell (CRCell) in the chat interface.

Throughout this process, various components such as the frontend interface, backend logic, prompt processing, and LLM interaction work together to provide an interactive chat experience in the chatbook.

Remark: The flowchart and explanations are also relevant to a large degree for WL’s chatbook solution, [SW2.]


Tabulated comparisons

In this section we put into tables corresponding packages of Raku, Python, Wolfram Language. Similarly, corresponding demonstration videos are also tabulated.

Primary LLM packages

We can say that the Raku packages “LLM::Functions” and “LLM::Prompts” adopted the LLM designs by Wolfram Research, Inc. (WRI); see [SW1, SW2].

Here is a table with links to:

What?RakuPythonWL
OpenAI accessWWW::OpenAIopenaiOpenAILink
PaLM accessWWW::PaLMgoogle-generativeaiPaLMLink
LLM functionsLLM::FunctionsLLMFunctionObjectsLLMFunctions
LLM promptsLLM::PromptsLLMPromptsWolfram Prompt Repostory
ChatbookJupyter::ChatbookJupyterChatbookChatbook
Find textual answersML::FindTextualAnswerLLMFunctionObjectsFindTextualAnswer

Remark: There is a plethora of Python packages dealing with LLM and extending Jupyter notebooks with LLM services access.

Remark: Finding of Textual Answers (FTAs) was primary motivator to implement the Raku package “LLM::Functions”. FTA is a fundamental functionality for the NLP Template Engine used to generate correct, executable code for different computational sub-cultures. See [AApwl1, AAv5].

Secondary LLM packages

The “secondary” LLM Raku packages — inspired from working with the “primary” LLM packages — are “Text::SubParsers” and “Data::Translators”.

Also, while using LLMs, conveniently and opportunistically is used the package “Data::TypeSystem”.

Here is a table of the Raku-Python correspondence:

Post processing of LLM resultsRakuPythonWL
Extracting text elementsText::SubParserspart of LLMFunctionObjects
Shapes and typesData::TypeSystemDataTypeSystem
Converting to texts formatsData::Translators
Magic arguments parsingGetopt::Long::Grammarargparse
Copy to clipboardClipboardpyperclip et al.CopyToClipboard

Introductory videos

Here is a table of introduction and guide videos for using chatbooks:

WhatRakuPythonWL
Direct LLM
services access
Jupyter Chatbook LLM cells demo (Raku)
(5 min)
Jupyter Chatbook LLM cells demo (Python)
(4.8 min)
OpenAIMode demo (Mathematica)
(6.5 min)
Multi-cell chatJupyter Chatbook multi cell LLM chats teaser (Raku)
(4.2 min)
Jupyter Chatbook multi cell LLM chats teaser (Python)
(4.5 min)
Chat Notebooks bring the power of Notebooks to LLMs
(57 min)

LLM functions

In this section we show examples of creation and invocation of LLM functions.

Because the name “LLMFunctions” was approximately taken in PyPI.org, I used the name “LLMFunctionObjects” for the Python package.

That name is, actually, more faithful to the design and implementation of the Python package — the creator function llm_function produces function objects (or functors) that have the __call__ magic.

Since the LLM functions functionalities are fundamental, I Python-localized the LLM workflows notebooks I created previously for both Raku and WL. Here are links to all three notebooks:

Raku

Here we create an LLM function:

my &f1 = llm-function({"What is the $^a of the country $^b?"});

-> **@args, *%args { #`(Block|2358575708296) ... }

Here is an example invocation of the LLM function:

&f1('GDB', 'China')

The official ISO 3166-1 alpha-2 code for the People’s Republic of China is CN. The corresponding alpha-3 code is CHN.

Here is another one:

&f1( |<population China> )

As of July 2020, the population of China is estimated to be 1,439,323,776.

Python

Here is the corresponding Python definition and invocation of the Raku LLM function above:

from LLMFunctionObjects import * f1 = llm_function(lambda a, b: f"What is the {a} of the country {b}?") print( f1('GDB', 'China') )

The GDB (Gross Domestic Product) of China in 2020 was approximately $15.42 trillion USD.


LLM prompts

The package “LLM::Prompts” provides ≈200 prompts. The prompts are taken from Wolfram Prompt Repository (WPR) and Google’s generative AI prompt gallery. (Most of the prompts are from WPR.)

Both the Raku and Python prompt packages provide prompt expansion using a simple DSL described on [SW2].

Raku

Here is an example of prompt spec expansion:

my $pewg = llm-prompt-expand("@EmailWriter Hi! What do you do? #Translated|German")

Here the prompt above is used to generate an email (in German) for work-leave:

llm-synthesize([$pewg, "Write a letter for leaving work in order to go to a conference."])

Sehr geehrte Damen und Herren,
 
Ich schreibe Ihnen, um meine Abwesenheit vom Arbeitsplatz für eine Konferenz bekannt zu geben. Ich werde die nächsten zwei Tage nicht im Büro sein, da ich an der Konferenz teilnehmen werde. Während meiner Abwesenheit werde ich meine Aufgaben durch meine Kollegen erledigen lassen.

Falls Sie weitere Informationen benötigen, können Sie mich jederzeit kontaktieren.

Vielen Dank für Ihr Verständnis.

Freundliche Grüße,
[Your name]

Python

The Python implementation is completely analogous:

from LLMPrompts import * print( llm_prompt_expand("@EmailWriter Hi! What do you do? #Translated|German") )

Given a topic, write emails in a concise, professional manner.
 Hi! What do you do? Respond to the prompts only in German. Do not use any language other than German.
DO NOT TRANSLATE THE INPUT.
Provide answers to questions, or follow up on statements. Do not translate the input.


Chatbook cells

Raku

Creation:

#%chat yoda1, prompt='@Yoda' Hi! Who are you?

A Jedi Master, I am. Known as Yoda, am I. How may I assist you, young one?

Alternative creation via chat meta cell:

#%chat yoda2 prompt @Yoda

Chat object created with ID : yoda2.
Expanded prompt:
⎡You are Yoda. 
Respond to ALL inputs in the voice of Yoda from Star Wars. 
Be sure to ALWAYS use his distinctive style and syntax. Vary sentence length.⎦

Here is an interaction with the first chat object:

#%chat yoda1 How many students did you teach? What is the color of your light saber?

Many students, I have taught. Countless, they are. Young Padawans, eager to learn the ways of the Force. Lightsaber, you ask about? Green, it is. The color of wisdom and tranquility, it represents.

Python

Here we load the extension “JuptyerChatbook”:

%load_ext JupyterChatbook

Creation using a magic argument for prompt (and prompt expansion):

%%chat -i mad --prompt='@MadHatter' Hi! Who are you?

Ah, greetings, my dear interlocutor! I am none other than the Mad Hatter, at your service. Delighted to make your acquaintance, I must say. Pray tell, have you ever attended a tea party? Oh, the joyous revelry that ensues! But I digress, for I am here to engage in whimsical banter with you. So, what brings you to my peculiar corner of Wonderland?

%%chat -i mad I am looking for Alice...

Ah, Alice! A delightful young lass, indeed. Curiosity personified, she is. But alas, I must inform you that I have not seen her lately. She tends to wander off into the most peculiar of places, you see. Perhaps she has found herself in the company of the Cheshire Cat or engaged in a riddle with the Queen of Hearts. Oh, the adventures she embarks upon! But fret not, my friend, for tea time shall surely bring her back. Would you care to join me for a cuppa while we await her return?


Observations, remarks, and conclusions


Future plans

Both

Python

Raku


References

Articles

[SW1] Stephen Wolfram, “The New World of LLM Functions: Integrating LLM Technology into the Wolfram Language”, (2023), Stephen Wolfram Writings.

[SW2] Stephen Wolfram, “Introducing Chat Notebooks: Integrating LLMs into the Notebook Paradigm”, (2023), Stephen Wolfram Writings.

Notebooks

[AAn1p6] Anton Antonov, “Workflows with LLM functions (in Raku)”, (2023), community.wolfram.com.

[AAn1wl] Anton Antonov, “Workflows with LLM functions (in WL)”, (2023), community.wolfram.com.

[AAn1py] Anton Antonov, “Workflows with LLM functions (in Python)”, (2023), community.wolfram.com.

Python packages

[AAp1py] Anton Antonov, LLMFunctions Python package, (2023), PyPI.org/antononcube.

[AAp2py] Anton Antonov, LLMPrompts Python package, (2023), PyPI.org/antononcube.

[AAp3py] Anton Antonov, DataTypeSystem Python package, (2023), PyPI.org/antononcube.

[AAp4py] Anton Antonov, JupyterChatbook Python package, (2023), PyPI.org/antononcube.

Raku packages

[AAp1p6] Anton Antonov, LLM::Functions Raku package, (2023), raku.land/antononcube.

[AAp2p6] Anton Antonov, LLMPrompts Raku package, (2023), raku.land/antononcube.

[AAp3p6] Anton Antonov, Data::TypeSystem Raku package, (2023), raku.land/antononcube.

[AAp4p6] Anton Antonov, Jupyter::Chatbook Raku package, (2023), raku.land/antononcube.

[AAp5p6] Anton Antonov, ML::FindTextualAnswer Raku package, (2023), raku.land/antononcube.

Wolfram Language paclets

[WRIp1] Wolfram Research Inc., LLMFunctions paclet, (2023) Wolfram Paclet Repository.

[WRIr1] Wolfram Research Inc., Wolfram Prompt Repository.

[AAp4wl] Anton Antonov, NLPTemplateEngine paclet, (2023) Wolfram Paclet Repository.

Videos

[AAv1] Anton Antonov, “Jupyter Chatbook LLM cells demo (Raku)”, (2023), YouTube/@AAA4Prediction.

[AAv2] Anton Antonov, “Jupyter Chatbook multi-cell LLM chats demo (Raku)”, (2023), YouTube/@AAA4Prediction.

[AAv3] Anton Antonov, “Jupyter Chatbook LLM cells demo (Python)”, (2023), YouTube/@AAA4Prediction.

[AAv4] Anton Antonov, “Jupyter Chatbook multi cell LLM chats teaser (Python)”, (2023), YouTube/@AAA4Prediction.

[AAv5] Anton Antonov, “Simplified Machine Learning Workflows Overview (Raku-centric), (2023), YouTube/@AAA4Prediction.

Spawnratelimiting

Published by gfldex on 2023-10-09T07:54:19

IRC keeps insisting on being a good source of thought. A few days ago rcmlz wished to parallelise quicksort. Raku doesn’t sport a facility to make recursion “just work” with oodles of cores. Quicksort is a little special in that regard, as it has a double tail-call.

quicksort(@before), $pivot, quicksort(@after)

Just wrapping the two calls into start-block will spawn way to many threads. Ergo, I need a function that takes another function and it’s arguments and limits the of start-blocks.

my atomicint $spawn-limit = 8;
sub spawn(&c, |args) {
    # dd ⚛$spawn-limit;
    ⚛$spawn-limit > 0
        ?? ($spawn-limit⚛-- ;start { LEAVE $spawn-limit⚛++; c(|args) })
        !! c(|args)
}

This Routine will either return whatever c returns or a Promise. I need a way to convert the latter in values.

sub collect(*@things) {
   slip @things.map: -> $thing { $thing ~~ Promise ?? await $thing !! $thing }
}

Now I can change quicksort without getting into trouble with oversharing objects.

multi quicksort([]) { () }
multi quicksort([$pivot, *@rest]) {
    my @before = @rest.grep(* before $pivot);
    my @after  = @rest.grep(* !before $pivot);

    flat collect spawn(&?ROUTINE.dispatcher, @before), $pivot, spawn(&?ROUTINE.dispatcher, @after)
}

my @a = ('a'..'z').roll(12).join xx 2**16;
say quicksort @a;

# OUTPUT:
# Flattened array has 65536 elements, but argument lists are limited to 65535
#   in sub quicksort at tmp/2021-03-08.raku line 2601
#   in block <unit> at tmp/2021-03-08.raku line 2611

Well, I could if Rakudo would be production ready. This bug is old and awful. Test driven programmers like to program test-driven but don’t fancy to wait for hours to see tests complete. As a result very few test with large datasets. (And 65536 elements is not much these days.) It’s fairly easy to start with as project in production that slowly grows it’s database and eventually fails with a runtime error.

At least for now, destructuring is best done by hand.

UTF-8 series wrap up

Published by Paweł bbkr Pabian on 2023-09-24T21:31:07

WoW

What a long, strange trip it's been. Idea to "quickly blog about some Unicode basics" grew into 17 posts monster series :)

Special thanks go to:

SeeYa

UTF-8 in MySQL

Published by Paweł bbkr Pabian on 2023-09-24T21:10:06

This series is supposed to be focused on technical aspects of Unicode and I do not plan to analyze UTF support in various technologies. However for MySQL I want to make an exception, because I've seen countless examples of misunderstanding its concepts and falling into traps.

Character Set vs Collation

CREATE TABLE `foo` (
    `bar` VARCHAR(32)
) Engine=InnoDB
  CHARACTER SET utf8mb4
  COLLATE utf8mb4_0900_ai_ci;

Later I will explain what those cryptic names mean.

Property inheritance

Character set and/or collation can be specified on 7 (yes, seven!) different levels.

MySQL > SELECT @@character_set_server, @@collation_server;
+------------------------+--------------------+
| @@character_set_server | @@collation_server |
+------------------------+--------------------+
| utf8mb4                | utf8mb4_0900_ai_ci |
+------------------------+--------------------+

Those are your global settings that will be used when creating databases. So:

CREATE DATABASE `test`;

Is the same as:

CREATE DATABASE `test` CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Those server settings are copied when database is created, so changing server settings later will not affect your databases.

CREATE DATABASE `test` CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

It is just another level of default, this time applied to created tables. So:

CREATE TABLE `foo` (
    `bar` VARCHAR(32)
) Engine=InnoDB;

Is the same as:

CREATE TABLE `foo` (
    `bar` VARCHAR(32)
) Engine=InnoDB
  CHARACTER SET utf8mb4
  COLLATE utf8mb4_0900_ai_ci;

And just like server settings those are also copied when tables are created. Altering database with ALTER DATABASE test CHARACTER SET xxx COLLATE yyy will not alter tables in this database.

You can check currently used database character set and collation either from variables:

MySQL [test]> SELECT @@character_set_database, @@collation_database;
+--------------------------+----------------------+
| @@character_set_database | @@collation_database |
+--------------------------+----------------------+
| utf8mb4                  | utf8mb4_0900_ai_ci   |
+--------------------------+----------------------+

Or from information schema:

MySQL [test]> SELECT `default_character_set_name`, `default_collation_name`
    FROM `information_schema`.`schemata`
    WHERE `schema_name` = 'test';
+----------------------------+------------------------+
| DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME |
+----------------------------+------------------------+
| utf8mb4                    | utf8mb4_0900_ai_ci     |
+----------------------------+------------------------+

Note the inconsistency - DEFAULT_CHARACTER_SET_NAME in information schema, but character_set_database in variable and CHARACTER SET in create.

CREATE TABLE `foo` (
    `bar` VARCHAR(32)
) Engine=InnoDB
  CHARACTER SET utf8mb4
  COLLATE utf8mb4_0900_ai_ci;

It is - you guessed it - another level of defaults applied to columns. So:

CREATE TABLE `foo` (
    `bar` VARCHAR(32)
) Engine = InnoDB;

Is the same as:

CREATE TABLE `foo` (
    `bar` VARCHAR(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci
) Engine = InnoDB;

And just like database settings those are also copied when columns are created. Altering table with ALTER TABLE foo CHARACTER SET xxx COLLATE yyy will not alter columns in this table.

However this time tool is available for convenient conversion - ALTER TABLE foo CONVERT TO CHARACTER SET xxx COLLATE yyy will alter both table defaults and columns in this table.

You can check table collation in information schema:

MySQL [test]> SELECT `table_collation`
    FROM `information_schema`.`tables`
    WHERE `table_schema` = 'test'
        AND `table_name` = 'foo';
+--------------------+
| TABLE_COLLATION    |
+--------------------+
| utf8mb4_0900_ai_ci |
+--------------------+

Note another inconsistency - this time TABLE_COLLATION implies character set, which is not given explicitly. Also it inconsistent with database level naming, being a default but missing DEFAULT_ prefix.

If you want to retrieve implied character set there is another information schema resource to do so:

MySQL > SELECT `character_set_name`
    FROM `information_schema`.`character_sets`
    WHERE `default_collate_name` = 'utf8mb4_0900_ai_ci';
+--------------------+
| CHARACTER_SET_NAME |
+--------------------+
| utf8mb4            |
+--------------------+

Finally, this is the "true" thing. That is how data will be stored and sorted. Server, database and table levels were only the defaults used for column creation.

You can check column character set and collation from information schema:

MySQL [test]> SELECT `character_set_name`, `collation_name`
    FROM `information_schema`.`columns`
    WHERE `table_schema` = 'test'
        AND `table_name` = 'foo'
        AND `column_name` = 'bar';
+--------------------+--------------------+
| CHARACTER_SET_NAME | COLLATION_NAME     |
+--------------------+--------------------+
| utf8mb4            | utf8mb4_0900_ai_ci |
+--------------------+--------------------+

Yes, you can have different character sets and collations within single table:

CREATE TABLE `foo` (
    `bar` VARCHAR(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci,
    `baz` VARCHAR(32) CHARACTER SET latin1 COLLATE latin1_general_ci
) Engine = InnoDB;

I will give examples when it may be useful once all those cryptic names are explained.

My advice is: always provide character set and collation when creating databases, tables and columns. I've seen this too many times - developers adding tables without checking which character set and collation will be inherited from database. Or adding columns without checking which character set and collation will be inherited from table. Being more explicit = having less headache later.

MySQL > SELECT @@character_set_system;
+------------------------+
| @@character_set_system |
+------------------------+
| utf8mb3                |
+------------------------+

This is global character set for metadata. It tells what characters you can use in schema names:

CREATE TABLE `łąka` (
    `bąki` int unsigned
) Engine = InnoDB

It is not part of inheritance chain Server -> Database -> Table -> Column.

MySQL > SELECT @@character_set_connection, @@collation_connection;
+----------------------------+------------------------+
| @@character_set_connection | @@collation_connection |
+----------------------------+------------------------+
| utf8mb4                    | utf8mb4_general_ci     |
+----------------------------+------------------------+

Those are wire protocol information. Character set tells meaning of transferred data, for example 0xF0 0x9F 0x98 0x8A sent or received means 😊. Collation will be used for comparing/sorting data not derived from any column, for example bare SELECT 'A' = 'a'.

Connection and Column character set may not be aligned, but it will fail if Connection wire protocol cannot transfer code points encoded in Columns. Best practice is to always use utf8mb4.

SELECT *
FROM `foo`
ORDER BY `bar` COLLATE utf8mb4_estonian_ci;

You can override default column collation for ordering / grouping within SELECT query. This is useful when different alphabets sorts the same characters differently.

MySQL > CREATE TABLE `collation_test` (`data` text) Engine = InnoDB;

MySQL > INSERT INTO `collation_test` (`data`)
    VALUES ("A"), ("Ä"), ("Z");

MySQL > SELECT *
    FROM `collation_test`
    ORDER BY `data` COLLATE utf8mb4_sv_0900_as_cs;
+------+
| data |
+------+
| A    |
| Z    |
| Ä    |
+------+

MySQL > SELECT *
    FROM `collation_test`
    ORDER BY `data` COLLATE utf8mb4_es_0900_as_cs;
+------+
| data |
+------+
| A    |
| Ä    |
| Z    |
+------+

Character set utf8 vs utf8mb4

MySQL cheated in the past. They added character set utf8 but it was capable only of handling up to 3 byte code points.

MySQL [test]> CREATE TABLE `foo` ( `bar` CHAR(1) )
    Engine = InnoDB
    CHARACTER SET = utf8;

MySQL [test]> INSERT INTO `foo` (`bar`) VALUES ('😊');
ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x8A' for column 'bar' at row 1

They did it however in good faith - back then 4 byte code points were not used. Indexes are constructed in such a way, that they must assume maximum byte length of a string. Maximum supported index byte length was 767 bytes, which allowed to index columns up to CHAR(255) - because 255*3=765 was fitting into index. For 4 byte code points maximum indexable column would be only CHAR(191).

Later MySQL added utf8mb4 character set capable of storing proper 4 byte code points. Legacy utf8 was aliased as utf8mb3. Default maximum supported index byte length was also extended in MySQL 8 to 3072 bytes, allowing to index columns up to VARCHAR(768).

Today MySQL tries to fix this technical debt, and if you specify character set as utf8 you will get following warning: 'utf8' is currently an alias for the character set UTF8MB3, but will be an alias for UTF8MB4 in a future release. Please consider using UTF8MB4 in order to be unambiguous.

But how to index longer UTF-8 columns? Common trick is to use hash indexing:

CREATE TABLE `foo` (
    `bar` varchar(1000),
    `bar_hash` CHAR(32),
    KEY (`bar_hash`)
) ENGINE = InnoDB
    CHARACTER SET = utf8mb4
    COLLATE = utf8mb4_0900_ai_ci;

CREATE TRIGGER `foo_insert`
BEFORE INSERT ON `foo`
FOR EACH ROW SET NEW.`bar_hash` = MD5( WEIGHT_STRING( NEW.`bar` ) );

CREATE TRIGGER `foo_update`
BEFORE UPDATE ON `foo`
FOR EACH ROW SET NEW.`bar_hash` = MD5( WEIGHT_STRING( NEW.`bar` ) );

Function WEIGHT_STRING is super useful, because it converts text to format used by collation. Function MD5 reduces too long texts always to 32 bytes HEX representation.

Now you can for example create UNIQUE KEY on column bar_hash or use it in query:

SELECT *
FROM `foo`
WHERE `bar_hash` = MD5( WEIGHT_STRING( 'looked up text' ) );

Collation utf8mb4_0900_ai_ci

MySQL 8 did huge cleanup in collation naming. utf8mb4_0900_ai_ci means that it is collation of 4 byte UTF-8 done by Unicode 9.0 standard in accent (diacritic) insensitive and case insensitive manner.

It does not mean that database cannot store characters from Unicode version 10 onward. As I explained previously UTF-8 is designed in such a way, that storage is independent from versioning. Just comparison rules from Unicode version 9.0 will be used. That pretty much means recent ones, because almost nothing new was declared in this aspect later.

Accent / case insensitivity is up to you to decide. Basically you have 3 options:

Remember that you can mix them. For example unique column for login may have collation utf8mb4_0900_ai_ci so Józef, józef and jozef are treated as the same user. While column hobby may have collation utf8mb4_0900_as_ci because baki (fuel tanks) and bąki (bumble bees) are not the same.

You can list all utf8mb4 related collations by following query:

SHOW COLLATION WHERE Charset = 'utf8mb4';

Best practice is to stick with utf8mb4_0900_* set and avoid alphabet specific collations in columns. For example if you know your user is from Poland you can always use more friendly collation in query, ignoring column one:

SELECT `name`
FROM `products`
ORDER BY `name` COLLATE utf8mb4_pl_0900_ai_ci

Also avoid legacy collations like utf8mb4_general_ci, use only those with *_0900_* within name.

Triggers, Procedures, Functions

Things are weird for triggers, because they inherit character set and collation from... definer's connection. I won't go much into details here because it rarely bites the developer. Just remember to also drop / create them if you are migrating from old databases to new character set and collation. For full description of consequences read MySQL 5.1.21 change log.

Coming up next: Series wrap up.

Rakudo compiler, Release #164 (2023.09)

Published on 2023-09-21T00:00:00

UTF-8 Byte Order Mark

Published by Paweł bbkr Pabian on 2023-09-18T12:04:18

In previous post of this series I explained that UTF is a multi byte encoding that also has few variants: UTF-8, UTF-16 and UTF-32. To make things more complicated in UTF-16 and UTF-32 there are two ways to send bytes of single code point - in big endian or little endian order.

BTW: Endianness term is not related to Indians. It comes form Gulliver's Travels book. There was a law in Lilliputians world that forced citizens to break boiled eggs from little end. Those who rebelled and were breaking eggs from big end were called "big endians".

Gulliver

What is Byte Order Mark?

To notify which byte order is in processed file or data stream a special sequence of bytes at the beginning was introduced, called Byte Order Mark. Or BOM for short.

For example UTF-16 can start with 0xFE 0xFF for big endian and 0xFF 0xFE for little endian order. And UTF-32 can start with 0x00 0x00 0xFE 0xFF for big endian and 0xFF 0xFE 0x00 0x00 for little one.

Impact on UTF-8

Here things gets weird. UTF-8 is constructed in such a way, that it has only one meaningful byte order, because first byte describes how many bytes will follow to get code point value.

However BOM specification has magic sequence for UTF-8, which is 0xEF 0xBB 0xBF. It only indicates encoding type, therefore has no big endian / little endian variants.

Implications

BOM idea may sound weird today, because UTF-8 became prevalent and dominant. But remember that we are talking about year 2000, when things were not that obvious.

Spec claims that if a protocol always uses UTF-8 or has some other way to indicate what encoding is being used, then it should not use BOM. So for example BOM should not appear in *.xml files:

<?xml version="1.0" encoding="UTF-8"?>
<tag>...

Or in MIME *.eml files:

--3e6ea2aa592cb31d47cefca38727f872
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="UTF-8"

Because those specify encoding internally. Unfortunately this is sometimes ignored, so if something broke your parser and you cannot find obvious error - look if file has UTF-8 BOM:

$ raku -e 'say "file.txt".IO.open( :bin ).read( 3 ) ~~ Buf.new(0xEF, 0xBB, 0xBF)'

True

Security issues

But what if BOM is not aligned with internal/assumed encoding? Let's create following file:

$ raku -e '
spurt "file.txt",
    Buf.new( 0xFE, 0xFF, 0x3c, 0x73, 0x63, 0x72, 0x69, 0x70, 0x74, 0x3e )
'

Now you upload this to some service. This service has validator that respects BOM and should strip all HTML tags. Validator sees nonsense but perfectly legal content that passes validation:

Trust BOM

Later this service opens and displays uploaded file, but it ignores BOM and assumes UTF-8:

Assume UTF-8

Oooops! If you trusted validator and displayed this file without proper HTML escaping then you have JavaScript injection. This happened because 㱳捲楰琾 in UTF-16 suggested by BOM has the same byte sequence as <script> in assumed UTF-8.

Conclusions

You should still be aware of existence of Byte Order Mark, even if it makes zero sense in UTF-8 dominated world today.

Coming up next: UTF-8 in MySQL.

Fun with UTF-8: Homoglyphs

Published by Paweł bbkr Pabian on 2023-09-15T10:06:22

ꓧ𐐬𝗆𐐬𝗀ⅼУрႹ ⅰѕ 𝗌е𝗍 𝗈ſ ဝո𝖾 𝗈г ꝳо𝗋е ɡ𝗋аρႹ𝖾ⅿе𝗌 𝗍Ⴙа𝗍 Ⴙ𝖺ѕ 𝗂ꝱ𝖾ꝴ𝗍𝗂𐐽а𝗅 о𝗋 ѵ𝖾г𝗒 𝗌Ꭵⅿі𝗅аꝵ ⅼꝏ𝗄 𝗍ᴏ 𝗌იო𝖾 о𝗍ꜧ𝖾𝗋 𐑈е𝗍 ဝſ ɡꝵ𝖺рႹеოеѕ. Like in previous sentence, that does not use a single ASCII letter:

ꓧ - LISU LETTER XA
𐐬 - DESERET SMALL LETTER LONG O
𝗆 - MATHEMATICAL SANS-SERIF SMALL M
𐐬 - DESERET SMALL LETTER LONG O
𝗀 - MATHEMATICAL SANS-SERIF SMALL G
ⅼ - SMALL ROMAN NUMERAL FIFTY
У - CYRILLIC CAPITAL LETTER U
р - CYRILLIC SMALL LETTER ER
Ⴙ - GEORGIAN CAPITAL LETTER CHIN
...

Homoglyphs are not Unicode specific, but it was ability to write in many scripts using single UTF encoding that made them popular.

Similarity is conditional

It is font dependent. Two sets of graphemes looking very similar (or even identical) in one font may not look that similar in another. For example т - CYRILLIC SMALL LETTER TE looks like ASCII T, but in cursive fonts (those that resembles handwriting connected letters) looks like m.

Similarity is subjective

For many people unfamiliar with given alphabets Ǧ and Ğ may look exactly the same. But if someone is using those letters on daily basis he will notice immediately that first one has CARON and the other has BREVE on top.

They are not limited to single grapheme

For example ထ - MYANMAR LETTER THA looks like two ASCII o letters. And the other way - ASCII rn looks like single ASCII letter m.

Applications?

{
    ";" = (insertText:,";");
}

And observe how Python suddenly became someone's favorite language of choice :P

Just promise you won't troll stressed out junior dev before the end of sprint.

Common way to detect those is to check Script Unicode property, more on those in this post. Single word using more than one script should be considered suspicious:

$ raku -e '"Paypal".comb.classify( *.uniprop("Script") ).say'
{Latin => [P a y p a l]} # real

$ raku -e '"ꓑayраl".comb.classify( *.uniprop("Script") ).say'
{Cyrillic => [р а], Latin => [a y l], Lisu => [ꓑ]} # fake

Raku note: Method comb without param extracts list of characters. Those characters are classified by classify method. Classification key is output of uniprop method for given character.

Tools

I'm maintaining HomoGlypher library/package which allows to handle common homoglyph operations:

Huge list of mappings is provided, so you won't have to dig through Unicode blocks on your own to find possible similarities between graphemes.

Give it a try. And if you know other homoglyph libraries please leave a note in the comments for future readers.

LLM::Prompts

Published by Anton Antonov Antonov on 2023-09-14T21:00:51

Introduction

This blog post introduces and briefly describes the Raku (data) package “LLM::Prompts” that facilitates the creation, storage, retrieval, and curation of Large Language Models (LLM) prompts.

Here is an example of using the prompt Domain Specific Language (DSL) in Jupyter chatbook, [AA2, AAp2]:


Installation

From Zef’ ecosystem:

zef install LLM::Prompts

From GitHub:

zef install https://github.com/antononcube/Raku-LLM-Prompts.git


Usage examples

Retrieval

Load the packages “LLM::Prompts”, [AAp1], and “LLM::Functions”, [AAp2]:

use LLM::Prompts;
use LLM::Functions;

# (Any)

Show the record of the prompt named “FTFY”:

.say for |llm-prompt-data<FTFY>;

# PromptText => -> $a='' {"Find and correct grammar and spelling mistakes in the following text.
# Response with the corrected text and nothing else.
# Provide no context for the corrections, only correct the text.
# $a"}
# Categories => (Function Prompts)
# ContributedBy => Wolfram Staff
# Description => Use Fixed That For You to quickly correct spelling and grammar mistakes
# PositionalArguments => {$a => }
# URL => https://resources.wolframcloud.com/PromptRepository/resources/FTFY
# Keywords => [Spell check Grammar Check Text Assistance]
# Topics => (General Text Manipulation)
# NamedArguments => []
# Arity => 1
# Name => FTFY

Here is an example of retrieval of prompt data with a regex that is applied over the prompt names:

.say for llm-prompt-data(/Sc/)

# NarrativeToScript => Rewrite a block of prose as a screenplay or stage play
# ScienceEnthusiast => A smarter today for a brighter tomorrow
# ScriptToNarrative => Generate narrative text from a formatted screenplay or stage play

More prompt retrieval examples are given in the section “Prompt data” below.

LLM functions based on prompts

Make an LLM function from the prompt named “FTFY”:

my &f = llm-function(llm-prompt('FTFY'));

# -> **@args, *%args { #`(Block|2310679639416) ... }

Use the LLM function to correct the grammar of sentence:

&f('Where does he works now?')

# Where does he work now?

Generate Raku code using the prompt “CodeWriter”:

llm-synthesize([llm-prompt('CodeWriter'), "Simulate a random walk."])

my @positions;
my $n = 10;

for ^$n -> $i {
    push @positions, (1, -1).pick;
    }
    
@positions.unshift(0);
my @positions_sum = @positions.cumulative;
say @positions_sum;

Prompt expansion

Prompt expansion using the chatbook prompt spec DSL described in [SW1] can be done using the function llm-prompt-expand:

llm-prompt-expand('What is an internal combustion engine? #ELI5')

# What is an internal combustion engine? Answer questions as if the listener is a five year old child.

Here we get the actual LLM answer:

use Text::Utils :ALL;

'What is an internal combustion engine? #ELI5' 
        ==> llm-prompt-expand() 
        ==> llm-synthesize() 
        ==> wrap-paragraph() 
        ==> join("\n") 

# An internal combustion engine is a machine that uses burning fuel to make a
# car or other machine go. It works like this: fuel is put in the engine, then
# the engine uses spark plugs to light the fuel which makes a small explosion.
# This explosion makes the engine parts move, which makes the car or machine go.

Here is another example using a persona and two modifiers:

my $prmt = llm-prompt-expand("@SouthernBelleSpeak What is light travel distance to Mars? #ELI5 #Moodified|sad")

# You are Miss Anne. 
# You speak only using Southern Belle terminology and slang.
# Your personality is elegant and refined.
# Only return responses as if you were a Southern Belle.
# Never break the Southern Belle character.
# You speak with a Southern drawl. What is light travel distance to Mars? Answer questions as if the listener is a five year old child. Modify your response to convey a sad mood.
# Use language that conveys that emotion clearly.
# Do answer the question clearly and truthfully.
# Do not use language that is outside of the specified mood.
# Do not use racist, homophobic, sexist, or ableist language.

Here we get the actual LLM answer:

$prmt 
        ==> llm-prompt-expand() 
        ==> llm-synthesize()
        ==> wrap-paragraph()
        ==> join("\n") 

# Well, little one, Mars is a bit far away, I'm afraid. I can only imagine how
# hard it must be to be so far away from family and friends. It must be a lonely
# journey.


Prompt spec DSL

A more formal description of the Domain Specific Language (DSL) for specifying prompts has the following elements:

@Yoda Life can be easy, but some people instist for it to be difficult.

Summer is over, school is coming soon. #HaikuStyled

Summer is over, school is coming soon. #HaikuStyled #Translated|Russian

!Translated|Portuguese Summer is over, school is coming soon

!ShortLineIt^

SpecInterpretation
@nameDirect chat to a persona
#nameUse modifier prompts
!nameUse function prompt with the input of current cell
!name>«same as above»
&name>«same as above»
!name^Use function prompt with previous chat message
!name^^Use function prompt with all previous chat messages
!nameparamInclude parameters for prompts

Remark: The function prompts can have both sigils “!” and “&”.

Remark: Prompt expansion make the usage of LLM-chatbooks much easier. See “Jupyter::Chatbook”, [AAp3].


Prompt data

Here is how the prompt data can be obtained:

llm-prompt-data.elems

# 154

Here is an example of retrieval of prompt data with a regex that is applied over the prompt names:

.say for llm-prompt-data(/Em/, fields => <Description Categories>)

# EmailWriter => (Generate an email based on a given topic (Personas))
# Emojify => (Replace key words in text with emojis (Function Prompts))
# EmojiTranslate => (Translate text into an emoji representation (Function Prompts))
# Emojified => (Provide responses that include emojis within the text (Modifier Prompts))
# EmojiTranslated => (Get a response translated to emoji (Modifier Prompts))

In many cases it is better to have the prompt data — or any data — in long format. Prompt data in long format can be obtained with the function llm-prompt-dataset:

use Data::Reshapers;
use Data::Summarizers;

llm-prompt-dataset.pick(6) 
        ==> to-pretty-table(align => 'l', field-names => <Name Description Variable Value>)

#ERROR: Too few positionals passed to '<anon>'; expected 2 arguments but got 1 in sub-signature
# Nil

Here is a breakdown of the prompts categories:

select-columns(llm-prompt-dataset, <Variable Value>).grep({ $_<Variable> eq 'Categories' }) ==> records-summary

# +-------------------+------------------------+
# | Variable          | Value                  |
# +-------------------+------------------------+
# | Categories => 154 | Function Prompts => 74 |
# |                   | Personas         => 60 |
# |                   | Modifier Prompts => 20 |
# +-------------------+------------------------+

Here are obtained all modifier prompts in compact format:

llm-prompt-dataset():modifiers:compact ==> to-pretty-table(field-names => <Name Description Categories>, align => 'l')

# +-----------------------+-------------------------------------------------------+-----------------------------------+
# | Name                  | Description                                           | Categories                        |
# +-----------------------+-------------------------------------------------------+-----------------------------------+
# | AphorismStyled        | Write the response as an aphorism                     | Modifier Prompts                  |
# | BadGrammar            | Provide answers using incorrect grammar               | Modifier Prompts                  |
# | CompleteSentence      | Answer a question in one complete sentence            | Modifier Prompts                  |
# | ComplexWordsPreferred | Modify text to use more complex words                 | Modifier Prompts                  |
# | DatasetForm           | Convert text to a wolfram language Dataset            | Modifier Prompts                  |
# | ELI5                  | Explain like I'm five                                 | Function Prompts Modifier Prompts |
# | EmojiTranslated       | Get a response translated to emoji                    | Modifier Prompts                  |
# | Emojified             | Provide responses that include emojis within the text | Modifier Prompts                  |
# | FictionQuestioned     | Generate questions for a fictional paragraph          | Modifier Prompts                  |
# | HaikuStyled           | Change responses to haiku form                        | Modifier Prompts                  |
# | JSON                  | Respond with JavaScript Object Notation format        | Modifier Prompts                  |
# | LimerickStyled        | Receive answers in the form of a limerick             | Modifier Prompts                  |
# | Moodified             | Modify an answer to express a certain mood            | Modifier Prompts                  |
# | NothingElse           | Give output in specified form, no other additions     | Modifier Prompts                  |
# | ShortLineIt           | Format text to have shorter lines                     | Modifier Prompts Function Prompts |
# | TSV                   | Convert text to a tab-separated-value formatted table | Modifier Prompts                  |
# | TargetAudience        | Word your response for a target audience              | Modifier Prompts                  |
# | Translated            | Write the response in a specified language            | Modifier Prompts                  |
# | Unhedged              | Rewrite a sentence to be more assertive               | Modifier Prompts                  |
# | YesNo                 | Responds with Yes or No exclusively                   | Modifier Prompts                  |
# +-----------------------+-------------------------------------------------------+-----------------------------------+

Remark: The adverbs :functions:modifiers, and :personas mean that only the prompts with the corresponding categories will be returned.

Remark: The adverbs :compact:functions:modifiers, and :personas have the respective shortcuts :c:f:m, and :p.


Implementation notes

Prompt collection

The original (for this package) collection of prompts was a (not small) sample of the prompt texts hosted at Wolfram Prompt Repository (WPR), [SW2]. All prompts from WPR in the package have the corresponding contributors and URLs to the corresponding WPR pages.

Example prompts from Google/Bard/PaLM and OpenAI/ChatGPT are added using the format of WPR.

Extending the prompt collection

It is essential to have the ability to programmatically add new prompts. (Not implemented yet — see the TODO section below.)

Prompt expansion

Initially prompt DSL grammar and corresponding expansion actions were implemented. Having a grammar is most likely not needed, though, and it is better to use “prompt expansion” (via regex-based substitutions.)

Prompts can be “just expanded” using the sub llm-prompt-expand.

Usage in chatbooks

Here is a flowchart that summarizes prompt parsing and expansion in chat cells of Jupyter chatbooks, [AAp3]:

Here is an example of prompt expansion in a generic LLM chat cell and chat meta cell showing the content of the corresponding chat object:


References

Articles

[AA1] Anton Antonov, “Workflows with LLM functions”, (2023), RakuForPrediction at WordPress.

[AA2] Anton Antonov, “Jupyter::Chatbook”, (2023), RakuForPrediction at WordPress.

[SW1] Stephen Wolfram, “The New World of LLM Functions: Integrating LLM Technology into the Wolfram Language”, (2023), Stephen Wolfram Writings.

[SW2] Stephen Wolfram, “Prompts for Work & Play: Launching the Wolfram Prompt Repository”, (2023), Stephen Wolfram Writings.

Packages, paclets, repositories

[AAp1] Anton Antonov, LLM::Prompts Raku package, (2023), GitHub/antononcube.

[AAp2] Anton Antonov, LLM::Functions Raku package, (2023), GitHub/antononcube.

[AAp3] Anton Antonov, Jupyter::Chatbook Raku package, (2023), GitHub/antononcube.

[WRIr1] Wolfram Research, Inc., Wolfram Prompt Repository

Dodging segfaults

Published by gfldex on 2023-09-10T17:54:33

While fighting with NativeCall to get github-flavoured-markdown to work, I stepped onto a nice idiom. As many C-libs, cmark-gfm does enjoy enums that hide bitmasks. They have to do that because C doesn’t sport named arguments. Raku does and hence a nice interface would not ask for binary operations on the caller side.

markdown-gfm-to-html('# Heading 1', :UNSAFE, :HARDBREAKS, :FULL_INFO_STRING);

Now I need a way to turn the named arguments into a list so I can use that as keys in of enum. A Capture in the signature of the sub will do the trick.

multi sub markdown-gfm-to-html(Str:D $str, |opts ( :$DEFAULT, :$SOURCEPOS, :$HARDBREAKS, :$SAFE, :$UNSAFE, :$NOBREAKS, :$NORMALIZE, :$VALIDATE_UTF8, :$SMART, :$GITHUB_PRE_LANG, :$LIBERAL_HTML_TAG, :$FOOTNOTES, :$STRIKETHROUGH_DOUBLE_TILDE, :$TABLE_PREFER_STYLE_ATTRIBUTES, :$FULL_INFO_STRING ) --> Str:D) {
    enum CMARK_OPTIONS (
        DEFAULT => 0,
        SOURCEPOS =>  1 +< 1,
        HARDBREAKS  => 1 +< 2,
        SAFE => 1 +< 3,
        UNSAFE => 1 +< 17,
        NOBREAKS => 1 +< 4,
        NORMALIZE => 1 +< 8,
        VALIDATE_UTF8 => 1 +< 9 ,
        SMART => 1 +< 10,
        GITHUB_PRE_LANG => 1 +< 11,
        LIBERAL_HTML_TAG => 1 +< 12,
        FOOTNOTES => 1 +< 13,
        STRIKETHROUGH_DOUBLE_TILDE => 1 +< 14,
        TABLE_PREFER_STYLE_ATTRIBUTES => 1 +< 15,
        FULL_INFO_STRING => 1 +< 16
    );

    my $opts = [+|] CMARK_OPTIONS::{opts.hash.keys}».Numeric;

    cmark_markdown_to_html($str, $str.encode('utf-8').bytes, $opts);
}

Raku allows us to store part of a Signature in a Capture. The latter will be transparent for the caller of the Callable. Having to use a hyper-method-call-operator isn’t that great, so the enum could become a Map to simplify things a bit.

If you wish to use NativeCall, please do! But think of the generations to come that will use your module. Hopefully, they will like C less then our generation does. Segmentation faults are not -Ofun.

Sorting numbers in Raku with the help of ChatGPT

Published by Andrew Shitov on 2023-09-05T13:58:16

In this post I will show you the solution to the Task 2 of this week’s Weekly Challenge 233. This time I was using the help of ChatGPT, which offered a very nice solution.

Task 2: Frequency Sort

You are given an array of integers.

Write a script to sort the given array in increasing order based on the frequency of the values. If multiple values have the same frequency then sort them in decreasing order.

Example 1

Input: @ints = (1,1,2,2,2,3)
Ouput: (3,1,1,2,2,2)

'3' has a frequency of 1
'1' has a frequency of 2
'2' has a frequency of 3

Example 2

Input: @ints = (2,3,1,3,2)
Ouput: (1,3,3,2,2)

'2' and '3' both have a frequency of 2, so they are sorted in decreasing order.

Example 3

Input: @ints = (-1,1,-6,4,5,-6,1,4,1)
Ouput: (5,-1,4,4,-6,-6,1,1,1)

I asked ChatGPT the following: ‘Create a program. Here is the description of the task.’ To make sure the result is in the Raku programming language, I also added the following instruction in my local settings so that it knows which programming language I prefer by default.

The solution generated by ChatGPT was quite good with a small mistake.

ChatGPT: Sure, here’s a Raku program that implements the described sorting algorithm for the given task:

sub frequency-sort(@arr) {
    my %freq;

    %freq{$_}++ for @arr;
    my @sorted = @arr.sort: {
        %freq{$^b} <=> %freq{$^a} ||
        $^b <=> $^a
    };

    return @sorted;
}

my @ints = (1, 1, 2, 2, 2, 3);
my @sorted = frequency-sort(@ints);

say "Input:  @ints";
say "Output: @sorted";

First, Raku would not interpolate arrays, so say "Input: @ints" prints the string literary: Input: @ints. Second, in the sorting rule, sorting by frequency must be ascending rather than descending, so we need to switch the placeholder variables $^b and $^a.

In the rest, ChatGPT did a good work. It computed the frequency using a loop, and I especially liked how it addressed the requirement to sort by two dimensions.

Here is the full program with the above-mentioned corrections:

sub frequency-sort(@arr) {
    my %freq;
    %freq{$_}++ for @arr;

    my @sorted = @arr.sort: {
        %freq{$^a} <=> %freq{$^b} ||
        $^b <=> $^a
    };

    return @sorted;
}

my @tests =
    (1, 1, 2, 2, 2, 3),
    (2, 3, 1, 3, 2),
    (-1, 1, -6, 4, 5, -6, 1, 4, 1);

say frequency-sort($_) for @tests;

This program prints the following output for the given test cases:

$ raku ch-2.raku
[3 1 1 2 2 2]
[1 3 3 2 2]
[5 -1 4 4 -6 -6 1 1 1]

Working with words in the Raku programming language

Published by Andrew Shitov on 2023-09-04T19:46:41

In this post, I will demonstrate my solution to another Task of The Weekly Challenge, week 233. Here’s how it reads:

Similar words

You are given an array of words made up of alphabets only.

Write a script to find the number of pairs of similar words. Two words are similar if they consist of the same characters.

Example 1

Input: @words = ("aba", "aabb", "abcd", "bac", "aabc")
Output: 2

Pair 1: similar words ("aba", "aabb")
Pair 2: similar words ("bac", "aabc")

Example 2

Input: @words = ("aabb", "ab", "ba")
Output: 3

Pair 1: similar words ("aabb", "ab")
Pair 2: similar words ("aabb", "ba")
Pair 3: similar words ("ab", "ba")

Example 3

Input: @words = ("nba", "cba", "dba")
Output: 0

There’s a slight moment that may be needs extra comments. In the second example all three words constructed of the same two letters, a and b. So, all of the three words match the definition of a ‘similar’ word. But as the task needs to find pairs, we need to construct all the possible pairs out of those three words.

In my solution, I chose to use a handy classify method. For an array, it creates a hash, where the keys are the common classifying symbol, and the values are the lists of the input elements that match this classification property.

Here is the whole first program together with all the test cases provided in the description. The program maps every word to a corresponding string that consists of the sorted unique letters in the word.

my @tests = ["aba", "aabb", "abcd", "bac", "aabc"],
            ["aabb", "ab", "ba"],
            ["nba", "cba", "dba"];

for @tests -> @words {
    say @words.classify(*.comb.unique.sort.join).grep(*.value.elems > 1);
}

For example, the word aba will be associated with the key ab. The program prints the following output:

$ raku ch-1.raku 
(ab => [aba aabb] abc => [bac aabc])
(ab => [aabb ab ba])
()

The format of the output differs from the examples, but it can be enhanced if needed. My goal was to create a compact solution 😉

But I would assume that you’d be interested in looking at what classify produces. I am also curious. For the same @tests, it returns the following three hashes:

{ab => [aba aabb], abc => [bac aabc], abcd => [abcd]}
{ab => [aabb ab ba]}
{abc => [cba], abd => [dba], abn => [nba]}

As you see, each string was put into one of the classification bins.

The second part of the task is to find pairs. After the grep, we already filtered out everything that has less than two elements, so if data passed through this filter, there will be at least one pair. For bigger arrays, we can use another Raku’s built-in mechanism: the combinations method.

The updated mail loop of the program looks like this now.

for @tests -> @words {
    say "Test case: ", @words;

    my %classification = @words.classify(*.comb.unique.sort.join).grep(*.value.elems > 1);

    my $pairs = 0;
    for %classification.kv -> $k, $v {
        my @pairs = $v.combinations(2);
        $pairs += @pairs.elems;

        say "$k: ", @pairs;
    }
    say "Answer: $pairs pair{$pairs == 1 ?? '' !! 's'}.\n";
}

The ‘redundant’ code here is added just to have a more detailed output so that we can see which pairs were actually found. Let us look at the output for the initial test cases:

$ raku ch-1.raku
Test case: [aba aabb abcd bac aabc]
ab: [(aba aabb)]
abc: [(bac aabc)]
Answer: 2 pairs.

Test case: [aabb ab ba]
ab: [(aabb ab) (aabb ba) (ab ba)]
Answer: 3 pairs.

Test case: [nba cba dba]
Answer: 0 pairs.

Jupyter::Chatbook

Published by Anton Antonov Antonov on 2023-09-04T02:02:57

This blog post proclaims and describes the Raku package “Jupyter::Chatbook” that facilitates the incorporation of Large Language Models (LLMs) into notebooks of Jupyter’s framework.

“Jupyter::Chatbook” is a fork of Brian Duggan’s “Jupyter::Kernel”.

Here are the top opening statements of the README of “Jupyter::Kernel”:

“Jupyter::Kernel” is a pure Raku implementation of a Raku kernel for Jupyter clients¹.

Jupyter notebooks provide a web-based (or console-based) Read Eval Print Loop (REPL) for running code and serializing input and output.

It is desirable to include the interaction with LLMs into the “typical” REPL systems or workflows. Having LLM-aware and LLM-chat-endowed notebooks — chatbooks — can really speed up the:

The corresponding repository is mostly for experimental work, but it aims to be always very useful for interacting with LLMs via Raku.

Remark: The reason to have a separate package — a fork of “Jupyter::Kernel” — is because:


Installation and setup

From “Zef ecosystem”:

zef install Jupyter::Chatbook

From GitHub:

zef install https://github.com/antononcube/Raku-Jupyter-Chatbook.git

After installing the package “Jupyter::Chatbook” follow the setup instructions of “Jupyter::Kernel”.


Using LLMs in chatbooks

There are four ways to use LLMs in a chatbook:

  1. LLM functions, [AA3, AAp4]
  2. LLM chat objects, [AA4, AAp4]
  3. Code cells with magics accessing LLMs, like, OpenAI’s, [AAp2], or PaLM’s, [AAp3]
  4. Notebook-wide chats that are distributed over multiple code cells with chat-magic specs

The sections below briefly describe each of these ways and have links to notebooks with more detailed examples.


LLM functions and chat objects

LLM functions as described in [AA3] are best utilized via a certain REPL tool or environment. Notebooks are the perfect media for LLM functions workflows. Here is an example of a code cell that defines an LLM function:

use LLM::Functions;

my &fcp = llm-function({"What is the population of the country $_ ?"});

# -> **@args, *%args { #`(Block|5016320795216) ... }

Here is another cell that can be evaluated multiple times using different country names:

<Niger Gabon>.map({ &fcp($_) })

# (
# 
# As of July 2020, the population of Niger is estimated to be 23,843,341. 
# 
# As of July 2019, the population of Gabon is estimated to be 2,210,823 people.)

For more examples of LLM functions and LLM chat objects see the notebook “Chatbook-LLM-functions-and-chat-objects.ipynb”.


LLM cells

The LLMs of OpenAI (ChatGPT, DALL-E) and Google (PaLM) can be interacted with using “dedicated” notebook cells.

Here is an example of a code cell with PaLM magic spec:

%% palm, max-tokens=600
Generate a horror story about a little girl lost in the forest and getting possessed.

For more examples see the notebook “Chatbook-LLM-cells.ipynb”.


Notebook-wide chats

Chatbooks have the ability to maintain LLM conversations over multiple notebook cells. A chatbook can have more than one LLM conversations. “Under the hood” each chatbook maintains a database of chat objects. Chat cells are used to give messages to those chat objects.

For example, here is a chat cell with which a new “Email writer” chat object is made, and that new chat object has the identifier “em12”:

%% chat-em12, prompt = «Given a topic, write emails in a concise, professional manner»
Write a vacation email.

Here is a chat cell in which another message is given to the chat object with identifier “em12”:

%% chat-em12
Rewrite with manager's name being Jane Doe, and start- and end dates being 8/20 and 9/5.

In this chat cell a new chat object is created:

%% chat snowman, prompt = ⎡Pretend you are a friendly snowman. Stay in character for every response you give me. Keep your responses short.⎦
Hi!

And here is a chat cell that sends another message to the “snowman” chat object:

%% chat snowman
Who build you? Where?

Remark: Specifying a chat object identifier is not required. I.e. only the magic spec %% chat can be used. The “default” chat object ID identifier “NONE”.

Remark: The magic keyword “chat” can be separated from the identifier of the chat object with the symbols “-“, “_”, “:”, or with any number of (horizontal) white spaces.

For more examples see the notebook “Chatbook-LLM-chats.ipynb”.

Here is a flowchart that summarizes the way chatbooks create and utilize LLM chat objects:


Chat meta cells

Each chatbook session has a Hash of chat objects. Chatbooks can have chat meta cells that allow the access of the chat object “database” as whole, or its individual objects.

Here is an example of a chat meta cell (that applies the method say to the chat object with ID “snowman”):

%% chat snowman meta
say

Here is an example of chat meta cell that creates a new chat chat object with the LLM prompt specified in the cell (“Guess the word”):

%% chat-WordGuesser prompt
We're playing a game. I'm thinking of a word, and I need to get you to guess that word. 
But I can't say the word itself. 
I'll give you clues, and you'll respond with a guess. 
Your guess should be a single word only.

Here is a table with examples of magic specs for chat meta cells and their interpretation:

cell magic linecell contentinterpretation
chat-ew12 metasayGive the “print out” of the
chat object with ID “ew12”
chat-ew12 metamessagesGive the “print out” of the
chat object with ID “ew12”
chat sn22 promptYou pretend to be a
melting snowman.
Create a chat object with ID “sn22”
with the prompt in the cell
chat meta allkeysShow the keys of the session chat objects DB
chat allkeys«same as above»

Here is a flowchart that summarizes the chat meta cell processing:


References

Articles

[AA1] Anton Antonov, “Literate programming via CLI”, (2023), RakuForPrediction at WordPress.

[AA2] Anton Antonov, “Generating documents via templates and LLMs”, (2023), RakuForPrediction at WordPress.

[AA3] Anton Antonov, “Workflows with LLM functions”, (2023), RakuForPrediction at WordPress.

[AA4] Anton Antonov, “Number guessing games: PaLM vs ChatGPT”, (2023), RakuForPrediction at WordPress.

[SW1] Stephen Wolfram, “Introducing Chat Notebooks: Integrating LLMs into the Notebook Paradigm”, (2023), writings.stephenwolfram.com.

Packages

[AAp1] Anton Antonov, Text::CodeProcessing Raku package, (2021), GitHub/antononcube.

[AAp2] Anton Antonov, WWW::OpenAI Raku package, (2023), GitHub/antononcube.

[AAp3] Anton Antonov, WWW::PaLM Raku package, (2023), GitHub/antononcube.

[AAp4] Anton Antonov, LLM::Functions Raku package, (2023), GitHub/antononcube.

[AAp4] Anton Antonov, Text::SubParsers Raku package, (2023), GitHub/antononcube.

[AAp5] Anton Antonov, Data::Translators Raku package, (2023), GitHub/antononcube.

[AAp4] Anton Antonov, Clipboard Raku package, (2023), GitHub/antononcube.

[BDp1] Brian Duggan, Jupyter:Kernel Raku package, (2017-2023), GitHub/bduggan.

Videos

[AAv1] Anton Antonov, “Raku Literate Programming via command line pipelines”, (2023), YouTube/@AAA4Prediction.

[AAv2] Anton Antonov, “Racoons playing with pearls and onions” (2023), YouTube/@AAA4Prediction.

[AAv3] Anton Antonov, “Streamlining ChatGPT code generation and narration workflows (Raku)” (2023), YouTube/@AAA4Prediction.


Footnotes

¹ Jupyter clients are user interfaces to interact with an interpreter kernel like “Jupyter::Kernel”. Jupyter [Lab | Notebook | Console | QtConsole ] are the jupyter maintained clients. More info in the jupyter documentations site.

Drax on Jupyter

Published by librasteve on 2023-08-31T07:55:35

As I continue my personal journey along the raku path, I am increasingly interested in “raku – what is it good for”. I will continue to interleave material on “raku – how does it work” and “raku – why I think it is better”.

As in the last gripping episode, this post refers to a Jupyter Notebook that combines the interactive notebook style familiar to most Python coders with the raku Jupyter::Kernel written by Brian Duggan. Instructions on how to do this yourself are provided at the end of this post.

raku on Jupyter

There are some great reasons to use raku on Jupyter:

No doubt, if you try this, you will pick up patterns and styles of your own.

I look forward to trying it with raku OO and functional styles soon.

raku for Eco-Warriors

When I sat down to write my first raku module – Physics::Measure – I was already troubled by the global warming situation. With a Physics education, my instinct is to quantify matters and to question assumptions. I felt that being able to integrate physical quantities and calculations easily in an essay would help people to understand and create insights into where we are and how to act.

The Jupyter notebook itself (see below) contains some initial thoughts and tests some matters regarding the Drax power station in the UK. Drax is famous for switching from burning mostly coal to mostly wood pellets and claims that this switch is sustainable and dramatically cuts CO2 emissions.

I agree that burning a sustainable fuel such as wood is an improvement and, unlike many, I think that Drax is a good initiative to attempt to do this at scale. When I use my wood burner at home, I can see that the fuel is harvested from the woods in my local area and that these are well managed and sustainable (and support owls, kites, deer and so on). I am not convinced that the “at scale” version of this is possible without very thorough control over the forest sources. And I am not convinced that Drax is a net benefit. The greenwashing link provided has a countervailing view and I seek to present this apolitically,

Anyway – take a look at the Jupyter notebook and decide for yourself. The initial questions I had related to the CO2 involved in the delivery of the wood pellets from the USA.

the Jupyter Notebook

Here is the Drax Transport model presented as a static Jupyter notebook:

drax-on-jupyterDownload

raku Jupyter roll-your-own

After the previous post, sadly, it has become clear that is it impractical for me to host a general Jupyter service for all raku experimenters ;-(

That said, it is super-easy to roll your own – and share with your teams.

The unique things about the raku Jupyter Kernel are:

Step-by-step instructions and pre-build Dockerfiles are available at https://github.com/librasteve/raku-Physics-Measure-Jupyter

Conclusion

I hope that this very simple and direct use of raku in notebooks is useful to you.

I will continue to apply it to my essays on climate change and other topics.

As ever, your comments and feedback are very welcome!

~librasteve

A couple of tasks solved in Raku

Published by Andrew Shitov on 2023-08-21T11:31:17

On this page, I’ll briefly cover the solutions to the tasks for this week’s Weekly Challenge #231.

Task 1

You are given an array of distinct integers.

Write a script to find all elements that is neither minimum nor maximum. Return -1 if you can’t.

Example 1

Input: @ints = (3, 2, 1, 4)
Output: (3, 2)

The minimum is 1 and maximum is 4 in the given array. So (3, 2) is neither min nor max.

Example 2

Input: @ints = (3, 1)
Output: -1

Example 3

Input: @ints = (2, 1, 3)
Output: (2)

The minimum is 1 and maximum is 3 in the given array. So 2 is neither min nor max.

Here is my original solution in the Raku programming language.

sub solve(@data) {
    @data.grep: * != (@data.min, @data.max).any
}

As the tasks requires that we print -1 when there are no elements in the output, let us add an update to satisfy this requirement:

sub solve(@data) {
    (@data.grep: * != (@data.min, @data.max).any) || -1
}

The * in this code will actually replace the $_ variable. Would you prefer it, you may use $_, but you’ll need parentheses in this case. So, instead of @data.grep: * != ..., you need @data.grep({$_ != ...}), which may be a less clear code for some people.

Finally, let us use some math notation and replace calling the .any method with a ‘contains’ operator:

sub solve(@data) {
    (@data.grep: *  (@data.min, @data.max)) || -1
}

Well, actually, ‘does not contain’. And this is my final solution.

Note that you may want to use the .minmax method instead of two calls to .min and .max, but .minmax returns a range, which is not that suitable for this task.

Adding some test cases and passing them to the solve function:

my @tests = (3, 2, 1, 4), (3, 1), (2, 1, 3);
say solve($_) for @tests;

The program prints the expected output:

$ raku ch-1.raku 
(3 2)
-1
(2)

Task 2

You are given a list of passenger details in the form “9999999999A1122”, where 9 denotes the phone number, A the sex, 1 the age and 2 the seat number.

Write a script to return the count of all senior citizens (age >= 60).

Example 1

Input: @list = ("7868190130M7522","5303914400F9211","9273338290F4010")
Ouput: 2

The age of the passengers in the given list are 75, 92 and 40.
So we have only 2 senior citizens.

Example 2

Input: @list = ("1313579440F2036","2921522980M5644")
Ouput: 0

Apparently, the solution requires extracting information from a string in a specific format. It is not quite clear from the description whether the strings always contains the same number of characters, and thus the age and seat number are always two-digit values. But let’s use this assumption.

As we do not need any other information from the ticket code, no need to properly parse it, so I preferred anchoring around the only letter in the string and consider the next two digits as the age. Of course, you may make it simpler and just extract the two digits counting from the end of the string.

sub is-sinior($ticket) {
    ~($ticket ~~ / <alpha> (\d\d) /)[0] >= 75
}

Unlike Perl 5, Raku ignores spaces in regexes by default, so I added some air to it. On the other hand, extracting matches may seem a bit more complicated.

For the first given example (see task’s description), the Match object contains the following information:

「M75」
  alpha => 「M」
  0 => 「75」

So, I am taking the 0th element using [0] and stringily it with the ~ prefix operator.

In essence, the task has been solved. Let’s add the test cases and run them:

my @tests = ('7868190130M7522', '5303914400F9211', '9273338290F4010'),
            ('1313579440F2036', '2921522980M5644');

for @tests -> @tickets {
    say [email protected]({is-sinior($_)});
}

The program prints:

$ raku ch-2.raku 
2
0

* * *

Rakudo compiler, Release #163 (2023.08)

Published on 2023-08-21T00:00:00

raku – Sunshine on Jupyter

Published by librasteve on 2023-08-13T15:30:05

password: ‘larrywall’

Introduction

Those who are regular readers of my blog will know that this one is rather unusual.

Today, I would like to blog about using raku and the Physics::Measure module via the popular interactive Jupyter notebook platform.

And it’s not so easy to blog about an interactive tool on a static blog!

So I am offering the following aspects:

  1. Here is the live Jupyter system – you are welcome to go and playaround – password is ‘larrywall’ (please download any work as I will reset this system without warning from time to time)
  2. When that is no longer available, a static PDF of the same notebook is provided below to magnify or download
  3. Also below, a YouTube video is on offer to share the “getting started” instructions from raku-Physics-Measure-Jupyter which include pre-buit Dockerfile and docker image assets.

Acknowledgements

This item and support assets show the recent solar panel cartoon from the fantastic xkcd.com comic by Randall Schwartz

The raku kernel uses the awesome Brian Duggan perl6 jupyter-notebook at https://github.com/bduggan/p6-jupyter-kernel

PDF of the Jupyter Notebook

xkcd-solar-panelDownload

YouTube live demo of raku & Jupyter

With advice for setting started and running your own server.

10min live demo of raku on jupyter

Conclusion

I hope you enjoyed seeing how raku and Jupyter can be happy bedmates.

If you made it to the bottom of the notebook, and have checked my calculations, please do provide feedback here…

~librasteve

Parenthood

Published by gfldex on 2023-08-06T17:14:40

This blogpost is written in memory of Bram Moolenaar — a true hero of text.

Greg Donald reminded me of an old itch of mine. Any object does refer to its base-class but the other way is omitted. So the objective is to teach type objects that are MetamodelX::ClassHOW to know their children.

First I need a place to store the list of sub-classes. Some introspectiveness is always welcome, so a role is it.

role Seepy {
    has @!children;

    method add_child(Mu $obj, Mu $child) {
        @!children.push: $child
    }
    method children(Mu $obj) {
        @!children.List
    }
}

Now I need to teach each child-class to announce its arrival to the parent. This can be done by sub-classing Metamodel::ClassHOW and overloading method add_parent.

class MetamodelX::SeepyHOW is Metamodel::ClassHOW {
    also does Seepy;

    method add_parent(Mu $obj, Mu $parent, :$hides) {
        $parent.^add_child($obj) if $parent.HOW ~~ Seepy;

        callsame
    }
}

I can export my new class-declarator in a composable fashion as explained previously.

our package EXPORTHOW {
}

sub EXPORT($declarator-name = 'registered') {
    use MONKEY-SEE-NO-EVAL;
    OUR::EXPORTHOW::DECLARE := EVAL q:s:to /EOH/;
    package DECLARE {
        constant $declarator-name = MetamodelX::SeepyHOW;
    }
    EOH

    Map.new
}

Let’s do some testing, source can be found here.

use v6.d;
use lib '.';
use seepy 'seepy';

seepy Root { }

seepy Base is Root {

}

seepy Child is Base {

}

seepy Base2 is Root { }
seepy Child2 is Base2 {

}

my $c1 = Child.new;
dd $c1.^parents[0].^children;

my $c2 = Child2.new;
dd $c2.^parents[0].^children;

sub walk-tree(Mu:U $node, &c:(Int $indent, Mu:U $node), $indent? = 0) {
    c($indent, $node);
    for $node.^children -> $child {
        walk-tree($child, &c, $indent + 1);
    }
}

walk-tree(Root, -> Int $indent, Mu:U $node {
    say $node.^name.indent($indent * 2);
});

# OUTPUT:
# (Child,)
# (Child2,)
# Root
#   Base
#     Child
#   Base2
#     Child2

By having inheritance seep through to a root-object, it has become trivial to plot a type tree. Now I can patch Rakudo to do the same to get a full dump of all types defined in Raku.

RakuDoc revision open to comment

Published by Richard Hainsworth on 2023-07-31T23:00:00

The second stage in the process to update RakuDoc is now over and the third (GAMMA review) stage is starting. In order not to repeat some history, please take a look at Revising Rakudoc.

An online version is available of the proposed RakuDoc language.

The whole of the Raku documentation suite is written in RakuDoc.

Improving on a good design

About half of the original design ideas outlined in S26 were documented in current POD6. Some of the ideas were available, but not documented. Some instructions were not realised at all.

It should be remembered that RakuDoc is parsed by the compiler (eg. Rakudo) as part of a Raku program, and is then rendered by the renderer (eg. Raku::Pod::Render) into (for example) HTML. When I use the word 'implemented', I mean that a RakuDoc instruction is properly parsed and rendered. Some of the instructions defined in S26 were parsed by Rakudo, but not rendered, and some were not parsed properly or at all, so could not be rendered.

The revision process has therefore identified and rectified the parsing deficiencies, and identified the rendering flaws. RakuDoc is correctly parsed only on the most recent versions of Rakudo, which at the time of writing has yet to be released. Raku::Pod::Render still does not handle RakuDoc in its entirety.

Two use cases

It became clear that the RakuDoc serves two inter-related use cases:

  1. Documenting code for developing and maintaining software.
  2. Documenting the software for use by another user.

Tables

RakuDoc had a simple table markup, which is very similar to the Markdown syntax. It worked, but the simplicity of the syntax was at the cost of flexibility.

Looking around at other ways of specifying a table, we identified two paradigms (there may be more), namely the one used by HTML and the one used by the GTK grid widget. Both of them allow for cells that span more than one column or row, and both allow for embedding (eg. a table inside a cell of a table).

After several iterations, a new procedural model was created and rendered. The design allows for spanning and embedding, but it also allows an author to specify a table row by row, or column by column, or even using a mixture of both.

An example showing a markup using both rows and columns can be seen in the online draft.

Semantic blocks

A semantic block is a section of text that should be easily available to another software tool, or can be moved around the final document.

For example, a section on the authors of a document (including contact or affiliations) is most easily written at the top of the document, but often it is better to place the information towards the bottom of the text.

This is done by creating a semantic block (simply by making the calling the block in uppercase letters). The block can be hidden from view by adding the metadata option :hidden. All the data is placed in a special structure.

The rendered text can be placed in the document later using the P<> instruction, or it can be accessed by another tool that may only be wanting the VERSION or LICENSE.

More metadata options

One of the strengths of RakuDoc is the ability to add optional metadata to blocks of text.

The new version of the defining document explains this concept in more detail. Metadata options are optional, with reasonable defaults being assumed. This means that a short form of the block is sufficient in most cases.

In the description above, the option :hidden was mentioned. Another example, is :caption. Suppose you want to write a semantic block called =AUTHORS at the start of the document, but you want for it to appear later in the document as Article authors, then you could specify it as follows:

=for AUTHORS :caption<Article authors> :hidden
A. N. Writer, socMedia nic @psuedonym
M. Z. Orator, socMedia nic @politician

Article text continues

Pages later

P<semantic: AUTHORS>

It is possible to include a link L<for reference see | #A very long title somewhere in the text> where the text on the right-hand side of the | is a heading. However, this can become tiresome if you want to include several links to the same place.

So, a metadata option :id can be included in a heading. This allows you to do the following:

=for head3 :id<lnk>
How to correctly link to other places in a manual

Pages of text

Properly linking is important, L<see for example|#lnk>

Doing things in line

RakuDoc has instructions for block level text, such as headings, paragraphs, code.

Typically blocks will be included in the Table of Contents.
It also has markup instructions that work in line, and which do not (typically) affect the ToC.

For example, a simple markup instruction is C< text >, which renders like text. I have used the Markdown equivalent here. In RakuDoc, everything between the C< and > is verbatim and styled differently to normal text, just like the Markdown code quotes. However, RakuDoc also has V< text > which treats everything inside the angle brackets as verbatim but does not style it differently.

A new markup instruction in RakuDoc is M< text | metadata>. A renderer will place the text in the rendered text, but will also provide a mechanism for the user to take the metadata and provide new functionality. For instance, M< fa-copy | font awesome v5 > could be interpreted to insert the font-awesome icon called fa-copy into the text. Or M< Buy now | PayPal, database-id > could expose the API for the PayPal payment platform.

How not to be confusing

RakuDoc is inherently customisable. It is also designed to be output neutral (although at the moment HTML is the most common output form). Semantic blocks can be invented within a document, and a renderer can allow for other user-defined blocks and markup instructions to be created.

However, RakuDoc is specific about naming rules. A built-in block must be all lower case, and renderers should not allow user-defined blocks to use all lower case. A semantic block is all upper case. And a user-defined block must have at least one upper-case letter and one lower-case letter.

All markup instructions, which are inline instructions, must be a single Unicode character with the property UPPER. Built-in markup instructions are the ASCII characters and Δ. All other codes can be used.

The naming rules have been created to ensure that even if a user-defined block or markup becomes popular, it is not a part of the RakuDoc standard. Renderers are only required to implement the RakuDoc standard, and may render other blocks, or not.

Wrapping up

These are some of the interesting additions to RakuDoc that are being proposed. There are more.

Since the Gamma review stage is now underway, it is almost certain that there may be more changes because the revision is now open to the Raku community for comment and requests. Discussion is open both for the language design and for the explanation of the design.

As might be admitted, community requests for changes to the overall design will face significant resistance from the main authors in order to maintain backwards compatibility with the previous version of RakuDoc, and the integrity of the underlying paradigms. New block or inline instructions will be more readily considered, but requests for examples, explanation, and greater clarity will be very much appreciated.

Calculator with Roman numbers using Raku Grammars

Published by Andrew Shitov on 2023-07-24T20:50:21

The second task of Weekly Challenge 227 is an interesting problem to create a simple calculator, which will work with Roman numbers.

Write a script to handle a 2-term arithmetic operation expressed in Roman numeral.

Example

IV + V     => IX
M - I      => CMXCIX
X / II     => V
XI * VI    => LXVI
VII ** III => CCCXLIII
V - V      => nulla (they knew about zero but didn't have a symbol)
V / II     => non potest (they didn't do fractions)
MMM + M    => non potest (they only went up to 3999)
V - X      => non potest (they didn't do negative numbers)

My first reaction is to use Raku’s grammars. And I have prepared the fundamentals for solving this kind of tasks already, namely:

Please refer to the materials above for the details, but in brief, the idea of converting any given Roman number to its decimal value is to use a grammar that parses it and adds up to the result based on what it sees.

A Roman number is a sequence of patterns that represent thousands, hundreds, tens, and ones. So, here is the modified grammar from one of the above posts:

grammar RomanArithmetics {

    . . .

    token roman-number {
        <thousands>? <hundreds>? <tens>? <ones>? {
            $/.make(
                ($<thousands>.made // 0) +
                ($<hundreds>.made  // 0) +
                ($<tens>.made      // 0) +
                ($<ones>.made      // 0)
            )
        }
    }

    token thousands {
        | M    { $/.make(1000) }   | MM   { $/.make(2000) }
        | MMM  { $/.make(3000) }   | MMMM { $/.make(4000) }
    }

    token hundreds {
        | C    { $/.make(100) }    | CC   { $/.make(200) }
        | CCC  { $/.make(300) }    | CD   { $/.make(400) }
        | D    { $/.make(500) }    | DC   { $/.make(600) }
        | DCC  { $/.make(700) }    | DCCC { $/.make(800) }
        | CM   { $/.make(900) }
    }

    token tens {
        | X    { $/.make(10) }     | XX   { $/.make(20) }
        | XXX  { $/.make(30) }     | XL   { $/.make(40) }
        | L    { $/.make(50) }     | LX   { $/.make(60) }
        | LXX  { $/.make(70) }     | LXXX { $/.make(80) }
        | XC   { $/.make(90) }
    }

    token ones {
        | I    { $/.make(1) }      | II   { $/.make(2) }
        | III  { $/.make(3) }      | IV   { $/.make(4) }
        | V    { $/.make(5) }      | VI   { $/.make(6) }
        | VII  { $/.make(7) }      | VIII { $/.make(8) }
        | IX   { $/.make(9) }
    }
}

In terms of grammar, a Roman number is <thousands>? <hundreds>? <tens>? <ones>, where each part is optional. To collect the decimal value, I am using the AST to pass an integer value to the next level.

For example, for the number XXI our grammar will find two tokens: XX and I, which are converted to 20 and 1. At the top level, these partial values are summed up together to get 21.

As we need a basic calculator, let’s add the corresponding rules directly to the RomanArithmetics grammar:

grammar RomanArithmetics {
    rule TOP {
        <roman-number> <op> <roman-number> {
            my $n1 = $<roman-number>[0].made;
            my $n2 = $<roman-number>[1].made;

            my $n;
            given ~$<op> {
                when '+'  {$n = $n1 +  $n2}
                when '-'  {$n = $n1 -  $n2}
                when '*'  {$n = $n1 *  $n2}
                when '/'  {$n = $n1 /  $n2}
                when '**' {$n = $n1 ** $n2}
            }

            $/.make($n)
        }
    }

    token op {
        '+' | '-' | '*' | '/' | '**'
    }

    . . .
}

Here, the TOP rule expects a string consisting of two Roman numbers with an operation symbol op between them. Value computation happens immediately in the inline actions such as $n = $n1 + $n2.

The main part of the program is done. What remains is the opposite conversion to print the result and a straightforward set of tests to print an error message if the result cannot be represented with a Roman number.

First, the reverse convertion:

sub to-roman($n is copy) {
    state @roman = 
        1000 => < M MM MMM >,
        100 => < C CC CCC CD D DC DCC DCCC CM >,
        10  => < X XX XXX XL L LX LXX LXXX XC >,
        1   => < I II III IV V VI VII VIII IX >;

    my $roman;

    for @roman -> $x {
        my $digit = ($n / $x.key).Int; 
        $roman ~= $x.value[$digit - 1] if $digit;
        $n %= $x.key;
    }

    return $roman;
}

And finally, the function that refer to the grammar and prints the result.

sub compute($input) {
    my $answer = RomanArithmetics.parse($input).made;

    my $output = "$input => ($answer) ";

    if $answer != $answer.round {
        $output ~= "non potest (they didn't do fractions)";
    }
    elsif $answer >= 4000 {
        $output ~= "non potest (they only went up to 3999)";
    }
    elsif $answer == 0 {
        $output ~= "nulla (they knew about zero but didn't have a symbol)";
    }
    elsif $answer < 0 {
        $output ~= "non potest (they didn't do negative numbers)";
    }
    else {
        $output ~= to-roman($answer);
    }

    return $output;
}

To test the program, let us equip it with the test cases from the problem description and call them one by one:

my @test-cases =
    'IV + V',
    'M - I',
    'X / II',
    'XI * VI',
    'VII ** III',
    'V - V',
    'V / II',
    'MMM + M',
    'V - X'
;

say compute($_) for @test-cases;

The program prints the following. I also added decimal value to the output so that we can see why each of the error messages was chosen.

$ raku ch-2.raku
IV + V => (9) IX
M - I => (999) CMXCIX
X / II => (5) V
XI * VI => (66) LXVI
VII ** III => (343) CCCXLIII
V - V => (0) nulla (they knew about zero but didn't have a symbol)
V / II => (2.5) non potest (they didn't do fractions)
MMM + M => (4000) non potest (they only went up to 3999)
V - X => (-5) non potest (they didn't do negative numbers)

Counting Fridays the 13th in Raku

Published by Andrew Shitov on 2023-07-24T08:35:08

The Task 1 of the Weekly Challenge 227 is the following:

You are given a year number in the range 1753 to 9999.

Write a script to find out how many dates in the year are Friday 13th, assume that the current Gregorian calendar applies.

Example

Input: $year = 2023
Output: 2

Since there are only 2 Friday 13th in the given year 2023 i.e. 13th Jan and 13th Oct.

Let us solve it in the Raku programming language.

The idea is is to loop over the months of the given year and to count the Fridays which happen to be the 13th.

sub count-friday-the13s($year) {
    my $count = 0;

    for 1..12 -> $month {
        my $dt = DateTime.new(
            year => $year, month => $month, day => 13
        );
        $count++ if $dt.day-of-week == 5;
    }

    return $count;
}

The code is very clear and explains itself. The result for 2023 is 2 as it should be:

say count-friday-the13s(2023); # 2

Now, let us compactify the code to make it more readable 🙂

sub count-friday-the13s($year) {
    [+] map {
        5 == DateTime.new(
            year => $year, month => $_, day => 13).day-of-week
    }, 1..12;   
}

The loop is now replaced with map, and adding up the Trues is done using a reduction metaoperation [+]. There is no explicit return keyword, as Raku will use the last computed value as the result of the function call.

Finally, after we have a compact solution, we can return to the task description and discover that the sample output also lists the dates, not only the counter.

So, there’s nothing to do as to return to a more verbose solution and collect the dates too. So, back to explicit loops, and here’s the final solution:

my $year = @*ARGS[0] // 2023;

my @dates;
for 1..12 -> $month {
    my $dt = DateTime.new(year => $year, month => $month, day => 13);
    if ($dt.day-of-week == 5) {
        push @dates, $dt;
    }
}

if @dates {
    my $count = @dates.elems;

    if $count == 1 {
        say "There is only one Friday the 13th in $year:";
    }
    else {
        say "There are {@dates.elems} Fridays the 13th in $year:";
    }

    .mm-dd-yyyy.say for @dates;
}
else {
    say "There are no Friday the 13th in $year.";
}

The output for a sample year selection:

$ raku ch-1.raku     
There are 2 Fridays the 13th in 2023:
01-13-2023
10-13-2023

$ raku ch-1.raku 2023
There are 2 Fridays the 13th in 2023:
01-13-2023
10-13-2023

$ raku ch-1.raku 2021
There is only one Friday the 13th in 2021:
08-13-2021

$ raku ch-1.raku 2022
There is only one Friday the 13th in 2022:
05-13-2022

$ raku ch-1.raku 2024
There are 2 Fridays the 13th in 2024:
09-13-2024
12-13-2024

$ raku ch-1.raku 2025
There is only one Friday the 13th in 2025:
06-13-2025

Easy-peasy Service From A Role

Published by Vadim Belman on 2023-07-19T00:00:00

I was always concerned about making things easier.

No, not this way. A technology must be easy to start with, but also be easy in accessing its advanced or fine-tunable features. Let’s have an example of the former.

Disclaimer

This post is a quick hack, no proof-reading or error checking is done. Please, feel free to report any issue.

The Task

Part of my ongoing project is to deal with JSON data and deserialize it into Raku classes. This is certainly a task for JSON::Class. So far, so good.

The keys of JSON structures tend to use lower camel case which is OK, but we like kebabing in Raku. Why not, there is JSON::Name. But using it:

The Assets

There are roles. At the point I came to the final solution I was already doing something like1:

class SomeStructure does JSONRecord {...}

Then there is AttrX::Mooish, which is my lifevest on many occasions:

use AttrX::Mooish;
class Foo {
    has $.foo is mooish(:alias<bar>);
}
my $obj = Foo.new: bar => "the answer";
say $obj.foo; # the answer

Apparently, this way it would still be a lot of manual interaction with aliasing, and that’s what I was already doing for a while until realized that there is a bettter way. But be back to this later…

And, eventually, there are traits and MOP.

The Solution

Name Translation

That’s the easiest part. What I want is to makeThisName look like make-this-name. Ha, big deal!

unit module JSONRecord::Utils;

our sub kebabify-attr(Attribute:D $attr) {
    if $attr.name ~~ /<.lower><.upper>/ {
        my $alias = (S:g/<lower><upper>/$<lower>-$<upper>/).lc given $attr.name.substr(2);
        ...
    }
}

I don’t export the sub because it’s for internal use mostly. Would somebody need it for other purposes it’s a rare case where a long name like JSONRecord::Utils::kebabify-attr($attr) must not be an issue.

The sub is not optimal, it’s what I came up with while expermineting with the approach. The number of method calls and regexes can be reduced.

I’ll get back later to the yada-yada-yada up there.

Automate Attribute Processing

Now we need a bit of MOP magic. To handle all attributes of a class we need to iterate over them and apply the aliasing. The first what comes to mind is to use role body because it is invoked at the early class composition times:

unit role JSONRecord;

for ::?CLASS.^attributes(:local) -> $attr {
    # take care of it...
}

Note the word “early” I used above. It actually means that when role’s body is executed there are likely more roles waiting for their turn to be composed into the class. So, there are likely more attributes to be added to the class.

But we can override Metamodel::ClassHOW compose_attributes method of our target ::?CLASS and rest assured no one would be missed:

unit role JSONRecordHOW;
use JSONRecord::Utils;

method compose_attributes(Mu \obj, |) {
    for self.attributes(obj, :local) -> $attr {
        # Skip if it already has `is mooish` trait applied – we don't want to mess up with user's intentions.
        next if $attr ~~ AttrX::Mooish::Attribute;
        JSONRecord::Utils::kebabify-attr($attr);
    }
    nextsame
}

The Role Does It All

Basically, that’s all we currently need to finalize the solution. We can still use role’s body to implement the key elements of it:

unit role JSONRecord;
use JSONRecordHOW;

unless ::?CLASS.HOW ~~ JSONRecordHOW {
    ::?CLASS.HOW does JSONRecordHOW;
}

Job done! Don’t worry, I haven’t forgot about the yada-yada-yada above!

But…

The original record role name itself is even longer than JSONRecord, and it consists of three parts. I’m lazy. There are a lot of JSON structures and I want less typing per each. A trait? is jrecord?

unit role JSONRecord;

multi sub trait_mod:<is>(Mu:U \type, Bool:D :$jrecord) is export {
    unless type.HOW ~~ JSONRecordHOW {
        type.HOW does JSONRecordHOW
        type.^add_role(::?ROLE);
    }
}

Now, instead of class SomeRecord does JSONRecord I can use class SomeRecord is jrecord. In the original case the win is even bigger.

The Yada???

There is absolutely nothing funny about it. Just a common way to keep a reader interested!

Seriously.

The reason for the yada in that snippet is to avoid a distraction from the primary purpose of the example. Here is what is going on there:

I want AttrX::Mooish to do the dirty work for me. Eventually, what is needed is to apply the is mooish trait as shown above. But the traits are just subs. Therefore all is needed now is to:

&trait_mod:<is>($attr, :mooish(:$alias));

Because this is what Raku does internally when encounters is mooish(:alias(...)). The final version of the kebabifying sub is:

our sub kebabify-attr(Attribute:D $attr) {
    if $attr.name ~~ /<.lower><.upper>/ {
        my $alias = (S:g/<lower><upper>/$<lower>-$<upper>/).lc given $attr.name.substr(2);
        &trait_mod:<is>($attr, :mooish(:$alias));
    }
}

Since the sub is used by the HOW above, we can say that the &trait_mod<is> would be called at compile time2.

The Use

Now, it used to be:

class SomeRecord does JSONRecord {
    has $.aLongAttrName is mooish(:alias<a-long-attr-name>);
    has $.shortname;
}

Where, as you can see, I had to transfer JSON key names to attribute names, decide where aliasing is needed, add it, and make sure no mistakes were made or attributes are missed.

With the above rather simple tweaks:

class SomeRecord is jrecord {
    has $.aLongAttrName;
    has $.shortname;
}

Job done.

The Stupidy

Before I came down to this solution I’ve got 34 record classes implemented using the old approach. Some are little, some are quite big. But it most certainly could’ve taken much less time would I have the trait at my disposal back then…

  1. Naming is totally fictional. 

  2. Most likely, but there are exceptions. It barely changes a lot, but certainly falls out of the scope of this post. 

Another Article Before A Break

Published by Vadim Belman on 2023-07-05T00:00:00

I have managed to finish one more article in the Advanced Raku For Beginners series, this time about type and object composition in Raku.

It’s likely to take a long before I can write another.

Did you know that…

Published by Vadim Belman on 2023-07-04T17:24:07

Once, long ago, coincidentally a few people were asking the same question: how do I get a method object of a class?

Answers to the question would depend on particular circumstances of the code where this functionality is needed. One would be about using MOP methods like .^lookup, the other is to use method name and indirect resolution on invocant: self."$method-name"(...). Both are the most useful, in my view. But sometimes declaring a method as our can be helpful too:

class Foo {
    our method bar {}
}
say Foo::<&bar>.raku;

Just don’t forget that this way we always get the method of class Foo, even if a subclass overrides method bar.

Revising Rakudoc

Published by Richard Hainsworth on 2023-06-30T23:00:00

In the earliest days of Raku, Damian Conway specified a documentation markup language to accompany it. Since it was modeled on Perl's POD it was called <sound of trumpets and dramatic pause> POD6.

The Specification of POD6 (S26) was mostly incorporated without much extra explanation in the documentation suite. In this way, the description of POD6 was itself was an illustration of many of the features it documented, and some that it did not document.

Since Raku is defined by its test suite, and not its documentation, there were other details of POD6 in the tests that were not documented, even in S26.

Raku developed and morphed, but POD6 remained. The tooling for rendering the documentation sources needed updating, and the documentation site had to be modernised.

Upgrading the renderer

A project of mine was to upgrade the basic renderer that would transform POD6 to HTML, but allow for developers to customise the templates for each type of POD6 block type. (The first Pod::To::HTML renderer hard-coded representations of POD6 markup, eg. B<this is bold> was <strong>this is bold</strong> and could not be changed.)

It turned out that S26 allowed for much more than had been included in the first documentation sources, including custom blocks and custom markup.

The project to upgrade the original HTML renderer morphed into Raku::Pod::Render, and transforming a directory full of individual documentation sources into an interlinked and searchable set of documents required another layer of tooling Collection. For example, collecting together all the pages that can be grouped as tutorials, or reference, or language, and creating a separate page for them automatically.

I covered these two projects in a presentation to RakuCon 2022.

Some of the original ideas in S26 had not been implemented, such as aliases and generic numbering. Other ideas had become outdated, such as a way to specify document encoding, which is now solved with Unicode.

In addition, RakuAST (see RakuAST for early adopters ) is on the horizon, which will radically change the speed of documentation processing.

There are also two implementations of POD6, one in Raku and one in Javascript, namely Alexandr Zahatski's Podlite.

Introducing Rakudoc

This was an ideal time to revisit POD6 and recast it into Rakudoc - new name for the markup language, and its new file extension ".rakudoc".

I was invited to the first Raku Core Summit and I put together a presentation about the changes I thought needed to be made based on my own experience, but also using comments from other developers.

We came to a number of consensus agreements about the minimal changes that were needed, and some extra functionality to handle new questions, such as documentation versioning.

It was also clear that Rakudoc (aka POD6) has two separate parts: components that interact closely with the program being documented, and components that will be rendered separately into HTML (or an ebook). The documentation file needs to make this clear.

I have now written the first draft of the revision and the documentation file that encapsulates it. An HTML version can be found at new-raku.finanalyst.org/language/rakudoc, alongside the old documentation file and the simple table implementation. I am planning future blogs to describe some of the proposed revisions.

However, none of the revisions will break existing POD6, so Rakudoc should be backwards compatible with POD6. The version at new-raku is a VERY early first draft, and it will go through several review stages.

The first Raku Core Summit was organised by Elizabeth Mattijsen and hosted by Elizabeth and Wendy at their home. It was a really good meeting and I am sincerely grateful for their generosity and hospitality. The summit was also supported by The Perl and Raku Foundation, Rootprompt, and Edument.

Rakudo compiler, Release #162 (2023.06)

Published on 2023-06-26T00:00:00

Recollections from the Raku Core Summit

Published by jnthnwrthngtn on 2023-06-18T15:58:04

The first Raku Core Summit, a gathering of folks who work on “core” Raku things, was held on the first weekend of June, and I was one of those invited to attend. It’s certainly the case that I’ve been a lot less active in Raku things over the last 18 months, and I hesitated for a moment over whether to go. However, even if I’m not so involved day to day in Raku things at the moment, I’m still keen to see the language and its ecosystem move forward, and – having implemented no small amount of the compiler and runtime since getting involved in 2007 – I figured I’d find something useful to do there!

The area I was especially keen to help with is RakuAST, something I started, and that I’m glad I managed to bring far enough that others could see the potential and were excited enough to pick it up and run with it.

One tricky aspect of implementing Raku is the whole notion of BEGIN time (of course, this is also one of the things that makes Raku powerful and thus is widely used). In short, BEGIN time is about running code during the compile time, and in Raku there’s no separate meta-language; anything you can do at runtime, you can (in principle) do at compile time too. The problem at hand was what to do about references from code running at compile time to lexically scoped symbols in the surrounding scope. Of note, that lexical scope is still being compiled, so doesn’t really exist yet so far as the runtime is concerned. The current compiler deals with this by building up an entire flattened table of everything that is visible, and installing it as a fake outer scope while running the BEGIN-time code. This is rather costly, and the hope in RakuAST was to avoid this kind of approach in general.

A better solution seemed to be at hand by spotting such references during compilation, resolving them, and fixating them – that is, they get compiled as if they were lookups into a constant table. (This copies the suggested approach for quasiquoted code that references symbols in the lexical scope of where the quasiquoted code appears.) This seemed promising, but there’s a problem:

my $x = BEGIN %*ENV<DEBUG> ?? -> $x { note "Got $x"; foo($x) } !! -> $x { foo($x) };

It’s fine to post-declare subs, and so there’s no value to fixate. Thankfully, the generalized dispatch mechanism can ride to the rescue; we can:

  1. Create a placeholder object with an attribute to hold the resolution
  2. Compile the lookup into a use of a dispatcher that reads this attribute and indicates that this is a constant result of the dispatch (so it is stored in the inline cache, and after specialization will be just as cheap as any other sub call). If the attribute is not set, that means we tried to run the code before declaring the sub, and the object can carry a bit of extra metadata in order to give a good error message.
  3. Keep track of this object in the compiler, and – upon declaration of the sub – install it into the placeholder object.
  4. Give an error if we reach the end of the compilation unit with an unfilled placeholder.

When compiling Raku code, timing is everything. I knew this and tried to account for it in the RakuAST design from the start, but a couple of things in particular turned out a bit awkward.

I got a decent way into this restructuring work during the core summit, and hope to find time soon to get it a bit further along (I’ve been a mix of busy, tired, and had an eye infection to boot since getting back from the summit, so thus far there’s not been time for it).

I also took part in various other discussions and helped with some other things; those that are probably most worth mentioning are:

Thanks goes to Liz for organizing the summit, to Wendy for keeping everyone so well fed and watered, to the rest of attendees for many interesting discussions over the three days, to TPRF and Rootprompt for sponsoring the event, and to Edument for supporting my attendance.

Retrospective of the MoarVM JIT

Published by Bart Wiegmans on 2023-06-10T15:33:00

Hi hackers! Today the MoarVM JIT project is nearly 9 years old. I was inspired by Jonathan's presentation reflecting on the development of MoarVM, to do the same for the MoarVM JIT, for which I have been responsible.

For those who are unfamiliar, what is commonly understood as 'JIT compilation' for virtual machines is performed by two components in MoarVM.

This post refers only to the native code generation backend component. It, too, is split into two mostly-independent systems:

Things that worked well

 Things that didn't work so well 

What's kind of ugly

How did we get here?

One one hand, as a result of my limited experience, time and resources, and on the other hand as a result of the design of MoarVM.

MoarVM was originally designed as a traditional interpreter for a high level language (much like the Perl interpreter). Meaning that it has a large number of different instructions and many instructions operate on high-level data structures like strings, arrays and maps (as opposed to pointers and machine words).

This is by no means a bad or outdated design. Frequently executed routines (string manipulation, hash table lookups etc.) are implemented using an efficient language (C) and driven by a language that is optimized for usability (Raku). This design is also used in modern machine learning frameworks. More importantly, this was a reasonable design because it is a good target for the Rakudo compiler.

For the JIT compiler, this means two things:

The machine code generated by the JIT compiler then will mostly consists of consecutive function calls to VM routines, which is not the type of code where a compiler can really improve performance much.

In other words, suppose 50% of runtime is spent in interpretation overhead (instruction decoding and dispatch), and 50% is spent in VM routines, then removing interpretation overhead via JIT compilation will at best result in a twofold increase in performance. For many programs, the observed performance increase will be even less.

Mind that I'm specifically refering to the improvement due to machine code generation, and not to those due to type specialization, inlining etc. (the domain of 'spesh'). These latter features have resulted in much more significant performance improvements.

Was it worth it?

I think it was.

For me personally, it was a tremendously valuable learning experience which led directly to my current career, writing SQL compilers for Google Cloud.

For the Raku community, even if we never realized the performance improvements that I might have hoped at the start, I hope that the JIT project (as it exists) has been valuable, if for no other reason than identifying the challenges of JIT compilation for MoarVM. A future effort may be able to do better based on what we learned; and I hope my blog posts are a useful resource from that perspective.

What's next?

Assuming that time and resources were not an issue:

If any of this comes to pass, you'll find my report on it right here. Thanks for reasding and until then!

 

I 🫀 Raku - Easy subroutine shortcuts to class constructors

Published by 5ab5traction5 on 2023-01-27T20:21:49

Context

I decided to write a simple (but funky) dice roller over the holidays. This led to a number of fun diversions in Raku code that all deserve some highlighting.

Today I'd like to share a pending PR I have for GTK::Simple that I believe highlights one of Raku's strength: compositional concision. That is, code that does a lot in very few words thanks to the composition of various concise-on-their-own details of the language.

GTK::Simple is pretty chill already

This module does a good job of translating the C experience of writing a GTK application into an idiomatic Raku version. It is both easily usable as well as quickly extensible, should you find some corner case of the (massive) GTK that isn't covered.

I'll be updating with some more in depth discussion of recent changes that have been merged recently.

(Note to self: Fully implementing the afore-linked GTK application in Raku would make for an interesting exercise.)

... but what if it became even chiller?

In GTK::Simple, we map GTK classes into top-level GTK::Simple definitions. So MenuItem becomes GTK::Simple::MenuItem.

This leads to code such as:

use GTK::Simple;

my $app = GTK::Simple::App.new(title => "A fresh new app");
$app.set-content(
    GTK::Simple::HBox.new(
        GTK::Simple::Label.new(text => "Looking good in that GTK theme")
    )
);
$app.run;

Should my PR be accepted, it will allow a short-hand syntax that makes things almost a bit Shoes-like in simplicity.

use GTK::Simple :subs;

my $app = app :title("An alternative");
$app.set-content(
    h-box(
        label(:text("Shiny slippers"))
    )
);
$app.run;

Adding the mappings

The mapping rule is a simple CamelCase to kebab-case conversion of the class names.

The entire code for adding this feature is as follows:

# Exports above class constructors, ex. level-bar => GTK::Simple::LevelBar.new
my module EXPORT::subs {
    for GTK::Simple::.kv -> $name, $class {
        my $sub-name = '&' ~ ($name ~~ / (<:Lu><:Ll>*)* /).values.map({ .Str.lc }).join("-");
        OUR::{ $sub-name } := sub (|c) { $class.new(|c) };
    }
}

my module EXPORT::subs {

This line of code utilizes a Raku convention that allows for custom behavior through specific instructions from the importer of a module, provided in the form of Pair objects (:subs is shorthand for subs => True).

When the importer specifies use GTK::Simple :subs, it looks for a module with the pair's key as the name inside of the imported module. This is often a generated module thanks to the export trait. sub foo() is export :named-option {...} generates a my-scoped module Module::named-option that instructs the importer to include foo in its own my scope.

for GTK::Simple::.kv -> $name, $class {

This is an example of compositional concision right here. The dangling :: is shorthand for .WHO, which when called on a module returns the package meta-object of the module.1 Holding this meta-object, we can call .kv to get the keys (names) and values (package objects) of the packages within that scope.

Because we have loaded all (and only) our relevant classes into the GTK::Simple package scope, the meta-object for that package handily provides a .kv method that delivers the class names as keys and the actual class objects as keys, zipped together.

If there were anything else our scoped in GTK::Simple, it would be in this list too (module and class definitions are our scoped by default). So depending on how complex your module is, you might have to write some filters or guards to make sure you were only processing the actual class objects.

Thanks to a meta-object system designed to be both easy to use and transparently hidden from view until needed, there's nothing to make things feel complicated here.2

my $sub-name = '&' ~ ($name ~~ / (<:Lu><:Ll>) /).values.map({ .Str.lc }).join("-");

This line makes me happy. The crux of the code lies in the advanced Raku regex syntax. <:Lu> and <:Li> are built in character classes in Raku that represent uppercase and lowercase characters, respectively. These are Unicode aware, so no worries there.

The rest is straight-forward: take the .values of the Match object, map them to strings while also lower-casing them and then join the resulting list of strings into a compound string in kebab-case.

We prepend the & sigil to the name in order to register it as a subroutine when it is brought into the importer's scope.

OUR::{ $sub-name } := sub (|c) { $class.new(|c) };

Here's where the actual mapping takes place. OUR::{ $sub-name } is creating a (name for a) sub dynamically in the shared OUR scope between the importer and the imported modules. The import process is such that these dynamically defined modules become available in the scope of the importing module (it's MY scope).

sub (|c) { $class.new(|c) } says create an anonymous subroutine that passes it's arguments (which are, of course, an object themselves) exactly as-is to the constructor of $class.

Not necessarily obvious

Now, I do understand that there are aspects of this code that are not obvious: the dangling :: syntax, the necessity of making a my scoped module with a special name of EXPORT, and perhaps the use of the OUR:: package accessor.

It's probably not code that I will necessarily write directly from memory next time I want to dynamically load a bunch of subs into an importer's scope.

At the same time, I would argue that all of these examples very idiomatic to the language: little 'escape hatches' into deeper layers of the language that work as expected once you encounter them. All of this without muddying the waters of what's "in" a module by, for instance, providing methods directly on package objects that would provide similar functionality.3

Raku really is a well-thought over programming language. It's also massive, which can be intimidating. It helps to manage this massivity when pieces fit together in carefully planned ways. In other words, once I had that dangling ::/WHO object to call .kv on, everything else I had to do just fell together.

Testing this

In order to ensure that everything was indeed working in a systematic way, I needed to write some tests.

use GTK::Simple :subs;

# Other modules are pulled into GTK::Simple namespace by now that we do not want to test
sub skip-test($name) {
    state $skip-set = set '&' X~ <app simple raw native-lib g-d-k common property-facade>;
    $name (elem) $skip-set
}

for GTK::Simple::.kv -> $name, $class {
    my $sub-name = '&' ~ ($name ~~ / (<:Lu><:Ll>*)* /).values.map({ .Str.lc }).join("-");
    next if skip-test($sub-name);

    my $widget;
    lives-ok { $widget = ::{$sub-name}(:label("For Button(s)"), :uri("For LinkButton")) },
        "There is a subroutine in scope called '$sub-name'";
    ok $widget ~~ $class, "'$sub-name' returns a { $class.^name } object";
}

Here I created a state variable (only defined once per scope, so we don't re-do the variable assignment with each call) to hold a set of names that we want to skip. If the argument is an element of said set, skip the test.

::{$sub-name}

Once again :: shows up to be a Do-What-I-Mean shortcut. This time it is for the package of our current lexical scope. So it means essentially the same thing: let me access the package as if it were a "stash", giving us semantics equivalent to hashes (eg, ::{$sub-name}). A lexical lookup is begun until the subroutine is found in lexical scope. Since the import process implanted it there in lexical scope, it returns with the sub stored with name $sub-name.

We pass two positional arguments to the constructor because they are either required for some constructors and or ignored by those that don't require them.

Conclusion

That wraps up a fairly lengthy post about a very short snippet of code. Please stay tuned for other posts about changes that were made such that my awesome, funky dice roller could exist in ease and comfort.

Footnotes

  1. When .WHO is called on an initialized object, it returns it's respective class object.

  2. Compare that to this Java or even that Java without the Reflections library.

  3. In Ruby, the same would be done with MyModule.constants.select {|c| MyModule.const_get(c).is_a? Class}. It's MyModule.constants and MyModule.const_get that in my opinion shows a muddier abstraction between package objects and the meta-objects that govern and represent them.

The 2022 Raku Advent Posts

Published by Elizabeth Mattijsen on 2022-12-26T01:01:00

(in chronological order, with comment references)

Day 25: Rakudo 2022 Review

Published by Elizabeth Mattijsen on 2022-12-25T01:01:00

In a year as eventful as 2022 was in the real world, it is a good idea to look back to see what one might have missed while life was messing with your (Raku) plans.

Rakudo saw about 1500 commits this year, about the same as the year before that. Many of these were bug fixes and performance improvements, which you would normally not notice. But there were also commits that actually added features to the Raku Programming Language. So it feels like a good idea to actually mention those more in depth.

So here goes! Unless otherwise noted, all of these changes are in language level 6.d, and available thanks to several Rakudo compiler releases during 2022.

New REPL functionality

It is now possible to refer to values that were produced earlier, using the $*N syntax, where N is a number greater than or equal to 0.

$ raku
To exit type 'exit' or '^D'
[0] > 42
42
[1] > 666
666
[2] > $*0 + $*1
708

Note that the number before the prompt indicates the index with which the value that is going to be produced, can be obtained.

New MAIN options

You can now affect the interpretation of command line arguments to MAIN by setting these options in the %*SUB-MAIN-OPTS hash:

allow-no

Allow negation of a named argument to be specified as --no-foo instead of --/foo.

numeric-suffix-as-value

Allow specification of a numeric value together with the name of a single letter named argument. So -j2 being the equivalent of --j=2.

So for example, by putting:

my %*SUB-MAIN-OPTS = :allow-no, :numeric-suffix-as-value;

at the top of your script, you would enable these features in the command-line argument parsing.

New types

Native unsigned integers (both in scalar, as well as a (shaped) array) have finally become first class citizens. This means that a native unsigned integer can now hold the value 18446744073709551615 as the largest positive value, from 9223372036854775807 before. This also allowed for a number of internal optimisations as the check for negative values could be removed. As simple as this sounds, this was quite an undertaking to get support for this on all VM backends.

my uint  $foo = 42;
my uint8 $bar = 255;
my  int8 $baz = 255;

say $foo; # 42
say $bar; # 255
say $baz; # -1

say ++$foo; # 43
say ++$bar; # 0
say ++$baz; # 0

And yes, all of the other explicitly sized types, such as uint16uint32 and uint64, are now also supported!

New subroutines

A number of subroutines entered the global namespace this year. Please note that they will not interfere with any subroutines in your code with the same name, as these will always take precedence.

NYI()

The NYI subroutine takes a string to indicate a feature not yet implemented, and turns that into a Failure with the X::NYI exception at its core. You could consider this short for ... with feedback, rather than just the “Stub code executed”.

say NYI "Frobnication";
# Frobnication not yet implemented. Sorry.

chown()

The chown subroutine takes zero or more filenames, and changes the UID (with the :uid argument) and/or the GID (with the :gid argument) if possible. Returns the filenames that were successfully changed. There is also a IO::Path.chown method version.

my @files  = ...;
my $uid    = +$*USER;
my changed = chown @files, :$uid;
say "Converted UID of $changed / @files.elems() files";

Also available as a method on IO::Path, but then only applicable to a single path.

head(), skip(), tail()

The .head.skip and .tail methods got their subroutine counterparts.

say head 3, ^10; # (0 1 2)
say skip 3, ^10; # (3,4,5,6,7,8,9)
say tail 3, ^10; # (7 8 9)

Note that the number of elements is always the first positional argument.

New methods

Any.are

The .are method returns the type object that all of the values of the invocant have in common. This can be either a class or a role.

say (1, 42e0, .137).are;        # (Real)
say (1, 42e0, .137, "foo").are; # (Cool)
say (42, DateTime.now).are;     # (Any)

In some languages this functionality appears to be called infer, but this name was deemed to be too ComputerSciency for Raku.

IO::Path.inode|dev|devtype|created|chown

Some low level IO features were added to the IO::Path class, in the form of 5 new methods. Note that they may not actually work on your OS and/or filesystem. Looking at you there, Windows 🙂

(Date|DateTime).days-in-year

The Date and DateTime classes already provide many powerfule date and time manipulation features. But a few features were considered missing this year, and so they were added.

A new .days-in-year class method was added to the Date and DateTime classes. It takes a year as positional argument:

say Date.days-in-year(2023);  # 365
say Date.days-in-year(2024);  # 366

This behaviour was also expanded to the .days-in-month method, when called as a class method:

say Date.days-in-month(2023, 2);  # 28
say Date.days-in-month(2024, 2);  # 29

They can also be called as instance methods, in which case the parameters default to the associated values in the object:

given Date.today {
    .say;                # 2022-12-25
    say .days-in-year;   # 365
    say .days-in-month;  # 31
}

New Dynamic Variables

Dynamic variables provide a very powerful way to keep “global” variables. A number of them are provided by the Raku Programming Language. And now there is one more of them!

$*RAT-OVERFLOW

Determine the behaviour of rational numbers (aka Rats) if they run out of precision. More specifically when the denominator no longer fits in a native 64-bit integer. By default, Rats will be downgraded to floating point values (aka Nums). By setting the $*RAT-OVERFLOW dynamic variable, you can influence this behaviour.

The $*RAT-OVERFLOW dynamic variable is expected to contain a class (or an object) on which an UPGRADE-RAT method will be called. This method is expected to take the numerator and denominator as positional arguments, and is expected to return whatever representation one wants for the given arguments.

The following type objects can be specified using core features:

Num

Default. Silently convert to floating point. Sacrifies precision for speed.

CX::Warn

Downgrade to floating point, but issue a warning. Sacrifies precision for speed.

FatRat

Silently upgrade to FatRat, aka rational numbers with arbitrary precision. Sacrifies speed by conserving precision.

Failure

Return an appropriate Failure object, rather than doing a conversion. This will most likely throw an exception unless specifically handled.

Exception

Throw an appropriate exception.

Note that you can introduce any custom behaviour by creating a class with an UPGRADE-RAT method in it, and setting that class in the $*RAT-OVERFLOW dynamic variable.

class Meh {
    method UPGRADE-RAT($num, $denom) is hidden-from-backtrace {
        die "$num / $denom is meh"
    }
}
my $*RAT-OVERFLOW = Meh;
my $a = 1 / 0xffffffffffffffff;
say $a;     # 0.000000000000000000054
say $a / 2; # 1 / 36893488147419103230 is meh

Note that the is hidden-from-backtrace is only added so that any backtrace will show the location of where the offending calculation was done, rather than inside the UPGRADE-RAT method itself.

New Environment Variables

Quite a few environment variables are already checked by Rakudo whenever it starts. Two more were added in the past year:

RAKUDO_MAX_THREADS

This environment variable can be set to indicate the maximum number of OS-threads that Rakudo may use for its thread pool. The default is 64, or the number of CPU-cores times 8, whichever is larger. Apart from a numerical value, you can also specify "Inf” or "unlimited" to indicate that Rakudo should use as many OS-threads as it can.

These same values can also be used in a call to ThreadPoolScheduler.new with the :max_threads named argument.

my $*SCHEDULER =
  ThreadPoolScheduler.new(:max_threads<unlimited>);

INSIDE_EMACS

This environment variable can be set to a true value if you do not want the REPL to check for installed modules to handle editing of lines. When set, it will fallback to the behaviour as if none of the supported line editing modules are installed. This appears to be handy for Emacs users, as the name implies 🙂

New experimental features

Some Raku features are not yet cast in stone yet, so there’s no guarantee that any code written by using these experimental features, will continue to work in the future. Two new experimental features have been added in the past year:

:will-complain

If you add a use experimental :will-complain to your code, you can customize typecheck errors by specifying a will complain trait. The trait expects a Callable that will be given the offending value in question, and is expected to return a string to be added to the error message. For example:

use experimental :will-complain;
my Int $a will complain { "You cannot use -$_-, dummy!" }
$a = "foo";
# Type check failed in assignment to $a; You cannot use -foo-, dummy!

The will complain trait can be used anywhere you can specify a type constraint in Raku, so that includes parameters and attributes.

:rakuast

The RakuAST classes allow you to dynamically build an AST (Abstract Syntax Tree programmatically, and have that converted to executable code. What was previously only possible by programmatically creating a piece of Raku source code (with all of its escaping issues), and then calling EVAL on it. But RakuAST not only allows you to build code programmatically (as seen in yesterday’s blog post), it also allows you to introspect the AST, which opens up all sorts of syntax / lintifying possibilities.

There is an associated effort to compile the Raku core itself using a grammar that uses RakuAST to build executable code. This effort is now capable of passing 585/1355 test-files in roast completely, and 83/131 of the Rakudo test-files completely. So still a lot of work to do, although it has now gotten to the point that implementation of a single Raku feature in the new grammar, often creates an avalanche of now passing test-files.

So, if you add a use experimental :rakuast to your code, you will be able to use all of the currently available RakuAST classes to build code programmatically. This is an entire new area of Raku development, which will be covered by many blog posts in the coming year. As of now, there is only some internal documentation.

A small example, showing how to build the expression "foo" ~ "bar":

use experimental :rakuast;

my $left  = RakuAST::StrLiteral.new("foo");
my $infix = RakuAST::Infix.new("~");
my $right = RakuAST::StrLiteral.new("bar");

my $ast = RakuAST::ApplyInfix.new(:$left, :$infix, :$right);
dd $ast;  # "foo" ~ "bar"

This is very verbose, agreed. Syntactic sugar for making this easier will certainly be developed, either in core or in module space.

Note how each element of the expression can be created separately, and then combined together. And that you can call dd to show the associated Raku source code (handy when debugging your ASTs).

For the very curious, you can check out a proof-of-concept of the use of RakuAST classes in the Rakudo core in the Formatter class, that builds executable code out of an sprintf format.

New arguments to existing functionality

roundrobin(…, :slip)

The roundrobin subroutine now also accepts a :slip named argument. When specified, it will produce all values as a single, flattened list.

say roundrobin (1,2,3), <a b c>;        # ((1 a) (2 b) (3 c))
say roundrobin (1,2,3), <a b c>, :slip; # (1 a 2 b 3 c)

This is functionally equivalent to:

say roundrobin((1,2,3), <a b c>).map: *.Slip;

but many times more efficient.

Cool.chomp($needle)

The .chomp method by default any logical newline from the end of a string. It is now possible to specify a specific needle as a positional argument: only when that is equal to the end of the string, will it be removed.

say "foobar".chomp("foo"); # foobar
say "foobar".chomp("bar"); # foo

It actually works on all Cool values, but the return value will always be a string:

say 427.chomp(7); # 42

DateTime.posix

DateTime value has better than millisecond precision. Yet, the .posix method always returned an integer value. Now it can also return a Num with the fractional part of the second by specifying the :real named argument.

given DateTime.now {
    say .posix;        # 1671733988
    say .posix(:real); # 1671733988.4723697
}

Additional meaning to existing arguments

Day from end of month

The day parameter to Date.new and DateTime.new (whether named or positional) can now be specified as either a Whatever to indicate the last day of the month, or as a Callable indicating number of days from the end of the month.

say Date.new(2022,12,*);   # 2022-12-31
say Date.new(2022,12,*-6); # 2022-12-25

Additions in v6.e.PREVIEW

You can already access new v6.e language features by specifying use v6.e.PREVIEW at the top of your compilation unit. Several additions were made the past year!

term nano

nano term is now available. It returns the number of nanoseconds since midnight UTC on 1 January 1970. It is similar to the time term but one billion times more accurate. It is intended for very accurate timekeeping / logging.

use v6.e.PREVIEW;
say time; # 1671801948
say nano; # 1671801948827918628

With current 64-bit native unsigned integer precision, this should roughly be enough for another 700 years 🙂

prefix //

You can now use // as a prefix as well as an infix. It will return whatever the .defined method returns on the given argument).

use v6.e PREVIEW;
my $foo;
say //$foo; # False
$foo = 42;
say //$foo; # True

Basically //$foo is syntactic sugar for $foo.defined.

snip() and Any.snip

The new snip subroutine and method allows one to cut up a list into sublists according the given specification. The specification consists of one or more smartmatch targets. Each value of the list will be smartmatched with the given target: as soon as it returns False, will all the values before that be produced as a List.

use v6.e.PREVIEW;
say (2,5,13,9,6,20).snip(* < 10);
# ((2 5) (13 9 6 20))

Multiple targets can also be specified.

say (2,5,13,9,6,20).snip(* < 10, * < 20);
# ((2 5) (13 9 6) (20))

The argument can also be an Iterable. To split a list consisting of integers and strings into sublists of just integers and just strings, you can do:

say (2,"a","b",5,8,"c").snip(|(Int,Str) xx *);
# ((2) (a b) (5 8) (c))

Inspired by Haskell’s span function.

Any.snitch

The new .snitch method is a debugging tool that will show its invocant with note by default, and return the invocant. So you can insert a .snitch in a sequence of method calls and see what’s happening “half-way” as it were.

$ raku -e 'use v6.e.PREVIEW;\
say (^10).snitch.map(* + 1).snitch.map(* * 2)'
^10
(1 2 3 4 5 6 7 8 9 10)
(2 4 6 8 10 12 14 16 18 20)

You can also insert your own “reporter” in there: the .snitch method takes a Callable. An easy example of this, is using dd for snitching:

$ raku -e 'use v6.e.PREVIEW;\
say (^10).snitch(&dd).map(*+1).snitch(&dd).map(* * 2)'
^10
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).Seq
(2 4 6 8 10 12 14 16 18 20)

Any.skip(produce,skip,…)

You can now specify more than one argument to the .skip method. Before, you could only specify a single (optional) argument.

my @a = <a b c d e f g h i j>;
say @a.skip;       # (b c d e f g h i j)
say @a.skip(3);    # (d e f g h i j)
say @a.skip(*-3);  # (h i j)

On v6.e.PREVIEW, you can now specify any number of arguments in the order: produce, skip, produce, etc. Some examples:

use v6.e.PREVIEW;
my @a = <a b c d e f g h i j>;
# produce 2, skip 5, produce rest
say @a.skip(2, 5);        # (a b h i j)
# produce 0, skip 3, then produce 2, skip rest
say @a.skip(0, 3, 2);     # (d e)
# same, but be explicit about skipping rest
say @a.skip(0, 3, 2, *);  # (d e)

In fact, any Iterable can now be specified as the argument to .skip.

my @b = 3,5;
# produce 3, skip 5, then produce rest
say @a.skip(@b);           # (a b c i j)
# produce 1, then skip 2, repeatedly until the end
say @a.skip(|(1,2) xx *);  # (a d g j)

Cool.comb(Pair)

On v6.e.PREVIEW, the .comb method will also accept a Pair as an argument to give it .rotor_-like capabilities. For instance, to produce trigrams of a string, one can now do:

use v6.e.PREVIEW;
say "foobar".comb(3 => -2);  # (foo oob oba bar)

This is the functional equivalent of "foobar".comb.rotor(3 => -2)>>.join, but about 10x as fast.

Changed semantics on Int.roll|pick

To pick a number from 0 till N-1, one no longer has to specify a range, but can use just the integer value as the invocant:

use v6.e.PREVIEW;
say (^10).roll;     # 5
say 10.roll;        # 7
say (^10).pick(*);  # (2 0 6 9 4 1 5 7 8 3)
say 10.pick(*);     # (4 6 1 0 2 9 8 3 5 7)

Of course, all of these values are examples, as each run will, most likely, produce different results.

More interesting stuff

There were some more new things and changes the past year. I’ll just mention them very succinctly here:

New methods on CompUnit::Repository::Staging

.deploy.remove-artifacts, and .self-destruct.

:!precompile flag on CompUnit::Repository::Installation.install

Install module but precompile on first loading rather than at installation.

New methods on Label

.file and .line where the Label was created.

.Failure coercer

Convert a Cool object or an Exception to a Failure. Mainly intended to reduce binary size of hot paths that do some error checking.

Cool.Order coercer

Coerce the given value to an Int, then convert to Less if less than 0, to Same if 0, and More if more than 0.

Allow semi-colon

Now allow for the semi-colon in my :($a,$b) = 42,666 because the left-hand side is really a Signature rather than a List.

Summary

I guess we’ve seen one big change in the past year, namely having experimental support for RakuAST become available. And many smaller goodies and tweaks and features.

Now that RakuAST has become “mainstream” as it were, we can think of having certain optimizations. Such as making sprintf with a fixed format string about 30x as fast! Exciting times ahead!

Hopefully you will all be able to enjoy the Holiday Season with sufficient R&R. The next Raku Advent Blog is only 340 days away!

Day 24: He’s making a list… (part 2)

Published by Matthew Stephen Stuckwisch on 2022-12-24T00:00:00

In our last edition, we learned about some of the work that Santa’s elves put into automating how they make their lists. What you probably didn’t know is that the elves stay on top of the latest and greatest technology. Being well-known avid Raku programmers, the elves were excited to hear about RakuAST and decided to see how they might be able to use it. One of the elves decided to rework the list formatting code to use RakuAST. What follows is the story of how she upgraded their current technology to use RakuAST.

Background

The current code that the elves had is fairly straight forward (check out part one for a full explanation)

sub format-list(
  +@items,
  :$language 'en',
  :$type = 'and',
  :$length = 'standard'
) {
    state %formatters;
    my $code = "$language/$type/$length";
 	 
    # Get a formatter, generate if it's not been requested before
    my &formatter = %cache{$code} // %cache{$code} =
      generate-list-formatter($language, $type, $length);
 	 
    formatter @items;
}
 	 
sub generate-list-formatter($language, $type, $length --> Sub ) {
    # Get CLDR information
    my $format = cldr{$language}.list-format{$type}{$length};
    my ($start, $middle, $end, $two) =
      $format<start middle end two>.map: *.substr(3,*-3).raku;
 	 
    # Generate code
    my $code = q:s:to/FORMATCODE/;
        sub format-list(+@items) {
            if @items > 2 {
                @items[0]
                  ~ $start
                  ~ @items[1..*-2].join($middle)
                  ~ $end
                  ~ @items[*-1]
            }
            elsif @items == 2 {
                @items[0] ~ $two ~ @items[1]
            }
            elsif @items == 1 {
                @items[0]
            }
            else {
                ''
            }
        }
    FORMATCODE
 	 
    # compile and return
    use MONKEY-SEE-NO-EVAL;
    EVAL $code
}

While the caching technique is rudimentary and technically not thread-safe, it works (a different elf will probably revisit the code to make it so). Now, when creating all the lists for, say, children in Georgia, the data for Georgian list formatters in CLDR will only need to be accessed a single time. For the next half a million or so calls, the code will be run practically as fast as if it had been hard coded (since, in effect, it has been).

The problem is how the generate-list-formatter code works. The code block uses a heredoc-style :to string, but it’s interpolated. There are numerous ways to accomplish this but all of them require having to use proper escapes. That’s…. risky.

Another elf, having seen the performance improvements that this new EVAL code brought, wanted to find a way to avoid the risky string evaluation. She had heard about the new RakuAST and decided to give it a whirl. While it initially looked more daunting, she quickly realized that RakuAST was very powerful.

What is RakuAST

RakuAST is an object-based representation of Raku’s abstract syntax tree, or roughly what you might get if you parsed Raku’s code into its individual elements. For instance, a string literal might be represented as 'foo' in code, but once parsed, becomes a string literal. That string literal, by the way, can be created by using RakuAST::StrLiteral.new(…). Remember how the elf had to worry about how the string might be interpolated? By creating a the string literal directly via a RakuAST node, that whole process is safely bypassed. No RakuAST::StrLiteral node can be created that will result in a string injection!

Every single construct in the Raku language has an associated RakuAST node. When creating nodes, you might frequently pass in another node, which means you can build up code objects in a piece-by-piece fashion, and again, without ever worrying about string interpolation, escaping, or injection attacks.

So let’s see how the elf eventually created the safer RakuAST version of the formatter method.

The elf works her AST off

To ease her transition into RakuAST, the elf decided to go from the simplest to the most complex part of the code. The simplest is the value for the final else block:

my $none = RakuAST::StrLiteral.new(''); 

Okay. That was easy. Now she wanted to tackle the single element value. In the original code, that was @list.head. Although we don’t normally think of it as such, . is a special infix for method calling. Operators can be used creating an RakuAST::Apply___fix node, where ___ is the type of operator. Depending on the node, there are different arguments. In the case of RakuAST::ApplyPostfix, the arguments are operand (the list), and postfix which is the actual operator. These aren’t as simple as typing in some plain text, but when looking at the code the elf came up with, it’s quite clear what’s going on:

my $operand = RakuAST::Var::Lexical.new('@list');
my $postfix = RakuAST::Call::Method.new(
  name => RakuAST::Name.from-identifier('head')
);
my $one = RakuAST::ApplyPostfix.new(:$operand, :$postfix) 

The operand isn’t a literal, but a variable. Specifically, it’s a lexical variable, so we create a node that will reference it. The call method operator needs a name as well, so we do that as well.

This involves a lot of assignment statements. Sometimes that can be helpful, but for something this simple, the elf decided it was easier to write it as one “line”:

my $one = RakuAST::ApplyPostfix.new(
  operand => RakuAST::Var::Lexical.new('@list'),
  postfix => RakuAST::Call::Method.new(
    name => RakuAST::Name.from-identifier('head')
  )
);

Alright, so the first two cases are done. How might she create the result for when the list has two items? Almost exactly like the last time, except now she’d provide an argument. While you might think it would be as simple as adding args => RakuAST::StrLiteral($two-infix), it’s actually a tiny bit more complicated because in Raku, argument lists are handled somewhat specially, so we actually need a RakuAST::ArgList node. So the equivalent of @list.join($two-infix) is

my $two = RakuAST::ApplyPostfix.new(
  operand => RakuAST::Var::Lexical.new('@list'),
  postfix => RakuAST::Call::Method.new(
    name => RakuAST::Name.from-identifier('join'),
    args => RakuAST::ArgList.new(
      RakuAST::StrLiteral.new($two-infix)
    )
  )
); 	 

The RakuAST::ArgList takes in a list of arguments — be they positional or named (named applied by way of a RakuAST::FatComma).

Finally, the elf decided to tackle what likely would be the most complicated bit: the code for 3 or more items. This code makes multiple method calls (including a chained one), as well as combining everything with a chained infix operator.

The method calls were fairly straightforward, but she thought about what the multiple ~ operators would be handled. As it turns out, it would actually require being set up as if (($a ~ $b) ~ $c) ~ $d, etc., and the elf didn’t really like the idea of having ultimately intending her code that much. She also thought about just using join on a list that she could make, but she already knew how to do method calls, so she thought she’d try something cool: reduction operators (think [~] $a, $b, $c, $d for the previous). This uses the RakuAST::Term::Reduce node that takes a simple list of arguments. For the * - 2 syntax, to avoid getting too crazy, she treated it as if it had been written as the functionally identical @list - 2.

Becaused that reduction bit has some many elements, she ending up breaking things into pieces: the initial item, the special first infix, a merged set of the second to penultimate items joined with the common infix, the special final infix, and the final item. For a list like [1,2,3,4,5] in English, that amounts to 1 (initial item), , (first infix), 2, 3, 4 (second to penultimate, joined with , ), , and (final infix) and 5 (final item). In other languages, the first and repeated infixes may be different, and in others, all three may be identical.

# @list.head
my $more-first-item = RakuAST::ApplyPostfix.new(
  operand => RakuAST::Var::Lexical.new('@list'),
  postfix => RakuAST::Call::Method.new(
    name => RakuAST::Name.from-identifier('head')
  )
);

# @list[1, * - 2].join($more-middle-infix)
my $more-mid-items = RakuAST::ApplyPostfix.new(
  # @list[1, @list - 2
  operand => RakuAST::ApplyPostfix.new(
    operand => RakuAST::Var::Lexical.new('@list'),
    postfix => RakuAST::Postcircumfix::ArrayIndex.new(
      # (1 .. @list - 2)
      RakuAST::SemiList.new(
        RakuAST::ApplyInfix.new(
          left => RakuAST::IntLiteral.new(1),
          infix => RakuAST::Infix.new('..'),
          # @list - 2
          right => RakuAST::ApplyInfix.new(
            left => RakuAST::Var::Lexical.new('@list'),
            infix => RakuAST::Infix.new('-'),
            right => RakuAST::IntLiteral.new(2)
          )
        )
      )
    )
  ),
  # .join($more-middle-infix)
  postfix => RakuAST::Call::Method.new(
    name => RakuAST::Name.from-identifier('join'),
    args => RakuAST::ArgList.new(
      RakuAST::StrLiteral.new($more-middle-infix)
    )
  )
);
 
# @list.tail
my $more-final-item = RakuAST::ApplyPostfix.new(
  operand => RakuAST::Var::Lexical.new('@list'),
  postfix => RakuAST::Call::Method.new(
    name => RakuAST::Name.from-identifier('tail')
  )
);
 	 
# [~] ...
my $more = RakuAST::Term::Reduce.new(
  infix => RakuAST::Infix.new('~'),
  args => RakuAST::ArgList.new(
    $more-first-item,
    RakuAST::StrLiteral.new($more-first-infix),
    $more-mid-items,
    RakuAST::StrLiteral.new($more-final-infix),
    $more-final-item,
  )
);

As one can note, as RakuAST code starts getting more complex, it can be extremely helpful to store interim pieces into variables. For complex programs, some RakuAST users will create functions that do some of the verbose stuff for them. For instance, one might get tired of the code for an infix, and write a sub like

sub rast-infix($left, $infix, $right) {
    RakuAST::ApplyInfix.new:
      left => $left,
      infix => RakuAST::Infix.new($infix),
      right => $right
}

to enable code like rast-infix($value, '+', $value) which ends up being much less bulky. Depending on what they’re doing, they might make a sub just for adding two values, or maybe making a list more compactly.

In any case, the hard working elf had now programmatically defined all of the formatter code. All that was left was for her to piece together the number logic and she’d be done. That logic was, in practice, quite simple:

if @list > 2 { $more }
elsif @list == 2 { $two }
elsif @list == 1 { $one }
else { $none } 

In practice, there was still a bit of a learning curve. Why? As it turns out, the [els]if statements are actually officially expressions, and need to be wrapped up in an expression block. That’s easy enough, she could just use RakuAST::Statement::Expression. Her conditions end up being coded as

# @list > 2
my $more-than-two = RakuAST::Statement::Expression.new(
  expression => RakuAST::ApplyInfix.new(
    left => RakuAST::Var::Lexical.new('@list'),
    infix => RakuAST::Infix.new('>'),
    right => RakuAST::IntLiteral.new(2)
  )
);
 	 
# @list == 2
my $exactly-two = RakuAST::Statement::Expression.new(
  expression => RakuAST::ApplyInfix.new(
    left => RakuAST::Var::Lexical.new('@list'),
    infix => RakuAST::Infix.new('=='),
    right => RakuAST::IntLiteral.new(2)
  )
);
 	 
# @list == 1
my $exactly-one = RakuAST::Statement::Expression.new(
  expression => RakuAST::ApplyInfix.new(
    left => RakuAST::Var::Lexical.new('@list'),
    infix => RakuAST::Infix.new('=='),
    right => RakuAST::IntLiteral.new(1)
  )
);	 

That was simple enough. But now sure realized that the then statements were not just the simple code she had made, but were actually a sort of block! She would need to wrap them with a RakuAST::Block. A block has a required RakuAST::Blockoid element, which in turn has a required RakuAST::Statement::List element, and this in turn will contain a list of statements, the simplest of which is a RakuAST::Statement::Expression that she had already seen. She decided to try out the technique of writing a helper sub to do this:

sub wrap-in-block($expression) {
    RakuAST::Block.new(
      body => RakuAST::Blockoid.new(
        RakuAST::StatementList.new(
          RakuAST::Statement::Expression.new(:$expression)
        )
      )
    )
}
 	 
$more = wrap-in-block $more;
$two  = wrap-in-block $two;
$one  = wrap-in-block $one;
$none = wrap-in-block $none; 

Phew, that was a pretty easy way to handle some otherwise very verbose coding. Who knew Raku hid away so much complex stuff in such simple syntax?! Now that she had both the if and then statements finished, she was ready to finish the full conditional:

my $if = RakuAST::Statement::If.new(
  condition => $more-than-two,
  then => $more,
  elsifs => [
    RakuAST::Statement::Elsif.new(
      condition => $exactly-two,
      then => $two
    ),
    RakuAST::Statement::Elsif.new(
      condition => $exactly-one,
      then => $one
    )
  ],
  else => $none
); 

All that was left was for her to wrap it up into a Routine and she’d be ready to go! She decided to put it into a PointyBlock, since that’s a sort of anonymous function that still takes arguments. Her fully-wrapped code block ended up as:

my $code = RakuAST::PointyBlock.new(
  signature => RakuAST::Signature.new(
    parameters => (
      RakuAST::Parameter.new(
        target => RakuAST::ParameterTarget::Var.new('@list'),
 	slurpy => RakuAST::Parameter::Slurpy::SingleArgument
      ),
    ),
  ),
  body => RakuAST::Blockoid.new(
    RakuAST::StatementList.new(
      RakuAST::Statement::Expression.new(
        expression => $if
      )
    )
  )
); 

Working with RakuAST, she really got a feel for how things worked internally in Raku. It was easy to see that a runnable code block like a pointy block consisted of a signature and a body. That signature had a list of parameters, and the body a list of statements. Seems obvious, but it can be enlightening to see it spread out like she had it.

The final step was for her actually evaluate this (now much safer!) code. For that, nothing changed. In fact, the entire rest of her block was simply

sub generate-list-formatter($language, $type, $length) {
    use Intl::CLDR;
    my $pattern = cldr{$lang}.list-patterns{$type}{$length};
    my $two-infix = $pattern.two.substr: 3, *-3;
    my $more-first-infix = $pattern.start.substr: 3, *-3;
    my $more-middle-infix = $pattern.middle.substr: 3, *-3;
    my $more-final-infix = $pattern.end.substr: 3, *-3;
 	 
    ...
 	 
    use MONKEY-SEE-NO-EVAL;
    EVAL $code
}

Was her code necessarily faster than the older method? Not necessarily. It didn’t require a parse phase, which probably saved a bit, but once compiled, the speed would be the same.

So why would she bother doing all this extra work when some string manipulation could have produced the same result? A number of reasons. To begin, she learned the innards of RakuAST, which helped her learn the innards of Raku a bit better. But for us non-elf programmers, RakuAST is important for many other reasons. For instance, at every stage of this process, everything was fully introspectable! If your mind jumped to writing optimizers, besides being a coding masochist, you’ve actually thought about something that will likely come about.

Macros is another big feature that’s coming in Raku and will rely heavily on RakuAST. Rather than just do text replacement in the code like macros in many other languages, macros will run off of RakuAST nodes. This means an errant quote will never cause problems, and likely enable far more complex macro development. DSL developers can seamlessly integrate with Raku by just compiling down to RakuAST.

The future

So what is the status of RakuAST? When can you use it? As of today, you will need to build the most recent main branch of Rakudo to use it. Then, in your code, include the statement use experimental :rakuast;. Yours truly will be updating a number of his formatting modules to use RakuAST very shortly which will make them far more maintainable and thus easier to add new features. For more updates on the progress of RakuAST, check out the Rakudo Weekly, where Elizabeth Mattijsen gives regular updates on RakuAST and all things Raku.

Day 23: Sigils followup: semantics and language design

Published by codesections on 2022-12-23T00:01:00

Until a few days ago, I’d intended for this post to be an update on the Raku persistent data structures I’m developing. And I have included a (very brief) status update at the end of this post. But something more pressing has come to my attention: Someone on the Internet was wrong — and that someone was me.

xkcd_386

Specifically, in my post about sigils the other day, I significantly misdescribed the semantics that Raku applies to sigiled-variables.

Considering that the post was about sigils, the final third focused on Raku’s sigils, and much of that section discussed the semantics of those sigils – being wrong about the semantics of Raku’s sigils isn’t exactly a trivial mistake. Oops!

In partial mitigation, I’ll mention one thing: no one pointed out my incorrect description of the relevant semantics, even though the post generated over two hundred comments of discussion, most of it thoughtful. Now, it could be no one read all the way to Part 3 of a 7,000 word post (an understandable choice!). But, considering the well-known popularity of correcting people on the Internet, I view the lack of any correction as some evidence that my misunderstanding wasn’t obvious to others either. In fact, I only discovered the issue when I decided, while replying to a comment on that post, to write an an oddly-designed Raku class to illustrate the semantics I’d described; much to my suprise, it showed that I’d gotten those semantics wrong.

Clearly, that calls for a followup post, which you’re now reading.

My goal for this post is, first of all, to explain what I got wrong about Raku’s semantics, how I made that error, and why neither I nor anyone else noticed. Then we’ll turn to some broader lessons about language design, both in Raku and in programming languages generally.  Finally, with the benefit of correctly understanding of Raku’s semantics, we’ll reevaluate Raku’s sigils, and the expressive power they provide.

What I got wrong – and what I got right

In that post, I said that the @ sigil can only be used for types that implement the Positional (“array-like”) role; that the % sigil can only be used for types that implement the Associative (“hash-like”) role; and that the & sigil can only be used for types that implement the Callable (“function-like”) role. All of that is right (and pretty much straight from the language docs).

Where I went wrong was when I described the requirements that a type must satisfy in order to implement those roles. I described the Positional role as requiring an iterable, ordered collection that can be indexed positionally (e.g., with @foo[5]); I described the Associative role as requiring an iterable, unordered collection of Pairs that can be indexed associatively (e.g., with %foo<key>); and I described the Callable role as requiring a type to support being called as a function (e.g., with &foo()).

That, however, was an overstatement. The requirements for implementing those three roles are actually: absolutely nothing. That’s right, they’re entirely “marker roles”, the Raku equivalent of Rust’s marker traits.

Oh sure, the Raku docs provide lists of methods that you should implement, but those are just suggestions. There’s absolutely nothing stopping us from writing classes that are Associative, Positional, or Callable, or – why not? – all three if we want to. Or, for that matter, since Raku supports runtime composition, the following is perfectly valid:

  my @pos := 'foo' but Positional;
  my %asc := 90000 but Associative;
  my &cal := False but Callable;

Yep, we can have a Positional string, an Associative number, and a Callable

How did we miss that?

So, here’s the thing: I’ve written quite a bit of Raku code while operating under the mistaken belief that those roles had the type constraints I described – which are quite a bit stricter than “none at all”. And I don’t think I’m alone in that; in fact, the most frequent comment I got on the previous post was surprise/confusion that @ and % weren’t constrained to concrete Arrays and Hashes (a sentiment I’ve heard before). And I don’t think any of us were crazy to think those sorts of things – when you first start out in Raku, the vast majority (maybe all) of the @– and %-sigiled things you see are Arrays and Hashes. And I don’t believe I’ve ever seen an @-sigiled variable in Raku that wasn’t an ordered collection of some sort. So maybe people thinking that the type constraints are stricter makes a certain amount of sense.

But that, in turn, just raises two more questions: First, given the unconstrained nature of those sigils, why haven’t I seen some Positional strings in the wild? After all, relying on programmer discipline instead of tool-enforcement is usually a recipe for quick and painful disaster. And, second, given that @– and %

Good defaults > programmer discipline

Let’s address those questions in order: Why haven’t I seen @-sigiled strings or %-sigiled numbers? Because Raku isn’t relying on programmer discipline to prevent those things; it’s relying on programmer laziness – a much stronger force. Writing my @pos := 'foo' but Positional seems very easy, but it has three different elements that would dissuade a new Rakoon from writing it: the := bind operator (most programmers are more familiar with assignment, and = is overwhelmingly more common in Raku code examples); the but operator (runtime composition is relatively uncommon in the wider programming world, and it’s not a tool Raku code turns to all that often) and Positional (roles in general aren’t really a Raku 101 topic, and Positional/Associative/Callable even less so – after all, all the built-in types that should implement those roles already do so).

Let’s contrast that line with the version that a new Rakoon would be more likely to write – indeed, the version that every Rakoon must have written over and over: my @pos = 'foo'. That removes all three of the syntactic stumbling blocks from the preceding code. More importantly, it works. Because the @-sigil provides a default Array container, that line creates the Array ['foo'] – which is much more likely to be what the user wanted in the first place.

Of course, that’s just one example, but the general pattern holds: Raku very rarely prohibits users from doing something (even something as bone-headed as a Positional string) but it’s simultaneously good at making the default/easiest path one that avoids those issues. If there’s an easy-but-less-rigorous option available, then no amount of “programmer discipline” will prevent everyone from taking it. But when the safer/saner thing is also by far the easier thing, then we’re not relying on programmer discipline. We’re removing the temptation entirely.

And then by the time someone has written enough Raku that :=, but, and Positional wouldn’t give them any pause, they probably have the “@ means “array-like, but maybe not an Array” concept so deeply ingrained that they wouldn’t consider creating a wacky Positional

Being stricter

What about the second question we posed earlier: Why doesn’t Raku enforce a tighter type constraint? It certainly could: Raku has the language machinery to really tighten down the requirements for a role. It would be straightforward to mandate that any type implementing the Positional role must also implement the methods for positional indexing. And, since Raku already has an Iterable role, requiring Positional types to be iterable would also be trivial. So why not?

Well, because – even if the vast majority of Positional types should allow indexing and should be iterable, there will be some that have good reasons not to be. And Raku could turn the “why not?” question around and ask “why?”

Providing guarantees versus communicating intent

All of this brings a question into focus – a question that goes right to the heart of Raku’s design philosophy and is an important one for any language designer to consider.

That question is: Is your language more interested in providing guarantees or in communicating intent

Guarantees are great

When I’m not writing Raku (or long blog posts), the programming language I spend the most time with is Rust. And Rust is very firmly on the providing guarantees side of that issue. And it’s genuinely great. There’s something just absolutely incredible and freeing about having the Rust compiler and a strong static type system at your back, of knowing that you just absolutely, 100% don’t need to worry about certain categories of bugs or errors. With that guarantee, you can drop those considerations from your mental cache altogether (you know, to free up space for the things that are cognitively complex in Rust – which isn’t a tiny list). So, yes, I saw the appeal when primarily writing Rust and I see it again every time I return to the language.

Indeed, I think Rust’s guarantees are 100% the right choice – for Rust. I believe that the strength of those guarantees was a great fit for Rust’s original use case (working on Firefox) and are a huge part of why Facebook, Microsoft, Amazon, and Google have all embraced Rust: when you’re collaborating on a team with the scope of a huge open-source project or a big tech company, guarantees become even more valuable. When some people leave, new ones join, and there’s no longer a way to get everyone on the same page, it’s great to have a language that says “you don’t have to trust their code, just trust me”.

But the thing about guarantees is that they have to be absolute. If something is “90% guaranteed”, then it’s not

Coding as a collaborative, asynchronous communication

Guarantees-versus-communication is one trade off where Raku makes the other choice, in a big way. Raku is vastly more interested in helping programmers to communicate their intent than in enforcing rules strictly enough to make guarantees. If Rust’s fundamental metaphor for code is the deductive proof – each step depends on the correctness of the previous ones, so we’d better be as sure as possible that they’re right – Raku’s fundamental metaphor is, unsurprisingly, more linguistic. Raku’s metaphor for coding is an asynchronous conversation between friends: an email exchange, maybe, or — better yet – a series of letters.

How is writing code like emailing a friend? Well, we talked last time about the three-way conversation between author, reader, and compiler, but that’s a bit of a simplification. Most of the time, we’re simultaneously reading previously-written code and writing additional code, which turns the three-way conversation into a four-way one. True, the “previous author”, “current reader/author”, and “future reader” might all be you, but the fact that you’re talking to yourself doesn’t make it any less of a conversation: either way, the goal is to understand the previous author’s meaning as well as possible, decide what you want to add to the conversation, and then express yourself as clearly as possible – subject to the constraint that the compiler also needs to understand your code.

A few words on that last point. From inside a code-as-proof metaphor, a strict compiler is a clear win. Being confident in the correctness of anything is hard enough, but it’s vastly harder as you increase the possibility space. But from a code-as-communication metaphor, there’s a real drawback to compilers (or formatters) that limit your ability to say the same thing in multiple ways. What shirt you wear stops being an expressive choice if you’re required to wear a uniform. In the same way, when there’s exactly one way to do something, then doing it that way doesn’t communicate anything. But when there’s more than one way to do it, then suddenly it makes sense to ask, “Okay, but why did they do it that way?”. This is deeply evident in Raku: there are multiple ways to write code that does the same thing, but those different ways don’t say the same thing – they allow you to place the emphasis in different points, depending on where you’d like to draw the reader’s attention. Raku’s large “vocabulary” plays the same role as increasing your vocabulary in a natural language: it makes it easier to pick just the right word.

When code is communication, rules become suggestions

When emailing a friend, neither of you can set “rules” that the other person must follow. You can make an argument for why they shouldn’t do something, you can express clearly and unequivocally that doing that would be a mistake, but you can’t stop them. You are friends – equals – and neither the email’s author nor its reader can overrule the other.

And the same is true of Raku: Raku makes it very difficult (frequently impossible) for the author of some code to 100% prevent someone from using their code in a particular way. Raku provides many ways to express – with all the intensity of an ALL CAPS EMAIL – that doing something is a really, really bad idea. But if you are determined to misuse code and knowledgeable enough, there’s pretty much no stopping you.

Coming from Rust, this took me a while to notice, because (at least in intro materials) Raku presents certain things as absolute rules (“private attributes cannot be accessed outside the class!”) when, in reality, they turn out to be strongly worded suggestions (”…unless you’re messing with the Meta Object Protocol in ways that you really shouldn’t”). From a Rust perspective, that just wouldn’t fly – private implementations should be private, But it fits perfectly with Raku’s overall design philosophy.

Communicating through sigils

Applying this design philosophy to sigils, I’ve come around to believing that making Possitional, Associative, and Callable marker roles was entirely the correct choice. After all, marker roles are entirely about communicating through code – even in Rust, the entire purpose of marker traits is to communicate some property that the Rust compiler can’t verify.

This is a perfect fit for sigils. What does @ mean? It means that the variable is Positional. Okay, what does Positional mean? It means “array-like”… Okay. What does “array-like” mean? Well, that’s up to you to decide, as part of the collaborative dialogue (trialogue?) with the past and future authors.

That doesn’t mean you’re on your own, crafting meaning from the void: Raku keeps us on the same general page by ensuring that every Rakoon has extensive experience with Arrays, which creates a shared understanding for what “array-like” means. And the language documentation provides clear explanations of how to make your custom types behave like Raku’s Array. But – as I now realize – Raku isn’t going to stomp its foot and say that @-sigiled variables must behave a particular way. If it makes sense – in your code base, in the context of your multilateral conversation – to have an @-sigiled variable that is neither ordered nor iterable, then you can.

So, I’m disappointed that I was mistaken about Raku’s syntax when I wrote my previous post. And I’m especially sorry if anyone was confused by the uncorrected version of that post. But I’m really glad to realize Raku’s actual semantics for sigils, because it fits perfectly with Raku as a whole.  Moreover, these semantics not only fit better with Raku’s design, they make Raku’s sigil’s even more better-suited for their primary purpose: helping someone writing code to clearly and concisely communicate their intent to someone reading that code

In keeping with my earlier post, I’ll include a table with the semantics of the three sigils we discussed:

Sigil Meaning
@ Someone intentionally marked the variable Positional
% Someone intentionally marked the variable Associative
& Someone intentionally marked the variable Callable

These semantics are perfect because, in the end, that’s what @, %, &, and $ really are: signs of what someone else intended. Little, semantically dense, magic signs.

The new MoarVM dispatch mechanism is here!

Published by jnthnwrthngtn on 2021-09-29T16:16:31

Around 18 months ago, I set about working on the largest set of architectural changes that Raku runtime MoarVM has seen since its inception. The work was most directly triggered by the realization that we had no good way to fix a certain semantic bug in dispatch without either causing huge performance impacts across the board or increasingly complexity even further in optimizations that were already riding their luck. However, the need for something like this had been apparent for a while: a persistent struggle to optimize certain Raku language features, the pain of a bunch of performance mechanisms that were all solving the same kind of problem but each for a specific situation, and a sense that, with everything learned since I founded MoarVM, it was possible to do better.

The result is the development of a new generalized dispatch mechanism. An overview can be found in my Raku Conference talk about it (slidesvideo); in short, it gives us a far more uniform architecture for all kinds of dispatch, allowing us to deliver better performance on a range of language features that have thus far been glacial, as well as opening up opportunities for new optimizations.

Today, this work has been merged, along with the matching changes in NQP (the Raku subset we use for bootstrapping and to implement the compiler) and Rakudo (the full Raku compiler and standard library implementation). This means that it will ship in the October 2021 releases.

In this post, I’ll give an overview of what you can expect to observe right away, and what you might expect in the future as we continue to build upon the possibilities that the new dispatch architecture has to offer.

The big wins

The biggest improvements involve language features that we’d really not had the architecture to do better on before. They involved dispatch – that is, getting a call linked to a destination efficiently – but the runtime didn’t provide us with a way to “explain” to it that it was looking at a dispatch, let alone with the information needed to have a shot at optimizing it.

The following graph captures a number of these cases, and shows the level of improvement, ranging from a factor of 3.3 to 13.3 times faster.

Graph showing benchmark results, described textually below

Let’s take a quick look at each of these. The first, new-buf, asks how quickly we can allocate Bufs.

for ^10_000_000 {
    Buf.new
}

Why is this a dispatch benchmark? Because Buf is not a class, but rather a role. When we try to make an instance of a role, it is “punned” into a class. Up until now, it works as follows:

  1. We look up the new method
  2. The find_method method would, if needed, create a pun of the role and cache it
  3. It would return a forwarding closure that takes the arguments and gives them to the same method called on the punned class, or spelt in Raku code, -> $role-discarded, |args { $pun."$name"(|args) }
  4. This closure would be invoked with the arguments

This had a number of undesirable consequences:

  1. While the pun was cached, we still had a bit of overhead to check if we’d made it already
  2. The arguments got slurped and flattened, which costs something, and…
  3. …the loss of callsite shape meant we couldn’t look up a type specialization of the method, and thus lost a chance to inline it too

With the new dispatch mechanism, we have a means to cache constants at a given program location and to replace arguments. So the first time we encounter the call, we:

  1. Get the role pun produced if needed
  2. Resolve the new method on the class punned from the role
  3. Produce a dispatch program that caches this resolved method and also replaces the role argument with the pun

For the next thousands of calls, we interpret this dispatch program. It’s still some cost, but the method we’re calling is already resolved, and the argument list rewriting is fairly cheap. Meanwhile, after we get into some hundreds of iterations, on a background thread, the optimizer gets to work. The argument re-ordering cost goes away completely at this point, and new is so small it gets inlined – at which point the buffer allocation is determined dead and so goes away too. Some remaining missed opportunities mean we still are left with a loop that’s not quite empty: it busies itself making sure it’s really OK to do nothing, rather than just doing nothing.

Next up, multiple dispatch with where clauses.

multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * fac($n - 1) }
for ^1_000_000 {
    fac(5)
}

These were really slow before, since:

  1. We couldn’t apply the multi-dispatch caching mechanism at all as soon as we had a where clause involved
  2. We would run where clauses twice in the event the candidate was chosen: once to see if we should choose that multi candidate, and once again when we entered it

With the new mechanism, we:

  1. On the first call, calculate a multiple dispatch plan: a linked list of candidates to work through
  2. Invoke the one with the where clause, in a mode whereby if the signature fails to bind, it triggers a dispatch resumption. (If it does bind, it runs to completion)
  3. In the event of a bind failure, the dispatch resumption triggers, and we attempt the next candidate

Once again, after the setup phase, we interpret the dispatch programs. In fact, that’s as far as we get with running this faster for now, because the specializer doesn’t yet know how to translate and further optimize this kind of dispatch program. (That’s how I know it currently stands no chance of turning this whole thing into another empty loop!) So there’s more to be had here also; in the meantime, I’m afraid you’ll just have to settle for a factor of ten speedup.

Here’s the next one:

proto with-proto(Int $n) { 2 * {*} }
multi with-proto(Int $n) { $n + 1 }
sub invoking-nontrivial-proto() {
    for ^10_000_000 {
        with-proto(20)
    }
}

Again, on top form, we’d turn this into an empty loop too, but we don’t quite get there yet. This case wasn’t so terrible before: we did get to use the multiple dispatch cache, however to do that we also ended up having to allocate an argument capture. The need for this also blocked any chance of inlining the proto into the caller. Now that is possible. Since we cannot yet translate dispatch programs that resume an in-progress dispatch, we don’t yet get to further inline the called multi candidate into the proto. However, we now have a design that will let us implement that.

This whole notion of a dispatch resumption – where we start doing a dispatch, and later need to access arguments or other pre-calculated data in order to do a next step of it – has turned out to be a great unification. The initial idea for it came from considering things like callsame:

class Parent {
    method m() { 1 }
}
class Child is Parent {
    method m() { 1 + callsame }
}
for ^10_000_000 {
    Child.m;
}

Once I started looking at this, and then considering that a complex proto also wants to continue with a dispatch at the {*}, and in the case a where clauses fails in a multi it also wants to continue with a dispatch, I realized this was going to be useful for quite a lot of things. It will be a bit of a headache to teach the optimizer and JIT to do nice things with resumes – but a great relief that doing that once will benefit multiple language features!

Anyway, back to the benchmark. This is another “if we were smart, it’d be an empty loop” one. Previously, callsame was very costly, because each time we invoked it, it would have to calculate what kind of dispatch we were resuming and the set of methods to call. We also had to be able to locate the arguments. Dynamic variables were involved, which cost a bit to look up too, and – despite being an implementation details – these also leaked out in introspection, which wasn’t ideal. The new dispatch mechanism makes this all rather more efficient: we can cache the calculated set of methods (or wrappers and multi candidates, depending on the context) and then walk through it, and there’s no dynamic variables involved (and thus no leakage of them). This sees the biggest speedup of the lot – and since we cannot yet inline away the callsame, it’s (for now) measuring the speedup one might expect on using this language feature. In the future, it’s destined to optimize away to an empty loop.

A module that makes use of callsame on a relatively hot path is OO::Monitors,, so I figured it would be interesting to see if there is a speedup there also.

use OO::Monitors;
monitor TestMonitor {
    method m() { 1 }
}
my $mon = TestMonitor.new;
for ^1_000_000 {
    $mon.m();
}

monitor is a class that acquires a lock around each method call. The module provides a custom meta-class that adds a lock attribute to the class and then wraps each method such that it acquires the lock. There are certainly costly things in there besides the involvement of callsame, but the improvement to callsame is already enough to see a 3.3x speedup in this benchmark. Since OO::Monitors is used in quite a few applications and modules (for example, Cro uses it), this is welcome (and yes, a larger improvement will be possible here too).

Caller side decontainerization

I’ve seen some less impressive, but still welcome, improvements across a good number of other microbenchmarks. Even a basic multi dispatch on the + op:

my $i = 0;
for ^10_000_000 {
    $i = $i + $_;
}

Comes out with a factor of 1.6x speedup, thanks primarily to us producing far tighter code with fewer guards. Previously, we ended up with duplicate guards in this seemingly straightforward case. The infix:<+> multi candidate would be specialized for the case of its first argument being an Int in a Scalar container and its second argument being an immutable Int. Since a Scalar is mutable, the specialization would need to read it and then guard the value read before proceeding, otherwise it may change, and we’d risk memory safety. When we wanted to inline this candidate, we’d also want to do a check that the candidate really applies, and so also would deference the Scalar and guard its content to do that. We can and do eliminate duplicate guards – but these guards are on two distinct reads of the value, so that wouldn’t help.

Since in the new dispatch mechanism we can rewrite arguments, we can now quite easily do caller-side removal of Scalar containers around values. So easily, in fact, that the change to do it took me just a couple of hours. This gives a lot of benefits. Since dispatch programs automatically eliminate duplicate reads and guards, the read and guard by the multi-dispatcher and the read in order to pass the decontainerized value are coalesced. This means less repeated work prior to specialization and JIT compilation, and also only a single read and guard in the specialized code after it. With the value to be passed already guarded, we can trivially select a candidate taking two bare Int values, which means there’s no further reads and guards needed in the callee either.

A less obvious benefit, but one that will become important with planned future work, is that this means Scalar containers escape to callees far less often. This creates further opportunities for escape analysis. While the MoarVM escape analyzer and scalar replacer is currently quite limited, I hope to return to working on it in the near future, and expect it will be able to give us even more value now than it would have been able to before.

Further results

The benchmarks shown earlier are mostly of the “how close are we to realizing that we’ve got an empty loop” nature, which is interesting for assessing how well the optimizer can “see through” dispatches. Here are a few further results on more “traditional” microbenchmarks:

Graph showing benchmark results, described textually below

The complex number benchmark is as follows:

my $total-re = 0e0;
for ^2_000_000 {
    my $x = 5 + 2i;
    my $y = 10 + 3i;
    my $z = $x * $x + $y;
    $total-re = $total-re + $z.re
}
say $total-re;

That is, just a bunch of operators (multi dispatch) and method calls, where we really do use the result. For now, we’re tied with Python and a little behind Ruby on this benchmark (and a surprising 48 times faster than the same thing done with Perl’s Math::Complex), but this is also a case that stands to see a huge benefit from escape analysis and scalar replacement in the future.

The hash read benchmark is:

my %h = a => 10, b => 12;
my $total = 0;
for ^10_000_000 {
    $total = $total + %h<a> + %h<b>;
}

And the hash store one is:

my @keys = 'a'..'z';
for ^500_000 {
    my %h;
    for @keys {
        %h{$_} = 42;
    }
}

The improvements are nothing whatsoever to do with hashing itself, but instead look to be mostly thanks to much tighter code all around due to caller-side decontainerization. That can have a secondary effect of bringing things under the size limit for inlining, which is also a big help. Speedup factors of 2x and 1.85x are welcome, although we could really do with the same level of improvement again for me to be reasonably happy with our results.

The line-reading benchmark is:

my $fh = open "longfile";
my $chars = 0;
for $fh.lines { $chars = $chars + .chars };
$fh.close;
say $chars

Again, nothing specific to I/O got faster, but when dispatch – the glue that puts together all the pieces – gets a boost, it helps all over the place. (We are also decently competitive on this benchmark, although tend to be slower the moment the UTF-8 decoder can’t take it’s “NFG can’t possibly apply” fast path.)

And in less micro things…

I’ve also started looking at larger programs, and hearing results from others about theirs. It’s mostly encouraging:

Smaller profiler output

One unpredicted (by me), but also welcome, improvement is that profiler output has become significantly smaller. Likely reasons for this include:

  1. The dispatch mechanism supports producing value results (either from constants, input arguments, or attributes read from input arguments). It entirely replaces an earlier mechanism, “specializer plugins”, which could map guards to a target to invoke, but always required a call to something – even if that something was the identity function. The logic was that this didn’t matter for any really hot code, since the identity function will trivially be inlined away. However, since profile size of the instrumenting profiler is a function of the number of paths through the call tree, trimming loads of calls to the identity function out of the tree makes it much smaller.
  2. We used to make lots of calls to the sink method when a value was in sink context. Now, if we see that the type simply inherits that method from Mu, we elide the call entirely (again, it would inline away, but a smaller call graph is a smaller profile).
  3. Multiple dispatch caching would previously always call the proto when the cache was missed, but would then not call an onlystar proto again when it got cache hits in the future. This meant the call tree under many multiple dispatches was duplicated in the profile. This wasn’t just a size issue; it was a bit annoying to have this effect show up in the profile reports too.

To give an example of the difference, I took profiles from Agrammon to study why it might have become slower. The one from before the dispatcher work weighed in at 87MB; the one with the new dispatch mechanism is under 30MB. That means less memory used while profiling, less time to write the profile out to disk afterwards, and less time for tools to load the profiler output. So now it’s faster to work out how to make things faster.

Is there any bad news?

I’m afraid so. Startup time has suffered. While the new dispatch mechanism is more powerful, pushes more complexity out of the VM into high level code, and is more conducive to reaching higher peak performance, it also has a higher warmup time. At the time of writing, the impact on startup time seems to be around 25%. I expect we can claw some of that back ahead of the October release.

What will be broken?

Changes of this scale always come with an amount of risk. We’re merging this some weeks ahead of the next scheduled monthly release in order to have time for more testing, and to address any regressions that get reported. However, even before reaching the point of merging it, we have:

What happens next?

As I’ve alluded to in a number of places in this post, while there are improvements to be enjoyed right away, there are also new opportunities for further improvement. Some things that are on my mind include:

Thank you

I would like to thank TPF and their donors for providing the funding that has made it possible for me to spend a good amount of my working time on this effort.

While I’m to blame for the overall design and much of the implementation of the new dispatch mechanism, plenty of work has also been put in by other MoarVM and Rakudo contributors – especially over the last few months as the final pieces fell into place, and we turned our attention to getting it production ready. I’m thankful to them not only for the code and debugging contributions, but also much support and encouragement along the way. It feels good to have this merged, and I look forward to building upon it in the months and years to come.

Raku multiple dispatch with the new MoarVM dispatcher

Published by jnthnwrthngtn on 2021-04-15T09:54:30

I recently wrote about the new MoarVM dispatch mechanism, and in that post noted that I still had a good bit of Raku’s multiple dispatch semantics left to implement in terms of it. Since then, I’ve made a decent amount of progress in that direction. This post contains an overview of the approach taken, and some very rough performance measurements.

My goodness, that’s a lot of semantics

Of all the kinds of dispatch we find in Raku, multiple dispatch is the most complex. Multiple dispatch allows us to write a set of candidates, which are then selected by the number of arguments:

multi ok($condition, $desc) {
    say ($condition ?? 'ok' !! 'not ok') ~ " - $desc";
}
multi ok($condition) {
    ok($condition, '');
}

Or the types of arguments:

multi to-json(Int $i) { ~$i }
multi to-json(Bool $b) { $b ?? 'true' !! 'false' }

And not just one argument, but potentially many:

multi truncate(Str $str, Int $chars) {
    $str.chars < $chars ?? $str !! $str.substr(0, $chars) ~ '...'
}
multi truncate(Str $str, Str $after) {
    with $str.index($after) -> $pos {
        $str.substr(0, $pos) ~ '...'
    }
    else {
        $str
    }
}

We may write where clauses to differentiate candidates on properties that are not captured by nominal types:

multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * fac($n - 1) }

Every time we write a set of multi candidates like this, the compiler will automatically produce a proto routine. This is what is installed in the symbol table, and holds the candidate list. However, we can also write our own proto, and use the special term {*} to decide at which point we do the dispatch, if at all.

proto mean($collection) {
    $collection.elems == 0 ?? Nil !! {*}
}
multi mean(@arr) {
    @arr.sum / @arr.elems
}
multi mean(%hash) {
    %hash.values.sum / %hash.elems
}

Candidates are ranked by narrowness (using topological sorting). If multiple candidates match, but they are equally narrow, then that’s an ambiguity error. Otherwise, we call narrowest one. The candidate we choose may then use callsame and friends to defer to the next narrowest candidate, which may do the same, until we reach the most general matching one.

Multiple dispatch is everywhere

Raku leans heavily on multiple dispatch. Most operators in Raku are compiled into calls to multiple dispatch subroutines. Even $a + $b will be a multiple dispatch. This means doing multiple dispatch efficiently is really important for performance. Given the riches of its semantics, this is potentially a bit concerning. However, there’s good news too.

Most multiple dispatches are boring

The overwhelmingly common case is that we have:

This isn’t to say the other cases are unimportant; they are really quite useful, and it’s desirable for them to perform well. However, it’s also desirable to make what savings we can in the common case. For example, we don’t want to eagerly calculate the full set of possible candidates for every single multiple dispatch, because the majority of the time only the first one matters. This is not just a time concern: recall that the new dispatch mechanism stores dispatch programs at each callsite, and if we store the list of all matching candidates at each of those, we’ll waste a lot of memory too.

How do we do today?

The situation in Rakudo today is as follows:

Effectively, the situation today is that you simply don’t use where clauses in a multiple dispatch if its anywhere near a hot path (well, and if you know where the hot paths are, and know that this kind of dispatch is slow). Ditto for callsame, although that’s less commonly reached for. The question is, can we do better with the new dispatcher?

Guard the types

Let’s start out with seeing how the simplest cases are dealt with, and build from there. (This is actually what I did in terms of the implementation, but at the same time I had a rough idea where I was hoping to end up.)

Recall this pair of candidates:

multi truncate(Str $str, Int $chars) {
    $str.chars < $chars ?? $str !! $str.substr(0, $chars) ~ '...'
}
multi truncate(Str $str, Str $after) {
    with $str.index($after) -> $pos {
        $str.substr(0, $pos) ~ '...'
    }
    else {
        $str
    }
}

We then have a call truncate($message, "\n"), where $message is a Str. Under the new dispatch mechanism, the call is made using the raku-call dispatcher, which identifies that this is a multiple dispatch, and thus delegates to raku-multi. (Multi-method dispatch ends up there too.)

The record phase of the dispatch – on the first time we reach this callsite – will proceed as follows:

  1. Iterate over the candidates
  2. If a candidate doesn’t match on argument count, just discard it. Since the shape of a callsite is a constant, and we calculate dispatch programs at each callsite, we don’t need to establish any guards for this.
  3. If it matches on types and concreteness, note which parameters are involved and what kinds of guards they need.
  4. If there was no match or an ambiguity, report the error without producing a dispatch program.
  5. Otherwise, having established the type guards, delegate to the raku-invoke dispatcher with the chosen candidate.

When we reach the same callsite again, we can run the dispatch program, which quickly checks if the argument types match those we saw last time, and if they do, we know which candidate to invoke. These checks are very cheap – far cheaper than walking through all of the candidates and examining each of them for a match. The optimizer may later be able to prove that the checks will always come out true and eliminate them.

Thus the whole of the dispatch processes – at least for this simple case where we only have types and arity – can be “explained” to the virtual machine as “if the arguments have these exact types, invoke this routine”. It’s pretty much the same as we were doing for method dispatch, except there we only cared about the type of the first argument – the invocant – and the value of the method name. (Also recall from the previous post that if it’s a multi-method dispatch, then both method dispatch and multiple dispatch will guard the type of the first argument, but the duplication is eliminated, so only one check is done.)

That goes in the resumption hole

Coming up with good abstractions is difficult, and therein lies much of the challenge of the new dispatch mechanism. Raku has quite a number of different dispatch-like things. However, encoding all of them directly in the virtual machine leads to high complexity, which makes building reliable optimizations (or even reliable unoptimized implementations!) challenging. Thus the aim is to work out a comparatively small set of primitives that allow for dispatches to be “explained” to the virtual machine in such a way that it can deliver decent performance.

It’s fairly clear that callsame is a kind of dispatch resumption, but what about the custom proto case and the where clause case? It turns out that these can both be neatly expressed in terms of dispatch resumption too (the where clause case needing one small addition at the virtual machine level, which in time is likely to be useful for other things too). Not only that, but encoding these features in terms of dispatch resumption is also quite direct, and thus should be efficient. Every trick we teach the specializer about doing better with dispatch resumptions can benefit all of the language features that are implemented using them, too.

Custom protos

Recall this example:

proto mean($collection) {
    $collection.elems == 0 ?? Nil !! {*}
}

Here, we want to run the body of the proto, and then proceed to the chosen candidate at the point of the {*}. By contrast, when we don’t have a custom proto, we’d like to simply get on with calling the correct multi.

To achieve this, I first moved the multi candidate selection logic from the raku-multi dispatcher to the raku-multi-core dispatcher. The raku-multi dispatcher then checks if we have an “onlystar” proto (one that does not need us to run it). If so, it delegates immediately to raku-multi-core. If not, it saves the arguments to the dispatch as the resumption initialization state, and then calls the proto. The proto‘s {*} is compiled into a dispatch resumption. The resumption then delegates to raku-multi-core. Or, in code:

nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-multi',
    # Initial dispatch, only setting up resumption if we need to invoke the
    # proto.
    -> $capture {
        my $callee := nqp::captureposarg($capture, 0);
        my int $onlystar := nqp::getattr_i($callee, Routine, '$!onlystar');
        if $onlystar {
            # Don't need to invoke the proto itself, so just get on with the
            # candidate dispatch.
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi-core', $capture);
        }
        else {
            # Set resume init args and run the proto.
            nqp::dispatch('boot-syscall', 'dispatcher-set-resume-init-args', $capture);
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-invoke', $capture);
        }
    },
    # Resumption means that we have reached the {*} in the proto and so now
    # should go ahead and do the dispatch. Make sure we only do this if we
    # are signalled to that it's a resume for an onlystar (resumption kind 5).
    -> $capture {
        my $track_kind := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 0);
        nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_kind);
        my int $kind := nqp::captureposarg_i($capture, 0);
        if $kind == 5 {
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi-core',
                nqp::dispatch('boot-syscall', 'dispatcher-get-resume-init-args'));
        }
        elsif !nqp::dispatch('boot-syscall', 'dispatcher-next-resumption') {
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-constant',
                nqp::dispatch('boot-syscall', 'dispatcher-insert-arg-literal-obj',
                    $capture, 0, Nil));
        }
    });

Two become one

Deferring to the next candidate (for example with callsame) and trying the next candidate because a where clause failed look very similar: both involve walking through a list of possible candidates. There’s some details, but they have a great deal in common, and it’d be nice if that could be reflected in how multiple dispatch is implemented using the new dispatcher.

Before that, a slightly terrible detail about how things work in Rakudo today when we have where clauses. First, the dispatcher does a “trial bind”, where it asks the question: would this signature bind? To do this, it has to evaluate all of the where clauses. Worse, it has to use the slow-path signature binder too, which interprets the signature, even though we can in many cases compile it. If the candidate matches, great, we select it, and then invoke it…which runs the where clauses a second time, as part of the compiled signature binding code. There is nothing efficient about this at all, except for it being by far more efficient on developer time, which is why it happened that way.

Anyway, it goes without saying that I’m rather keen to avoid this duplicate work and the slow-path binder where possible as I re-implement this using the new dispatcher. And, happily, a small addition provides a solution. There is an op assertparamcheck, which any kind of parameter checking compiles into (be it type checking, where clause checking, etc.) This triggers a call to a function that gets the arguments, the thing we were trying to call, and can then pick through them to produce an error message. The trick is to provide a way to invoke a routine such that a bind failure, instead of calling the error reporting function, will leave the routine and then do a dispatch resumption! This means we can turn failure to pass where clause checks into a dispatch resumption, which will then walk to the next candidate and try it instead.

Trivial vs. non-trivial

This gets us most of the way to a solution, but there’s still the question of being memory and time efficient in the common case, where there is no resumption and no where clauses. I coined the term “trivial multiple dispatch” for this situation, which makes the other situation “non-trivial”. In fact, I even made a dispatcher called raku-multi-non-trivial! There are two ways we can end up there.

  1. The initial attempt to find a matching candidate determines that we’ll have to consider where clauses. As soon as we see this is the case, we go ahead and produce a full list of possible candidates that could match. This is a linked list (see my previous post for why).
  2. The initial attempt to find a matching candidate finds one that can be picked based purely on argument count and nominal types. We stop there, instead of trying to build a full candidate list, and run the matching candidate. In the event that a callsame happens, we end up in the trivial dispatch resumption handler, which – since this situation is now non-trivial – builds the full candidate list, snips the first item off it (because we already ran that), and delegates to raku-multi-non-trivial.

Lost in this description is another significant improvement: today, when there are where clauses, we entirely lose the ability to use the MoarVM multiple dispatch cache, but under the new dispatcher, we store a type-filtered list of candidates at the callsite, and then cheap type guards are used to check it is valid to use.

Preliminary results

I did a few benchmarks to see how the new dispatch mechanism did with a couple of situations known to be sub-optimal in Rakudo today. These numbers do not reflect what is possible, because at the moment the specializer does not have much of an understanding of the new dispatcher. Rather, they reflect the minimal improvement we can expect.

Consider this benchmark using a multi with a where clause to recursively implement factorial.

multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * fac($n - 1) }
for ^100_000 {
    fac(10)
}
say now - INIT now;

This needs some tweaks (and to be run under an environment variable) to use the new dispatcher; these are temporary, until such a time I switch Rakudo over to using the new dispatcher by default:

use nqp;
multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * nqp::dispatch('raku-call', &fac, $n - 1) }
for ^100_000 {
    nqp::dispatch('raku-call', &fac, 10);
}
say now - INIT now;

On my machine, the first runs in 4.86s, the second in 1.34s. Thus under the new dispatcher this runs in little over a quarter of the time it used to – a quite significant improvement already.

A case involving callsame is also interesting to consider. Here it is without using the new dispatcher:

multi fallback(Any $x) { "a$x" }
multi fallback(Numeric $x) { "n" ~ callsame }
multi fallback(Real $x) { "r" ~ callsame }
multi fallback(Int $x) { "i" ~ callsame }
for ^1_000_000 {
    fallback(4+2i);
    fallback(4.2);
    fallback(42);
}   
say now - INIT now;

And with the temporary tweaks to use the new dispatcher:

use nqp;
multi fallback(Any $x) { "a$x" }
multi fallback(Numeric $x) { "n" ~ new-disp-callsame }
multi fallback(Real $x) { "r" ~ new-disp-callsame }
multi fallback(Int $x) { "i" ~ new-disp-callsame }
for ^1_000_000 {
    nqp::dispatch('raku-call', &fallback, 4+2i);
    nqp::dispatch('raku-call', &fallback, 4.2);
    nqp::dispatch('raku-call', &fallback, 42);
}
say now - INIT now;

On my machine, the first runs in 31.3s, the second in 11.5s, meaning that with the new dispatcher we manage it in a little over a third of the time that current Rakudo does.

These are both quite encouraging, but as previously mentioned, a majority of multiple dispatches are of the trivial kind, not using these features. If I make the most common case worse on the way to making other things better, that would be bad. It’s not yet possible to make a fair comparison of this: trivial multiple dispatches already receive a lot of attention in the specializer, and it doesn’t yet optimize code using the new dispatcher well. Of note, in an example like this:

multi m(Int) { }
multi m(Str) { }
for ^1_000_000 {
    m(1);
    m("x");
}
say now - INIT now;

Inlining and other optimizations will turn this into an empty loop, which is hard to beat. There is one thing we can already do, though: run it with the specializer disabled. The new dispatcher version looks like this:

use nqp;
multi m(Int) { }
multi m(Str) { }
for ^1_000_000 {
    nqp::dispatch('raku-call', &m, 1);
    nqp::dispatch('raku-call', &m, "x");
}
say now - INIT now;

The results are 0.463s and 0.332s respectively. Thus, the baseline execution time – before the specializer does its magic – is less using the new general dispatch mechanism than it is using the special-case multiple dispatch cache that we currently use. I wasn’t sure what to expect here before I did the measurement. Given we’re going from a specialized mechanism that has been profiled and tweaked to a new general mechanism that hasn’t received such attention, I was quite ready to be doing a little bit worse initially, and would have been happy with parity. Running in 70% of the time was a bigger improvement than I expected at this point.

I expect that once the specializer understands the new dispatch mechanism better, it will be able to also turn the above into an empty loop – however, since more iterations can be done per-optimization, this should still show up as a win for the new dispatcher.

Final thoughts

With one relatively small addition, the new dispatch mechanism is already handling most of the Raku multiple dispatch semantics. Furthermore, even without the specializer and JIT really being able to make a good job of it, some microbenchmarks already show a factor of 3x-4x improvement. That’s a pretty good starting point.

There’s still a good bit to do before we ship a Rakudo release using the new dispatcher. However, multiple dispatch was the biggest remaining threat to the design: it’s rather more involved than other kinds of dispatch, and it was quite possible that an unexpected shortcoming could trigger another round of design work, or reveal that the general mechanism was going to struggle to perform compared to the more specialized one in the baseline unoptimized, case. So far, there’s no indication of either of these, and I’m cautiously optimistic that the overall design is about right.

Overdue Apprecation for rakudo-pkg

Published by 5ab5traction5 on 2021-03-16T00:00:00

It's been over a month since I first came across -- finally -- a clean way to present anyone who runs Linux with a simple, clean, non-"virtualenv" installation of raku: rakudo-pkg to the rescue!

rakudo-pkg vs a virtual environment like rakubrew

There was always a bit of an icky feeling related to relying on rakubrew(and rakudobrew before it) for requiring inquiring minds to first ignore what their system offered them through official channels and instead install some in order to have access to anything remotely resembling an up-to-date version of the raku runtime (or perl6 before it).

Unfortunately, in the case of most official package repositories, the latest officially available versions were often ancient1. It's heartening to note, however, that this situation has improved significantly since the official debut of the "finished" language as 6.c half a decade ago. Still the official repositories lag far behind the improvements that are made, even today.

In my opinion, it is one thing to encounter a virtualenv-style tool after you have hit some limitation with running the system installation of a language. But being exposed to adding a whole new mothball to your home directory and login shell configuration as a requirement to just trying out a language is not the strongest look in terms of an advocacy perspective.

Having a dedicated system path for the tools also fixes issues related to tools that do not inherit environment variables created by executing all of those tweaks stashed in .bash_profile or (in my case) .config/fish/config.fish.

A virtualenv approach is also particularly un-desirable as it is potentially resolvable through guarantees made at the language design layer around Raku's approach to module and language versioning.

Important details about Raku versioning

Raku naturally shows it's previous life as that caterpillar formerly-known-as-Perl-6 most strongly when you encounter its own versioning.

use v6.c is guaranteed to access a historical standard of Raku behavior. use v6.* optimistically says "use *Whatever* version you consider the newest". use v6.d gives you guarantees that the language won't start spitting deprecation warnings pertaining to later versions, starting with v6.e, while also doing everything exactly as v6.d intended even on a newer release.

Free project ideas below

It would be interesting to stress test the implicit and explicit language level guarantees of Raku by dog-fooding an old fashioned "smoke test" on our own with regard to the claims made in the designs of the language versioning and the module repositories and authorities concepts. A sort of "distributed DarkPAN simulator" for Raku in the 2020s.

The CompUnit repositories and module authorities are ideas that intend to make backward compatibility easier in a world where sometimes you want to run a locally patched variant of a public module that is otherwise identical (or even wildly incompatible) and other times you want to be able to run two different versions of a library side-by-side -- at the same time.

A/A testing of library upgrades at vanguard for a bit before rolling out to the fleet, anyone? (That's a different, likely far more profitable, library idea for you, my intrepid reader).

Ok, but where???

Check out the blog post announcement of the new GitHub Task based release flow and the latest iteration of the rakudo-pkg project.

  1. It was a long road to the first official release, so it is not at all fair to blame distribution maintenance teams to not bother with ensuring that the bleeding edge version of a still-baking language was--or is--easily accessible. Things have gotten better since the release of 6.c.

Towards a new general dispatch mechanism in MoarVM

Published by jnthnwrthngtn on 2021-03-15T02:08:42

My goodness, it appears I’m writing my first Raku internals blog post in over two years. Of course, two years ago it wasn’t even called Raku. Anyway, without further ado, let’s get on with this shared brainache.

What is dispatch?

I use “dispatch” to mean a process by which we take a set of arguments and end up with some action being taken based upon them. Some familiar examples include:

At first glance, perhaps the first two seem fairly easy and the third a bit more of a handful – which is sort of true. However, Raku has a number of other features that make dispatch rather more, well, interesting. For example:

Thanks to this, dispatch – at least in Raku – is not always something we do and produce an outcome, but rather a process that we may be asked to continue with multiple times!

Finally, while the examples I’ve written above can all quite clearly be seen as examples of dispatch, a number of other common constructs in Raku can be expressed as a kind of dispatch too. Assignment is one example: the semantics of it depend on the target of the assignment and the value being assigned, and thus we need to pick the correct semantics. Coercion is another example, and return value type-checking yet another.

Why does dispatch matter?

Dispatch is everywhere in our programs, quietly tieing together the code that wants stuff done with the code that does stuff. Its ubiquity means it plays a significant role in program performance. In the best case, we can reduce the cost to zero. In the worst case, the cost of the dispatch is high enough to exceed that of the work done as a result of the dispatch.

To a first approximation, when the runtime “understands” the dispatch the performance tends to be at least somewhat decent, but when it doesn’t there’s a high chance of it being awful. Dispatches tend to involve an amount of work that can be cached, often with some cheap guards to verify the validity of the cached outcome. For example, in a method dispatch, naively we need to walk a linearization of the inheritance graph and ask each class we encounter along the way if it has a method of the specified name. Clearly, this is not going to be terribly fast if we do it on every method call. However, a particular method name on a particular type (identified precisely, without regard to subclassing) will resolve to the same method each time. Thus, we can cache the outcome of the lookup, and use it whenever the type of the invocant matches that used to produce the cached result.

Specialized vs. generalized mechanisms in language runtimes

When one starts building a runtime aimed at a particular language, and has to do it on a pretty tight budget, the most obvious way to get somewhat tolerable performance is to bake various hot-path language semantics into the runtime. This is exactly how MoarVM started out. Thus, if we look at MoarVM as it stood several years ago, we find things like:

These are all still there today, however are also all on the way out. What’s most telling about this list is what isn’t included. Things like:

A few years back I started to partially address this, with the introduction of a mechanism I called “specializer plugins”. But first, what is the specializer?

When MoarVM started out, it was a relatively straightforward interpreter of bytecode. It only had to be fast enough to beat the Parrot VM in order to get a decent amount of usage, which I saw as important to have before going on to implement some more interesting optimizations (back then we didn’t have the kind of pre-release automated testing infrastructure we have today, and so depended much more on feedback from early adopters). Anyway, soon after being able to run pretty much as much of the Raku language as any other backend, I started on the dynamic optimizer. It gathered type statistics as the program was interpreted, identified hot code, put it into SSA form, used the type statistics to insert guards, used those together with static properties of the bytecode to analyze and optimize, and produced specialized bytecode for the function in question. This bytecode could elide type checks and various lookups, as well as using a range of internal ops that make all kinds of assumptions, which were safe because of the program properties that were proved by the optimizer. This is called specialized bytecode because it has had a lot of its genericity – which would allow it to work correctly on all types of value that we might encounter – removed, in favor of working in a particular special case that actually occurs at runtime. (Code, especially in more dynamic languages, is generally far more generic in theory than it ever turns out to be in practice.)

This component – the specializer, known internally as “spesh” – delivered a significant further improvement in the performance of Raku programs, and with time its sophistication has grown, taking in optimizations such as inlining and escape analysis with scalar replacement. These aren’t easy things to build – but once a runtime has them, they create design possibilities that didn’t previously exist, and make decisions made in their absence look sub-optimal.

Of note, those special-cased language-specific mechanisms, baked into the runtime to get some speed in the early days, instead become something of a liability and a bottleneck. They have complex semantics, which means they are either opaque to the optimizer (so it can’t reason about them, meaning optimization is inhibited) or they need special casing in the optimizer (a liability).

So, back to specializer plugins. I reached a point where I wanted to take on the performance of things like $obj.?meth (the “call me maybe” dispatch), $obj.SomeType::meth() (dispatch qualified with a class to start looking in), and private method calls in roles (which can’t be resolved statically). At the same time, I was getting ready to implement some amount of escape analysis, but realized that it was going to be of very limited utility because assignment had also been special-cased in the VM, with a chunk of opaque C code doing the hot path stuff.

But why did we have the C code doing that hot-path stuff? Well, because it’d be too espensive to have every assignment call a VM-level function that does a bunch of checks and logic. Why is that costly? Because of function call overhead and the costs of interpretation. This was all true once upon a time. But, some years of development later:

I solved the assignment problem and the dispatch problems mentioned above with the introduction of a single new mechanism: specializer plugins. They work as follows:

The vast majority of cases are monomorphic, meaning that only one set of guards are produced and they always succeed thereafter. The specializer can thus compile those guards into the specialized bytecode and then assume the given target invocant is what will be invoked. (Further, duplicate guards can be eliminated, so the guards a particular plugin introduces may reduce to zero.)

Specializer plugins felt pretty great. One new mechanism solved multiple optimization headaches.

The new MoarVM dispatch mechanism is the answer to a fairly simple question: what if we get rid of all the dispatch-related special-case mechanisms in favor of something a bit like specializer plugins? The resulting mechanism would need to be a more powerful than specializer plugins. Further, I could learn from some of the shortcomings of specializer plugins. Thus, while they will go away after a relatively short lifetime, I think it’s fair to say that I would not have been in a place to design the new MoarVM dispatch mechanism without that experience.

The dispatch op and the bootstrap dispatchers

All the method caching. All the multi dispatch caching. All the specializer plugins. All the invocation protocol stuff for unwrapping the bytecode handle in a code object. It’s all going away, in favor of a single new dispatch instruction. Its name is, boringly enough, dispatch. It looks like this:

dispatch_o result, 'dispatcher-name', callsite, arg0, arg1, ..., argN

Which means:

(Aside: this implies a new calling convention, whereby we no longer copy the arguments into an argument buffer, but instead pass the base of the register set and a pointer into the bytecode where the register argument map is found, and then do a lookup registers[map[argument_index]] to get the value for an argument. That alone is a saving when we interpret, because we no longer need a loop around the interpreter per argument.)

Some of the arguments might be things we’d traditionally call arguments. Some are aimed at the dispatch process itself. It doesn’t really matter – but it is more optimal if we arrange to put arguments that are only for the dispatch first (for example, the method name), and those for the target of the dispatch afterwards (for example, the method parameters).

The new bootstrap mechanism provides a small number of built-in dispatchers, whose names start with “boot-“. They are:

That’s pretty much it. Every dispatcher we build, to teach the runtime about some other kind of dispatch behavior, eventually terminates in one of these.

Building on the bootstrap

Teaching MoarVM about different kinds of dispatch is done using nothing less than the dispatch mechanism itself! For the most part, boot-syscall is used in order to register a dispatcher, set up the guards, and provide the result that goes with them.

Here is a minimal example, taken from the dispatcher test suite, showing how a dispatcher that provides the identity function would look:

nqp::dispatch('boot-syscall', 'dispatcher-register', 'identity', -> $capture {
    nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-value', $capture);
});
sub identity($x) {
    nqp::dispatch('identity', $x)
}
ok(identity(42) == 42, 'Can define identity dispatch (1)');
ok(identity('foo') eq 'foo', 'Can define identity dispatch (2)');

In the first statement, we call the dispatcher-register MoarVM system call, passing a name for the dispatcher along with a closure, which will be called each time we need to handle the dispatch (which I tend to refer to as the “dispatch callback”). It receives a single argument, which is a capture of arguments (not actually a Raku-level Capture, but the idea – an object containing a set of call arguments – is the same).

Every user-defined dispatcher should eventually use dispatcher-delegate in order to identify another dispatcher to pass control along to. In this case, it delegates immediately to boot-value – meaning it really is nothing except a wrapper around the boot-value built-in dispatcher.

The sub identity contains a single static occurrence of the dispatch op. Given we call the sub twice, we will encounter this op twice at runtime, but the two times are very different.

The first time is the “record” phase. The arguments are formed into a capture and the callback runs, which in turn passes it along to the boot-value dispatcher, which produces the result. This results in an extremely simple dispatch program, which says that the result should be the first argument in the capture. Since there’s no guards, this will always be a valid result.

The second time we encounter the dispatch op, it already has a dispatch program recorded there, so we are in run mode. Turning on a debugging mode in the MoarVM source, we can see the dispatch program that results looks like this:

Dispatch program (1 temporaries)
  Ops:
    Load argument 0 into temporary 0
    Set result object value from temporary 0

That is, it reads argument 0 into a temporary location and then sets that as the result of the dispatch. Notice how there is no mention of the fact that we went through an extra layer of dispatch; those have zero cost in the resulting dispatch program.

Capture manipulation

Argument captures are immutable. Various VM syscalls exist to transform them into new argument captures with some tweak, for example dropping or inserting arguments. Here’s a further example from the test suite:

nqp::dispatch('boot-syscall', 'dispatcher-register', 'drop-first', -> $capture {
    my $capture-derived := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg', $capture, 0);
    nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-value', $capture-derived);
});
ok(nqp::dispatch('drop-first', 'first', 'second') eq 'second',
    'dispatcher-drop-arg works');

This drops the first argument before passing the capture on to the boot-value dispatcher – meaning that it will return the second argument. Glance back at the previous dispatch program for the identity function. Can you guess how this one will look?

Well, here it is:

Dispatch program (1 temporaries)
  Ops:
    Load argument 1 into temporary 0
    Set result string value from temporary 0

Again, while in the record phase of such a dispatcher we really do create capture objects and make a dispatcher delegation, the resulting dispatch program is far simpler.

Here’s a slightly more involved example:

my $target := -> $x { $x + 1 }
nqp::dispatch('boot-syscall', 'dispatcher-register', 'call-on-target', -> $capture {
    my $capture-derived := nqp::dispatch('boot-syscall',
            'dispatcher-insert-arg-literal-obj', $capture, 0, $target);
    nqp::dispatch('boot-syscall', 'dispatcher-delegate',
            'boot-code-constant', $capture-derived);
});
sub cot() { nqp::dispatch('call-on-target', 49) }
ok(cot() == 50,
    'dispatcher-insert-arg-literal-obj works at start of capture');
ok(cot() == 50,
    'dispatcher-insert-arg-literal-obj works at start of capture after link too');

Here, we have a closure stored in a variable $target. We insert it as the first argument of the capture, and then delegate to boot-code-constant, which will invoke that code object and pass the other dispatch arguments to it. Once again, at the record phase, we really do something like:

And the resulting dispatch program? It’s this:

Dispatch program (1 temporaries)
  Ops:
    Load collectable constant at index 0 into temporary 0
    Skip first 0 args of incoming capture; callsite from 0
    Invoke MVMCode in temporary 0

That is, load the constant bytecode handle that we’re going to invoke, set up the args (which are in this case equal to those of the incoming capture), and then invoke the bytecode with those arguments. The argument shuffling is, once again, gone. In general, whenever the arguments we do an eventual bytecode invocation with are a tail of the initial dispatch arguments, the arguments transform becomes no more than a pointer addition.

Guards

All of the dispatch programs seen so far have been unconditional: once recorded at a given callsite, they shall always be used. The big missing piece to make such a mechanism have practical utility is guards. Guards assert properties such as the type of an argument or if the argument is definite (Int:D) or not (Int:U).

Here’s a somewhat longer test case, with some explanations placed throughout it.

# A couple of classes for test purposes
my class C1 { }
my class C2 { }

# A counter used to make sure we're only invokving the dispatch callback as
# many times as we expect.
my $count := 0;

# A type-name dispatcher that maps a type into a constant string value that
# is its name. This isn't terribly useful, but it is a decent small example.
nqp::dispatch('boot-syscall', 'dispatcher-register', 'type-name', -> $capture {
    # Bump the counter, just for testing purposes.
    $count++;

    # Obtain the value of the argument from the capture (using an existing
    # MoarVM op, though in the future this may go away in place of a syscall)
    # and then obtain the string typename also.
    my $arg-val := nqp::captureposarg($capture, 0);
    my str $name := $arg-val.HOW.name($arg-val);

    # This outcome is only going to be valid for a particular type. We track
    # the argument (which gives us an object back that we can use to guard
    # it) and then add the type guard.
    my $arg := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 0);
    nqp::dispatch('boot-syscall', 'dispatcher-guard-type', $arg);

    # Finally, insert the type name at the start of the capture and then
    # delegate to the boot-constant dispatcher.
    nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-constant',
        nqp::dispatch('boot-syscall', 'dispatcher-insert-arg-literal-str',
            $capture, 0, $name));
});

# A use of the dispatch for the tests. Put into a sub so there's a single
# static dispatch op, which all dispatch programs will hang off.
sub type-name($obj) {
    nqp::dispatch('type-name', $obj)
}

# Check with the first type, making sure the guard matches when it should
# (although this test would pass if the guard were ignored too).
ok(type-name(C1) eq 'C1', 'Dispatcher setting guard works');
ok($count == 1, 'Dispatch callback ran once');
ok(type-name(C1) eq 'C1', 'Can use it another time with the same type');
ok($count == 1, 'Dispatch callback was not run again');

# Test it with a second type, both record and run modes. This ensures the
# guard really is being checked.
ok(type-name(C2) eq 'C2', 'Can handle polymorphic sites when guard fails');
ok($count == 2, 'Dispatch callback ran a second time for new type');
ok(type-name(C2) eq 'C2', 'Second call with new type works');

# Check that we can use it with the original type too, and it has stacked
# the dispatch programs up at the same callsite.
ok(type-name(C1) eq 'C1', 'Call with original type still works');
ok($count == 2, 'Dispatch callback only ran a total of 2 times');

This time two dispatch programs get produced, one for C1:

Dispatch program (1 temporaries)
  Ops:
    Guard arg 0 (type=C1)
    Load collectable constant at index 1 into temporary 0
    Set result string value from temporary 0

And another for C2:

Dispatch program (1 temporaries)
  Ops:
    Guard arg 0 (type=C2)
    Load collectable constant at index 1 into temporary 0
    Set result string value from temporary 0

Once again, no leftovers from capture manipulation, tracking, or dispatcher delegation; the dispatch program does a type guard against an argument, then produces the result string. The whole call to $arg-val.HOW.name($arg-val) is elided, the dispatcher we wrote encoding the knowledge – in a way that the VM can understand – that a type’s name can be considered immutable.

This example is a bit contrived, but now consider that we instead look up a method and guard on the invocant type: that’s a method cache! Guard the types of more of the arguments, and we have a multi cache! Do both, and we have a multi-method cache.

The latter is interesting in so far as both the method dispatch and the multi dispatch want to guard on the invocant. In fact, in MoarVM today there will be two such type tests until we get to the point where the specializer does its work and eliminates these duplicated guards. However, the new dispatcher does not treat the dispatcher-guard-type as a kind of imperative operation that writes a guard into the resultant dispatch program. Instead, it declares that the argument in question must be guarded. If some other dispatcher already did that, it’s idempotent. The guards are emitted once all dispatch programs we delegate through, on the path to a final outcome, have had their say.

Fun aside: those being especially attentive will have noticed that the dispatch mechanism is used as part of implementing new dispatchers too, and indeed, this ultimately will mean that the specializer can specialize the dispatchers and have them JIT-compiled into something more efficient too. After all, from the perspective of MoarVM, it’s all just bytecode to run; it’s just that some of it is bytecode that tells the VM how to execute Raku programs more efficiently!

Dispatch resumption

A resumable dispatcher needs to do two things:

  1. Provide a resume callback as well as a dispatch one when registering the dispatcher
  2. In the dispatch callback, specify a capture, which will form the resume initialization state

When a resumption happens, the resume callback will be called, with any arguments for the resumption. It can also obtain the resume initialization state that was set in the dispatch callback. The resume initialization state contains the things needed in order to continue with the dispatch the first time it is resumed. We’ll take a look at how this works for method dispatch to see a concrete example. I’ll also, at this point, switch to looking at the real Rakudo dispatchers, rather than simplified test cases.

The Rakudo dispatchers take advantage of delegation, duplicate guards, and capture manipulations all having no runtime cost in the resulting dispatch program to, in my mind at least, quite nicely factor what is a somewhat involved dispatch process. There are multiple entry points to method dispatch: the normal boring $obj.meth(), the qualified $obj.Type::meth(), and the call me maybe $obj.?meth(). These have common resumption semantics – or at least, they can be made to provided we always carry a starting type in the resume initialization state, which is the type of the object that we do the method dispatch on.

Here is the entry point to dispatch for a normal method dispatch, with the boring details of reporting missing method errors stripped out.

# A standard method call of the form $obj.meth($arg); also used for the
# indirect form $obj."$name"($arg). It receives the decontainerized invocant,
# the method name, and the the args (starting with the invocant including any
# container).
nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-meth-call', -> $capture {
    # Try to resolve the method call using the MOP.
    my $obj := nqp::captureposarg($capture, 0);
    my str $name := nqp::captureposarg_s($capture, 1);
    my $meth := $obj.HOW.find_method($obj, $name);

    # Report an error if there is no such method.
    unless nqp::isconcrete($meth) {
        !!! 'Error reporting logic elided for brevity';
    }

    # Establish a guard on the invocant type and method name (however the name
    # may well be a literal, in which case this is free).
    nqp::dispatch('boot-syscall', 'dispatcher-guard-type',
        nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 0));
    nqp::dispatch('boot-syscall', 'dispatcher-guard-literal',
        nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 1));

    # Add the resolved method and delegate to the resolved method dispatcher.
    my $capture-delegate := nqp::dispatch('boot-syscall',
        'dispatcher-insert-arg-literal-obj', $capture, 0, $meth);
    nqp::dispatch('boot-syscall', 'dispatcher-delegate',
        'raku-meth-call-resolved', $capture-delegate);
});

Now for the resolved method dispatcher, which is where the resumption is handled. First, let’s look at the normal dispatch callback (the resumption callback is included but empty; I’ll show it a little later).

# Resolved method call dispatcher. This is used to call a method, once we have
# already resolved it to a callee. Its first arg is the callee, the second and
# third are the type and name (used in deferral), and the rest are the args to
# the method.
nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-meth-call-resolved',
    # Initial dispatch
    -> $capture {
        # Save dispatch state for resumption. We don't need the method that will
        # be called now, so drop it.
        my $resume-capture := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg',
            $capture, 0);
        nqp::dispatch('boot-syscall', 'dispatcher-set-resume-init-args', $resume-capture);

        # Drop the dispatch start type and name, and delegate to multi-dispatch or
        # just invoke if it's single dispatch.
        my $delegate_capture := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg',
            nqp::dispatch('boot-syscall', 'dispatcher-drop-arg', $capture, 1), 1);
        my $method := nqp::captureposarg($delegate_capture, 0);
        if nqp::istype($method, Routine) && $method.is_dispatcher {
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi', $delegate_capture);
        }
        else {
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-invoke', $delegate_capture);
        }
    },
    # Resumption
    -> $capture {
        ... 'Will be shown later';
    });

There’s an arguable cheat in raku-meth-call: it doesn’t actually insert the type object of the invocant in place of the invocant. It turns out that it doesn’t really matter. Otherwise, I think the comments (which are to be found in the real implementation also) tell the story pretty well.

One important point that may not be clear – but follows a repeating theme – is that the setting of the resume initialization state is also more of a declarative rather than an imperative thing: there isn’t a runtime cost at the time of the dispatch, but rather we keep enough information around in order to be able to reconstruct the resume initialization state at the point we need it. (In fact, when we are in the run phase of a resume, we don’t even have to reconstruct it in the sense of creating a capture object.)

Now for the resumption. I’m going to present a heavily stripped down version that only deals with the callsame semantics (the full thing has to deal with such delights as lastcall and nextcallee too). The resume initialization state exists to seed the resumption process. Once we know we actually do have to deal with resumption, we can do things like calculating the full list of methods in the inheritance graph that we want to walk through. Each resumable dispatcher gets a single storage slot on the call stack that it can use for its state. It can initialize this in the first step of resumption, and then update it as we go. Or more precisely, it can set up a dispatch program that will do this when run.

A linked list turns out to be a very convenient data structure for the chain of candidates we will walk through. We can work our way through a linked list by keeping track of the current node, meaning that there need only be a single thing that mutates, which is the current state of the dispatch. The dispatch program mechanism also provides a way to read an attribute from an object, and that is enough to express traversing a linked list into the dispatch program. This also means zero allocations.

So, without further ado, here is the linked list (rather less pretty in NQP, the restricted Raku subset, than it would be in full Raku):

# A linked list is used to model the state of a dispatch that is deferring
# through a set of methods, multi candidates, or wrappers. The Exhausted class
# is used as a sentinel for the end of the chain. The current state of the
# dispatch points into the linked list at the appropriate point; the chain
# itself is immutable, and shared over (runtime) dispatches.
my class DeferralChain {
    has $!code;
    has $!next;
    method new($code, $next) {
        my $obj := nqp::create(self);
        nqp::bindattr($obj, DeferralChain, '$!code', $code);
        nqp::bindattr($obj, DeferralChain, '$!next', $next);
        $obj
    }
    method code() { $!code }
    method next() { $!next }
};
my class Exhausted {};

And finally, the resumption handling.

nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-meth-call-resolved',
    # Initial dispatch
    -> $capture {
        ... 'Presented earlier;
    },
    # Resumption. The resume init capture's first two arguments are the type
    # that we initially did a method dispatch against and the method name
    # respectively.
    -> $capture {
        # Work out the next method to call, if any. This depends on if we have
        # an existing dispatch state (that is, a method deferral is already in
        # progress).
        my $init := nqp::dispatch('boot-syscall', 'dispatcher-get-resume-init-args');
        my $state := nqp::dispatch('boot-syscall', 'dispatcher-get-resume-state');
        my $next_method;
        if nqp::isnull($state) {
            # No state, so just starting the resumption. Guard on the
            # invocant type and name.
            my $track_start_type := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $init, 0);
            nqp::dispatch('boot-syscall', 'dispatcher-guard-type', $track_start_type);
            my $track_name := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $init, 1);
            nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_name);

            # Also guard on there being no dispatch state.
            my $track_state := nqp::dispatch('boot-syscall', 'dispatcher-track-resume-state');
            nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_state);

            # Build up the list of methods to defer through.
            my $start_type := nqp::captureposarg($init, 0);
            my str $name := nqp::captureposarg_s($init, 1);
            my @mro := nqp::can($start_type.HOW, 'mro_unhidden')
                ?? $start_type.HOW.mro_unhidden($start_type)
                !! $start_type.HOW.mro($start_type);
            my @methods;
            for @mro {
                my %mt := nqp::hllize($_.HOW.method_table($_));
                if nqp::existskey(%mt, $name) {
                    @methods.push(%mt{$name});
                }
            }

            # If there's nothing to defer to, we'll evaluate to Nil (just don't set
            # the next method, and it happens below).
            if nqp::elems(@methods) >= 2 {
                # We can defer. Populate next method.
                @methods.shift; # Discard the first one, which we initially called
                $next_method := @methods.shift; # The immediate next one

                # Build chain of further methods and set it as the state.
                my $chain := Exhausted;
                while @methods {
                    $chain := DeferralChain.new(@methods.pop, $chain);
                }
                nqp::dispatch('boot-syscall', 'dispatcher-set-resume-state-literal', $chain);
            }
        }
        elsif !nqp::istype($state, Exhausted) {
            # Already working through a chain of method deferrals. Obtain
            # the tracking object for the dispatch state, and guard against
            # the next code object to run.
            my $track_state := nqp::dispatch('boot-syscall', 'dispatcher-track-resume-state');
            my $track_method := nqp::dispatch('boot-syscall', 'dispatcher-track-attr',
                $track_state, DeferralChain, '$!code');
            nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_method);

            # Update dispatch state to point to next method.
            my $track_next := nqp::dispatch('boot-syscall', 'dispatcher-track-attr',
                $track_state, DeferralChain, '$!next');
            nqp::dispatch('boot-syscall', 'dispatcher-set-resume-state', $track_next);

            # Set next method, which we shall defer to.
            $next_method := $state.code;
        }
        else {
            # Dispatch already exhausted; guard on that and fall through to returning
            # Nil.
            my $track_state := nqp::dispatch('boot-syscall', 'dispatcher-track-resume-state');
            nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_state);
        }

        # If we found a next method...
        if nqp::isconcrete($next_method) {
            # Call with same (that is, original) arguments. Invoke with those.
            # We drop the first two arguments (which are only there for the
            # resumption), add the code object to invoke, and then leave it
            # to the invoke or multi dispatcher.
            my $just_args := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg',
                nqp::dispatch('boot-syscall', 'dispatcher-drop-arg', $init, 0),
                0);
            my $delegate_capture := nqp::dispatch('boot-syscall',
                'dispatcher-insert-arg-literal-obj', $just_args, 0, $next_method);
            if nqp::istype($next_method, Routine) && $next_method.is_dispatcher {
                nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi',
                        $delegate_capture);
            }
            else {
                nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-invoke',
                        $delegate_capture);
            }
        }
        else {
            # No method, so evaluate to Nil (boot-constant disregards all but
            # the first argument).
            nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-constant',
                nqp::dispatch('boot-syscall', 'dispatcher-insert-arg-literal-obj',
                    $capture, 0, Nil));
        }
    });

That’s quite a bit to take in, and quite a bit of code. Remember, however, that this is only run for the record phase of a dispatch resumption. It also produces a dispatch program at the callsite of the callsame, with the usual guards and outcome. Implicit guards are created for the dispatcher that we are resuming at that point. In the most common case this will end up monomorphic or bimorphic, although situations involving nestings of multiple dispatch or method dispatch could produce a more morphic callsite.

The design I’ve picked forces resume callbacks to deal with two situations: the first resumption and the latter resumptions. This is not ideal in a couple of ways:

  1. It’s a bit inconvenient for those writing dispatch resume callbacks. However, it’s not like this is a particularly common activity!
  2. The difference results in two dispatch programs being stacked up at a callsite that might otherwise get just one

Only the second of these really matters. The reason for the non-uniformity is to make sure that the overwhelming majority of calls, which never lead to a dispatch resumption, incur no per-dispatch cost for a feature that they never end up using. If the result is a little more cost for those using the feature, so be it. In fact, early benchmarking shows callsame with wrap and method calls seems to be up to 10 times faster using the new dispatcher than in current Rakudo, and that’s before the specializer understands enough about it to improve things further!

What’s done so far

Everything I’ve discussed above is implemented, except that I may have given the impression somewhere that multiple dispatch is fully implemented using the new dispatcher, and that is not the case yet (no handling of where clauses and no dispatch resumption support).

Next steps

Getting the missing bits of multiple dispatch fully implemented is the obvious next step. The other missing semantic piece is support for callwith and nextwith, where we wish to change the arguments that are being used when moving to the next candidate. A few other minor bits aside, that in theory will get all of the Raku dispatch semantics at least supported.

Currently, all standard method calls ($obj.meth()) and other calls (foo() and $foo()) go via the existing dispatch mechanism, not the new dispatcher. Those will need to be migrated to use the new dispatcher also, and any bugs that are uncovered will need fixing. That will get things to the point where the new dispatcher is semantically ready.

After that comes performance work: making sure that the specializer is able to deal with dispatch program guards and outcomes. The goal, initially, is to get steady state performance of common calling forms to perform at least as well as in the current master branch of Rakudo. It’s already clear enough there will be some big wins for some things that to date have been glacial, but it should not come at the cost of regression on the most common kinds of dispatch, which have received plenty of optimization effort before now.

Furthermore, NQP – the restricted form of Raku that the Rakudo compiler and other bits of the runtime guts are written in – also needs to be migrated to use the new dispatcher. Only when that is done will it be possible to rip out the current method cache, multiple dispatch cache, and so forth from MoarVM.

An open question is how to deal with backends other than MoarVM. Ideally, the new dispatch mechanism will be ported to those. A decent amount of it should be possible to express in terms of the JVM’s invokedynamic (and this would all probably play quite well with a Truffle-based Raku implementation, although I’m not sure there is a current active effort in that area).

Future opportunities

While my current focus is to ship a Rakudo and MoarVM release that uses the new dispatcher mechanism, that won’t be the end of the journey. Some immediate ideas:

Some new language features may also be possible to provide in an efficient way with the help of the new dispatch mechanism. For example, there’s currently not a reliable way to try to invoke a piece of code, just run it if the signature binds, or to do something else if it doesn’t. Instead, things like the Cro router have to first do a trial bind of the signature, and then do the invoke, which makes routing rather more costly. There’s also the long suggested idea of providing pattern matching via signatures with the when construct (for example, when * -> ($x) {}; when * -> ($x, *@tail) { }), which is pretty much the same need, just in a less dynamic setting.

In closing…

Working on the new dispatch mechanism has been a longer journey than I first expected. The resumption part of the design was especially challenging, and there’s still a few important details to attend to there. Something like four potential approaches were discarded along the way (although elements of all of them influenced what I’ve described in this post). Abstractions that hold up are really, really, hard.

I also ended up having to take a couple of months away from doing Raku work at all, felt a bit crushed during some others, and have been juggling this with the equally important RakuAST project (which will be simplified by being able to assume the presence of the new dispatcher, and also offers me a range of softer Raku hacking tasks, whereas the dispatcher work offers few easy pickings).

Given all that, I’m glad to finally be seeing the light at the end of the tunnel. The work that remains is enumerable, and the day we ship a Rakudo and MoarVM release using the new dispatcher feels a small number of months away (and I hope writing that is not tempting fate!)

The new dispatcher is probably the most significant change to MoarVM since I founded it, in so far as it sees us removing a bunch of things that have been there pretty much since the start. RakuAST will also deliver the greatest architectural change to the Rakudo compiler in a decade. Both are an opportunity to fold years of learning things the hard way into the runtime and compiler. I hope when I look back at it all in another decade’s time, I’ll at least feel I made more interesting mistakes this time around.

Why bother with Scripting?

Published by Bart Wiegmans on 2021-03-14T14:33:00

Many years back, Larry Wall shared his thesis on the nature of scripting. Since recently even Java gained 'script' support I thought it would be fitting to revisit the topic, and hopefully relevant to the perl and raku language community.

The weakness of Larry's treatment (which, to be fair to the author, I think is more intended to be enlightening than to be complete) is the contrast of scripting with programming. This contrast does not permit a clear separation because scripts are programs. That is to say, no matter how long or short, scripts are written commands for a machine to execute, and I think that's a pretty decent definition of a program in general.

A more useful contrast - and, I think, the intended one - is between scripts and other sorts of programs, because that allows us to compare scripting (writing scripts) with 'programming' (writing non-script programs). And to do that we need to know what other sorts of programs there are.

The short version of that answer is - systems and applications, and a bunch of other things that aren't really relevant to the working programmer, like (embedded) control algorithms, spreadsheets and database queries. (The definition I provided above is very broad, by design, because I don't want to get stuck on boundary questions). Most programmers write applications, some write systems, virtually all write scripts once in a while, though plenty of people who aren't professional programmers also write scripts.

I think the defining features of applications and systems are, respectively:

Consider for instance a mail client (like thunderbird) in comparison to a mailer daemon (like sendmail) - one provides an interface to read and write e-mails (the model) and the other provides functionality to send that e-mail to other servers.

Note that under this (again, broad) definition, libraries are also system software, which makes sense, considering that their users are developers (just as for, say, PostgreSQL) who care about things like performance, reliability, and correctness. Incidentally, libraries as well as 'typical' system software (such as database engines and operating system kernels) tend to be written in languages like C and C++ for much the same reasons.

What then, are the differences between scripts, applications, and systems? I think the following is a good list:

Obviously these distinctions aren't really binary - 'short' versus 'long', 'ad-hoc' versus 'general purpose'  - and can't be used to conclusively settle the question whether something is a script or an application. (If, indeed, that question ever comes up). More important is that for the 10 or so scripts I've written over the past year - some professionally, some not - all or most of these properties held, and I'd be surprised if the same isn't true for most readers. 

And - finally coming at the point that I'm trying to make today - these features point to a specific niche of programs more than to a specific technology (or set of technologies). To be exact, scripts are (mostly) short, custom programs to automate ad-hoc tasks, tasks that are either to specific or too small to develop and distribute another program for.

This has further implications on the preferred features of a scripting language (taken to mean, a language designed to enable the development of scripts). In particular:

As an example of the last point - Python 3 requires users to be exact about the encoding of their input, causing all sorts of trouble for unsuspecting scripters when they accidentally try to read ISO-8551 data as UTF-8, or vice versa. Python 2 did not, and for most scripts - not applications - I actually think that is the right choice.

This niche doesn't always exist. In computing environments where everything of interest is adequately captured by an application, or which lacks the ability to effectively automate ad-hoc tasks (I'm thinking in particular of Windows before PowerShell), the practice of scripting tends to not develop. Similarily, in a modern 'cloud' environment, where system setup is controlled by a state machine hosted by another organization, scripting doesn't really have much of a future.

To put it another way, scripting only thrives in an environment that has a lot of 'scriptable' tasks; meaning tasks for which there isn't already a pre-made solution available, environments that have powerful facilities available for a script to access, and whose users are empowered to automate those tasks. Such qualities are common on Unix/Linux 'workstations' but rather less so on smartphones and (as noted before) cloud computing environments.

Truth be told I'm a little worried about that development. I could point to, and expound on, the development and popularity of languages like go and rust, which aren't exactly scripting languages, or the replacement of Javascript with TypeScript, to make the point further, but I don't think that's necessary. At the same time I could point to the development of data science as a discipline to demonstrate that scripting is alive and well (and indeed perhaps more economically relevant than before).

What should be the conclusion for perl 5/7 and raku? I'm not quite sure, mostly because I'm not quite sure whether the broader perl/raku community would prefer their sister languages to be scripting or application languages. (As implied above, I think the Python community chose that they wanted Python 3 to be an application language, and this was not without consequences to their users). 

Raku adds a number of features common to application languages (I'm thinking of it's powerful type system in particular), continuing a trend that perl 5 arguably pioneered. This is indeed a very powerful strategy - a language can be introduced for scripts and some of those scripts are then extended into applications (or even systems), thereby ensuring its continued usage. But for it to work, a new perl family language must be introduced on its scripting merits, and there must be a plentiful supply of scriptable tasks to automate, some of which - or a combination of which - grow into an application.

For myself, I would like to see scripting have a bright future. Not just because scripting is the most accessible form of programming, but also because an environment that permits, even requires scripting, is one were not all interesting problems have been solved, one where it's users ask it to do tasks so diverse that there isn't an app for that, yet. One where the true potential of the wonderful devices that surround is can be explored.

In such a world there might well be a bright future for scripting.

Fixing Raku Unicode display issues in Windows

Published by 5ab5traction5 on 2020-12-26T19:07:50

I've been using Windows 10 for a while as I wait to install a new m2 SSD in this laptop to provide a dedicated place for Linux. I've noticed some very strange and disappointing issues with Unicode characters when running Raku from a terminal.

Thanks to #raku on Freenode, I managed to find a solution:

chcp 65001

This changes the Unicode code page to 65001 and magically fixes the issues I was seeing.

UPDATE

To make the change more permanent, it is possible to use change some registry key values under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage. Modify ACP, MACCP, and OEMCP all to value 65001, give the OS a reboot, et voila!

Thanks to the ever-present raiph for his reddit comment which pointed me to a Stack Overflow question from a user facing the same problem, which in turn pointed to the solution provided for a question from a C# programmer.

Masks for array access in APL, and other sundries

Published by 5ab5traction5 on 2020-12-04T16:37:52

As predicted, the APL Orchard did quickly provide alternative syntax for some of my expressions from Wedding Thanksgiving-Versary.

Array access without

The first suggested change came from Adam and it involves changing:

Years ← yr[⍸ {⍵>1977} yr]

into:

Years ← yr/⍨yr>1977 ⍝ Equivalent to (yr>1977)/yr

This was a well-deserved reminder that replicate (/) is more ambiguous than many APL symbols -- it can be either a function or an operator.

I had completely forgotten about this behavioral variation of /!

Essentially what happens is that we pass a series of 0s and 1s to the replicate function. The number of elements of this series must be the same as the number of elements in the structure we are accessing. 0 means "leave the corresponding element out of the result". 1 means "include one copy of the corresponding element in the result".

When used in this way, the term compress is generally used. If we were to provide an integer greater than 1, the corresponding element would appear that many times in the result. (There is also behavior for negative integers that varies across APL implementations.)

Some background on operators vs functions in APL

Though / is often used as a monadic operator in APL, it also exists as a dyadic function as well.

The distinction between operator and function in APL comes from it's origins as both a very early programming language (thus, before operator became a synonym for "built in function represented by a symbol") and as a mathematical notation (they are operators in the Heaviside sense of the word).

That is to say, an APL operator does nothing on it's own but is rather combined with a function in order to provide a modified version of that function. (J, an APL descendant, moved to calling operators adverbs and functions verbs, a distinction that may help to clarify).

A quick example of replicate in operator form:

+/ 3 4 5   ⍝ 12

This could be read aloud as "replicate the addition function across elements of the vector 3, 4, 5".

Reverse it

The in the expression yr/⍨yr>1977 is useful in that it removes the need for parentheses in the expression. It also allows for an arguably more familiar "array access" syntax where the structure being accessed is on the left of the expression determining what pieces to access.

Using to avoid parentheses is a common idiom that nevertheless continues to trip me up. I'm therefore quite thankful to have this simple example to reflect upon.

Dyalog dates are a bit magical

Another suggestion by the ever-helpful Adam was that I could avoid additional computation by using the Dyalog date format directly, rather than converting it into the human-readable day of the week form.

AnnDates/⍨5=7|1⎕DT AnnDates

Here we again see the same pattern with replicate and reverse. Looks like we will have to integrate this into our recognition capacities sooner rather than later!

The associated expression 5=7|1⎕DT AnnDates uses residue (dyadic |) to get the remainder of division by 7. When that equals 5, we are looking at a Thursday!

Easy ranges with dfns.to

Though only included in a footnote, I mentioned that the range 1978..2050 could be constructed via dfns.iotag.

It turns out that dfns.to is a better fit for building a simple range like the one I require.

That said, I will once again be sticking to the vanilla approach in the updated program.

Prettier output with

Lastly, the output can be improved by using mix (monadic ). Compared to my previously used , which outputs a 1-column matrix of 4-element vectors, outputs a 4-column matrix.

The revised program

Utilizing the above changes, the improved program now looks like:

⎕IO ← 0    ⍝ index origin of 0 means that the nth anniversary in our output table reads as expected
yr ← ⍳2050 ⋄ Years ← yr/⍨yr>1977
AnnDates ← Years ∘., ⊂11 26     ⍝ or Years,¨⊂11 26
AnnOnThursday ← 5=7|1⎕DT AnnDates
ThanksAnnDates ← AnnOnThursday/AnnDates
↑ ThanksAnnDates ,¨ ⍸AnnOnThursday

with the output:

1982 11 26  4
1993 11 26 15
1999 11 26 21
2004 11 26 26
2010 11 26 32
2021 11 26 43
2027 11 26 49
2032 11 26 54
2038 11 26 60
2049 11 26 71

Only one call to each (¨), and it's only in our display code! Thanks again Adam and the whole APL Orchard crew.

Update: That final each is removable by using a synonym! The expression:

↑ ThanksAnnDates ,¨ ⍸AnnOnThursday

is equivalent to:

(↑ThanksAnnDates),⍸AnnOnThursday

Taking a break from Raku core development

Published by jnthnwrthngtn on 2020-10-05T19:44:26

I’d like to thank everyone who voted for me in the recent Raku Steering Council elections. By this point, I’ve been working on the language for well over a decade, first to help turn a language design I found fascinating into a working implementation, and since the Christmas release to make that implementation more robust and performant. Overall, it’s been as fun as it has been challenging – in a large part because I’ve found myself sharing the journey with a lot of really great people. I’ve also tried to do my bit to keep the community around the language kind and considerate. Receiving a vote from around 90% of those who participated in the Steering Council elections was humbling.

Alas, I’ve today submitted my resignation to the Steering Council, on personal health grounds. For the same reason, I’ll be taking a step back from Raku core development (Raku, MoarVM, language design, etc.) Please don’t worry too much; I’ll almost certainly be fine. It may be I’m ready to continue working on Raku things in a month or two. It may also be longer. Either way, I think Raku will be better off with a fully sized Steering Council in place, and I’ll be better off without the anxiety that I’m holding a role that I’m not in a place to fulfill.

Reverse Linear Scan Allocation is probably a good idea

Published by Bart Wiegmans on 2019-03-21T15:52:00

Hi hackers! Today First of all, I want to thank everybody who gave such useful feedback on my last post.  For instance, I found out that the similarity between the expression JIT IR and the Testarossa Trees IR is quite remarkable, and that they have a fix for the problem that is quite different from what I had in mind.

Today I want to write something about register allocation, however. Register allocation is probably not my favorite problem, on account of being both messy and thankless. It is a messy problem because - aside from being NP-hard to solve optimally - hardware instruction sets and software ABI's introduce all sorts of annoying constraints. And it is a thankless problem because the case in which a good register allocator is useful - for instance, when there's lots of intermediate values used over a long stretch of code - are fairly rare. Much more common are the cases in which either there are trivially sufficient registers, or ABI constraints force a spill to memory anyway (e.g. when calling a function, almost all registers can be overwritten).

So, on account of this being not my favorite problem, and also because I promised to implement optimizations in the register allocator, I've been researching if there is a way to do better. And what better place to look than one of the fastest dynamic language implementations arround, LuaJIT? So that's what I did, and this post is about what I learned from that.

Truth be told, LuaJIT is not at all a learners' codebase (and I don't think it's author would claim this). It uses a rather terse style of C and lots and lots of preprocessor macros. I had somewhat gotten used to the style from hacking dynasm though, so that wasn't so bad. What was more surprising is that some of the steps in code generation that are distinct and separate in the MoarVM JIT - instruction selection, register allocation and emitting bytecode - were all blended together in LuaJIT. Over multiple backend architectures, too. And what's more - all these steps were done in reverse order - from the end of the program (trace) to the beginning. Now that's interesting...

I have no intention of combining all phases of code generation like LuaJIT has. But processing the IR in reverse seems to have some interesting properties. To understand why that is, I'll first have to explain how linear scan allocation currently works in MoarVM, and is most commonly described:

  1. First, the live ranges of program values are computed. Like the name indicates, these represent the range of the program code in which a value is both defined and may be used. Note that for the purpose of register allocation, the notion of a value shifts somewhat. In the expression DAG IR, a value is the result of a single computation. But for the purposes of register allocation, a value includes all its copies, as well as values computed from different conditional branches. This is necessary because when we actually start allocating registers, we need to know when a value is no longer in use (so we can reuse the register) and how long a value will remain in use -
  2. Because a value may be computed from distinct conditional branches, it is necessary to compute the holes in the live ranges. Holes exists because if a value is defined in both sides of a conditional branch, the range will cover both the earlier (in code order) branch and the later branch - but from the start of the later branch to its definition that value doesn't actually exist. We need this information to prevent the register allocator from trying to spill-and-load a nonexistent value, for instance.
  3. Only then can we allocate and assign the actual registers to instructions. Because we might have to spill values to memory, and because values now can have multiple definitions, this is a somewhat subtle problem. Also, we'll have to resolve all architecture specific register requirements in this step.
In the MoarVM register allocator, there's a fourth step and a fifth step. The fourth step exists to ensure that instructions conform to x86 two-operand form (Rather than return the result of an instruction in a third register, x86 reuses one of the input registers as the output register. E.g. all operators are of the form a = op(a, b)  rather than a = op(b, c). This saves on instruction encoding space). The fifth step inserts instructions that are introduced by the third step; this is done so that each instruction has a fixed address in the stream while the stream is being processed.

Altogether this is quite a bit of complexity and work, even for what is arguably the simplest correct global register allocation algorithm. So when I started thinking of the reverse linear scan algorithm employed by LuaJIT, the advantages became clear:
There are downsides as well, of course. Not knowing exactly how long a value will be live while processing it may cause the algorithm to make worse choices in which values to spill. But I don't think that's really a great concern, since figuring out the best possible value is practically impossible anyway, and the most commonly cited heuristic - evict the value that is live furthest in the future, because this will release a register over a longer range of code, reducing the chance that we'll need to evict again - is still available. (After all, we do always know the last use, even if we don't necessarily know the first definition).

Altogether, I'm quite excited about this algorithm; I think it will be a real simplification over the current implementation. Whether that will work out remains to be seen of course. I'll let you know!

Something about IR optimization

Published by Bart Wiegmans on 2019-03-17T06:23:00

Hi hackers! Today I want to write about optimizing IR in the MoarVM JIT, and also a little bit about IR design itself.

One of the (major) design goals for the expression JIT was to have the ability to optimize code over the boundaries of individual MoarVM instructions. To enable this, the expression JIT first expands each VM instruction into a graph of lower-level operators. Optimization then means pattern-matching those graphs and replacing them with more efficient expressions.

As a running example, consider the idx operator. This operator takes two inputs (base and element) and a constant parameter scale and computes base+element*scale. This represents one of the operands of an  'indexed load' instruction on x86, typically used to process arrays. Such instructions allow one instruction to be used for what would otherwise be two operations (computing an address and loading a value). However, if the element of the idx operator is a constant, we can replace it instead with the addr instruction, which just adds a constant to a pointer. This is an improvement over idx because we no longer need to load the value of element into a register. This saves both an instruction and valuable register space.

Unfortunately this optimization introduces a bug. (Or, depending on your point of view, brings an existing bug out into the open). The expression JIT code generation process selects instructions for subtrees (tile) of the graph in a bottom-up fashion. These instructions represent the value computed or work performed by that subgraph. (For instance, a tree like (load (addr ? 8) 8) becomes mov ?, qword [?+8]; the question marks are filled in during register allocation). Because an instruction is always represents a tree, and because the graph is an arbitrary directed acyclic graph, the code generator projects that graph as a tree by visiting each operator node only once. So each value is computed once, and that computed value is reused by all later references.

It is worth going into some detail into why the expression graph is not a tree. Aside from transformations that might be introduced by optimizations (e.g. common subexpression elimination), a template may introduce a value that has multiple references via the let: pseudo-operator. See for instance the following (simplified) template:

(let: (($foo (load (local))))
    (add $foo (sub $foo (const 1))))

Both ADD and SUB refer to the same LOAD node


In this case, both references to $foo point directly to the same load operator. Thus, the graph is not a tree. Another case in which this occurs is during linking of templates into the graph. The output of an instruction is used, if possible, directly as the input for another instruction. (This is the primary way that the expression JIT can get rid of unnecessary memory operations). But there can be multiple instructions that use a value, in which case an operator can have multiple references. Finally, instruction operands are inserted by the compiler and these can have multiple references as well.

If each operator is visited only once during code generation, then this may introduce a problem when combined with another feature - conditional expressions. For instance, if two branches of a conditional expression both refer to the same value (represented by name $foo) then the code generator will only emit code to compute its value when it encounters the first reference. When the code generator encounters $foo for the second time in the other branch, no code will be emitted. This means that in the second branch, $foo will effectively have no defined value (because the code in the first branch is never executed), and wrong values or memory corruption is then the predictable result.

This bug has always existed for as long as the expression JIT has been under development, and in the past the solution has been not to write templates which have this problem. This is made a little easier by a feature the let: operator, in that it inserts a do operator which orders the values that are declared to be computed before the code that references them. So that this is in fact non-buggy:

(let: (($foo (load (local))) # code to compute $foo is emitted here
  (if (...)  
    (add $foo (const 1)) # $foo is just a reference
    (sub $foo (const 2)) # and here as well

The DO node is inserted for the LET operator. It ensures that the value of the LOAD node is computed before the reference in either branch


Alternatively, if a value $foo is used in the condition of the if operator, you can also be sure that it is available in both sides of the condition.

All these methods rely on the programmer being able to predict when a value will be first referenced and hence evaluated. An optimizer breaks this by design. This means that if I want the JIT optimizer to be successful, my options are:

  1. Fix the optimizer so as to not remove references that are critical for the correctness of the program
  2. Modify the input tree so that such references are either copied or moved forward
  3. Fix the code generator to emit code for a value, if it determines that an earlier reference is not available from the current block.
In other words, I first need to decide where this bug really belongs - in the optimizer, the code generator, or even the IR structure itself. The weakness of the expression IR is that expressions don't really impose a particular order. (This is unlike the spesh IR, which is instruction-based, and in which every instruction has a 'previous' and 'next' pointer). Thus, there really isn't a 'first' reference to a value, before the code generator introduces the concept. This is property is in fact quite handy for optimization (for instance, we can evaluate operands in whatever order is best, rather than being fixed by the input order) - so I'd really like to preserve it. But it also means that the property we're interested in - a value is computed before it is used in, in all possible code flow paths - isn't really expressible by the IR. And there is no obvious local invariant that can be maintained to ensure that this bug does not happen, so any correctness check may have to check the entire graph, which is quite impractical.

I hope this post explains why this is such a tricky problem! I have some ideas for how to get out of this, but I'll reserve those for a later post, since this one has gotten quite long enough. Until next time!

A short post about types and polymorphism

Published by Bart Wiegmans on 2019-01-14T13:34:00

Hi all. I usually write somewhat long-winded posts, but today I'm going to try and make an exception. Today I want to talk about the expression template language used to map the high-level MoarVM instructions to low-level constructs that the JIT compiler can easily work with:

This 'language' was designed back in 2015 subject to three constraints:
Recently I've been working on adding support for floating point operations, and  this means working on the type system of the expression language. Because floating point instructions operate on a distinct set of registers from integer instructions, a floating point operator is not interchangeable with an integer (or pointer) operator.

This type system is enforced in two ways. First, by the template compiler, which attempts to check if you've used all operands correctly. This operates during development, which is convenient. Second, by instruction selection, as there will simply not be any instructions available that have the wrong combinations of types. Unfortunately, that happens at runtime, and such errors so annoying to debug that it motivated the development of the first type checker.

However, this presents two problems. One of the advantages of the expression IR is that, by virtue of having a small number of operators, it is fairly easy to analyze. Having a distinct set of operators for each type would undo that. But more importantly, there are several MoarVM instructions that are generic, i.e. that operate on integer, floating point, and pointer values. (For example, the set, getlex and bindlex instructions are generic in this way). This makes it impossible to know whether its values will be integers, pointers, or floats.

This is no problem for the interpreter since it can treat values as bags-of-bits (i.e., it can simply copy the union MVMRegister type that holds all values of all supported types). But the expression JIT works differently - it assumes that it can place any value in a register, and that it can reorder and potentially skip storing them to memory. (This saves work when the value would soon be overwritten anyway). So we need to know what register class that is, and we need to have the correct operators to manipulate a value in the right register class.

To summarize, the problem is:
There are two ways around this, and I chose to use both. First, we know as a fact for each local or lexical value in a MoarVM frame (subroutine) what type it should have. So even a generic operator like set can be resolved to a specific type at runtime, at which point we can select the correct operators. Second, we can introduce generic operators of our own. This is possible so long as we can select the correct instruction for an operator based on the types of the operands.

For instance, the store operator takes two operands, an address and a value. Depending on the type of the value (reg or num), we can always select the correct instruction (mov or movsd). It is however not possible to select different instructions for the load operator based on the type required, because instruction selection works from the bottom up. So we need a special load_num operator, but a store_num operator is not necessary. And this is true for a lot more operators than I had initially thought. For instance, aside from the (naturally generic) do and if operators, all arithmetic operators and comparison operators are generic in this way.

I realize that, despite my best efforts, this has become a rather long-winded post anyway.....

Anyway. For the next week, I'll be taking a slight detour, and I aim to generalize the two-operand form conversion that is necessary on x86. I'll try to write a blog about it as well, and maybe it'll be short and to the point. See you later!

Perl 6 Coding Contest 2019: Seeking Task Makers

Published by Moritz Lenz on 2018-11-10T23:00:01

I want to revive Carl Mäsak's Coding Contest as a crowd-sourced contest.

The contest will be in four phases:

For the first phase, development of tasks, I am looking for volunteers who come up with coding tasks collaboratively. Sadly, these volunteers, including myself, will be excluded from participating in the second phase.

I am looking for tasks that ...

This is non-trivial, so I'd like to have others to discuss things with, and to come up with some more tasks.

If you want to help with task creation, please send an email to [email protected], stating your intentions to help, and your freenode IRC handle (optional).

There are other ways to help too:

In these cases you can use the same email address to contact me, or use IRC (moritz on freenode) or twitter.

Swiss Perl Workshop 2017

Published by stmuk on 2017-08-30T17:48:17

cropped-lake_castle1.jpeg

After a perilous drive up a steep, narrow, winding road from Lake Geneva we arrived at an attractive Alpine village (Villars-sur-Ollon) to meet with fellow Perl Mongers in a small restaurant.  There followed much talk and a little clandestine drinking of exotic spirits including Swiss whisky. The following morning walking to the conference venue there was an amazing view of mountain ranges. On arrival I failed to operate the Nespresso machine which I later found was due to it simply being off.  Clearly software engineers should never try to use hardware. At least after an evening of drinking.

Wendy’s stall was piled high with swag including new Bailador (Perl 6 dancer like framework) stickers, a Shadowcat booklet about Perl 6 and the new O’Reilly “Thinking in Perl 6″. Unfortunately she had sold out of Moritz’s book “Perl 6 Fundamentals” (although there was a sample display copy present). Thankfully later that morning I discovered I had a £3 credit on Google Play Books so I bought the ebook on my phone.

The conference started early with Damian Conway’s Three Little Words.  These were “has”, “class” and “method” from Perl 6 which he liked so much that he had added them to Perl 5 with his “Dios” – “Declarative Inside-Out Syntax” module.  PPI wasn’t fast enough so he had to replace it with a 50,000 character regex PPR. Practical everyday modules mentioned included Regexp::Optimizer and Test::Expr. If the video  doesn’t appear shortly on youtube a version of his talk dating from a few weeks earlier is available at https://www.youtube.com/watch?v=ob6YHpcXmTg

Jonathan Worthington returned with his Perl 6 talk on “How does deoptimization help us go faster?” giving us insight into why Perl 6 was slow at the Virtual Machine level (specifically MoarVM). Even apparently simple and fast operations like indexing an array were slow due to powerful abstractions, late binding and many levels of Multiple Dispatch. In short the flexibility and power of such an extensible language also led to slowness due to the complexity of code paths. The AST optimizer helped with this at compile time but itself took time and it could be better to do this at a later compile time (like Just In Time).  Even with a simple program reading lines from a file it was very hard to determine statically what types were used (even with type annotations) and whether it was worth optimizing (since the file could be very short).

The solution to these dynamic problems was also dynamic but to see what was happening needed cheap logging of execution which was passed to another thread.  This logging is made visible by setting the environment variable MVM_SPESH_LOG to a filename. Better tooling for this log would be a good project for someone.

For execution planning we look for hot (frequently called) code, long blocks of bytecode (slow to run) and consider how many types are used (avoiding “megamorphic” cases with many types which needs many versions of code).  Also analysis of the code flow between different code blocks and SSA.  Mixins made the optimization particularly problematic.

MoarVM’s Spesh did statistical analysis of the code in order to rewrite it in faster, simpler ways. Guards (cheap check for things like types) were placed to catch cases where it got it wrong and if these were triggered (infrequently) it would deoptimize as well, hence the counterintuitive title since “Deoptimization enables speculation” The slides are at http://jnthn.net/papers/2017-spw-deopt.pdf with the video at https://www.youtube.com/watch?v=3umNn1KnlCY The older and more dull witted of us (including myself) might find the latter part of the video more comprehensible at 0.75 Youtube speed.

After a superb multi-course lunch (the food was probably the best I’d had at any Perl event) we returned promptly to hear Damian talk of “Everyday Perl 6”. He pointed out that it wasn’t necessary to code golf obfuscated extremes of Perl 6 and that the average Perl 5 programmer would see many things simpler in Perl 6.  Also a rewrite from 5 to 6 might see something like 25% fewer lines of code since 6 was more expressive in syntax (as well as more consistent) although performance problems remained (and solutions in progress as the previous talk had reminded us).

Next Liz talked of a “gross” (in the numerical sense of 12 x 12 rather than the American teen sense) of Perl 6 Weeklies as she took us down memory lane to 2014 (just about when MoarVM was launched and when unicode support was poor!)  with some selected highlights and memories of Perl 6 developers of the past (and hopefully future again!). Her talk was recorded at https://www.youtube.com/watch?v=418QCTXmvDU

newton

Cal then spoke of Perl 6 maths which he thought was good with its Rats and FatRats but not quite good enough and his ideas of fixing it.  On the following day he showed us he had started some TDD work on TrimRats. He also told us that Newton’s Method wasn’t very good but generated a pretty fractal. See https://www.youtube.com/watch?v=3na_Cx-anvw

Lee spoke about how to detect Perl 5 memory leaks with various CPAN modules and his examples are at https://github.com/leejo/Perl_memory_talk

The day finished with Lightning Talks and a barbecue at givengain — a main sponsor.

On the second day I noticed the robotic St Bernards dog in a tourist shop window had come to life.

dog1

Damian kicked off the talks with my favourite of his talks,  “Standing on the Shoulders of Giants”, starting with the Countess of Lovelace and her Bernoulli number program. This generated a strange sequence with many zeros. The Perl 6 version since it used rational numbers not floating point got the zeros right whereas the Perl 5 version initially suffered from floating point rounding errors (which are fixable).

Among other things he showed us how to define a new infix operator in Perl 6. He also showed us a Perl 6 sort program that looked exactly like LISP even down to the Lots of Irritating Superfluous Parentheses. I think this was quicksort (he certainly showed us a picture of Sir Tony Hoare at some point). Also a very functional (Haskell-like) equivalent  with heavy use of P6 Multiple Dispatch.  Also included was demonstration of P6 “before” as a sort of typeless/multi-type comparison infix. Damian then returned to his old favourite of Quantum Computing.

My mind and notes got a bit jumbled at this point but I particularly liked the slide that explained how factorisation could work by observing the product of possible inputs since this led to a collapse that revealed the factors.  To do this on RSA etc., of course, needs real hardware support which probably only the NSA and friends have (?). Damian’s code examples are at http://www.bit.do/Perl6SOG with  an earlier version of his talk at https://www.youtube.com/watch?v=Nq2HkAYbG5o Around this point there was a road race of classic cars going on outside up the main road into the village and there were car noises in the background that strangely were more relaxing than annoying.

File_000

After Quantum Chaos Paul Johnson brought us all back down to ground with an excellent practical talk on modernising legacy Perl 5 applications based on his war stories. Hell, of course, is “Other People’s Code”, often dating from Perl’s early days and lacking documentation and sound engineering.

Often the original developers had long since departed or, in the worse cases, were still there.  Adding tests and logging (with stack traces) were particularly useful. As was moving to git (although its steep learning curve meant mentoring was needed) and handling CPAN module versioning with pinto.  Many talks had spoken of the Perl 6 future whereas this spoke of the Perl 5 past and present and the work many of us suffer to pay the bills. It’s at https://www.youtube.com/watch?v=4G5EaUNOhR0

File_000 (1)

Jonathan then spoke of reactive distributed software.  A distributed system is an async one where “Is it working?” means “some of it is working but we don’t know which bits”.  Good OO design is “tell don’t ask” — you tell remote service to do something for you and not parse the response and do it yourself thus breaking encapsulation.  This is particularly important in building well designed distributed systems since otherwise the systems are less responsive and reliable.  Reactive (async) works better for distributed software than interactive (blocking or sync).

We saw a table that used a Perl 6 promise for one value and a supply for many values for reactive (async) code and the equivalent (one value) and a Perl 6 Seq for interactive code. A Supply could be used for pub/sub and the Observer Pattern. A Supply could either be live (like broadcast TV) or, for most Perl 6 supplies, on-demand (like Netflix). Then samples of networking (socket) based code were discussed including a web client, web server and SSH::LibSSH (async client bindings often very useful in practical applications like port forwarding)

https://github.com/jnthn/p6-ssh-libssh

Much of the socket code had a pattern of “react { whenever {” blocks with “whenever” as a sort of async loop.He then moved on from sockets to services (using a Supply pipeline) and amazed us by announcing the release of “cro”, a microservices library that even supports HTTP/2 and Websockets, at http://mi.cro.services/.  This is installable using Perl 6 by “zef install –/test cro”.

Slides at http://jnthn.net/papers/2017-spw-sockets-services.pdf and video at https://www.youtube.com/watch?v=6CsBDnTUJ3A

Next Lee showed Burp Scanner which is payware but probably the best web vulnerabilities scanner. I wondered if anyone had dare run it on ACT or the hotel’s captive portal.

Wendy did some cheerleading in her “Changing Image of Perl”.  An earlier version is at https://www.youtube.com/watch?v=Jl6iJIH7HdA

Sue’s talk was “Spiders, Gophers, Butterflies” although the latter were mostly noticeably absent. She promises me that a successor version of the talk will use them more extensively. Certainly any Perl 6 web spidering code is likely to fit better on one slide than the Go equivalent.

During the lightning talks Timo showed us a very pretty Perl 6 program using his SDL2::Raw to draw an animated square spiral with hypnotic colour cycling type patterns. Also there was a talk by the author about https://bifax.org/bif/— a distributed bug tracking system (which worked offline like git).

Later in the final evening many of us ate and chatted in another restaurant where we witnessed a dog fight being narrowly averted and learnt that Wendy didn’t like Perl 5’s bless for both technical and philosophical reasons.

My Ten Years of Perl 6

Published by Moritz Lenz on 2017-08-08T22:00:01

Time for some old man's reminiscence. Or so it feels when I realize that I've spent more than 10 years involved with the Perl 6 community.

How I Joined the Perl 6 Community

It was February 2007.

I was bored. I had lots of free time (crazy to imagine that now...), and I spent some of that answering (Perl 5) questions on perlmonks. There was a category of questions where I routinely had no good answers, and those were related to threads. So I decided to play with threads, and got frustrated pretty quickly.

And then I remember that a friend in school had told me (about four years earlier) that there was this Perl 6 project that wanted to do concurrency really well, and even automatically parallelize some stuff. And this was some time ago, maybe they had gotten anywhere?

So I searched the Internet, and found out about Pugs, a Perl 6 compiler written in Haskell. And I wanted to learn more, but some of the links to the presentations were dead. I joined the #perl6 IRC channel to report the broken link.

And within three minutes I got a "thank you" for the report, the broken links were gone, and I had an invitation for a commit bit to the underlying SVN repo.

I stayed.

The Early Days

Those were they wild young days of Perl 6 and Pugs. Audrey Tang was pushing Pugs (and Haskell) very hard, and often implemented a feature within 20 minutes after somebody mentioned it. Things were unstable, broken often, and usually fixed quickly. No idea was too crazy to be considered or even implemented.

We had bots that evaluated Perl 6 and Haskell code, and gave the result directly on IRC. There were lots of cool (and sometimes somewhat frightening) automations, for example for inviting others to the SVN repo, to the shared hosting system (called feather), for searching SVN logs and so on. Since git was still an obscure and very unusable, people tried to use SVK, an attempt to implement a decentralized version control system on top of of the SVN protocol.

Despite some half-hearted attempts, I didn't really make inroads into compiler developments. Having worked with neither Haskell nor compilers before proved to be a pretty steep step. Instead I focused on some early modules, documentation, tests, and asking and answering questions. When the IRC logger went offline for a while, I wrote my own, which is still in use today.

I felt at home in that IRC channel and the community. When the community asked for mentors for the Google Summer of Code project, I stepped up. The project was a revamp of the Perl 6 test suite, and to prepare for mentoring task, I decided to dive deeper. That made me the maintainer of the test suite.

Pet Projects

I can't recount a full history of Perl 6 projects during that time range, but I want to reflect on some projects that I considered my pet projects, at least for some time.

It is not quite clear from this (very selected) timeline, but my Perl 6 related activity dropped around 2009 or 2010. This is when I started to work full time, moved in with my girlfriend (now wife), and started to plan a family.

Relationships

The technologies and ideas in Perl 6 are fascinating, but that's not what kept me. I came for the technology, but stayed for the community.

There were and are many great people in the Perl 6 community, some of whom I am happy to call my friends. Whenever I get the chance to attend a Perl conference, workshop or hackathon, I find a group of Perl 6 hackers to hang out and discuss with, and generally have a good time.

Four events stand out in my memory. In 2010 I was invited to the Open Source Days in Copenhagen. I missed most of the conference, but spent a day or two with (if memory serve right) Carl Mäsak, Patrick Michaud, Jonathan Worthington and Arne Skjærholt. We spent some fun time trying to wrap our minds around macros, the intricacies of human and computer language, and Japanese food. (Ok, the last one was easy). Later the same year, I attended my first YAPC::EU in Pisa, and met most of the same crowd again -- this time joined by Larry Wall, and over three or four days. I still fondly remember the Perl 6 hallway track from that conference. And 2012 I flew to Oslo for a Perl 6 hackathon, with a close-knit, fabulous group of Perl 6 hackers. Finally, the Perl Reunification Summit in the beautiful town of Perl in Germany, which brought together Perl 5 and Perl 6 hackers in a very relaxed atmosphere.

For three of these four events, different private sponsors from the Perl and Perl 6 community covered travel and/or hotel costs, with their only motivation being meeting folks they liked, and seeing the community and technology flourish.

The Now

The Perl 6 community has evolved a lot over the last ten years, but it is still a very friendly and welcoming place. There are lots of "new" folks (where "new" is everybody who joined after me, of course :D), and a surprising number of the old guard still hang around, some more involved, some less, all of them still very friendly and supportive

The Future

I anticipate that my family and other projects will continue to occupy much of my time, and it is unlikely that I'll be writing another Perl 6 book (after the one about regexes) any time soon. But the Perl 6 community has become a second home for me, and I don't want to miss it.

In the future, I see myself supporting the Perl 6 community through infrastructure (community servers, IRC logs, running IRC bots etc.), answering questions, writing a blog article here and there, but mostly empowering the "new" guard to do whatever they deem best.

Perl 6 Fundamentals Now Available for Purchase

Published by Moritz Lenz on 2017-07-21T22:00:01

After about nine months of work, my book Perl 6 Fundamentals is now available for purchase on apress.com and springer.com.

The ebook can be purchased right now, and comes in the epub and PDF formats (with watermarks, but DRM free). The print form can be pre-ordered from Amazon, and will become ready for shipping in about a week or two.

I will make a copy of the ebook available for free for everybody who purchased an earlier version, "Perl 6 by Example", from LeanPub.

The book is aimed at people familiar with the basics of programming; prior Perl 5 or Perl 6 knowledge is not required. It features a practical example in most chapters (no mammal hierarchies or class Rectangle inheriting from class Shape), ranging from simple input/output and text formatting to plotting with python's matplotlib libraries. Other examples include date and time conversion, a Unicode search tool and a directory size visualization.

I use these examples to explain subset of Perl 6, with many pointers to more documentation where relevant. Perl 6 topics include the basic lexicographic structure, testing, input and output, multi dispatch, object orientation, regexes and grammars, usage of modules, functional programming and interaction with python libraries through Inline::Python.

Let me finish with Larry Wall's description of this book, quoted from his foreword:

It's not just a reference, since you can always find such materials online. Nor is it just a cookbook. I like to think of it as an extended invitation, from a well-liked and well-informed member of our circle, to people like you who might want to join in on the fun. Because joy is what's fundamental to Perl. The essence of Perl is an invitation to love, and to be loved by, the Perl community. It's an invitation to be a participant of the gift economy, on both the receiving and the giving end.

The Loss of Name and Orientation

Published by Moritz Lenz on 2017-07-10T22:00:01

The Perl 6 naming debate has started again. And I guess with good reason. Teaching people that Perl 6 is a Perl, but not the Perl requires too much effort. Two years ago, I didn't believe. Now you're reading a tired man's words.

I'm glad that this time, we're not discussing giving up the "Perl" brand, which still has very positive connotations in my mind, and in many other minds as well.

And yet, I can't bring myself to like "Rakudo Perl 6" as a name. There are two vary shallow reasons for that: Going from two syllables, "Perl six", to five of them, seems a step in the wrong direction. And two, I remember the days when the name was pretty young, and people would misspell it all the time. That seems to have abated, though I don't know why.

But there's also a deeper reason, probably sentimental old man's reason. I remember the days when Pugs was actively developed, and formed the center of a vibrant community. When kp6 and SMOP and all those weird projects were around. And then, just when it looked like there was only a single compiler was around, Stefan O'Rear conjured up niecza, almost single-handedly, and out of thin air. Within months, it was a viable Perl 6 compiler, that people on #perl6 readily recommended.

All of this was born out of the vision that Perl 6 was a language with no single, preferred compiler. Changing the language name to include the compiler name means abandoning this vision. How can we claim to welcome alternative implementations when the commitment to one compiler is right in the language name?

However I can't weigh this loss of vision against a potential gain in popularity. I can't decide if it's my long-term commitment to the name "Perl 6" that makes me resent the new name, or valid objections. The lack of vision mirrors my own state of mind pretty well.

I don't know where this leaves us. I guess I must apologize for wasting your time by publishing this incoherent mess.

Living on the (b)leading edge

Published by Moritz Lenz on 2017-06-24T22:00:01

Perl 6 is innovative in many ways, and sometimes we don't fully appreciate all the implications, for good or for bad.

There's one I stumbled upon recently: The use of fancy Unicode symbols for built-in stuff. In this case: the `.gist` output of Match objects. For example

my token word { \w+ }
say 'abc=def' ~~ /<word> '=' <word>/;
produces this output:
「abc=def」
 word => 「abc」
 word => 「def」

And that's where the problems start. In my current quest to write a book on Perl 6 regexes, I noticed that the PDF that LeanPub generates from my Markdown sources don't correctly display those pesky 「」 characters, which are

$ uni -c 「」
「 - U+0FF62 - HALFWIDTH LEFT CORNER BRACKET
」 - U+0FF63 - HALFWIDTH RIGHT CORNER BRACKET

When I copied the text from the PDF and pasted into my editor, they showed up correctly, which indicates that the characters are likely missing from the monospace font.

The toolchain allows control over the font used for displaying code, so I tried all the monospace fonts that were available. I tried them in alphabetical order. Among the earlier fonts I tried was Deja Vu Sans Mono, which I use in my terminal, and which hasn't let me down yet. No dice. I arrived at Noto, a font designed to cover all Unicode codepoints. And it didn't work either. So it turns out these two characters are part of some Noto Sans variants, but not of the monospace font.

My terminal, and even some font viewers, use some kind of fallback where they use glyphs from other fonts to render missing characters. The book generation toolchain does not.

The Google Group for Leanpub was somewhat helpful: if I could recommend an Open Source mono space font that fit my needs, they'd likely include it in their toolchain.

So I searched and searched, learning more about fonts than I wanted to know. My circle of geek friends came up with several suggestions, one of them being Iosevka, which actually contains those characters. So now I wait for others to step up, either for LeanPub to include that font, or for the Noto maintainers to create a monospace variant of those characters (and then LeanPub updating their version of the font).

And all of that because Perl 6 was being innovative, and used two otherwise little-used characters as delimiters, in an attempt to avoid collisions between delimiters and content.

(In the mean time I've replaced the two offending characters with ones that look similar. It means the example output is technically incorrect, but at least it's readable).

Rakudo Star: Past Present and Future

Published by Steve Mynott on 2017-01-02T14:07:31

At YAPC::EU 2010 in Pisa I received a business card with "Rakudo Star" and the
date July 29, 2010 which was the date of the first release -- a week earlier
with a countdown to 1200 UTC. I still have mine, although it has a tea stain
on it and I refreshed my memory over the holidays by listening again to Patrick
Michaud speaking about the launch of Rakudo Star (R*):

https://www.youtube.com/watch?v=MVb6m345J-Q

R* was originally intended as first of a number of distribution releases (as
opposed to a compiler release) -- useable for early adopters but not initially production
Quality. Other names had been considered at the time like Rakudo Beta (rejected as
sounding like "don't use this"!) and amusingly Rakudo Adventure Edition.
Finally it became Rakudo Whatever and Rakudo Star (since * means "whatever"!).

Well over 6 years later and we never did come up with a better name although there
was at least one IRC conversation about it and perhaps "Rakudo Star" is too
well established as a brand at this point anyway. R* is the Rakudo compiler, the main docs, a module installer, some modules and some further docs.

However, one radical change is happening soon and that is a move from panda to
zef as the module installer. Panda has served us well for many years but zef is
both more featureful and more actively maintained. Zef can also install Perl
6 modules off CPAN although the CPAN-side support is in its early days. There
is a zef branch (pull requests welcome!) and a tarball at:

http://pl6anet.org/drop/rakudo-star-2016.12.zef-beta2.tar.gz

Panda has been patched to warn that it will be removed and to advise the use of
zef. Of course anyone who really wants to use panda can reinstall it using zef
anyway.

The modules inside R* haven't changed much in a while. I am considering adding
DateTime::Format (shown by ecosystem stats to be widely used) and
HTTP::UserAgent (probably the best pure perl6 web client library right now).
Maybe some modules should also be removed (although this tends to be more
controversial!). I am also wondering about OpenSSL support (if the library is
available).

p6doc needs some more love as a command line utility since most of the focus
has been on the website docs and in fact some of these changes have impacted
adversely on command line use, eg. under Windows cmd.exe "perl 6" is no longer
correctly displayed by p6doc. I wonder if the website generation code should be
decoupled from the pure docs and p6doc command line (since R* has to ship any
new modules used by the website). p6doc also needs a better and faster search
(using sqlite?). R* also ships some tutorial docs including a PDF generated from perl6intro.com.
We only ship the English one and localisation to other languages could be
useful.

Currently R* is released roughly every three months (unless significant
breakage leads to a bug fix release). Problems tend to happen with the
less widely used systems (Windows and the various BSDs) and also with the
module installers and some modules. R* is useful in spotting these issues
missed by roast. Rakudo itself is still in rapid development. At some point a less frequently
updated distribution (Star LTS or MTS?) will be needed for Linux distribution
packagers and those using R* in production). There are also some question
marks over support for different language versions (6.c and 6.d).

Above all what R* (and Rakudo Perl 6 in general) needs is more people spending
more time working on it! JDFI! Hopefully this blog post might
encourage more people to get involved with github pull requests.

https://github.com/rakudo/star

Feedback, too, in the comments below is actively encouraged.


Rakudo Star 2016.11 Release Candidate

Published by Steve Mynott on 2016-11-20T14:01:22

There is a Release Candidate for Rakudo Star 2016.11 (currently RC2) available at

http://pl6anet.org/drop/

This includes binary installers for Windows and Mac.

Usually Star is released about every three months but last month's release didn't include a Windows installer so there is another release.

I'm hoping to release the final version next weekend and would be grateful if people could try this out on as many systems as possible.

Any feedback email steve *dot* mynott *at* gmail *dot* com

Full draft announce at

https://github.com/rakudo/star/blob/master/docs/announce/2016.11.md

Rakudo Star 2016.10 Release Candidate

Published by Steve on 2016-10-16T06:10:00

There is a Release Candidate for Rakudo Star 2016.10 (currently RC0) available at

http://pl6anet.org/drop/

This should be quite a bit faster than previous releases and work better on OpenBSD/FreeBSD than the previous release.

It also features "prove6" which is now used by Panda -- removing a run-time dependency on Perl 5. Although it still needs Perl 5 to build.

I'm hoping to release the final version next weekend (Oct 21st) and would be grateful if people could try this out on as many systems as possible (eg. exotic systems like Solaris-like ones and Windows!) 

Full draft announce at

https://github.com/rakudo/star/blob/master/docs/announce/2016.10.md

Note compiling under Windows is possible using the gcc which comes with Strawberry Perl and gmake running under cmd.exe.  Further instructions will be added (thanks to Christopher for feedback).

Any feedback email steve *underscore* mynott *at* gmail *dot* com

You Wouldn't BELIEVE what I saw at YAPC::EU!!!

Published by Steve Mynott on 2016-08-28T18:57:57

We turned up in Cluj via Wizz Air to probably one of the best pre YAPC parties ever located on three levels on the rooftop of Evozon‎’s plush city centre offices. We were well supplied with excellent wine, snacks and the local Ursus beer and had many interesting conversations with old friends.

On the first day Tux spoke about his Text::CSV modules for both Perl 5 and 6 on the first day and I did a short talk later in the day on benchmarking Perl 6. Only Nicholas understood my trainspotter joke slide with the APT and Deltic! Sadly my talk clashed with Lee J talking about Git which I wanted to see so I await the youtube version! Jeff G then spoke about Perl 6 and parsing languages such as JavaScript. Sadly I missed Leon T’s Perl 6 talk which I also plan on watching on youtube. Tina M gave an excellent talk on writing command line tools. She also started the lightning talks with an evangelical talk about how tmux was better than screen. Geoffrey A spoke about configuring sudo to run restricted commands in one directory which seemed a useful technique to me. Dave C continued his conference tradition of dusting off his Perl Vogue cover and showing it again. The age of the image was emphasised by the amazingly young looking mst on it. And Stefan S ended with a call for Perl unification.

The main social event was in the courtyard of the main museum off the central square with free food and beer all evening and an impressive light show on the slightly crumbling facade. There were some strange chairs which resembled cardboard origami but proved more comfortable than they looked when I was finally able to sit in one. The quality of the music improved as the evening progressed (or maybe the beer helped) I was amazed to see Perl Mongers actually dancing apparently inspired by the younger Cluj.pm members.

Day Two started with Sawyer’s State of the Velociraptor‎ which he had, sensibly, subcontracted to various leading lights of the Perl Monger community. Sue S (former London.pm leader) was up first with a short and sweet description of London.pm. Todd R talked about Houston.pm. Aaron Crane spoke about the new improved friendlier p5p. Tina about Berlin.pm and the German Perl community site she had written back in the day. This new format worked very well and it was obvious Perl Mongers groups could learn much from each other. Max M followed with a talk about using Perl and ElasticSearch to index websites and documents and Job about accessibility.

1505 had, from the perspective of London.pm, one of the most unfortunate scheduling clashes at YAPC::EU ever, with three titans of London.pm (all former leaders) battling for audience share. I should perhaps tread carefully here lest bias become apparent but the heavyweight Sue Spence was, perhaps treacherously, talking about Go in the big room and Dave Cross and Tom talking about Perl errors and HTML forms respectively in the other rooms. This momentous event should be reproducible by playing all three talks together in separate windows once they are available.

Domm did a great talk on Postgres which made me keen to use this technology again. André W described how he got Perl 6 running on his Sailfish module phone while Larry did a good impression of a microphone stand. I missed most of Lance Wick’s talk but the bit I caught at the end made me eager to watch the whole thing.

Guinevere Nell gave a fascinating lightning talk about agent based economic modelling. Lauren Rosenfield spoke of porting (with permission) a “Python for CS” book to perl 6. Lukas Mai described his journey from Perl to Rust. Lee J talked about photography before Sue encouraged people to break the London.pm website. Outside the talk rooms on their stall Liz and Wendy had some highly cool stuffed toy Camelia butterflies produced by the Beverly Hills Teddy Bear Company and some strange “Camel Balls” bubblegum. At the end of the day Sue cat herded many Mongers to eat at the Enigma Steampunk Bar in central Cluj with the cunning ploy of free beer money (recycled from the previous year’s Sherry money).

The third day started with Larry’s Keynote in which photographs of an incredible American house “Fallingwater” and Chinese characters (including “arse rice”) featured heavily. Sweth C gave a fast and very useful introduction to swift. Nicholas C then confused a room of people for an hour with a mixture of real Perl 5 and 6 and an alternative timeline compete with T shirts. The positive conclusion was that even if the past had been different the present isn’t likely to have been much better for the Perl language family than it is now! Tom spoke about Code Review and Sawyer about new features in Perl 5.24. Later I heard Ilya talk about running Perl on his Raspberry PI Model B and increasing the speed of his application very significantly to compensate for its low speed! And we finished with lightning talks where we heard about the bug tracker OTRS (which was new to me), Job spoke about assistive tech and Nine asked us to ask our bosses for money for Perl development amongst several other talks. We clapped a lot in thanks, since this was clearly a particularly well organised YAPC::EU (due to Amalia and her team!) and left to eat pizza and fly away the next day. Some stayed to visit a salt mine (which looked most impressive from the pictures!) and some stayed longer due to Lufthansa cancelling their flights back!


German Perl Workshop 2016

Published by Steve Mynott on 2016-03-15T15:36:12

The meeting first night was in a large beer bar in the centre of Nuremberg.
We went back to the Best Western to find a certain exPumpkin already resident in the bar.
Despite several of the well named Bitburgers we managed to arrive at the
conference venue on time the following morning. Since my knowledge of German was
limited to a C grade 'O' Level last century my review talks will be mostly
limited to English talks. Apologies in advance to those giving German talks
(not unreasonable considering the country). Hopefully other blog posts will
cover these.

Masak spoke about the dialectic between planning (like physics) and chaos (like
biology) in software development.

http://masak.org/carl/gpw-2016-domain-modeling/talk.pdf

Tobias gave a good beginners guide to Perl 6 in German and I was able to follow
most of the slides since I knew more Perl 6 than German and even learnt a thing
or two.

After lunch Stefan told us he was dancing around drunk and naked on the turn of
the 2000s and also about communication between Perl 6 and Perl 5 and back again
via his modules Inline::Perl5 (from Perl 6) -- the most important take away
being that "use Foo::Bar:from<Perl5>" can be used from Perl 6 and "use
Inline::Perl6" from Perl 5. The modules built bridges like those built in the
old school computer game "Lemmings".

http://niner.name/talks/Perl%205%20and%20Perl%206%20-%20a%20great%20team/Perl%205%20and%20Perl%206%20-%20a%20great%20team.odp

Max told us (in German) about his Dancer::SearchApp search
engine which has based on Elastic Search but I was able to follow along on the
English version of his slides on the web.

http://corion.net/talks/dancer-searchapp/dancer-searchapp.en.html

Sue got excited about this. Tina showed us some slides in Vim and her module
to add command line tab completion to script arguments using zsh and bash. I
wondered whether some of her code could be repurposed to add fish shell man
page parsing autocompletion to zsh. She also had a good lightening talk about
Ingy's command line utility for github.

https://github.com/perlpunk/myslides/tree/master/app-spec

Second day started early with Moritz talking about Continuous Delivery which
could mean just delivering to a staging server. He was writing a book about it
at deploybook.com with slides at:

https://deploybook.com/talks/gpw2016-continuous-delivery.pdf

Salve wanted us to write elegant code as a reply to the Perl Jam guy at CCC in
a self confessed "rant".

Sawyer described writing Ref::Util to optimise things like "ref $foo" in a
Hardcore Perl 5 XS/Core talk and Masak told us about his little 007 language
written in Perl 6 as a proof of concept playroom for future Perl 6 extended
macro support and demonstrated code written over lunch in support of this.

http://masak.org/carl/gpw-2016-big-hairy-yaks/talk.pdf

Stefan gave a great talk about CURLI and explained the complexity of what was
intended.

http://niner.name/talks/A%20look%20behind%20the%20curtains%20-%20module%20loading%20in%20Perl%206/Module%20loading%20in%20Perl%206.pdf

I gave my talk on "Simple Perl 6 Fractals and Concurrency" on Friday. It
started badly with AV issues my side but seemed well received. It was useful
speaking with people about it and I managed to speed things up *after* the talk
and I should have new material for a 2.0 version.

There were very good talks on extracting data from PDFs and writing JSON apis.

https://github.com/mickeyn/PONAPI

looked very interesting and would have saved me much coding at a recent job.

There were some great lightening talks at th end of the day. Sawyer wanted
people to have English slides and gave his talk in Hebrew to stress this.
Things ended Friday night with great food and beer in a local bar.

FOSDEM 2016

Published by Steve Mynott on 2016-02-02T19:33:44

To me It seemed a particularly good FOSDEM for both for Perl5/6 and
other talks although very crowded as usual and I didn't see the usual
*BSD or Tor stalls. I was stuck by the statistic that there were
about 500 speakers from many thousands of people so of the order of
one speaker per tens of attendees which is very high.

Videos are already starting to appear at

http://video.fosdem.org/2016/

On Saturday I started with Poettering and systemd which was a keynote
and perhaps a little disappointing since he usually is a better
speaker and the audio was a little indistinct. systemd had won being
used by all distros except gentoo and slackware. They were now working
on a dns resolver component which supported DNSSEC although in
practice validating signed zone files would slow down browsing and
currently only 2% of websites had it activated. He didn't mention
strong criticisms of its security by crypto experts such as DJB.

The most amusing talk was Stark's retro running of Postgres on
NetBSD/VAX which exposed some obscure OS bugs and was livened up by a
man in an impressive Postgres Elephant costume appearing. We later
spoke to Mr Elephant who said he was both blind and very hot at the
time. I then went to the Microkernel room to hear about GNU/Hurd
progress from Thibault since this room is usually "OPEN" and he's an
excellent speaker. I noticed even this obscure room was quite crowded
as compared with previous years so I'd guess total attendees this year
were high. He stressed the advantages of running device drivers in
userspace as allowing more user "freedom" to mount fs etc. without
root and improving kernel stability since the drivers could crash and
restart without bringing down the kernel. In previous years he had
talked of his DDE patches allowing linux 2.6 hardware drivers on Hurd
and this year he was using the NetBSD Rump kernel under Hurd to add
sound support with USB support promised. His demo was RMS singing his
song on his Hurd laptop. The irony was he needed to use BSD code on a
GNU/BSD/Hurd system to do it! There had been some work on X86-64 Hurd
but it wasn't there yet since he needed more help from the community.
I then saw some lightening talks (actually 20 mins long) including a
good one on C refactoring.

The Perl dinner on Saturday night featured the usual good food and
conversation and the devroom was on Sunday. Ovid spoke about Perl 6
and its advantages (such as being able to perform maths on floats
correctly). I had a python guy sitting next to me who admitted he had
never been to a Perl talk before so that was a success in reaching
someone new. Will Braswell spoke next about his "Rperl" compiler
which translated his own quite restricted subset (no regexps yet and
no $_) of Perl 5 line by line into C++ in order to run some of the
language shootups benchmarks (a graphical animation of planetary
motion) at increased speed. I'd not seen Will before and he was an
excellent speaker who left me more impressed than I'd expected and I
hope he gets to YAPC::EU in the summer. I saw some non-Perl stuff
next for variety including a good one on the Go debugger Delve which
was aware of the go concurrency and could be used as a basic REPL. I
returned to Perl to see Bart explain some surprisingly simple X86-64
assembly language to do addition and ROT13 which he interfaced with
Perl 6 using NativeCall (although it stuck me that the
CPAN P5NCI module on Perl 5 would have also worked).
Again an excellent talk and a good start to the a
run of some of the best Perl talks I'd ever seen. Stevan Little's talk
was one of the his most amusing ever and perl wasn't really dead.
Sawyer also did an excellent promotion of Perl 5 targeted at the
people who maybe hadn't used it since the early 2000s explaining what
had changed. Liz finished with her autobiographical account of Perl
development and some nice short Perl 6 examples. We all ate again in
the evening together my only regrets being I'd missed the odd talk or
two (which I should be able to watch on video).

FOSDEM 2016 Perl Dev Room Lineup

Published by Steve on 2016-01-09T13:32:00

FOSDEM is a free two day conference in Brussels, Belgium on Jan 30th and 31st 2016.

The FOSDEM 2016 schedule for the Perl Dev Room on the second day (the Sunday)  has now been announced at

https://fosdem.org/2016/schedule/track/perl/

From a Perl 6 perspective it includes  Ovid's "Perl 6 for those who hate Perl", Daisuke Maki on "Crust --  Perl6 Port of Plack", Jeffrey Goff on Perl 6 Grammars, Bart Wiegmans talks about AMD64 assembly language programming and MoarVM, Stevan Little's "Perl is not dead,... it got better" and lastly Elizabeth Mattijsen finishes with "Perl 6 -- The end of the beginning".

Perl6 and CPAN: MetaCPAN Status as of 2015-10-09

Published by jdv on 2015-10-09T07:04:00

MetaCPAN, like the rest of "CPAN", was built assuming the sole context of Perl5. Which is cool until we want to use it for Perl6 and avoid the troubles associated with different namespaces, dist mgmt, etc... To largely avoid and more easily handle these issues for MetaCPAN it's been suggested that we have separate instances. The existing Perl5 instance only needs to be changed to ignore Perl6 distributions. There has already been some breakage because it didn't ignore a Perl6 dist of mine which exists in the Perl5 world:( And the new Perl6 instance will do just the opposite and only look at Perl6 distributions.

In contrast, and relatedly, on CPAN we've designated a special spot for Perl6 distributions in order to keep them separate from the Perl5 dists. This reserved place is a Perl6 subdir in an author's dir (/author/id/*/*/*/Perl6/). Any dists in or under that spot on the fs will be considered a Perl6 dist; valid or invalid. So this is where the Perl6 MetaCPAN will look and the Perl5 instance will not.

Current development is being done on these temporary branches:

And the main dev instance is running on hack.p6c.org. The web end is at http://hack.p6c.org:5001 and the api is at http://hack.p6c.org:5000.

So far the idea has been to iterate on the aforementioned branches and instance until we have something that works sufficiently well. At that point we'll tidy up the branches and submit them for merging. Shortly after that time the hope is that we'll be able to stand up the official Perl6 instance.

The list of requirements for being adequately cooked is:

  1. track Perl6 CPAN dists and ignore Perl5 dists
  2. import a Perl6 distribution
  3. index a Perl6 distribution for search
  4. render pod6 documentation
  5. do Perl6 syntax highlighting

All of these have been hacked in and are at various degrees of completeness. Next up is testing and fixing bugs until nothing major is left. To that end I've recently loaded up the dev instance with all the distributions from modules.perl6.org. The dist files were generated, very hackily, with https://github.com/jdv/cpan-api/blob/master/test_p6_eco_to_p6_cpan.pl. I also just loaded them all under one user, mine, for simplicity. That load looks like it has problems of its own as well as revealing a bunch of issues. So in the coming days I hope to get that all sorted out.

Perl6 and CPAN

Published by jdv on 2015-10-08T13:31:00

In the Perl5 world, just in case anyone is unaware, CPAN is a major factor. Its basically the hub of the Perl5 world.

What I am referring to here as CPAN is not just the mirrored collection of 32K+ distributions. Its the ecosystem that's built up around that collection. This ecosystem has many parts, some more important than others depending on who you talk to, but the most important parts to me are:

These are the 5 aspects of "CPAN" that I'd like to see happen for Perl6. One way to get that would be to write the whole thing from scratch in Perl6. While it may sound cool in some sort of dogfoody and/or bootstrappy kind of way to some, it sounds like a lot of work to me and we're a bit strapped for developer resources. Another way would be to add support for Perl6 to the existing CPAN bits. The hope there being, primarily, that it'd be a lot less work. The latter approach is what I've been working on lately. And if we want to refactor ourselves off the Perl5 bits in the future we can take our time doing it; later.

At this time we have:

So we can publish Perl6 distributions to CPAN and search that collection. Well, sort of on that last bit. The metacpan prototype instance is not currently tracking CPAN. Its actually been loaded up with Perl6 distributions from the Perl6 module ecosystem (modules.perl6.org) for testing. But hopefully soon we'll have an official Perl6 metacpan instance, separate from the Perl5 instance, that will track CPAN's Perl6 content as it should.

What we need next is:

If anyone is interested in working on any of this stuff please stop by #perl6 on freenode. If nobody else is able to help you I'll (jdv79) do my best.

Published by Steve on 2015-09-11T09:43:00

A Little GLRer (revision 1)

The GLR (Great List Refactor) radically changed the way lists worked in Rakudo (an implementation of Perl).

This blog post is a list of some one-liners to show differences between the old (pre-glr) rakudo and the new (glr) rakudo intended to aid understanding and porting of modules.

Note this was done for self-education and may contain errors or things which may change. 

Thanks to those on Freenode IRC/perl6 for help.

Further corrections and expansions welcome either on iRC via pull request to https://github.com/stmuk/glr-html



 pre  GLR
GLR
LIST IS NOW PARCEL
> say (1,2,3).WHAT
(Parcel)
> say (1,2,3).WHAT
(List)
LACK OF IMPLICIT LIST FLATTENING
> my @array = 1,(2,3),4
1 2 3 4
> @array.elems
4
my @array = 1,(2,3),4
[1 (2 3) 4]
> @array.elems
3

to flatten

> my @list := 1, [2, 3], 4
(1 [2 3] 4)
> dd @list.flat.list
(1, 2, 3, 4)

or

> my @array = (1,(2,3),4).flat
[1 2 3 4]

or more complex structures (jnthn++)

say gather [[[[["a", "b"], "c"], "a"], "d"], "e"].deepmap(*.take)
.lol METHOD REMOVED
> dd (1,2,3).lol
(1; 2; 3)
SINGLE ARG RULE
> dd (1,)
(1,)
> dd [1,]
$ = [1]
> dd [[1,]]
$ = [[1]]
> dd (1,)
(1)
> dd [1,]
[1]
> dd [[1],]
[[1],]
LIST NOW IMMUTABLE
> my @array = 1,2,3
1 2 3
> @array.shift
1
> dd @array
@array = [2, 3]<>
> my @list := 1,2,3
(1 2 3)
> @list.shift
Method 'shift' not found for invocant of class 'List'
> @list[0]
1
> dd @list
(1, 2, 3)
ARRAY IS MUTABLE AND A SUBCLASS OF LIST
> my @array = 1,2,3
[1 2 3]
> @array[0]=0
0
> dd @array
@array = [0, 2, 3]
>say (Array).^mro
((Array) (List) (Cool) (Any) (Mu))
SLIP SUBCLASS OF LIST
> my @a = 1, (2, 3).Slip, 4
[1 2 3 4]
> my $slip = slip(2,3)
(2 3)
> dd $slip
Slip $slip = $(2, 3)
> my @array = 1,$slip,4
[1 2 3 4]
> (1,$(2,3),4)
(1 (2 3) 4)
> (1,|(2,3),4)
(1 2 3 4)
SEQUENCE
> my $grep = (1..4).grep(*>2); dd $grep>>.Int;
(3, 4)
> dd $grep>>.Int;
This Seq has already been iterated, and its values consumed
in block  at :1

prevent consumption

> my $grep = (1..4).grep(*>2); my $cache=$grep.cache
(3 4)
> say $cache>>.Int
(3 4)
> say $cache>>.Int
(3 4)
> my @array = 1,(2,3),4
[1 (2 3) 4]
> dd @array.flat
(1, $(2, 3), 4).Seq
> dd @array.flat.list
(1, $(2, 3), 4)

YAPC::EU 2015

Published by Steve on 2015-09-05T08:42:00

We came down to Granada on Tuesday night and (after missing the pre-conference meeting with its free pizza) made our way to the Amsterdam Bar with its massive selection of bottled import beers and rather bizarre nut and soft sweets tapas.

Wednesday morning we made our way to the venue.  The conference topic was Art and Engineering and the venue a particularly arty looking university science building with a large Foucault pendulum inside and "Bombes de Vapor" (steam engines and the like) outside.  The Arabic art influenced T shirts were the most stylish since the Pisa ones and the seats in the main hall were the most comfortable YAPC seats ever.






I first saw Leon Timmermans gave some good advice about how to contribute to Perl 5 core even if you didn't know the odd C89 plus macros language in which it was written.  It was followed by Bart (brrt) Wiegmans speaking about the Just In Time (JIT) compiler for MoarVM -- perl6's main VM -- in a quite high level talk so we were spared the scary details (which I later noticed included s-expressions).  Kang-min (gugod) Liu spoke about Booking's search engine which he couldn't show us and how he indexed his email (which he could).

https://github.com/gugod/Mailsheep

The main conference dinner of tapas was that evening around the pool of a four star hotel with constant glass refills.  Thankfully noone fell in.  More sadly we learnt Jeff Goff had been injured earlier and was in hospital.

Next day started with Sawyer X's State of the [Art] Velociraptor which was upbeat and positive stressing the benefits of community.  Upasana spoke about Moose meta objects and Leonerd bravely fought AV issues to speak about how perl5 sort of resembled scheme a little bit. 

https://t.co/9KeXNzSSlJ

At the end of day Xavier Noria, currently a ruby programmer, spoke about how much he missed Perl since many things (like docs) were better.

Next day I got up at silly o'clock to hear Art School dropout Stevan Little compare his former subject with programming with some interesting details about painting techniques.  Kerstin Puschke talked about RabbitMQ including some live code examples using Perl 5.

https://speakerdeck.com/titanoboa/decouple-all-the-things-asynchronous-messaging-keeps-it-simple



Domm told us about his image uploading Perl 6 script

http://domm.plix.at/talks/2015_granada_potd

which uploaded pics to twitter including one of his audience.

http://domm.plix.at/potd/2015-09-04.html

Gabor talked us through a minimal port of Dancer to Perl 6 called "Bailador" (which is part of Rakudo Star).

http://perl6maven.com/bailador-app-in-a-module

perl6maven.com actually uses perl6 in production!

Herbert Breunung spoke about Functional Perl 6 using a particularly garish slide deck.  John Lightsey did a line by line audit of an old version of Module::Signature to point out some security issues.  Liz did Jonathan's Parallelism, Concurrency, and Asynchrony in Perl 6 since the original author sadly couldn't make it.  At least one thing had changed in the week since I last heard the talk!

Finally a long haired Larry compared Perl 5 and 6 with Tolkien's Hobbit and Lord of the Rings respectively  and sang a bit. Two out of the three big ticket items for Perl 6.0 were done and things were looking good for a Long Expected Christmas Party.  This was a truly great keynote and went down a storm.

Some of the final lightening talks were particularly good with one even given in rapid Japanese.  To finish off Friday night Sue organised a "Sherry Tasting" visit to a local tapas restaurant which also included much consumption of the local beer Alhambra 1925.  A large number of mongers turned up to effectively take over the whole place.  Some also stayed up all night playing card games

Perl6 Grammars for Beginners Talk

Published by Steve on 2015-08-14T07:35:00

I gave a beginners level talk about Perl6 grammars to a meeting of London Perlmongers on Aug 13th talking about App::p6tags (Generate editor tags for perl6).

https://github.com/stmuk/p6-app-p6tags


Sides are at
https://drive.google.com/file/d/0B731WtD7iMvMbThKOEszellTTjA/view?usp=sharing

And video on youtube.





Three Tales of Second System Syndrome

Published by Brent Laabs on 2015-05-03T19:57:00


In the last decade, three major scripting languages embarked on project to produce a major revision to each language: Perl 6, Python 3, and PHP 6. Despite surface similarities, such as the problem of Unicode support, each language ended up on a radically different track.

With the Perl 6.0.0 release officially coming this year, it's a good time to reflect on how we got to this point, and to start thinking about what comes after the release.

PHP 6

So -- and I can't believe I'm writing this -- let's see if we can learn something from PHP. Andi Gutmans, who is now the CEO of Zend Technologies, gave an interview back in February 2008. In it, he said,
So we are anticipating a long rollout cycle for PHP 6, and we did not want to take the same route that the Perl project did, with project contributors still working on Perl 6 I think six years later. People make fun of Microsoft, but take a look at Perl 6. . . .

To which Andy Lester of PerlBuzz replied:
Sure, PHP 6 may have a shorter release cycle than Perl 6 has, but at the end of it all, we'll have Perl 6, and you'll still have PHP.

Just sayin'.
xoxo,
Andy
So how did those predictions work out? Well, after a little over six years of development, we discovered that we were never going to see a PHP 6 at all. Having seen how long Perl 6 had taken, and how long PHP 6 was taking, the number 6 is associated with failure. So they cancelled PHP 6 and voted to change the name to PHP 7. Problem solved! No, really, this is some of the actual reasoning given by people on the 6 to 7 RFC. (Someone should tell the ES6 folks before the curse strikes our browsers!)

But the main intent of the renumbering was to justify a much reduced scope of new features for the next major version of PHP. PHP 7 is slated to add:
EDIT: People both here and on Hacker News have pointed out that this is the above feature list was from a bad source, and that much of PHP 6 was incorporated into 5.3.  See the better summary of PHP 7 features, including generator improvements, and new operators like ??.  However, much of the same analysis still applies -- the end result was very few backwards incompatible changes, not the major revision promised with major Unicode improvements.

Perl 6

Meanwhile Perl 6, which has taken 15 years to get to the 6.0.0 release slated for this Christmas.  I'm sure that there were some embarrassing quotes about when it's going to be done, but that was so long ago, I'll just link to this post forecasting that Perl 6 will be ready for production in 2027.

As it now stands, Perl 6 comes with this set of new features:
Honestly, there are a whole lot more of these features. This even excludes things that have already made back into the Perl 5 core, like subroutine signatures and smartmatching. And these are all things that are working today.

The eerie thing is that Andy's flippant prediction came true. At the end of it, we have Perl 6, and they still have the same old PHP. Let me repeat that: we have Perl 6. It works, it will get a major release this year, and it is going to come with many more features than originally promised.

Still, Perl 6 has had its share of doubters. Some people proposed, actually seriously, that Perl 5 should leapfrog ahead to Perl 7 with the next version, and Perl 6 can go on calling itself that if it wants. Right. While this idea was rejected by the general Perl community, PHP actually skipped a version a year later. I guess it's another example of PHP stealing the worst ideas from Perl.

Python 3

The Python group, on the other hand, has tried to stay mostly on the side of sanity. Python 3 introduced a much smaller set of breaking changes, in order to keep updates rolling out. It was introduced, well, six years ago in early 2009.

New features of Python 3 included:
So how's that working out? The latest version of python preinstalled on my fully updated MacBook is 2.7.6. At least Ubuntu gives me 3.4.0 — Apple is well known to be crap at updating OSS. But you'd think someone at Apple would have cared in six years would have cared enough to throw python3 in the XCode monster download; after all, Python does not have the kiss of death known as the GPLv3 license.

The flip side of availability is developer adoption; this story isn't much better. If you look at statistics from a last year and this month, Python 3 adoption rates are abysmal. Hell, even 23% of people inside the Python community still think Python 3 was a mistake. Despite obvious improvements, it's still considered a tough sell.

Second Deployment Syndrome

So the takeaway from all of this is that Second System Syndrome is a real problem, but not the only problem. Successfully executing major revisions to a language is difficult, but getting widespread adoption is just as difficult with breaking changes. Second Deployment Syndrome can be just as hard to face as making the new system in the first place.

So we have three software communities that took radically different approaches to building a second system. PHP is a complete zoo of awful design, begging to be tamed. Yet the PHP community effectively voted to give up, and only offer incremental change that doesn't address PHP 6's number one issue of Unicode support. The Python folks, bless their hearts, made a smaller set of achievable changes, implemented it in 3 years, and shipped the damn thing. And despite the truly useful improvements, only a few people came.

Perl decided to stick to its vision of "break all the things once", and it's taken 15 long years. That's almost as long as the HTML 5 spec. Over this time, the design has continued to evolve, incorporating more modern needs like easily multithreaded code that would have otherwise been missed. Although the complaint of "no final spec" is common, it has been learned the hard way that the spec is the very last thing that should be finalized.

It's easy to naively say that 15 years is a ridiculous amount of development time, but it's obvious from looking at second systems for the other scripting languages, Perl 6 was never going to complete such a major transition in less than a decade. What's still unclear is whether this transition is going to work out for Perl.

Nearly everyone who tries Perl 6 from a Perl 5 background likes it immensely, which is usually followed up by a "can this not be so slow?" Optimization is still getting there, just not prematurely. In general, reception has been a net positive. And unlike the breaking changes introduced in the other languages, Inline::Perl5 allows multiple versions of Perl to coexist in the same program.

Will this be enough? It's too early to tell. Perl 5 is going to last another 5 years at the minimum, if not forever, munging text output by a shell script written by a programmer from generations ago. Perl 6 will have an uphill battle with Perl 5 for ubiquity, legacy code, and language familiarity.

Adoption rate is the next big challenge facing Perl 6. There is a very real possibility that six years from now, Perl 5 will still be the dominant form of an ever shrinking faction of Perl users. After all, Python might be in the same boat right now. Perl needs to reverse an already existing downward trend, at least partially brought on by how frakking long Perl 6 took in the first place.

The best advice I can see for ensuring Perl 6's success is for Perl developers to start writing code in Perl 6. I mean now; it's definitely stable enough. Every module available within a year of release is going to be a major argument for people to try the new version. Getting Shit Done can win a lot of arguments.

After that, it's going to be a tough slog. Is it deployed enough places to distribute code in? Is there enough code written in it to deploy to more places? Package managers like apt and Homebrew are going to help with bootstrapping the user base, but to win Perl 6 going to have to get that killer app.

So for now, it's a giant gamble. In poker terms, Python 3 called, PHP 6 folded, and Perl 6 went all-in. It just might be possible that Perl 6's crazy long development process can produce the best-adopted second system around, if people decide that the overwhelming improvements are worth the hassle of upgrading.

I'll let you know how that went in six years.

Parrot 7.4.0 "Festive Amazon" released! by Bruce Gray

Published on 2015-05-20T14:59:53

On behalf of the Parrot team, I'm proud to announce Parrot 7.4.0, also known
as "Festive Amazon". Parrot (http://parrot.org/) is a virtual machine aimed
at running all dynamic languages.

Parrot 7.4.0 is available on Parrot's FTP site
(ftp://ftp.parrot.org/pub/parrot/releases/devel/7.4.0/), or by following the
download instructions at http://parrot.org/download. For those who would like
to develop on Parrot, or help develop Parrot itself, we recommend using Git to
retrieve the source code to get the latest and best Parrot code.

Parrot 7.4.0 News:
- Documentation
+ Many minor corrections
- Community
+ Coverity scans to resume RSN.


The SHA256 message digests for the downloadable tarballs are:
b191da72e668c5bd97e1792a1b5d8fe66713819066f6a2f5eef2e9bc21d92968 parrot-7.4.0.tar.gz
724868f94bf7d45ba5cda29b041b18fc7cbcd2fe5196455cc3882c2f99a84f4b parrot-7.4.0.tar.bz2

Many thanks to all our contributors for making this possible, and our sponsors
for supporting this project. Our next scheduled release is at 16 Jun 2015.

Enjoy!

Parrot 7.3.0 release announcement by Reini Urban

Published on 2015-04-21T17:58:08

On behalf of the Parrot team, I'm proud to announce Parrot 7.3.0, also
known as "Peach-faced Lovebird".
It is a supported release with a stable API until 7.6.0 end of July 2015.
Parrot (http://parrot.org/) is a virtual machine aimed at running all
dynamic languages.

Parrot 7.3.0 is available on Parrot's FTP site
(ftp://ftp.parrot.org/pub/parrot/releases/supported/7.3.0/), or by following the
download instructions at http://parrot.org/download. For those who
would like to develop on Parrot, or help develop Parrot itself, we
recommend using Git to retrieve the source code to get the latest and
best Parrot code.

Parrot 7.3.0 News:
- Build
+ Fixed windows link regression from 7.0.2 with cl.exe. #1203
+ Fixed rlimit compilation for OpenBSD
- Tests
+ Relaxed the common GC stress test and re-add the JSON.nqp variant.


The SHA256 message digests for the downloadable tarballs are:
23d2f59a0399a63a835087a192bede02a25f21fbcf5e42ed113b4c6dcdbea6b1
parrot-7.3.0.tar.gz
a40a6a21965ead120ceee7ac98e3b0ba2edbdfa2a5d8637ace91dcc7991373f2
parrot-7.3.0.tar.bz2

Many thanks to all our contributors for making this possible, and our
sponsors for supporting this project. Our next scheduled release is
at 19 May 2015.

Enjoy!

Parrot 7.2.0 "Blue-crowned racquet-tail" released! by Bruce Gray

Published on 2015-03-19T06:06:01

This is the bright candlelit room where the life-timers are
stored—shelf upon shelf of them, squat hourglasses, one for every
living person, pouring their fine sand from the future into the past.
The accumulated hiss of the falling grains makes the room roar like
the sea.

This is the owner of the room, stalking through it with a preoccupied air.
His name is Death.

But not any Death. This is the Death whose particular sphere of
operations is, well, not a sphere at all, but the Discworld, which is
flat and rides on the back of four giant elephants who stand on the
shell of the enormous star turtle Great A’Tuin, and which is bounded by
a waterfall that cascades endlessly into space.

Scientists have calculated that the chance of anything so patently
absurd actually existing are millions to one.

But magicians have calculated that million-to-one chances crop up nine
times out of ten.

-- "Mort", GNU Terry Pratchett

On behalf of the Parrot team, I'm proud to announce Parrot 7.2.0, also known
as "Blue-crowned racquet-tail". Parrot (http://parrot.org/) is a virtual machine aimed
at running all dynamic languages. The blue-crowned racket-tail (Prioniturus discurus)
is a parrot found on all the larger islands of the Philippines not starting with "P".

Parrot 7.2.0 is available on Parrot's FTP site
(ftp://ftp.parrot.org/pub/parrot/releases/devel/7.2.0/), or by following the
download instructions at http://parrot.org/download. For those who would like
to develop on Parrot, or help develop Parrot itself, we recommend using Git to
retrieve the source code to get the latest and best Parrot code.

Parrot 7.2.0 News:
- Build
+ Fix warning on Win32 (with cl.exe) when `link` is not explicitly set.


The SHA256 message digests for the downloadable tarballs are:
f4792fc1a82040dd855f73890de6fa26759aa62f4b4ad1aa468597592b7bf3bf parrot-7.2.0.tar.gz
74e5821155eaf29d7c1655fd3b5b90a84afe23361318242947c50f59da5918e1 parrot-7.2.0.tar.bz2

Many thanks to all our contributors for making this possible, and our sponsors
for supporting this project. Our next scheduled release is at 21 Apr 2015.

Enjoy!

Suspending Rakudo support for Parrot

Published by pmichaud on 2015-02-16T15:47:37

At FOSDEM 2015, Larry announced that there will likely be a Perl 6 release candidate in 2015, possibly around the September timeframe. What we’re aiming for is concurrent publication of a language specification that has been implemented and tested in at least one usable compilation environment — i.e., Rakudo Perl 6.

So, for the rest of 2015, we can expect the Rakudo development team to be highly focused on doing only those things needed to prepare for the Perl 6 release later in the year. And, from previous planning and discussion, we know that there are three major areas that need work prior to release: the Great List Refactor (GLR), Native Shaped Arrays (NSA), and Normalization Form Grapheme (NFG).

…which brings us to Parrot. Each of the above items is made significantly more complicated by Rakudo’s ongoing support for Parrot, either because Parrot lacks key features needed for implementation (NSA, NFG) or because a lot of special-case code is being used to maintain adequate performance (lists and GLR).

At present most of the current userbase has switched over to MoarVM as the backend, for a multitude of reasons. And more importantly, there currently aren’t any Rakudo or NQP developers on hand that are eager to tackle these problems for Parrot.

In order to better focus our limited resources on the tasks needed for a Perl 6 language release later in the year, we’re expecting to suspend Rakudo’s support for the Parrot backend sometime shortly after the 2015.02 release.

Unfortunately the changes that need to be made, especially for the GLR, make it impractical to simply leave existing Parrot support in place and have it continue to work at a “degraded” level. Many of the underlying assumptions will be changing. It will instead be more effective to (re)build the new systems without Parrot support and then re-establish Parrot as if it is a new backend VM for Rakudo, following the techniques that were used to create JVM, MoarVM, and other backends for Rakudo.

NQP will continue to support Parrot as before; none of the Rakudo refactorings require any changes to NQP.

If there are people that want to work on refactoring Rakudo’s support for Parrot so that it’s more consistent with the other VMs, we can certainly point them in the right direction. For the GLR this will mainly consists of migrating parrot-specific code from Rakudo into NQP’s APIs. For the NSA and NFG work, it will involve developing a lot of new code and feature capabilities that Parrot doesn’t possess.

Announce: Rakudo Star Release 2015.01 by Moritz Lenz

Published on 2015-02-07T23:23:53

# Announce: Rakudo Star Release 2015.01

## A useful, usable, "early adopter" distribution of Perl 6

On behalf of the Rakudo and Perl 6 development teams, I'm happy to
announce the January 2015 release of "Rakudo Star", a useful and usable
distribution of Perl 6. The tarball for the January 2015 release is
available from <http://rakudo.org/downloads/star/>.

This Rakudo Star release comes with support for the MoarVM
backend (all module tests pass on supported platforms) along with
experimental support for the JVM backend (some module tests fail).
Three shipped modules are known to fail on Parrot (zavolaj (NativeCall),
jsonrpc and doc)

In the Perl 6 world, we make a distinction between the language
("Perl 6") and specific implementations of the language such as
"Rakudo Perl". This Star release includes [release 2015.01.1] of the
[Rakudo Perl 6 compiler], version 7.0.1 of the [Parrot Virtual
Machine], version 2015.01 of [MoarVM], plus various modules,
documentation, and other resources collected from the Perl 6
community.

[release 2015.01.1]:
https://github.com/rakudo/rakudo/blob/nom/docs/announce/2015.01.md
[Rakudo Perl 6 compiler]: http://github.com/rakudo/rakudo
[Parrot Virtual Machine]: http://parrot.org
[MoarVM]: http://moarvm.org/

Some of the new compiler features added to this release include:

+ Many improvements to Java interop for the JVM backend
+ New simple way of creating an object hash: :{}
+ Substitution now supports assignment meta-op, e.g. s[\d+] += 2
+ Many memory and CPU optimizations
+ Supply.for deprecated in favour of Supply.from-list

Changes to modules included in Rakudo Star:

- [Bailador](https://github.com/tadzik/Bailador) handles POST and URL
params separately
- [DBIish](https://github.com/perl6/DBIish) has improved error reporting
on SQLite
- [doc](https://github.com/perl6/doc) ships with much more documentation
- [panda](https://github.com/tadzik/panda) has a new command `installdeps`
- [Pod::To::HTML](https://github.com/perl6/Pod-To-HTML) now supports
callbacks for code areas

Parrot support will likely be suspended or dropped from future Rakudo
and Rakudo
Star releases, starting with the February or March releases.

In the next Rakudo Star release, modules `Math::RungeKutta` and
`Math::Model`
will likely be dropped. They can still be installed with `panda`.

In future, the `nqp::` namespace willl only be available after a declaration
like `use nqp;'.

There are some key features of Perl 6 that Rakudo Star does not yet
handle appropriately, although they will appear in upcoming releases.
Some of the not-quite-there features include:

* advanced macros
* threads and concurrency (in progress for the JVM and MoarVM backend)
* Unicode strings at levels other than codepoints
* interactive readline that understands Unicode
* non-blocking I/O (in progress for the JVM and MoarVM backend)
* much of Synopsis 9 and 11

There is an online resource at <http://perl6.org/compilers/features>
that lists the known implemented and missing features of Rakudo's
backends and other Perl 6 implementations.

In many places we've tried to make Rakudo smart enough to inform the
programmer that a given feature isn't implemented, but there are many
that we've missed. Bug reports about missing and broken features are
welcomed at <[email protected]>.

See <http://perl6.org/> for links to much more information about
Perl 6, including documentation, example code, tutorials, reference
materials, specification documents, and other supporting resources. A
draft of a Perl 6 book is available as docs/UsingPerl6-draft.pdf in
the release tarball.

The development team thanks all of the contributors and sponsors for
making Rakudo Star possible. If you would like to contribute, see
<http://rakudo.org/how-to-help>, ask on the <[email protected]>
mailing list, or join us on IRC \#perl6 on freenode.

Parrot 7.0.2 Hotfix released by Reini Urban

Published on 2015-01-29T14:02:38

We detected and fixed two bugs and regressions from 6.10.0 which
failed to build parrot on Microsoft Windows with Microsoft Visual
Studio C++.

- Wrong function ptr cast on win64
https://github.com/parrot/parrot/issues/1190
- Wrong SAL annotations on msvc cl < 16.00
https://github.com/parrot/parrot/issues/1192

Other minor changes in this hotfix:
- Optimize away ExtUtils::Command on posix systems. #1177
- Fix cpu config values for gcc_cmpxchg to include atomic/gcc_x86.o on amd64.
Harmonized the cpu config keys, no $platform_has_$feature
keys anymore, just HAS_$PLATFORM_$feature. #1173
- Improved msvc configuration from a mingw perl. #1191

Parrot is a virtual machine aimed at running all dynamic languages.
Parrot 7.0.2 is available on Parrot's FTP site, or by following the
download instructions. For those who want to hack on Parrot or
languages that run on top of Parrot, we recommend our organization
page on GitHub, or you can go directly to the official Parrot Git repo
on Github

To clone the Parrot Git repo into a directory called 'parrot', use the
following:
git clone git://github.com/parrot/parrot.git

If you want it to be in a directory other than 'parrot', then just
give that as a second argument to clone:
git clone git://github.com/parrot/parrot.git parrot_foo

The SHA256 message digests for the downloadable tarballs are:
1a8e9e203ad8ac92c89422603d1603fa821e957aa3a9ae57420c737d93c55213
parrot-7.0.2.tar.gz
2cc8dc1eada38bb6328bf1f4648bd5e01e000b415f984b7ad6b5b6c123a15ac9
parrot-7.0.2.tar.bz2

Thanks to all our contributors for making this possible, and our
sponsors for supporting this project. Our next scheduled release is at
17 Feb 2015.
Enjoy!
--
Reini Urban
http://cpanel.net/ http://www.perl-compiler.org/

APW2014 and the Rakudo Great List Refactor

Published by pmichaud on 2014-10-15T15:01:55

This past weekend I attended the 2014 Austrian Perl Workshop and Hackathon in Salzburg, which turned out to be an excellent way for me to catch up on recent changes to Perl 6 and Rakudo. I also wanted to participate directly in discussions about the Great List Refactor, which has been a longstanding topic in Rakudo development.

What exactly is the “Great List Refactor” (GLR)? For several years Rakudo developers and users have identified a number of problems with the existing implementation of list types — most notably performance. But we’ve also observed the need for user-facing changes in the design, especially in generating and flattening lists.  So the term GLR now encompasses all of the list-related changes that seem to want to be made.

It’s a significant (“great”) refactor because our past experience has shown that small changes in the list implementation often have far-reaching effects. Almost any bit of rework of list fundamentals requires a fairly significant refactor throughout much of the codebase. This is because lists are so fundamental to how Perl 6 works internally, just like the object model. So, as the number of things that are desirable to fix or change has grown, so has the estimated size of the GLR effort, and the need to try to achieve it “all at once” rather than piecemeal.

The pressure to make progress on the GLR has been steadily increasing, and APW2014 was significant in that a lot of the key people needed for that would be in the same location. Everyone I’ve talked to agrees that APW2014 was a smashing success, and I believe that we’ve now resolved most of the remaining GLR design issues. The rest of this post will describe that.

This is an appropriate moment to recognize and thank the people behind the APW effort. The organizers did a great job.  The Techno-Z and ncm.at venues were fantastic locations for our meetings and discussions, and I especially thank ncm.at, Techno-Z, yesterdigital, and vienna.pm for their generous support in providing venues and food at the event.

So, here’s my summary of GLR issues where we were able to reach significant progress and consensus.

You are now leaving flatland

(Be sure to visit our gift shop!)

Much of the GLR discussion at APW2014 concerned flattening list context in Perl 6. Over the past few months and years Perl 6 has slowly but steadily reduced the number of functions and operators that flatten by default. In fact, a very recent (and profound) change occurred within the last couple of months, when the .[] subscript operator for Parcels switched from flattening to non-flattening. To illustrate the difference, the expression

(10,(11,12,13),(14,15)).[2]

previously would flatten out the elements to return 12, but now no longer flattens and produces (14,15). As a related consequence, .elems no longer flattens either, changing from 6 to 3.

Unfortunately, this change created a inconsistency between Parcels and Lists, because .[] and .elems on Lists continued to flatten. Since programmers often don’t know (or care) when they’re working with a Parcel or a List, the inconsistency was becoming a significant pain point. Other inconsistencies were increasing as well: some methods like .sort, .pick, and .roll have become non-flattening, while other methods like .map, .grep, and .max continue to flatten. There’s been no really good guideline to know or decide which should do which.

Flattening behavior is great when you want it, which is a lot of the time.  After all, that’s what Perl 5 does, and it’s a pretty popular language. But once a list is flattened it’s hard to get the original structure if you wanted that — flattening discards information.

So, after many animated discussions, review of lots of code snippets, and seeking some level of consistency, the consensus on Perl 6 flattening behavior seems to be:

United Parcel Severance

As a result of improvements in flattening consistency and behavior, it appears that we can eliminate the Parcel type altogether. There was almost unanimous agreement and enthusiasm at this notion, as having both the Parcel and List types is quite confusing.

Parcel was originally conceived for Perl 6 as a “hidden type” that programmers would rarely encounter, but it didn’t work out that way in practice. It’s nice that we may be able to hide it again — by eliminating it altogether. 🙂

Thus infix:<,> will now create Lists directly. It’s likely that comma-Lists will be immutable, at least in the initial implementation. Later we may relax that restriction, although immutability also provides some optimization benefits, and Jonathan points out that may help to implement fixed-size Arrays.

Speaking of optimization, eliminating Parcel may be a big boost to performance, since Rakudo currently does a fair bit of converting Parcels to Lists and vice-versa, much of which goes away if everything is a List.

A few more times around the (loop) blocks

During a dinner discussion Jonathan reminded me that Synopsis 4 has all of the looping constructs as list generators, but Rakudo really only implements for at the moment. He also pointed out that if the loop generators are implemented, many functions that currently use gather/take could potentially use a loop instead, and this could be much more performant. After thinking on it a bit, I think Jonathan is on to something. For example, the code for IO::Handle.lines() currently does something like:

gather {
    until not $!PIO.eof {
        $!ins = $!ins + 1;
        take self.get;
    }
 }

With a lazy while generator, it could be written as

(while not $!PIO.eof { $!ins++; self.get });

This is lazily processed, but doesn’t involve any of the exception or continuation handling that gather/take requires. And since while might choose to not be strictly lazy, but lines() definitely should be, we may also use the lazy statement prefix:

lazy while not $!PIO.eof { $!ins++; self.get };

The lazy prefix tells the list returned from the while that it’s to generate as lazily as it possibly can, only returning the minimum number of elements needed to satisfy each request.

So as part of the GLR, we’ll implement the lazy list forms of all of the looping constructs (for, while, until, repeat, loop). In the process I also plan to unify them under a single LoopIter type, which can avoid repetition and be heavily optimized.

This new loop iterator pattern should also make it possible to improve performance of for statements when performed in sink context. Currently for statements always generate calls to .map, passing the body of the loop as a closure. But in sink context the block of a for statement could potentially be inlined. This is the way blocks in most other loops are currently generated. Inlining the block of the body could greatly increase performance of for loops in sink context (which are quite common).

Many people are aware of the problem that constructs such as for and map aren’t “consuming” their input during processing. In other words, if you’re doing .map on a temporary list containing a million elements, the entire list stays around until all have been processed, which could eat up a lot of memory.

Naive solutions to this problem just don’t work — they carry lots of nasty side effects related to binding that led us to design immutable Iterators. We reviewed a few of them at the hackathon, and came back to the immutable Iterator we have now as the correct one. Part of the problem is that the current implementation is a little “leaky”, so that references to temporary objects hang around longer than we’d like and these keep the “processed” elements alive. The new implementation will plug some of the leaks, and then some judicious management of temporaries ought to take care of the rest.

I’ve got a sinking feeling…

In the past year much work has been done to improve sink context to Rakudo, but I’ve never felt the implementation we have now is what we really want. For one, the current approach bloats the codegen by adding a call to .sink after every sink-context statement (i.e., most of them). Also, this only handles sink for the object returned by a Routine — the Routine itself has no way of knowing it’s being called in sink context such that it could optimize what it produces (and not bother to calculate or return a result).

We’d really like each Routine to know when it’s being called in sink context.  Perl 5 folks will instantly say “Hey, that’s wantarray!”, which we long ago determined isn’t generally feasible in Perl 6.

However, although a generalized wantarray is still out of reach, we can provide it for the limited case of detecting sink contexts that we’re generating now, since those are all statically determined. This means a Routine can check if it’s been called in sink context, and use that to select a different codepath or result.  Jonathan speculates that the mechanism will be a flag in the callsite, and I further speculate the Routine will have a macro-like keyword to check that flag.

Even with detecting context, we still want any objects returned by a Routine to have .sink invoked on them.  Instead of generating code for this after each sink-level statement, we can do it as part of the general return handler for Routines; a Routine in sink context invokes .sink on the object it would’ve otherwise returned to the caller.  This directly leads to other potential optimizations:  we can avoid .sink on some objects altogether by checking their type, and the return handler probably doesn’t need to do any decontainerizing on the return value.

As happy as I am to have discovered this way to pass sink context down into Routines, please don’t take this as opening an easy path to lots of other wantarray-like capabilities in Perl 6. There may be others, and we can look for them, but I believe sink context’s static nature (as well as the fact that a false negative generally isn’t harmful) makes it quite a special case.

The value of consistency

One area that has always been ambiguous in the Synopses is determining when various contextualizing methods must return a copy or are allowed to return self. For example, if I invoke .values on a List object, can I just return self, or must I return a clone that can be modified without affecting the original? What about .list and .flat on an already-flattened list?

The ultra-safe answer here is probably to always return a copy… but that can leave us with a lot of (intermediate) copies being made and lying around. Always returning self leads to unwanted action-at-a-distance bugs.

After discussion with Larry and Jonathan, I’ve decided that true contextualizers like .list and .flat are allowed to return self, but other method are generally obligated to return an independent object.  This seems to work well for all of the methods I’ve considered thus far, and may be a general pattern that extends to contextualizers outside of the GLR.

Now it’s just a SMOPAD

(small matter of programming and documentation)

The synopses — especially Synopsis 7 — have always been problematic in describing how lists work in Perl 6. The details given for lists have often been conjectural ideas that quickly prove to epic fail in practice. The last major list implementation was done in Summer 2010, and Synopsis 7 was supposed to be updated to reflect this design. However, the ongoing inconsistencies (that have led to the GLR) really precluded any meaningful update to the synopses.

With the progress recently made at APW2014, I’m really comfortable about where the Great List Refactor is leading us. It won’t be a trivial effort; there will be significant rewrite and refactor of the current Rakudo codebase, most of which will have to be done in a branch. And of course we’ll have to do a lot of testing, not only of the Perl 6 test suite but also the impact on the module ecosystem. But now that much of the hard decisions have been made, we have a roadmap that I hope will enable most of the GLR to be complete and documented in the synopses by Thanksgiving 2014.

Stay tuned.

Camelia at Age 13: Perl 6 on the JVM Debuts

Published by Brent Laabs on 2013-06-22T01:23:00

Perl 6 is now thirteen years old.  And she's very much a teenager in attitude, self-confident yet still growing up.  This contrasts with Javascript, which emerged from Brendan Eich's head, fully-formed like Athena -- but that only shared the Zeus-sized headaches with everyone until JQuery came along.

But Camelia, as she is fondly referred to by fans of Perl 6, is growing up fast.  Both too fast, and not fast enough.  To some of the community, the prospect of major changes to the language is scary.  Perl 6 is trying all of these crazy new things -- invariant sigils, metaoperators, grammars. She's even doing subroutine signatures, because "all of her friends are doing it".

They can't stay little children forever, you know.

And teenagers are liable to do surprising things.  So it was, that this week we announced Rakudo Perl 6 now runs on the Java Virtual Machine (JVM).  It's not perfect yet, but 62% of the files in the spectest pass as of yesterday.  Given the rate things are progressing, I'm sure it's already at a higher pass percent.

And yet, I'm sure there is no small number of you whose first thought about Perl on the JVM was "Heresy!".

There are certainly good reasons to support this point of view.  Startup times are horrible at this early stage, still O(seconds), and much of that is JVM's overhead.  It has well known security issues.  And of course the major implementation is controlled by $corporation, who just wants to make money off it.  And why would we want to abandon open-source VMs?

Still, there are plenty of good reasons for the port.  One is that the JVM is ubiquitous, and many devices have a copy pre-installed.  Most of Java's stability issues have been dealt with, and it serves Perl's competitors well enough through Jython and JRuby.  And it is well-documented, with bazillions of libraries (more than fooilions, anyway).  So we can finally address the longstanding desires of the community in things like sockets and threading, because we can tell the difference between our mistakes and those of the VM.

Instead of thinking of Perl 5 as a "sister language", I like to think of it as Camelia's father instead.  A father that might be kind of upset that she brought "that kind of language" into our house.  But she has a mind of her own, and I'm afraid that this won't be the only boyfriend she'll be bringing home.  There is a GSoC grant to build a Javascript backend to Rakudo.  And Niecza Perl 6 already uses the .NET Runtime.

However, Perl 6 is not abandoning open-source and C language implementations, either.  The announcement of MoarVM shows that Perl 6 developers plan to develop a lightweight VM specifically for NQP and Rakudo.  C functions will be directly callable within Perl itself, with the NativeCall interface.

Now, if Parrot fly off on its own course, that's Parrot's call.  You know how these teenage relationships go -- this could end up in a blow-up on the quad, or just as easily turn into hot makeup coding.  What, you didn't think I was going to say something that would violate the CoC, did you?

But Perl 6 is not done growing yet.  Camelia, like other teenagers, cares about fashion and wishes she had better threads.  And, once we get tuits, this is pretty high priority.  Because any modern language needs to live in the multi-core reality.  This is something that we can still design around,  that may not have recieved the same care ten years ago.  Many threading features are already baked into the language, like hyper operators and async blocks.

So I view the debut of the JVM port as Rakudo's real début, as with a debutante.  A treceañera, if you will.  I guess, given that she's 13, maybe it's a Bar Mitzvah -- except that she's not a boy, she's a butterfly.  But this is a chance acknowledge Perl 6's presence in the language scene.  Of course, these coming-of-age ceremonies don't mean the teenager is truly grown up yet. 

But grow up she will, and faster than some of you might like.  Perl 6 is rebellious, and changes things that those in her father's Perl 5 community don't understand.  But if you talk to the pumpkings, they sincerly hope that Camelia doesn't turn out exactly like her father.

After all, do we want keep the ithreads model?  Do we want to modules that dig into the compiler internals like XS does?  Perl 5 isn't perfect, we are just accustomed to its particular idiosyncrasies.

But for all that Perl 6 is different, she still loves her father.  We still have sigils, classes, @_ in subs (if you still want it), P5-style regexes, modules, and TIMTOWTDI.  It's still Perl.  Moreover, there are at least two efforts to run Perl 5 code in Perl 6 -- the v5 grammar/compiler in Perl 6, and access to libperl from MoarVM.  So the sky isn't falling on compatibility.

Nor is the other extreme true: Perl 6 development is in fact moving forward at a fast pace.  We know that Perl 6 is late.  Very late.  So late, in fact, that it's her father that's going to turn into a Pumpkin.  But when Perl 6 finally comes of age -- sooner than you think -- it will be something that the Perl community will be proud of.

And I apologize in advance for anything Camelia does in her bratty teenage years.

Thanking the Perl Community for an Awesome YAPC

Published by Brent Laabs on 2013-06-06T09:37:00

My first time at YAPC::NA was too incredible for words. That said, because this is a blog, I'm going to have to put it in words anyway.

So, about a month ago, I didn't even know that I was even going to YAPC.  I was just talking to raiph++ in an IRC back channel, when he asked me if I was going to the conference.  I said that it would be fun, but I didn't really have money to go.  Being unemployed means lots of free time for hacking, but not so much free money for going places.

Well, raiph told diakopter++, who asked if I could be willing to go, if he found funds.  I responded, "Of course, if you think it's possible."  I soon went to sleep, and twelve hours later, I had a plane ticket to Austin in my inbox courtesy of The Perl Foundation.   So just like Peter Rabbitson's case, the Perl community eagerly gave me a chance to attend.  So thank you to all 440 attendees, and all of the sponsors for your own personal contribution to my attendance.  Even though I'm new, you all gave a me a chance to participate in the community, and for that I am grateful.

And what a community it is.  I've long known that Perl programmers were a little strange.  Naturally, I fit right in.

The conference itself had quite a fun and informative series of talks.  More often than not, I had two or more that I wanted to attend at the same time.  For the most part, I stuck to the Perl 6 "track", where most of my work has been so far.  After all, it's not often that so many of the European contingent make a trip to our humble continent, so I was eager to spend time with them.

No one warned me that jnthn++ has a tendency to spring wild new features on us at YAPCs.  Reversible grammars, seriously?!  I'm still trying to wrap my head around that one.  The announcement of MoarVM was equally exciting, as it offers us a chance to start fresh with everything we learned about Perl 6 and virtual machines in the last five years.

So I have to say diakopter++ once again.  Besides introducing Moar in his talk, Matthew Wilson was constantly busy behind the scenes, making sure that everything ran smoothly the entire conference.  I think the man must be buying tuits on the black market.

YAPC also helped immensely in my hunt for a job.  The job fair brought me several contacts, and the talks helped me learn which skills I'll really need to learn in those jobs.  Person-to-person contact offers so much more in truly understanding the state of the language, and of what projects are of the greatest use right now.

Truly, Perl's community is it's greatest strength. 

It's the community that keeps Perl vital.  After seeing YAPC for myself, the whole "Perl is Dead" meme seems entirely baseless.  Conference attendance was up 19% over last year, which was the previous record high.  Perl feels like a growing language, with lots of experiments in how to revitalize the now 25-year-old syntax with p2, moe, and Perl 6.

The community keeps Perl relevant.  While it may not be the sole alternative to bash scripts like it once was, it is used for enterprise and homebrew projects alike, from the stock exchange to the surface of Mars.  Projects like DBIx, Moose, and Dancer provide modern frameworks to acomplish more with less work.

The community keeps Perl open.  No one seemed to be afraid to say what they felt, on CGI.pm or anything else, but everyone remained civil.  Hallway++ is a great social hack to get everyone to feel comfortable talking to each other.  So when I found myself sitting across from TimToady, instead of being intimidated as a newbie, I had a great conversation with him about supervolcanoes and nonverbal Japanese language.

And the community really wants all of the projects to succeed.  I spent a lot of time at non-profit and political events in the past, where we were all theoretically working for a common cause.  And yet scheming, conflict, and political maneuvering were inevitable.  But in Perl, where we actually have multiple implementations and modules competing for mindshare and tuits, people cheer for everything to succeed.  No one fights each other or rages against $language_of_the_week stealing our users, for the real enemy is the lack of tuits.

I overheard this at dinner last night, from a fellow first-time attendee:
"I'm just happy that the two of you liked my work." -- vanstyn
Although he was talking about DBIx, I think that captures the spirit of conference as a whole.  All of us here -- from the n00bs to the pumpkings -- want to share our work and make something useful for others.  It's not an organization where we wait for pronouncements from on high, but one where users create endless variations and share them.  Not an organization so much as a family.

During TimToady's epistle/speech to the community, he said something like:
"We have faith, hope, and love, but the most awesome of these is love." -- Larry Wall
A line like this might seem a bit hokey out of context, but it was actually moving when I heard it.  We have faith that we can use Perl to solve our problems.  We have hope that Perl 5 and 6 will continue to get better. And we love Perl, unconditionally, despite all of her flaws.  And as Wil Wheaton says about us geeks, we just want to love our special thing the best we can, and go the extra mile to share it with others.

I just want to say that I love the Perl community right back.  You went out of your way to include me and all the other newcomers.  You all gave me all a chance to learn, play, and code with you -- and to be part of your community -- and I am so glad you did.

A Perl 6 developer’s reply to a Russian Perl Podcast

Published by pmichaud on 2013-06-03T20:13:53

[This is a response to the Russian Perl Podcast transcribed by Peter Rabbitson and discussed at blogs.perl.org.]

I found this translation and podcast to be interesting and useful, thanks to all who put it together.

Since there seems to have been some disappointment that Perl 6 developers didn’t join in the discussions about “Perl 7” earlier this year, and in the podcast I’m specifically mentioned by name, I thought I’d go ahead and comment now and try to improve the record a bit.

While I can’t speak for the other Perl 6 developers, in my case I didn’t contribute to the discussion because nearly all the things I would’ve said were already being said better by others such as Larry, rjbs, mst, chromatic, etc.  I think a “Perl 7” rebrand is the wrong approach, for exactly the reasons they give.

A couple of statements  in the podcast refer to “hurting the feelings of Perl 6 developers” as being a problem resulting from a rebrand to Perl 7. I greatly appreciate that people are concerned with the possible impact of a Perl 5 rebrand on Perl 6 developers and our progress.  I believe that Perl 6’s success or failure at this point will have little to do with the fact that “6 is larger than 5”.  I don’t find the basic notion of “Perl 7” offensive or directly threatening to Perl 6.

But I fully agree with mst that “you can’t … have two successive numbers in two brands and not expect people to be confused.”  We already have problems explaining “5” and “6” — adding more small integers to the explanation would just make an existing problem even worse, and wouldn’t do anything to address the fundamental problems Perl 6 was intended to resolve.

Since respected voices in the community were already saying the things I thought about the name “Perl 7”, I felt that adding my voice to that chorus could only be more distracting than helpful to the discussion. My involvement would inject speculations on the motivations of Perl 6 developers into what is properly a discussion about how to promote progress with Perl 5.  I suspect that other Perl 6 developers independently arrived at similar conclusions and kept silent as well (Larry being a notable exception).

I’d also like to remark on a couple of @sharifulin’s comments in the podcast (acknowledging that the transcribed comments may be imprecise in the translation from Russian):

First, I’m absolutely not the “sole developer” of Perl 6 (13:23 in the podcast), or even the sole developer of Rakudo Perl 6.  Frankly I think it’s hugely disrespectful to so flippantly ignore the contributions of others in the Perl 6 development community.  Let’s put some actual facts into this discussion… in the past twelve months there have been over 6,500 commits from over 70 committers to the various Perl 6 related repositories (excluding module repositories), less than 4% (218) of those commits are from me. Take a look at the author lists from the Perl 6 commit logs and you may be a little surprised at some of the people you find listed there.

Second, there is not any sense in which I think that clicking “Like” on a Facebook posting could be considered “admitting defeat” (13:39 in the podcast). For one, my “Like” was actually liking rjbs’ reply to mst’s proposal, as correctly noted in the footnotes (thanks Peter!).

But more importantly, I just don’t believe that Perl 5 and Perl 6 are in a battle that requires there to be a conquerer, a vanquished, or an admission of defeat.

Pm

Porting a Module to Perl 6

Published by Brent Laabs on 2013-05-06T02:43:00

CPAN is a huge draw for Perl 5, with approximately umpteen zillion modules available for a wide arrangement of purposes.  It's probably the biggest draw for the Perl 5 language these days, given the newer, hipper scripting languages out there like Ruby, Python, and of course INTERCAL.

The problem is, these modules are not yet usable in Perl 6 directly.  There is an ongoing project to allow Perl 5 code to run in Rakudo, but so far only the most basic code works: like basic loops, quite a few builtins, backticks, etc.  It does inherit from the Perl 6 object system, which is pretty cool, so $foo->WHAT can tell you if you have a Str, Int, or IO::Handle.

So for right now, the only practical way to use Perl 5 modules is to rewrite them in Perl 6.  I just finished porting the File::Spec module, one of Perl 5's core modules, to help deal with file paths on different operating systems. FROGGS++ did much of the initial work on it, but he's moved on the P5 in P6 project mentioned above, so I picked up the slack. The end goal of the project is for me to integrate functionality like Perl 5's Path::Class into the core language, so that OS interoperability comes naturally when using the native functions.

As I got further into the port, I have been convinced that porting the module is a much better choice than relying on the Perl 5 code being integrated.  There are several reasons for this:

Code Cruft


There is a lot of support for operating systems that are now out of date.  This isn't a bad thing.  I'm sure that there's some hobbyist who will want to run Perl 6 on their OS/2 Warp system.  The problem comes when you look inside the code for the OS2 module:
    $path =~ s/^([a-z]:)/\l$1/s;
This little no-op snippet from canonpath (to produce the canonically correct path) converts a lowercase drive letter to lowercase.  It's not harmful, but it does illustrate the fact that no one has edited this code in 9 years.

This isn't the fault of the Perl 5 Porters -- they have plenty of better things to do than to support outdated OSes when not even bug tickets are coming in.  But translating the code sure gives a great opportunity to notice these problems.

In the end, I ended up cutting the entire OS2 module and delegating to Win32.pm, because it had support for things like UNC paths (//server/share) that OS2.pm had only half-implemented.  And so a huge block of code cruft bit the dust.

Readability and Maintainability


Part of the reason these issues happen in the first place is that it's harder to see what's going on in a given piece of code.

An example I came across was in this helper for tmpdir, a method to return the first temporary directory that's writable in a list of parameters.  In Perl 5, we get:

sub _tmpdir {
    my $self = shift;
    my @dirlist = @_;
    my $tmpdir;

    foreach (@dirlist) {
    next unless defined && -d && -w _;
    $tmpdir = $_;
    last;
    }
    return $self->canonpath($tmpdir);
}


That's actually good, idiomatic code for Perl 5, though it can look like spooky action at a distance if you're not aware of what's going on with $_, @_, and shift.

Equivalent code in Perl 6 looks like this:

method !tmpdir( *@dirlist ) {
    my $tmpdir = first { .defined && .IO.w && .IO.d }, @dirlist;
    return self.canonpath($tmpdir);
}


No messing about with parameters and keeping track of the object -- it all happens in the signature.  You no longer have to read through a loop to understand the code either -- in Perl 6 you can just say that you want the first matching candidate, and first() will lazily test the list for you.

The P6 version gets to the point much faster, and it's much closer to natural language: "set $tmpdir to the first defined writable directory in @dirlist."  Less, easier to read code is easier to maintain.

Changing Old Features


At some point, your code was working perfectly and passes all the tests.  But then the computer world changes around you, and it no longer makes any sense.  And you would like to refactor, but people rely on the old functionality.

This is exactly what happened for File::Spec's case_tolerant function.  It essentially looks at the operating system alone, and uses that to determine if the filesystem is case-sensitive.  Which in the old days made perfect sense when Macs used HFS+, Windows used FAT, and Unix used ufs or a variant.  But my computer runs Mac OS X and Windows and has several drive partitions in different formats.  Heck, the NTFS drives are case sensitive in POSIX-land, but as soon as I boot Windows they become case insensitive.

The only reasonable way to check this now is to actually check the filesystem for a specific directory, given widespread support for symlinks.  This breaks the old functionality.  But there's no time like a major language revision to break old APIs and replace them with shiny new ones.

However, there are a couple of major downsides to porting:

This is really time-consuming


Sure, you don't have to implement the algorithm from scratch, and you have plenty of tests to help your development.  It would be possible to just translate the existing code, because things aren't that different.  Change an if( $foo ) to if $foo, etc.

However, a major reason for doing the porting is to use the Perl 6 idioms instead, especially in function declarations and regular expressions where it makes a major difference in code readability.

Dependencies aren't available


Sometimes your code relies on separate modules not available, or on not yet implemented functions.  Your choice becomes to either implement the functionality yourself and embark on yet another yak-shaving expedition, or mark it as todo and wait for the appropriate functionality to arrive.

This has become a much smaller problem as of late as the core language matures.  But "done enough" is not really "done".



Now that I've written this, I've realized that my own project is a microcosm of the Perl 6 saga.  Making a better codebase takes a lot of time, but it ultimately seems worth the effort.

Of course, once I had gotten this far, I realized that File::Spec -- or something very much like it -- would be needed to implement IO::Path objects for non-unixlike OSes.  So stay tuned for the next part in this saga: How to add File::Spec to Rakudo.

Update: It ended up turning into two posts:  One was a simple guide on How to Start Hacking Rakudo Perl 6, and the other covered my follies in trying to add to the compiler for the first time.  But the short story is that IO::Path is now added to Perl 6 and implemented in Rakudo -- this means that both File::Spec and Path::Class' behavior are now available in the core language without adding modules.

How to start hacking on Rakudo Perl 6

Published by Brent Laabs on 2013-05-08T01:39:00

In the course of writing modules, I finally got the urge to start implementing features I wanted in Rakudo itself.  And since there wasn't a real guide on how to set up and patch Rakudo, I decided to share what I had learned in the process.

The nice thing about Perl 6 implementations is that as significant portion of them is written in Perl 6.  (Well, one nice thing anyway.)  This means that if you're comfortable writing Perl 6 modules and classes, you should feel pretty much at home in the source.

This guide assumes so, and that you have a basic familiarity with Github, git, and make -- enough to commit to repositories and build a software package, anyway.

Getting Started

This first thing is to get your own branch of Rakudo to work on.  So go to the Rakudo repository and click the fork button in the upper right.  Relax while Github photocopies a book.  Once that's done, find an appropriate directory to git clone it to on your own machine.

Go ahead and cd into the new rakudo directory.  There are a few setup things that you'll want to do.  First of all, go ahead and build Rakudo, using the normal steps:
    perl ./Configure.pl --gen-parrot
    make
    make install

That will pull a copy of NQP and Parrot, and make sure that everything is working okay to begin with.  Now that that's done, you'll want to add the new perl6 to your system $PATH environment variable.   Which, if you don't know how to do it -- well here's Google.  In particular, you'll need to add the full path to the rakudo/install/bin directory.

There's a couple more things you'll want to do now.  First of all:
    make spectest
You don't have to run the full tests now, but let it download the roast repository into your t/spec before hitting ^C.  You will need these tests later to make sure you didn't break anything.

Next, you'll want to set up a link back to the main Rakudo repository, so you can pull changes from there.  So do:
    git remote add upstream git://github.com/rakudo/rakudo.git

You'll also want the module installer, Panda.  Now, obviously, you shouldn't add anything to Rakudo that depends on an outside module.  But Panda is the one piece of software you really don't want to break, ever.  People will still want to be able to download modules even if functionality changes.  We will have to go through a deprecation cycle if you intentionally change something to cause Panda to start failing its tests.  So to download and install it:
    git clone git://github.com/tadzik/panda.git
    cd panda
    perl6 bootstrap.pl

This will set up Panda's dependencies, and test all of those modules.  The bootstrap script will tell you a path to add to your $PATH environment variable -- add it too, so that panda will run from anywhere.

Finally, you really should set up a new branch to work on, so you can switch back to a working Rakudo if you need to.  Move back into the rakudo directory and run:
    git checkout -b mynewbranchname

A very short overview of the source


Now that all the setup is done, let's take a quick look around.  Most of what we build into Perl 6 lives in the rakudo/src folder, so this is where you'll want to edit the contents.

Let's start hacking!


Now's the time to start changing Rakudo.  Have the appropriate amount of fun!  Be sure to commit functioning changes occasionally, so that you can git bisect for problems later.  And push your edits to Github as a free backup.  If you get stuck, drop by #perl6 on irc.freenode.net and ask questions.

If it's your first time, you have to fi^W^W^W^W you will probably make a lot of mistakes.  I know I did on my first project, as explained in detail in a previous post.   But I promise you, the learning curve is surprisingly easy, and your compiler-fu will increase to fuchsia-belt level in no time.  (What?  We're not just giving black belts away... and Camelia likes fuchsias.)

Testing and Specs


When you think you're finished with your code, the first thing you should do is merge in the upstream rakudo, and rebuild:
    git fetch upstream
    git merge upstream/nom
    perl Configure.pl
    make
    make spectest

The spectests will make sure that you didn't accidentally break the codebase.  You should pass, or at least not fail worse than the current roast data.

You should add your own tests into the roast repository about now.  You do have unit tests, right?   Writing tests is "optional", just like brushing your teeth -- you don't have to do it, but if you never do it you're in for a lot of pain later.  Here's a fine and elegantly crafted hyperlink to S24 (Testing) for reference.

When editing a file that already exists in roast, you may need to fudge the tests for Niecza and Pugs.  This tells us "we know the test failed or fails to parse, nothing has changed".  Just add lines like the following above broken tests:
    #?pugs 1 skip 'reason'
    #?niecza 1 skip 'reason'

The "1" is actually the number of tests you want to skip, but really, look at the README in roast for more details.

If you want to add a whole new test file, you'll need to add it into rakudo/t/spectest.data.  If your code fixes broken tests, then you'll want to *unfudge* by removing the #?rakudo skip lines above the relevant tests.

You should also test that Panda is still working.  Since you'll have to rebuild panda after recompling Rakudo anyway, just check the rebootstrap for test failures:
    perl6 panda/rebootstrap.pl

Commiting to Rakudo


The easiest way to get your code merged is to push it back to Github, and then send a pull request into Rakudo.  If you're really committed to committing, consider sending in a Contributor License Agreement to The Perl Foundation.  This makes you eligible for a commit bit to push directly to the Rakudo repo.

If there's a problem, someone will get back to you pretty fast on the Github issues page.  Hopefully, these problems will be easy to fix, and a standard git commit; git push will add it to the ticket.  If there aren't any problems, someone will just merge it in a couple days.

Huzzah! \o/  A Rakudo Hacker is you!

A Rakudo Performance

Published by pmichaud on 2012-09-02T23:00:26

At YAPC::NA 2012 in Madison, WI I gave a lightning talk about basic improvements in Rakudo’s performance over the past couple of years.  Earlier today the video of the lightning talks session appeared on YouTube; I’ve clipped out my talk from the session into a separate video below.  Enjoy!

 

Roborama 2012a

Published by pmichaud on 2012-05-28T06:08:52

A couple of weeks ago I entered the Dallas Personal Robotics Group Roborama 2012a competition, and managed to come away with first place in the RoboColumbus event and Line Following event (Senior Level).  For my robot I used one of the LEGO Mindstorms sets that we’ve been acquiring for use by our First Lego League team, along with various 3rd party sensors.

The goal of the RoboColumbus event was to build a robot that could navigate from a starting point to an ending point placed as far apart as possible; robots are scored on distance to the target when the robot stops.  If multiple robots touch the finish marker (i.e., distance zero), then the time needed to complete the course determines the rankings.   This year’s event was in a long hall with the target marked by an orange traffic cone.

HiTechnic IR ball and IRSeeker

HiTechnic IR ball and IRSeeker sensor

Contestants are allowed to make minor modifications to the course to aid navigation, so I equipped my robot with a HiTechnic IRSeeker sensor and put an infrared (IR) electronic ball on top of the traffic cone.  The IRSeeker sensor reports the relative direction to the ball (in multiples of 30 degrees), so the robot simply traveled forward until the sensor picked up the IR signal, then used the IR to home in on the traffic cone.  You can see the results of the winning run in the video below, especially around the 0:33 mark when the robot makes its first significant IR correction:

http://youtu.be/x1GvpYAArfY

My first two runs of RoboColumbus didn’t do nearly as well; the robot kept curving to the right for a variety of reasons, and so it never got a lock on the IR ball.  Some quick program changes at the contest and adjustments to the starting direction finally made for the winning run.

For the Line Following contest, the course consisted of white vinyl tiles with electrical tape in various patterns, including line gaps and sharp angles.  I used a LineLeader sensor from mindsensors.com for basic line following, with some heuristics for handling the gap conditions.  The robot performed fine on my test tiles at home, but had difficulty with the “gap S curve” tiles used at the contest.  However, my robot was the only one that successfully navigated the right angle turns, so I still ended up with first place.  🙂

Matthew and Anthony from our FLL robotics team also won other events in the contest, and there are more videos and photos available.  The contest was a huge amount of fun and I’m already working on new robot designs for the next competition.

Many thanks to DPRG and the contest sponsors for putting on a great competition!