Raku RSS Feeds
Elizabeth Mattijsen (Libera: lizmat #raku) / 2025-01-08T09:24:30The number 2025 has quite a few numeric properties, as Anton Antonov shows in their already second blog post this year! The first being “Doomsday clock parsing and plotting” (/r/rakulang comments). What a way to start the year!
On the title “Happy 𝚺 (^10)»³“: only two small additions would be needed to make that legal Raku code:
sub Happy ($_) { say "Happy $_!" }
sub prefix:<𝚺>(\x) { x.sum }
Happy 𝚺 (^10)»³
# Happy 2025!
And who knows, maybe 𝚺
should become a prefix operator in the Raku Programming Language, for the mathematically inclined!
If you didn’t have time to read any or all of the 2024 Advent posts about the Raku Programming Language, then you can use this handy list, which also includes references to any comments made! And there’s also an updated version of the Raku 2024 Review.
Weekly Challenge #303 is available for your perusal.
find_best_dispatchee
A lot of people slowly recovering from the Holiday season. Still, 12 authors updating at least one of their modules. Also not bad!
Please keep staying safe and healthy, and keep up the good work!
Meanwhile, still: Слава Україні! Героям слава!
If you like what I’m doing, committing to a small sponsorship would mean a great deal!
This blog post demonstrates many of the mathematical properties for the number 2025
given in the Wolfram notebook “Happy 2025 == 1³+2³+3³+4³+5³+6³+7³+8³+9³ !”, [EPn1], by Ed Pegg Jr.
We cannot computationally demonstrate easily all of the properties in Raku; for the full exposition of them see [EPn1].
First, let us demonstrate the most impressive numeric property of 2025
:
2025 = 13 + 23 + 33 + 43 + 53 + 63 + 73 + 83 + 93 = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9)2
This comes from the following identify, often referred to as Faulhaber’s Formula or Nicomachus’s Theorem:
∑nk=1 k3 = (∑nk=1 k)2
say "Sum of cubes of 1..9 : ", [+] (1..9)>>³;
say "Sum of 1..9 squared : ", ((1..9).sum)²;
# Sum of cubes of 1..9 : 2025
# Sum of 1..9 squared : 2025
Next, let us demonstrate the simple numeric properties of 2025
.
2025
is obtained by the square of the sum of (i) the number of the first two digits, 20
, with (ii) the number of the 3rd and 4th digits, 25
:
(20+25) ** 2
# 2025
An odd property of the square root, 45 == 2025.sqrt
, is that it is the smallest integer where the periodic part of the reciprocal is 2
:
use Rat::Precise;
(1/45).precise(40)
# 0.0222222222222222222222222222222222222222
2025
is the denominator for this sum of squares:
(1/9² + 1/5²).nude
# (106 2025)
2025
is the product of greatest common divisors of 15
for numbers less than 15
:
[*] (1..14).map(15 gcd *)
# 2025
2025
is the product of the proper divisors of 45
:
use Math::Sequences::Integer :support;
my @proper = divisors(45).head(*-1)
# [1 3 5 9 15]
[*] |@proper
# 2025
The package “Math::Sequences” has the functions sigma
(aka σ) and totient
(aka φ, not to be confused with the Golden ratio φ) that — in principle — can be used to demonstrate the rare property:
σ1(φ(20253)) = φ(σ1(20253))
But those package functions are too slow. Instead, we are using a Raku-chatbook Wolfram|Alpha cell, [AA1, AA2], to verify that formula:
#% wolfram-alpha
DivisorSigma[1, EulerPhi[2025^3]] == EulerPhi[DivisorSigma[1, 2025^3]]
Digits of 2025
represented in the Phi number system:
use Math::Sequences::Numberphile;
my @res = phi-number-system(2025);
# [15 13 10 5 3 1 -6 -11 -16]
Verification:
@res.map({ ϕ ** $_ }).sum.round(10e-11);
# 2025
Remark: We have to round (using a small multiple of 10
) because of the approximation of the golden ratio used in “Math::Sequences”.
From [EPn1]:
There are also 2025 “good” permutations (A006717), where all rotations have a single number in the correct place.
goodPermutations = Select[Permutations[Range[9]],Union[Table[Count[RotateRight[#,k]-Range[9],0],{k,0,8}]]=={1}&];
goodPermutations//Length
Here is the corresponding Raku code:
my @good-permutations = [1..9].permutations.race(:4degree).grep( -> @p {
my @res = (^9).map( -> $k { (@p.rotate(-$k) <<->> (1..9)).grep(0).elems }).unique.sort;
@res eqv [1]
});
@good-permutations.elems
# 2025
Optimization of the code above suggested by @timo:
my @good-permutations = [1..9].permutations.race(:4degree).grep( -> @p {
my @res = (^9).map( -> $k { (@p.rotate(-$k) >>-<< (1..9)).grep(0).elems }).unique.head(2);
@res.elems == 1 && @res[0] == 1
});
@good-permutations.elems
Remark: This is an embarrassingly parallel computation for which the order of the results does not matter. Sequential-computation-wise, Wolfram Language is ≈12 times faster than Raku’s first “good permutations” finding code, and ≈2.5 times faster than the second. (On a few years old laptop.)
Here are the first of the “good” permutations:
#% html
@good-permutations.head
==> -> @p { (^9).map( -> $k { @p.rotate(-$k) }) }()
==> { $_[(1, 9, 2, 7, 6, 8, 4, 3, 5) <<->> 1]}()
==> { .map(1..9)Z=>*)».Hash.Array }()
==> to-dataset()
==> to-html(field-names => (1..9)».Str)
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|
1 | 3 | 2 | 7 | 9 | 8 | 4 | 6 | 5 |
3 | 2 | 7 | 9 | 8 | 4 | 6 | 5 | 1 |
5 | 1 | 3 | 2 | 7 | 9 | 8 | 4 | 6 |
7 | 9 | 8 | 4 | 6 | 5 | 1 | 3 | 2 |
9 | 8 | 4 | 6 | 5 | 1 | 3 | 2 | 7 |
2 | 7 | 9 | 8 | 4 | 6 | 5 | 1 | 3 |
4 | 6 | 5 | 1 | 3 | 2 | 7 | 9 | 8 |
6 | 5 | 1 | 3 | 2 | 7 | 9 | 8 | 4 |
8 | 4 | 6 | 5 | 1 | 3 | 2 | 7 | 9 |
Remark: The diagonal of the table is with the digits from 1 to 9. Also, observe the “shifts” between the consecutive rows above.
[AA1] Anton Antonov, “Chatbook New Magic Cells”, (2024), RakuForPrediction at WordPress.
[AA2] Anton Antonov, “WWW::WolframAlpha”, (2024), RakuForPrediction at WordPress.
[EPn1] Ed Pegg, “Happy 2025 =1³+2³+3³+4³+5³+6³+7³+8³+9³!”, Wolfram Community, STAFFPICKS, December 30, 2024.
The Doomsday Clock is a symbolic timepiece maintained by the Bulletin of the Atomic Scientists (BAS) since 1947. It represents how close humanity is perceived to be to global catastrophe, primarily nuclear war but also including climate change and biological threats. The clock’s hands are set annually to reflect the current state of global security; midnight signifies theoretical doomsday.
In this post(notebook) we consider two tasks:
The data extraction and visualization in the post (notebook) serve educational purposes or provide insights into historical trends of global threats as perceived by experts. We try to make the ingestion and processing code universal and robust, suitable for multiple evaluations now or in the (near) future.
Remark: Keep in mind that the Doomsday Clock is a metaphor and its settings are not just data points but reflections of complex global dynamics (by certain experts and a board of sponsors.)
Remark: Currently (2024-12-30) Doomsday Clock is set at 90 seconds before midnight.
Remark: This post (notebook) is the Raku-version of the Wolfram Language (WL) notebook with the same name, [AAn1]. That is why the “standard” Raku-grammar approach is not used. (Although, in the preliminary versions of this work relevant Raku grammars were generated via both LLMs and Raku packages.)
I was very impressed by the looks and tune-ability of WL’s ClockGauge
, so, I programmed a similar clock gauge in Raku’s package “JavaScript::D3” (which is based on D3.js.)
use LLM::Functions;
use LLM::Prompts;
use LLM::Configurations;
use Text::SubParsers;
use Data::Translators;
use Data::TypeSystem;
use Data::Importers;
use Data::Reshapers;
use Hash::Merge;
use FunctionalParsers :ALL;
use FunctionalParsers::EBNF;
use Math::DistanceFunctions::Edit;
use Lingua::NumericWordForms;
my $background = 'none';
my $stroke-color = 'Ivory';
my $fill-color = 'none';
my $format = 'html';
my $titleTextStyle = { color => 'Ivory' };
my $backgroundColor = '#1F1F1F';
my $legendTextStyle = { color => 'Silver' };
my $legend = { position => "none", textStyle => {fontSize => 14, color => 'Silver'} };
my $hAxis = { title => 'x', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'}, logScale => False, format => 'scientific'};
my $vAxis = { title => 'y', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'}, logScale => False, format => 'scientific'};
my $annotations = {textStyle => {color => 'Silver', fontSize => 10}};
my $chartArea = {left => 50, right => 50, top => 50, bottom => 50, width => '90%', height => '90%'};
my $background = '1F1F1F';
my sub parsing-test-table(&parser, @phrases) {
my @field-names = ['statement', 'parser output'];
my @res = @phrases.map({ @field-names Z=> [$_, &parser($_.words).raku] })».Hash.Array;
to-html(@res, :@field-names)
}
Here we ingest the Doomsday Clock timeline page and show corresponding statistics:
my $url = "https://thebulletin.org/doomsday-clock/past-announcements/";
my $txtEN = data-import($url, "plaintext");
text-stats($txtEN)
# (chars => 73722 words => 11573 lines => 756)
By observing the (plain) text of that page we see the Doomsday Clock time setting can be extracted from the sentence(s) that begin with the following phrase:
my $start-phrase = 'Bulletin of the Atomic Scientists';
my $sentence = $txtEN.lines.first({ / ^ $start-phrase /})
# Bulletin of the Atomic Scientists, with a clock reading 90 seconds to midnight
Here is a grammar in Extended Backus-Naur Form (EBNF) for parsing Doomsday Clock statements:
my $ebnf = q:to/END/;
<TOP> = <clock-reading> ;
<clock-reading> = <opening> , ( <minutes> | [ <minutes> , [ 'and' | ',' ] ] , <seconds> ) , 'to' , 'midnight' ;
<opening> = [ { <any> } ] , 'clock' , [ 'is' ] , 'reading' ;
<any> = '_String' ;
<minutes> = <integer> <& ( 'minute' | 'minutes' ) ;
<seconds> = <integer> <& ( 'second' | 'seconds' ) ;
<integer> = '_Integer' <@ &{ $_.Int } ;
END
text-stats($ebnf)
# (chars => 364 words => 76 lines => 6)
Remark: The EBNF grammar above can be obtained with LLMs using a suitable prompt with example sentences. (We do not discuss that approach further in this notebook.)
Here the parsing functions are generated from the EBNF string above:
my @defs = fp-ebnf-parse($ebnf, <CODE>, name => 'Doomed2', actions => 'Raku::Code').head.tail;
.say for @defs.reverse
# my &pINTEGER = apply(&{ $_.Int }, symbol('_Integer'));
# my &pSECONDS = sequence-pick-left(&pINTEGER, (alternatives(symbol('second'), symbol('seconds'))));
# my &pMINUTES = sequence-pick-left(&pINTEGER, (alternatives(symbol('minute'), symbol('minutes'))));
# my &pANY = symbol('_String');
# my &pOPENING = sequence(option(many(&pANY)), sequence(symbol('clock'), sequence(option(symbol('is')), symbol('reading'))));
# my &pCLOCK-READING = sequence(&pOPENING, sequence((alternatives(&pMINUTES, sequence(option(sequence(&pMINUTES, option(alternatives(symbol('and'), symbol(','))))), &pSECONDS))), sequence(symbol('to'), symbol('midnight'))));
# my &pTOP = &pCLOCK-READING;
Remark: The function fb-ebnf-parse
has a variety of actions for generating code from EBNF strings. For example, with actions => 'Raku::Class'
the generation above would produce a class, which might be more convenient to do further development with (via inheritance or direct changes.)
Here the imperative code above — assigned to @defs
— is re-written using the infix form of the parser combinators:
my &pINTEGER = satisfy({ $_ ~~ /\d+/ }) «o {.Int};
my &pMINUTES = &pINTEGER «& (symbol('minute') «|» symbol('minutes')) «o { [minute => $_,] };
my &pSECONDS = &pINTEGER «& (symbol('second') «|» symbol('seconds')) «o { [second => $_,] };
my &pANY = satisfy({ $_ ~~ /\w+/ });
my &pOPENING = option(many(&pANY)) «&» symbol('clock') «&» option(symbol('is')) «&» symbol('reading');
my &pCLOCK-READING = &pOPENING «&» (&pMINUTES «|» option(&pMINUTES «&» option(symbol('and') «|» symbol(','))) «&» &pSECONDS) «&» symbol('to') «&» symbol('midnight');
my &pTOP = &pCLOCK-READING;
We must redefine the parser pANY
(corresponding to the EBNF rule “<any>”) in order to prevent pANY
of gobbling the word “clock” and in that way making the parser pOPENING
fail.
&pANY = satisfy({ $_ ne 'clock' && $_ ~~ /\w+/});
Here are random sentences generated with the grammar:
.say for fp-random-sentence($ebnf, 12).sort;
# clock reading 681 minutes to midnight
# clock reading 788 minutes to midnight
# clock is reading 584 seconds to midnight
# clock is reading 721 second to midnight
# clock is reading 229 minute and 631 second to midnight
# clock is reading 458 minutes to midnight
# clock is reading 727 minute to midnight
# F3V; clock is reading 431 minute to midnight
# FXK<GQ 3RJJJ clock is reading 369 seconds to midnight
# NRP FNSEE K0EQO OPE clock is reading 101 minute to midnight
# QJDV; R<K7S; JMQ>HD AA31 clock is reading 369 minute 871 second to midnight
# QKQGK FZJ@BB M8C1BD BPI;C: clock reading 45 minute 925 second to midnight
Verifications of the (sub-)parsers:
"90 seconds".words.&pSECONDS
# ((() [second => 90]))
"That doomsday clock is reading".words.&pOPENING
# ((() (((((That doomsday)) clock) (is)) reading)))
Here the “top” parser is applied:
my $str = "the doomsday clock is reading 90 seconds to midnight";
$str.words.&pTOP
# ((() ((((((((the doomsday)) clock) (is)) reading) (() [second => 90])) to) midnight)))
Here the sentence extracted above is parsed and interpreted into an association with keys “Minutes” and “Seconds”:
$sentence.words.&pTOP.tail.flat.grep(* ~~ Pair)
# (second => 90)
Let us redefine pCLOCK-READING
to return a minutes-&-seconds dictionary, pTOP
to return a corresponding date-time:
&pCLOCK-READING = &pCLOCK-READING «o { $_.flat.grep(* ~~ Pair).Hash };
&pTOP = &pCLOCK-READING «o {
Date.today.DateTime.earlier(seconds => ($_<minute> // 0) * 60 + ($_<second>// 0) )
};
Here we assign and show the results of those two parsers:
my $doom-reading = $sentence.words.&pCLOCK-READING.head.tail;
my $doom-time = $sentence.words.&pTOP.head.tail;
.say for (:$doom-reading, :$doom-time)
# doom-reading => {second => 90}
# doom-time => 2024-12-31T23:58:30Z
Using the interpretation derived above plot the corresponding clock with js-dr-clock-gauge
:
#% js
js-d3-clock-gauge($doom-time)
Let us define a map with clock-gauge plot options.
my @scale-ranges = (0, 0.01 ... 0.66).rotor(2=>-1).map({ ([0, 60], $_) });
my @scale-ranges2 = (0, 0.01 ... 0.82).rotor(2=>-1).map({ ([0, 60], $_) });
my %opts =
background => 'none',
stroke-color => 'Black', stroke-width => 0,
title-color => 'Ivory', title-font-family => 'Helvetica',
hour-hand-color => 'Orange', second-hand-color => 'Red',
color-scheme => 'Magma',
fill-color => 'AntiqueWhite',
:@scale-ranges,
color-scheme-interpolation-range => (0.11, 0.95),
margins => {top => 60, left => 20, right => 20, bottom => 60},
height => 420,
gauge-labels => {Doomsday => [0.5, 0.35], 'clock' => [0.5 ,0.28]},
gauge-labels-color => 'DarkSlateGray',
gauge-labels-font-family => 'Krungthep',
;
%opts.elems
# 16
Here are different “doomsday clock” examples:
#% js
[
{color-scheme => 'Plasma', fill-color => 'MistyRose', gauge-labels-color => 'Orchid'},
{color-scheme => 'Spectral', fill-color => '#4e65ac', stroke-color => 'DarkRed', stroke-width => 10, gauge-labels => %()},
{color-scheme => 'Cividis', fill-color => 'DarkSlateGray', gauge-labels => {Doomsday => [0.5, 0.6], 'clock' => [0.5 ,0.36]}, scale-ranges => @scale-ranges2},
].map({ js-d3-clock-gauge(:23hour, :58minute, :30second, |merge-hash(%opts.clone, $_, :!deep)) }).join("\n")
More robust parsing of Doomsday Clock statements can be obtained in these three ways:
The parser satisfy
can be used to handle misspellings (via, say, edit-distance
from “Math::DistanceFunctions”):
#% html
my &pDD = satisfy({ edit-distance($_, "doomsday") ≤ 2 }) «o {"doomsday"};
my @phrases = "doomsdat", "doomsday", "dumzday";
parsing-test-table(&pDD, @phrases)
statement | parser output |
---|---|
doomsdat | (((), “doomsday”),).Seq |
doomsday | (((), “doomsday”),).Seq |
dumzday | ().Seq |
But since “FunctionalParsers” provides the generic parser fuzzy-symbol
(that takes a word and a distance as arguments) we use that parser below.
#% html
my &pDD2 = fuzzy-symbol("doomsday", 2);
my @phrases = "doomsdat", "doomsday", "dumzday";
parsing-test-table(&pDD2, @phrases)
statement | parser output |
---|---|
doomsdat | (((), “doomsday”),) |
doomsday | (((), “doomsday”),) |
dumzday | () |
In order to include the misspelling handling into the grammar we manually rewrite the grammar. (The grammar is small, so, it is not that hard to do.)
my &pINTEGER = satisfy({ $_ ~~ /\d+/ }) «o {.Int};
my &pMINUTES = &pINTEGER «& (fuzzy-symbol('minute', 2) «|» fuzzy-symbol('minutes', 2)) «o { [minute => $_,] };
my &pSECONDS = &pINTEGER «& (fuzzy-symbol('second', 2) «|» fuzzy-symbol('seconds', 2)) «o { [second => $_,] };
my &pANY = satisfy({ edit-distance($_, 'clock') > 2 && $_ ~~ /\w+/ });
my &pOPENING = option(many(&pANY)) «&» fuzzy-symbol('clock', 1) «&» option(symbol('is')) «&» fuzzy-symbol('reading', 2);
my &pCLOCK-READING = &pOPENING «&» (&pMINUTES «|» option(&pMINUTES «&» option(symbol('and') «|» symbol(','))) «&» &pSECONDS) «&» symbol('to') «&» fuzzy-symbol('midnight', 2);
&pCLOCK-READING = &pCLOCK-READING «o { $_.flat.grep(* ~~ Pair).Hash };
&pTOP = &pCLOCK-READING «o {
Date.today.DateTime.earlier(seconds => ($_<minute> // 0) * 60 + ($_<second>// 0) )
};
Here is a verification table with correct- and incorrect spellings:
#% html
my @phrases =
"doomsday clock is reading 2 seconds to midnight",
"dooms day cloc is readding 2 minute and 22 sekonds to mildnight";
parsing-test-table(shortest(&pCLOCK-READING), @phrases)
statement | parser output |
---|---|
doomsday clock is reading 2 seconds to midnight | (((), {:second(2)}),) |
dooms day cloc is readding 2 minute and 22 sekonds to mildnight | (((), {:minute(2), :second(22)}),) |
One way to make the parsing more robust is to implement the ability to parse integer names (or numeric word forms) not just integers.
Remark: For a fuller discussion — and code — of numeric word forms parsing see the tech note “Integer names parsing” of the paclet “FunctionalParsers”, [AAp1].
First, we make an association that connects integer names with corresponding integer values:
my %worded-values = (^100).map({ to-numeric-word-form($_) => $_ });
%worded-values.elems
# 100
Remark: The function to-numeric-word-form
is provided by “Lingua::NumericWordForms”, [AAp3].
Here is how the rules look like:
%worded-values.pick(6)
# (ninety four => 94 forty three => 43 ninety eight => 98 seventy three => 73 ninety two => 92 eleven => 11)
Here we program the integer names parser:
my &pUpTo10 = alternatives( |(^10)».&to-numeric-word-form.map({ symbol($_.trim) }) );
my &p10s = alternatives( |(10, 20 ... 90)».&to-numeric-word-form.map({ symbol($_.trim) }) );
my &pWordedInteger = (&p10s «&» &pUpTo10 «|» &p10s «|» &pUpTo10) «o { %worded-values{$_.flat.join(' ')} };
Here is a verification table of that parser:
#% html
my @phrases = "three", "fifty seven", "thirti one";
parsing-test-table(&pWordedInteger, @phrases)
statement | parser output |
---|---|
three | (((), 3),).Seq |
fifty seven | (((), 57), ((“seven”,), 50)).Seq |
thirti one | ().Seq |
There are two parsing results for “fifty seven”, because &pWordedInteger
is defined with:
&p10s «|» &pUpTo10 «|» p10s ...
This can be remedied by using just
or shortest
:
#% html
parsing-test-table( just(&pWordedInteger), @phrases)
statement | parser output |
---|---|
three | (“((), 3),).Seq |
fifty seven | (“((), 57),).Seq |
thirti one | ().Seq |
Let us change &pINTEGER
to parse both integers and integer names:
#% html
&pINTEGER = &satisfy({ $_ ~~ /\d+/ }) «o {.Int} «|» &pWordedInteger;
my @phrases = "12", "3", "three", "forty five";
parsing-test-table( just(&pINTEGER), @phrases)
statement | parser output |
---|---|
12 | ($((), 12),).Seq |
3 | ($((), 3),).Seq |
three | ($((), 3),).Seq |
forty five | ($((), 45),).Seq |
Remark: &pINTEGER
has to be evaluated before the definitions of the rest of the parsers programmed in the previous subsection.
Let us try the new parser using integer names for the clock time:
my $str = "the doomsday clock is reading two minutes and forty five seconds to midnight";
$str.words
==> take-first(&pCLOCK-READING)()
# ((() {minute => 2, second => 45}))
There are multiple ways to employ LLMs for extracting “clock readings” from arbitrary statements for Doomsday Clock readings, readouts, and measures. Here we use LLM few-shot training:
my &flop = llm-example-function([
"the doomsday clock is reading two minutes and forty five seconds to midnight" => '{"minute":2, "second": 45}',
"the clock of the doomsday gives 92 seconds to midnight" => '{"minute":0, "second": 92}',
"The bulletin atomic scientist maybe is set to a minute an 3 seconds." => '{"minute":1, "second": 3}'
],
e => $conf4o,
form => sub-parser('JSON')
)
Here is an example invocation:
&flop("Maybe the doomsday watch is at 23:58:03")
# {minute => 1, second => 57}
The following function combines the parsing with the grammar and the LLM example function — the latter is used for fallback parsing:
my sub get-clock-reading(Str:D $st) {
my $op = just(&pCLOCK-READING)($st.words);
my %h = $op.elems > 0 && $op.head.head.elems == 0 ?? $op.head.tail !! &flop( $st );
return Date.today.DateTime.earlier(seconds => (%h<minute> // 0) * 60 + (%h<second> // 0) )
}
# &get-clock-reading
Here is the application of the combine function above over a certain “random” Doomsday Clock statement:
my $s = "You know, sort of, that dooms-day watch is 1 and half minute be... before the big boom. (Of doom...)";
$s.&get-clock-reading
# 2024-12-31T23:58:30Z
Remark: The same type of robust grammar-and-LLM combination is explained in more detail in the video “Robust LLM pipelines (Mathematica, Python, Raku)”, [AAv1]. (See, also, the corresponding notebook [AAn1].)
In this section we extract Doomsday Clock timeline data and make a corresponding plot.
Instead of using the official Doomsday clock timeline page we use Wikipedia.
We can extract the Doomsday Clock timeline using LLMs. Here we get the plaintext of the Wikipedia page and show statistics:
my $url = "https://en.wikipedia.org/wiki/Doomsday_Clock";
my $txtWk = data-import($url, "plaintext");
text-stats($txtWk)
# (chars => 42728 words => 6231 lines => 853)
Here we get the Doomsday Clock timeline table from that page in JSON format using an LLM (or ingest a previous extraction saved as a CSV file):
my $res;
if False {
$res = llm-synthesize([
"Give the time table of the doomsday clock as a time series that is a JSON array.",
"Each element of the array is a dictionary with keys 'Year', 'MinutesToMidnight', 'Time', 'Summary', 'Description'.",
"Do not shorten or summarize the descriptions -- use their full texts.",
"The column 'Summary' should have summaries of the descriptions, each summary no more than 10 words.",
$txtWk,
llm-prompt("NothingElse")("JSON")
],
e => $conf4o,
form => sub-parser('JSON'):drop
);
} else {
my @field-names = <Year MinutesToMidnight Time Summary Description>;
my $url = 'https://raw.githubusercontent.com/antononcube/RakuForPrediction-blog/refs/heads/main/Data/doomsday-clock-timeline-table.csv';
$res = data-import($url, headers => 'auto');
$res = $res.map({ my %h = $_.clone; %h<Year> = %h<Year>.Int; %h<MinutesToMidnight> = %h<MinutesToMidnight>.Num; %h }).Array
}
deduce-type($res)
# Vector(Struct([Description, MinutesToMidnight, Summary, Time, Year], [Str, Num, Str, Str, Int]), 26)
Here the LLM result is tabulated:
#% html
my @field-names = <Year MinutesToMidnight Time Summary Description>;
$res ==> to-html(:@field-names, align => 'left')
Remark: The LLM derived summaries in the table above are based on the descriptions in the column “Reason” in the Wikipedia data table.
The tooltips of the plot below use the summaries.
In order to have informative Doomsday Clock evolution plot we obtain and partition dataset’s time series into step-function pairs:
my @dsDoomsdayTimes = |$res;
my @ts0 = @dsDoomsdayTimes.map({ <Year MinutesToMidnight role:tooltip> Z=> $_<Year MinutesToMidnight Summary> })».Hash;
my @ts1 = @dsDoomsdayTimes.rotor(2=>-1).map({[
%( <Year MinutesToMidnight mark role:tooltip> Z=> $_.head<Year MinutesToMidnight MinutesToMidnight Summary>),
%( <Year MinutesToMidnight mark role:tooltip> Z=> [$_.tail<Year>, $_.head<MinutesToMidnight>, NaN, ''])
]}).map(*.Slip);
@ts1 = @ts1.push( merge-hash(@ts0.tail, {mark => @ts0.tail<MinutesToMidnight>}) );
deduce-type(@ts1):tally
# Vector(Struct([MinutesToMidnight, Year, mark, role:tooltip], [Num, Int, Num, Str]), 51)
Here are added callout annotations indicating the year and the minutes before midnight:
my @ts2 = @ts1.map({
my %h = $_.clone;
my $s = ($_<MinutesToMidnight> * 60) mod 60;
$s = $s > 0 ?? " {$s}s" !! '';
if %h<mark> === NaN {
%h<role:annotation> = '';
} else {
%h<role:annotation> = "{%h<Year>}: {floor($_<MinutesToMidnight>)}m" ~ $s;
}
%h
});
deduce-type(@ts2):tally
# Vector(Struct([MinutesToMidnight, Year, mark, role:annotation, role:tooltip], [Num, Int, Num, Str, Str]), 51)
Finally, here is the plot:
#% html
js-google-charts('ComboChart',
@ts2,
column-names => <Year MinutesToMidnight mark role:annotation role:tooltip>,
width => 1200,
height => 500,
title => "Doomsday clock: minutes to midnight, {@dsDoomsdayTimes.map(*<Year>).Array.&{ (.min, .max).join('-') }}",
series => {
0 => {type => 'line', lineWidth => 4, color => 'DarkOrange'},
1 => {type => 'scatter', pointSize => 10, opacity => 0.1, color => 'Blue'},
},
hAxis => { title => 'Year', format => '####', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'},
viewWindow => { min => 1945, max => 2026}
},
vAxes => {
0 => { title => 'Minutes to Midnight', titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'} },
1 => { titleTextStyle => { color => 'Silver' }, textStyle => { color => 'Gray'}, ticks => (^18).map({ [ v => $_, f => ($_ ?? "23::{60-$_}" !! '00:00' ) ] }).Array }
},
:$annotations,
:$titleTextStyle,
:$backgroundColor,
:$legend,
:$chartArea,
:$format,
div-id => 'DoomsdayClock',
:!png-button
)
Remark: The plot should be piecewise constant — simple linear interpolation between the blue points would suggest gradual change of the clock times.
Remark: By hovering with the mouse over the blue points the corresponding descriptions can be seen. We considered using clock-gauges as tooltips, but showing clock-settings reasons is more informative.
Remark: The plot was intentionally made to resemble the timeline plot in Doomsday Clock’s Wikipedia page.
Remark: The plot has deficiencies:
I gave up smoothing out those deficiencies after attempting to fix or address each of them a few times. (It is not that important to figure out Google Charts interface settings for that kind of plots.)
As expected, parsing, plotting, or otherwise processing the Doomsday Clock settings and statements are excellent didactic subjects for textual analysis (or parsing) and temporal data visualization. The visualization could serve educational purposes or provide insights into historical trends of global threats as perceived by experts. (Remember, the clock’s settings are not just data points but reflections of complex global dynamics.)
One possible application of the code in this notebook is to make a “web service“ that gives clock images with Doomsday Clock readings. For example, click on this button:
[AA1] Anton Antonov, “Geographic Data in Raku Demo”, (2024), RakuForPrediction at WordPress.
[AAn1] Anton Antonov, “Doomsday clock parsing and plotting”, (2024), Wolfram Community.
[AAn2] Anton Antonov, “Making robust LLM computational pipelines from software engineering perspective”, (2024), Wolfram Community.
[AAp1] Anton Antonov, FunctionalParsers Raku package, (2023-2024), GitHub/antononcube.
[AAp2] Anton Antonov, “FunctionalParsers”, (2023), Wolfram Language Paclet Repository.
[AAp3] Anton Antonov, Lingua::NumericWordForms Raku package, (2021-2024), GitHub/antononcube.
[AAv1] Anton Antonov, “Robust LLM pipelines (Mathematica, Python, Raku)”, (2024), YouTube/@AAA4prediction.
This is an updated version of the Raku Advent Post.
Welcome to 2025!
How time flies. Yet another year has flown by. 2024 was a year of changes, continuations and preparations. Let's start with the changes:
Edument Central Europe, the branch of Edument that is based in Prague (and led by Jonathan Worthington), decided to stop (commercial) development of its Raku related products: Comma (the IDE for the Raku Programming Language) and Cro (a set of libraries for building reactive distributed systems).
The announcement:
With the discrepancy between revenue and development cost to continue being so large, and the prevailing economic environment forcing us to focus on business activities that at least pay for themselves, we’ve made the sad decision to discontinue development of the Comma IDE.
Fortunately, Jonathan Worthington was not only able to release the final commercial version as a free download, but was also able to release all of the sources of Comma. This would allow other people to continue Comma development.
With John Haltiwanger being the now de-facto project leader, this has resulted in a beta of the open source version of a Raku plugin for IntelliJ IDEA, as described in this blog post.
The announcement
When Edument employees were by far the dominant Cro contributors, it made sense for us to carry the overall project leadership. However, the current situation is that members of the Raku community contribute more. We don’t see this balance changing in the near future.
With that in mind, we entered into discussions with the Raku Steering Council, in order that we can smoothly transfer control of Cro and its related projects to the Raku community. In the coming weeks, we will transfer the GitHub organization and release permissions to steering council representatives, and will work with the Raku community infrastructure team with regards to the project website.
As the source code of Cro had always been open source, this was more a question of handing over responsibilities. Fortunately the Raku Community reacted: Patrick Böker has taken care of making Cro a true open source project related to Raku, and the associated web site is now being hosted on the Raku infrastructure. With many kudos to the Raku Infra Team!
Sadly, Jonathan Worthington also indicated that they would only remain minimally involved in the further development of MoarVM, NQP and Rakudo in the foreseeable future. As such, (almost) all of their modules were moved to the Raku Community Modules Adoption Center, where they were updated and re-released.
It's hard to overstate the importance of Jonathan Worthington's work in the development and implementation of the Raku Programming Language. So on behalf of the current, past and future Raku Community members: Thank You!
At the beginning of October, Elizabeth Mattijsen decided to take on the large number of open Rakudo issues at that time: 1300+. This resulted in the closing of more than 500 issues: some just needed closing, some needed tests, and some could be fixed pretty easily.
They reported on that work in the Raku Fall Issue Cleanup blog post. Of the about 800 open issues remaining, almost 300 were marked as "fixed in RakuAST", and about 100 were marked as "Will be addressed in RakuAST". Which still leaves about 400 open because of other reasons, so there's still plenty of work to be done here.
The original Raku ecosystem ("p6c") is in the process of being completely removed. Since March 2023, the ecosystem was no longer being refreshed by zef. But it was still being refreshed by the Raku Ecosystem Archiver. But this stopped in August 2024, meaning that any updates of modules in that ecosystem would go unnoticed from then on.
At that time, every author who still had at least one module in the "p6c" ecosystem was given notice by creating an issue in the repository of the first of their modules in the META.list. Luckily many authors responded, either by indicating that they would migrate their module(s) to the "zef" ecosystem, or that they were no longer interested in maintaining.
Since then, most of the modules of the authors that responded, have been migrated. And work has started on the modules of the authors that did not respond. With the following results: at the beginning of 2024, there were still 658 modules in the "p6c" ecosystem (now 412, 37% less), by 230 different authors (now 132, 42% less).
In 2024, 579 Raku modules have been updated (or first released): up from 332 in 2023 (an increase of 74%). There are now 2309 different modules installable by zef by just mentioning their name. And there are now 12239 different versions of Raku modules available from the Raku Ecosystem Archive, up from 10754 in 2023, which means more than 4 module updates on average per day in 2024.
Rakudo saw about 2000 commits (MoarVM, NQP, Rakudo, doc) this year, which is about the same as in 2023. About one third of these commits were in the development of RakuAST (down from 75% in 2023).
A lot of work was done under the hood of the various subsystems of Rakudo. So was the dispatcher logic simplified by introducing several nqp:: shortcuts, which made the dispatcher code a lot more readable and maintainable.
The Meta-classes of NQP
and Raku
also received a bit of a makeover, as most of them hadn't been touched since the 2015 release: this resulted in better documentation, and some minor performance improvements. Support for TWEAK
methods and a rudimentary dd
functionality were also added to NQP
.
The JVM
backend also got some TLC in 2024: one under the hood change (by Daniel Green) made execution of Raku code on the JVM
backend twice as fast!
Timo Paulssen made the interface with low-level debuggers such as gdb
and lldb
a lot less cumbersome on MoarVM, which makes adding / fixing MoarVM features a lot easier!
On the MoarVM
backend the expression JIT (active on Intel hardware) was disabled by default: it was found to be too unreliable and did not provide any execution speed gains. This change made Rakudo on Intel hardware up to 5% faster overall.
Also on the MoarVM
backend, Daniel Green completed the work on optimizing short strings started by Timo Paulssen and Bart Wiegmans, resulting in about a 2% speed improvement of the compilation of Raku code.
Work on the Remote Debugger (which so far had only been really used as part of Comma) has resumed, now with a much better command line interface. And can now be checked in the language itself with the new VM.remote-debugging
method.
Some race conditions were fixed: a particularly nasty one on the lazy deserialization of bytecode that was very hard to reproduce, as well as some infiniloops.
A lot of work was done on making the Continuous Integration testing produce fewer (and recently hardly any) false positives anymore. Which makes life for core developers a lot easier!
Two new Routine traits were added to the Raku Programming Language in 2024.
The is item
trait can be used on @
and %
sigilled parameters to indicate that a Positional
(in the @
case) or an Associative
(in the %
case) is only acceptable in dispatch if it is presented as an item. It only serves as a tie-breaker, so there should always also be a dispatch candidate that would accept the argument when it is not itemized. Perhaps an example makes this more clear:
multi sub foo(@a) { say "array" }
multi sub foo(@a is item) { say "item" }
foo [1,2,3]; # array
foo $[1,2,3]; # item
The is revision-gated
trait fulfils a significant part of the promise of the Raku Programming Language to be a 100-year programming language. It allows a developer to add / keep behaviour of a dispatch to a subroutine or method depending on the language level from which it is being called.
As with is item
, this is implemented as a tie-breaker to be checked only if there are multiple candidates in dispatch that match a given set of arguments.
This will allow core and module developers to provide forward compatibility, as well as backward compatibility in their code (as long as the core supports a given language level, of course).
In its current implementation, the trait must be specified on the proto
to allow it to work (this may change in the future), and it should specify the lowest language level it should support. An example of a module "FOO" that exports a "foo" subroutine:
unit module FOO;
proto sub foo(|) is revision-gated("v6.c") is export {*}
multi sub foo() is revision-gated("6.c") {
say "6.c";
}
multi sub foo() is revision-gated("6.d") {
say "6.d"
}
Then we have a program that uses the "FOO" module and calls the "foo" subroutine. This shows "6.d" because the current default language level is "6.d".
use FOO;
foo(); # 6.d
However, if this program would like to use language level 6.c semantics, it can indicate so by adding a use v6.c
at the start of the program. And get a different result in an otherwise identical program:
use v6.c;
use FOO;
foo(); # 6.c
John Haltiwanger has written an extensive blog post about the background, implementation and usage of the is revision-gated
trait, if you'd like to know more about it.
These are some of the more notable changes in language level 6.d: all of them add functionality, so are completely backward compatible.
The .flat
method optionally takes a :hammer
named argument, which will deeply flatten any data structure given:
my @a = 1, [2, [3,4]];
say @a.flat; # (1 [2 [3 4]])
say @a.flat(:hammer); # (1 2 3 4)
One can now also use HyperWhatever
(aka **) in a postcircumfix [ ]
for the same semantics:
my @a = 1, [2, [3,4]];
say @a[*]; # (1 [2 [3 4]])
say @a[**]; # (1 2 3 4)
The .min
/ .max
/ .minmax
methods now also accept the :by
named argument to make it consistent with the sub versions, which should prevent unexpected breakage when refactoring code from a sub form to a method form (as the :by
would previously be silently ignored in the method form).
The .are
method now also accepts a type argument. If called in such a manner, it will return True
if all members of the invocant matched the given type, and False
if not. Apart from allowing better readable code, it also allows shortcutting if any of the members of the invocant did not match the given type. An example:
unless @args.are(Pair) {
die "All arguments should be Pairs";
}
The most notable additions to the future language level of the Raku Programming Language:
The .nomark
method on Cool
objects returns a string with the base characters of any composed characters, effectively removing any accents and such:
use v6.e.PREVIEW;
say "élève".nomark; # eleve
The .contains
/ .starts-with
/ .ends-with
/ .index
/ .rindex
/ .substr-eq
methods now all accept a :smartcase
named argument: a conditional :ignorecase
. If specified with a True
value, it will look at the needle to see if it is all lowercase. If it is, then :ignorecase
will be assumed. If there are any uppercase characters in the needle, then normal semantics will be assumed:
use v6.e.PREVIEW;
say "Frobnicate".contains("frob"); # False
say "Frobnicate".contains("frob", :smartcase); # True
say "Frobnicate".contains("FROB", :smartcase); # False
The .stem
method on IO::Path
objects returns the .basename
of the object without any extensions, or with the given number of extensions removed:
use v6.e.PREVIEW;
say "foo/bar/baz.tar.gz".IO.basename; # baz.tar.gz
say "foo/bar/baz.tar.gz".IO.stem; # baz
say "foo/bar/baz.tar.gz".IO.stem(1); # baz.tar
About one third of this year's work was done on the RakuAST (Raku Abstract Syntax Tree) project. It basically consists of 3 sub-projects, that are heavily intertwined:
Development of RakuAST classes that can be used to represent all aspects of Raku code in an object-oriented fashion.
Development of a grammar and an actions class to parse Raku source code and turn that into a tree of instantiated RakuAST objects.
Development of new features / bug fixing in the Raku Programming Language and everything else that has become a lot easier with RakuAST.
There is little more to say about the development of RakuAST
classes other than that there were 440 of them at the start of the year, and 454 of them at the end of the year. As the development of these classes is still very much in flux, they are not documented yet (other than in the test-files in the /t/12rakuast directory).
On the other hand, the RakuAST::Doc
classes are documented because they have a more or less stable API to allow for the development of RakuDoc Version 2.
The work on the Raku Grammar and Actions has been mostly about implementing already existing features. This is measured by the number of Rakudo (make test
) and roast (make spectest
) test-files that completely pass with the new Raku Grammar and Actions. And these are the changes:
make test
: 110/151 (72.8%) → 140/156 (89.7%)
make spectest
: 980/1356 (72.3%) → 1155 / 1359 (85%)
A lot of work was done by Stefan Seifert, picking up the compile-time handling refactor that Jonathan Worthington had started in 2023, but was unable to bring to fruition. TPRF continued the funding for this work.
By the way, DuckDuckGo donated US$ 25000 to the foundation to allow this type of development funding to go on! Hint, hint!
Like last year, there are still quite a few features left to implement. Although it must be said that many tests are hinging on the implementation of a single feature, and often cause an "avalanche" of additional test-files passing when it gets implemented.
If you'd like to try out the new Raku Grammar and Actions, you should set the RAKUDO_RAKUAST
environment variable to 1. The .legacy
method on the Raku
class will tell you whether the legacy (older) grammar is being used or not:
$ raku -e 'say Raku.legacy'
True
$ RAKUDO_RAKUAST=1 raku -e 'say Raku.legacy'
False
A cousin language was created: Draig
, allowing you to write Raku code in Welsh!
The RakuAST::Deparse::Highlight
module allows customizable Raku syntax highlighting of Raku code that is syntactically correct. It is definitely not usable for real-time syntax highlighting as it a. requires valid Raku code, and b. it may execute Raku code in BEGIN
blocks, constant
specifications and use
statements.
A more lenient alternative is the Rainbow
module by Patrick Böker.
The final version of the RakuDoc v2.0 specification can now be completely rendered thanks to the tireless work of Richard Hainsworth and extensive feedback and proofreading by Damian Conway, as shown in these blog posts.
Sadly it has turned out to be impossible to organize a Raku Conference (neither in-person or online), nor was it possible to organize a Raku Core Summit. We're hoping for a better situation in 2025!
Completely out of the blue, a Dr Raku created about 90 Raku beginner videos on YouTube.
Looking back, again an amazing amount of work has been done in 2024! Looking forward to an even more amazing 2025!
On these last days of 2024, Anton Antonov continued their neat blogging spree in “Sparse Matrix Neat Examples in Raku“, this time mixing The Lord of the Rings, X-Men and Pirates of the Caribbean.
The final 2 blog posts this year’s Advent Calendar:
There’s also a handy overview of all Raku advent posts of 2024.
Weekly Challenge #302 is available for your perusal.
NullPointerException
on the JVM backend.In RakuAST:
A
, P
, and Δ
to the RakuDoc markup letters that can have meta information.A record in cool new modules! And the end of a turbulent year. With most likely more turbulence ahead in 2025. Please keep staying safe and healthy, and keep up the good work!
Meanwhile, still: Слава Україні! Героям слава!
If you like what I’m doing, committing to a small sponsorship would mean a great deal!
Sparse matrices are an essential tool in computational mathematics, allowing us to efficiently represent and manipulate large matrices that are predominantly composed of zero elements. In this blog post, we will delve into a few intriguing examples of sparse matrix utilization, specifically in the Raku programming language.
Support for sparse matrix linear algebra is a hallmark of a mature computational system. Here’s a brief timeline of when some popular systems introduced sparse matrices:
Language | Initial Introduction | Confirmed Update |
---|---|---|
MATLAB | 1992 | ~ |
Mathematica / Wolfram Language | 2003 | updated 2007 |
Python | maybe since 2004 | updated 2006 |
R | maybe since 2011 | updated 2014 |
(This setup is similar to the one used in the Graph neat examples.)
Let’s begin by examining a random graph generated using the Watts-Strogatz model. This model is particularly useful for simulating social networks.
#% js
my $gl = Graph::Random.new: Graph::Distribution::WattsStrogatz.new(20,0.06);
my $gp = Graph::Path.new: $gl.find-shortest-path('0','12'), :directed;
my $grPlot =
js-d3-graph-plot(
$gl.edges(:dataset),
highlight => [|$gp.vertex-list, |$gp.edge-list],
background => '1F1F1F',
title-color => 'Silver',
edge-thickness => 3,
vertex-size => 6,
width => 600,
force => {charge => {strength => -260, iterations => 2}, y => {strength => 0.2}, collision => {radius => 6, iterations => 10}, link => {distance => 4}}
)
In the code above, we create a random graph with 20 vertices and a connection probability of 0.06. We also find the shortest path between vertices ‘0’ and ’12’.
The adjacency matrix of this graph is a sparse matrix, where non-zero elements indicate the presence of an edge between vertices.
#% js
my $m = Math::SparseMatrix.new(edge-dataset => $gl.edges(:dataset), row-names => $gl.vertex-list.sort(*.Int));
say $m;
$m.Array ==> js-d3-matrix-plot(width => 400, margins => 15, :$tick-labels-font-size)
# Math::SparseMatrix(:specified-elements(86), :dimensions((20, 20)), :density(0.215))
Here, we visualize the graph matrix, the shortest path matrix, and the sum of these matrices:
#% js
my $m2 = Math::SparseMatrix.new(edge-dataset => $gp.edges(:dataset), row-names => $m.row-names);
my $m3 = $m.add($m2.multiply(0.75));
# Visualize
my %opts = width => 350, margins => {top => 30, left => 16, right => 16, bottom => 16}, :$tick-labels-font-size, :$tick-labels-color, :$title-color, :!tooltip, color-palette => 'Inferno';
[
js-d3-matrix-plot($m.Array, |%opts, title => 'Graph'),
js-d3-matrix-plot($m2.Array, |%opts, title => 'Shortest path graph'),
js-d3-matrix-plot($m3.Array, |%opts, title => 'Sum')
].join("\n")
The sum matrix is printed in a “plain” format:
$m3.print
By comparing the graph and the sum matrix side-by-side, we can better understand the structure and relationships within the graph:
#% js
[
$grPlot,
js-d3-matrix-plot($m3.Array, margins => 16, :$tick-labels-font-size, :$tick-labels-color, width => 400, color-palette => 'Inferno')
].join("\n")
Next, we will ingest a CSV file containing data about movies and actors. This data will be used to create a bipartite graph.
my $file = $*CWD ~ '/Sparse-matrices/dsMovieRecords.csv';
my @dsMovieRecords = data-import($file, 'csv', headers => 'auto');
deduce-type(@dsMovieRecords)
# Vector(Assoc(Atom((Str)), Atom((Str)), 6), 40)
Here is a tabular representation of the movie data:
#% html
my @field-names = <Movie Actor Genre1 Genre2 Genre3 BoxOffice>;
@dsMovieRecords ==> to-html(:@field-names)
A summary of the data:
#% html
records-summary(@dsMovieRecords, :8max-tallies, :!say)
==> to-html(:@field-names)
We construct a bipartite graph based on the movie-actor relationships.
my @rules = @dsMovieRecords.map({ $_<Movie> => $_<Actor> });
my $g = Graph.new(@rules)
# Graph(vertexes => 27, edges => 40, directed => False)
The graph is confirmed to be bipartite:
$g.is-bipartite
# True
Here is the coloring of the graph:
.say for $g.bipartite-coloring.classify(*.value)
# 1 => [X2 => 1 The Lord of the Rings: The Fellowship of the Ring => 1 Pirates of the Caribbean: The Curse of the Black Pearl => 1 The Lord of the Rings: The Return of the King => 1 Pirates of the Caribbean: At World's End => 1 X-Men: The Last Stand => 1 The Lord of the Rings: The Two Towers => 1 Pirates of the Caribbean: Dead Man's Chest => 1]
# 0 => [Sean Astin => 0 Patrick Stewart => 0 Elijah Wood => 0 Rebecca Romijn => 0 Ian McKellen => 0 Keira Knightley => 0 Orlando Bloom => 0 Famke Janssen => 0 Bill Nighy => 0 Johnny Depp => 0 Jack Davenport => 0 Hugh Jackman => 0 Liv Tyler => 0 Halle Berry => 0 Andy Serkis => 0 Geoffrey Rush => 0 Stellan Skarsgård => 0 Anna Paquin => 0 Viggo Mortensen => 0]
#% js
$g.edges(:dataset)
==> js-d3-graph-plot(
highlight => @dsMovieRecords.map(*<Actor>).List,
:$background,
title-color => 'silver',
width => 1000,
:$edge-thickness,
:$vertex-size,
vertex-color => 'Red',
vertex-label-font-size => 12,
vertex-label-color => 'Grey',
vertex-label-font-family => 'Helvetica',
:!directed,
force => {charge => {strength => -680, iterations => 2}, collision => {radius => 10, iterations => 1}, link => {minDistance => 10}}
)
We create a sparse matrix representing the movie-actor relationships:
my @allVertexNames = [|@dsMovieRecords.map(*<Movie>).unique.sort, |@dsMovieRecords.map(*<Actor>).unique.sort];
my %h = @allVertexNames Z=> ^@allVertexNames.elems;
# {Andy Serkis => 8, Anna Paquin => 9, Bill Nighy => 10, Elijah Wood => 11, Famke Janssen => 12, Geoffrey Rush => 13, Halle Berry => 14, Hugh Jackman => 15, Ian McKellen => 16, Jack Davenport => 17, Johnny Depp => 18, Keira Knightley => 19, Liv Tyler => 20, Orlando Bloom => 21, Patrick Stewart => 22, Pirates of the Caribbean: At World's End => 0, Pirates of the Caribbean: Dead Man's Chest => 1, Pirates of the Caribbean: The Curse of the Black Pearl => 2, Rebecca Romijn => 23, Sean Astin => 24, Stellan Skarsgård => 25, The Lord of the Rings: The Fellowship of the Ring => 3, The Lord of the Rings: The Return of the King => 4, The Lord of the Rings: The Two Towers => 5, Viggo Mortensen => 26, X-Men: The Last Stand => 6, X2 => 7}
The row and column names are sorted, with movie titles first, followed by actor names:
.say for @allVertexNames
# Pirates of the Caribbean: At World's End
# Pirates of the Caribbean: Dead Man's Chest
# Pirates of the Caribbean: The Curse of the Black Pearl
# The Lord of the Rings: The Fellowship of the Ring
# The Lord of the Rings: The Return of the King
# The Lord of the Rings: The Two Towers
# X-Men: The Last Stand
# X2
# Andy Serkis
# Anna Paquin
# Bill Nighy
# Elijah Wood
# Famke Janssen
# Geoffrey Rush
# Halle Berry
# Hugh Jackman
# Ian McKellen
# Jack Davenport
# Johnny Depp
# Keira Knightley
# Liv Tyler
# Orlando Bloom
# Patrick Stewart
# Rebecca Romijn
# Sean Astin
# Stellan Skarsgård
# Viggo Mortensen
The sparse matrix of the bipartite graph is constructed:
my $m = Math::SparseMatrix.new(edge-dataset => $g.edges(:dataset))
# Math::SparseMatrix(:specified-elements(80), :dimensions((27, 27)), :density(<80/729>))
#%js
$m.Array ==> js-d3-matrix-plot(width=>400)
To clearly show the bipartite nature of the matrix, we restructure it using pre-arranged row and column names:
$m = $m[@allVertexNames; @allVertexNames]
# Math::SparseMatrix(:specified-elements(80), :dimensions((27, 27)), :density(<80/729>))
The matrix plot now clearly indicates a bipartite graph:
#%js
$m.Array ==> js-d3-matrix-plot(width=>400)
For an alternative visualization, we can create an HTML “pretty print” of the sparse matrix:
#% html
$m
.to-html(:v)
.subst('<td>1</td>', '<td><b>●</b></td>', :g)
Sparse matrices are particularly useful for information retrieval operations. Here, we demonstrate how to retrieve data about an actor, such as Orlando Bloom.
#%html
my $m-actor = $m['Orlando Bloom'].transpose;
$m-actor.to-html.subst('<td>0</td>','<td> </td>'):g
#% html
$m.dot($m-actor).to-html
There are two primary methods for plotting sparse matrices.
This method uses a heatmap plot specification:
#% js
my @ds3D = $m.tuples.map({ <x y z tooltip>.Array Z=> [|$_.Array, "⎡{$m.row-names[$_[0]]}⎦ : ⎡{$m.column-names[$_[1]]}⎦ : {$_.tail}"] })».Hash;
js-d3-matrix-plot(
@ds3D,
:$tooltip-background-color,
:$tooltip-color,
:$background,
width => 400)
Here is the corresponding (“coordinates”) list plot:
#%js
$m.tuples
==> js-d3-list-plot(:$background, width => 400, :!grid-lines)
This method visualizes the matrix as a dense array:
#%js
$m.Array
==> js-d3-matrix-plot(width => 400)
For larger matrices, a list plot might be more useful, especially if the matrix has a relatively high density.
my $gLarge = Graph::Random.new: Graph::Distribution::WattsStrogatz.new(100,0.1);
my $mLarge = Math::SparseMatrix.new(edge-dataset => $gLarge.edges(:dataset));
# Math::SparseMatrix(:specified-elements(444), :dimensions((100, 100)), :density(0.0444))
The corresponding graph:
#% js
$mLarge.tuples
==> js-d3-list-plot( :$background, width => 600, height => 600, :!grid-lines)
Remark: The list plot might be much more useful for large matrices with (relatively) high density.
Tuples dataset:
#%js
$mLarge.tuples(:dataset)
==> {rename-columns($_, (<i j x> Z=> <x y z>).Hash)}()
==> js-d3-matrix-plot(:$background, width => 600)
Lastly, we explore a dense matrix example:
#%js
my @a = random-real(10, 48) xx 12;
@a = rand > 0.5 ?? @a.map(*.sort) !! @a.&transpose.map(*.sort.Array).&transpose;
say "dimensions : ", dimensions(@a);
js-d3-matrix-plot(@a, width => 1600, margins => 1, :$tick-labels-font-size, color-palette => <Turbo Plasma Warm Inferno Oranges>.pick, :$background)
[AA1] Anton Antonov, “RSparseMatrix for sparse matrices with named rows and columns”, (2015), MathematicaForPrediction at WordPress.
[AAp1] Anton Antonov, Math::SparseMatrix Raku package, (2024), GitHub/antononcube.
[AAp2] Anton Antonov, Math::SparseMatrix::Native Raku package, (2024), GitHub/antononcube.
[AAp3] Anton Antonov, Graph Raku package, (2024), GitHub/antononcube.
[AAp4] Anton Antonov, JavaScript::D3 Raku package, (2022-2024), GitHub/antononcube.
(in chronological order, with comment references)
How time flies. Yet another year has flown by. 2024 was a year of changes, continuations and preparations. Let’s start with the changes:
Edument Central Europe, the branch of Edument that is based in Prague (and led by Jonathan Worthington), decided to stop (commercial) development of its Raku related products: Comma
(the IDE for the Raku Programming Language) and Cro
(a set of libraries for building reactive distributed systems).
The announcement:
With the discrepancy between revenue and development cost to continue being so large, and the prevailing economic environment forcing us to focus on business activities that at least pay for themselves, we’ve made the sad decision to discontinue development of the Comma IDE.
Fortunately, Jonathan Worthington was not only able to release the final commercial version as a free download, but was also able to release all of the sources of Comma. This would allow other people to continue Comma development.
With John Haltiwanger being the now de-facto project leader, this has resulted in a beta of the open source version of a Raku plugin for IntelliJ IDEA, as described on Day 20.
The announcement:
When Edument employees were by far the dominant Cro contributors, it made sense for us to carry the overall project leadership. However, the current situation is that members of the Raku community contribute more. We don’t see this balance changing in the near future.
With that in mind, we entered into discussions with the Raku Steering Council, in order that we can smoothly transfer control of Cro and its related projects to the Raku community. In the coming weeks, we will transfer the GitHub organization and release permissions to steering council representatives, and will work with the Raku community infrastructure team with regards to the project website.
As the source code of Cro
had always been open source, this was more a question of handing over responsibilities. Fortunately the Raku Community reacted: Patrick Böker has taken care of making Cro
a true open source project related to Raku, and the associated web site https://cro.raku.org is now being hosted on the Raku infrastructure. With many kudos to the Raku Infra Team!
Sadly, Jonathan Worthington also indicated that they would only remain minimally involved in the further development of MoarVM, NQP and Rakudo in the foreseeable future. As such, (almost) all of their modules were moved to the Raku Community Modules Adoption Center, where they were updated and re-released.
It’s hard to overstate the importance of Jonathan Worthington‘s work in the development and implementation of the Raku Programming Language. So on behalf of the current, past and future Raku Community members: Thank You!
At the beginning of October, Elizabeth Mattijsen decided to take on the large number of open Rakudo issues at that time: 1300+. This resulted in the closing of more than 500 issues: some just needed closing, some needed tests, and some could be fixed pretty easily.
They reported on that work in the Raku Fall Issue Cleanup blog post. Of the about 800 open issues remaining, almost 300 were marked as “fixed in RakuAST”, and about 100 were marked as “Will be addressed in RakuAST”. Which still leaves about 400 open because of other reasons, so there’s still plenty of work to be done here.
The original Raku ecosystem (“p6c”) is in the process of being completely removed. Since March 2023, the ecosystem was no longer being refreshed by zef
. But it was still being refreshed by the Raku Ecosystem Archiver. But this stopped in August 2024, meaning that any updates of modules in that ecosystem would go unnoticed from then on.
At that time, every author who still had at least one module in the “p6c” ecosystem was given notice by creating an issue in the repository of the first of their modules in the META.list
. Luckily many authors responded, either by indicating that they would migrate their module(s) to the “zef” ecosystem, or that they were no longer interested in maintaining.
Since then, most of the modules of the authors that responded, have been migrated. And work has started on the modules of the authors that did not respond. With the following results: at the beginning of 2024, there were still 658 modules in the “p6c” ecosystem (now 427, 35% less), by 230 different authors (now 138, 40% less).
In 2024, 558 Raku modules have been updated (or first released): up from 332 in 2023 (an increase of 68%). There are now 2304 different modules installable by zef
by just mentioning their name. And there are now 12181 different versions of Raku modules available from the Raku Ecosystem Archive, up from 10754 in 2023, which means almost 4 module updates / day in 2024.
Rakudo saw about 2000 commits (MoarVM, NQP, Rakudo, doc) this year, which is about the same as in 2023. About one third of these commits were in the development of RakuAST (down from 75% in 2023).
A lot of work was done under the hood of the various subsystems of Rakudo. So was the dispatcher logic simplified by introducing several nqp:: shortcuts, which made the dispatcher code a lot more readable and maintainable.
The Meta-classes of NQP
and Raku
also received a bit of a makeover, as most of them hadn’t been touched since the 2015 release: this resulted in better documentation, and some minor performance improvements. Support for TWEAK
methods and a rudimentary dd
functionality were also added to NQP
.
The JVM
backend also got some TLC in 2024: one under the hood change (by Daniel Green) made execution of Raku code on the JVM backend twice as fast!
Timo Paulssen made the interface with low-level debuggers such as gdb
and lldb
a lot less cumbersome on MoarVM
, which makes adding / fixing MoarVM
features a lot easier!
On the MoarVM backend the expression JIT (active on Intel hardware) was disabled by default: it was found to be too unreliable and did not provide any execution speed gains. This change made Rakudo on Intel hardware up to 5% faster overall.
Also on the MoarVM backend, Daniel Green completed the work on optimizing short strings started by Timo Paulssen and Bart Wiegmans, resulting in about a 2% speed improvement of the compilation of Raku code.
Work on the Remote Debugger (which so far had only been really used as part of Comma) has resumed, now with a much better command line interface. And can now be checked in the language itself with the new VM.remote-debugging
method.
Some race conditions were fixed: a particularly nasty one on the lazy deserialization of bytecode that was very hard to reproduce, as well as some infiniloops.
A lot of work was done on making the Continuous Integration testing produce fewer (and recently hardly any) false positives anymore. Which makes life for core developers a lot easier!
Two new Routine
traits were added to the Raku Programming Language in 2024.
The is item
trait can be used on @
and %
sigilled parameters to indicate that a Positional
(in the @
case) or an Associative
(in the %
case) is only acceptable in dispatch if it is presented as an item. It only serves as a tie-breaker, so there should always also be a dispatch candidate that would accept the argument when it is not itemized. Perhaps an example makes this more clear:
multi sub foo(@a) { say "array" }
multi sub foo(@a is item) { say "item" }
foo [1,2,3]; # array
foo $[1,2,3]; # item
The is revision-gated
trait fulfils a significant part of the promise of the Raku Programming Language to be a 100-year programming language. It allows a developer to add / keep behaviour of a dispatch to a subroutine or method depending on the language level from which it is being called.
As with is item
, this is implemented as a tie-breaker to be checked only if there are multiple candidates in dispatch that match a given set of arguments.
This will allow core and module developers to provide forward compatibility, as well as backward compatibility in their code (as long as the core supports a given language level, of course).
In its current implementation, the trait must be specified on the proto
to allow it to work (this may change in the future), and it should specify the lowest language level it should support. An example of a module “FOO” that exports a “foo” subroutine:
unit module FOO;
proto sub foo(|) is revision-gated("v6.c") is export {*}
multi sub foo() is revision-gated("6.c") {
say "6.c";
}
multi sub foo() is revision-gated("6.d") {
say "6.d"
}
Then we have a program that uses the “FOO” module and calls the “foo” subroutine. This shows “6.d” because the current default language level is “6.d”.
use FOO;
foo(); # 6.d
However, if this program would like to use language level 6.c semantics, it can indicate so by adding a use v6.c
at the start of the program. And get a different result in an otherwise identical program:
use v6.c;
use FOO;
foo(); # 6.c
John Haltiwanger has written an extensive blog post about the background, implementation and usage of the is revision-gated
trait on Day 16, if you’d like to know more about it.
These are some of the more notable changes in language level 6.d: all of them add functionality, so are completely backward compatible.
The .flat
method optionally takes a :hammer
named argument, which will deeply flatten any data structure given:
my @a = 1, [2, [3,4]];
say @a.flat; # (1 [2 [3 4]])
say @a.flat(:hammer); # (1 2 3 4)
One can now also use HyperWhatever
(aka **
) in a postcircumfix [ ]
for the same semantics:
my @a = 1, [2, [3,4]];
say @a[*]; # (1 [2 [3 4]])
say @a[**]; # (1 2 3 4)
The .min
/ .max
/ .minmax
methods now also accept the :by
named argument to make it consistent with the sub versions, which should prevent unexpected breakage when refactoring code from a sub form to a method form (as the :by
would previously be silently ignored in the method form).
say min <5 45 345>, :by(*.Str); # 345
say <5 45 345>.min(*.Str); # 345
say <5 45 345>.min(:by(*.Str)); # 345 ADDED
The .are
method now also accepts a type argument. If called in such a manner, it will return True
if all members of the invocant matched the given type, and False
if not. Apart from allowing better readable code, it also allows shortcutting if any of the members of the invocant did not match the given type. An example:
unless @args.are(Pair) {
die "All arguments should be Pairs";
}
The most notable additions to the future language level of the Raku Programming Language:
The .nomark
method on Cool
objects returns a string with the base characters of any composed characters, effectively removing any accents and such:
use v6.e.PREVIEW;
say "élève".nomark; # eleve
The .contains
/ .starts-with
/ .ends-with
/ .index
/ .rindex
/ .substr-eq
methods now all accept a :smartcase
named argument: a conditional :ignorecase
. If specified with a True
value, it will look at the needle to see if it is all lowercase. If it is, then :ignorecase
semantics will be assumed. If there are any uppercase characters in the needle, then normal semantics will be assumed:
use v6.e.PREVIEW;
say "Frobnicate".contains("frob"); # False
say "Frobnicate".contains("frob", :smartcase); # True
say "Frobnicate".contains("FROB", :smartcase); # False
The .stem
method on IO::Path
objects returns the .basename
of the object without any extensions, or with the given number of extensions removed:
use v6.e.PREVIEW;
say "foo/bar/baz.tar.gz".IO.basename; # baz.tar.gz
say "foo/bar/baz.tar.gz".IO.stem; # baz
say "foo/bar/baz.tar.gz".IO.stem(1); # baz.tar
About one third of this year’s work was done on the RakuAST (Raku Abstract Syntax Tree) project. It basically consists of 3 sub-projects, that are heavily intertwined:
There is little more to say about the development of RakuAST
classes other than that there were 440 of them at the start of the year, and 454 of them at the end of the year. As the development of these classes is still very much in flux, they are not documented yet (other than in the test-files in the /t/12rakuast
directory).
On the other hand, the RakuAST::Doc
classes are documented because they have a more or less stable API to allow for the development of RakuDoc Version 2.
The work on the Raku Grammar and Actions has been mostly about implementing already existing features. This is measured by the number of Rakudo (make test
) and roast (make spectest
) test-files that completely pass with the new Raku Grammar and Actions. And these are the changes:
make test
: 110/151 (72.8%) → 140/156 (89.7%)make spectest
: 980/1356 (72.3%) → 1155 / 1359 (85%)A lot of work was done by Stefan Seifert, picking up the compile-time handling refactor that Jonathan Worthington had started in 2023, but was unable to bring to fruition. TPRF continued the funding for this work.
By the way, DuckDuckGo donated US$ 25000 to the foundation to allow this type of development funding to go on! Hint, hint!
Like last year, there are still quite a few features left to implement. Although it must be said that many tests are hinging on the implementation of a single feature, and often cause an “avalanche” of additional test-files passing when it gets implemented.
If you’d like to try out the new Raku Grammar and Actions, you should set the RAKUDO_RAKUAST
environment variable to 1. The .legacy
method on the Raku
class will tell you whether the legacy (older) grammar is being used or not:
$ raku -e 'say Raku.legacy'
True
$ RAKUDO_RAKUAST=1 raku -e 'say Raku.legacy'
False
A cousin language was created: Draig
, allowing you to write Raku code in Welsh!
The RakuAST::Deparse::Highlight
module allows customizable Raku syntax highlighting of Raku code that is syntactically correct. It is definitely not usable for real-time syntax highlighting as it a. requires valid Raku code, and b. it may execute Raku code in BEGIN
blocks, constant
specifications and use
statements.
A more lenient alternative is the Rainbow
module by Patrick Böker.
The final version of the RakuDoc v2.0 specification can now be completely rendered thanks to the tireless work of Richard Hainsworth and extensive feedback and proofreading by Damian Conway, as shown in Day 1 and Day 14.
Sadly it has turned out to be impossible to organize a Raku Conference (neither in-person or online), nor was it possible to organize a Raku Core Summit. We’re hoping for a better situation in 2025!
Completely out of the blue, a Dr Raku created about 90 Raku beginner videos on YouTube.
Looking back, again an amazing amount of work has been done in 2024!
Hopefully you will all be able to enjoy the Holiday Season with sufficient R&R, especially Kane Valentine (aka kawaii).
The next Raku Advent Blog is only 340 days away!
Amongst all of its features, what do we consider to be the essence of Raku? In this advent post we will explore what makes Raku an exciting programming language and see if we can describe its essence.
Paul Graham wrote an essay about the 100 year language, back in 2003. In it he had this to say:
Who will design the languages of the future? One of the most exciting trends in the last ten years has been the rise of open-source languages like Perl, Python, and Ruby. Language design is being taken over by hackers. The results so far are messy, but encouraging. There are some stunningly novel ideas in Perl, for example. Many are stunningly bad, but that’s always true of ambitious efforts. At its current rate of mutation, God knows what Perl might evolve into in a hundred years.
Larry Wall took Paul Graham’s essay and ran with it:
“It has to do with whether it can be extensible, whether it can evolve over time gracefully.”
Larry referred to the expressiveness of human languages as his inspiration for Raku:
- Languages evolve over time. (“It’s okay to have dialects…”)
- No arbitrary limits. (“And they naturally cover multiple paradigms”)
- External influences on style.
- Fractal dimensionality.
- Easy things should be easy, hard things should be possible.
- “And, you know, if you get really good at it, you can even speak CompSci.”
Naoum Hankache’s Raku Guide introduces Raku:
Raku is a high-level, general-purpose, gradually typed language. Raku is multi-paradigmatic. It supports Procedural, Object Oriented, and Functional programming.
The feature list on Raku.org goes further:
As we can see, Raku has many rich features. Maybe the essence of Raku lies somewhere among them.
It’s the 21st Century and programming languages are finally waking up to the fact that we live in a multi-lingual world. Raku is a Unicode centric language in a way that no other language manages.
Again programming languages are only beginning to catch up with the reality that every computer has multiple cores. Sometimes very many cores. Raku excels, alongside other modern languages, at delivering the tools required for effective concurrent and asynchronous programs.
Raku is a richly functional language that combines the procedural and object oriented encapsulation constructs to span paradigms.
The breadth of the Raku language is indeed impressive. But I don’t think it’s instructive, or even possible to try and distill an essence from such a diverse list of capabilities.
Perl and Raku both adhere to the famous motto:
TMTOWTDI (Pronounced Tim Toady): There is more than one way to do it.
This is indeed demonstrably true as shown by Damian Conway whenever he blogs about or talks about Raku. Indeed in Why I love Raku he sums up with this endorsement:
More than any other language I know, Raku lets you write code in precisely the way that suits you best, at whatever happens to be your (team’s) current level of coding sophistication, and in whichever style you will later find most readable …and therefore easiest to maintain.
How do you identify the essence of a deliberately broad multi-paradigm language? Perhaps the essence is not so much about some aspect of the core language. Perhaps it is much more about the freedom it gives you to choose, instead of the language choosing for you.
Like Raku, the community is not opinionated. If you choose a particular approach, the community is there to help you. If you want to learn what’s possible, the community will offer up multiple approaches, as shown by many Stack Overflow answers.
Which leaves me with this thought:
The essence of Raku is the freedom to choose for yourself. And the freedom to choose differently tomorrow.
Brian Duggan will be presenting “Connecting the Geospatial Dots with Raku” at FOSDEM 2025: “This talk explores Raku’s expressive and powerful style as we mesh together…with some of Raku’s unique features such as NativeCall for native libraries, Grammars for parsing, multiple modes of interacting with command line tooling, and plentiful concurrency models”. Looking forward to it already!
The 7 new blog posts in the 2024 Raku Advent Calendar the past week:
Weekly Challenge #301 is available for your perusal.
state
-declared variables to be instantiated/initialized to non-zero signed integersrepl-mode
for -e
grep
slip its arguments.split
if at least one of the needles was a regex, and finished the re-imagination of the .trans
method (which fixed several issues and made use of .trans
, tr///
and TR///
up to 3x as fast).Dormitory
by Alexey Melezhik.Much reading again this week with the Advent blog posts. And some cool new modules! Meanwhile, still: Слава Україні! Героям слава!
Please keep staying safe and healthy, and keep up the good work!
If you like what I’m doing, committing to a small sponsorship would mean a great deal!
A lead-free area
Elf Nebecaneezer (‘Neb’) was welcoming some young new workers to his domain and doing what old folks like to do: pontificate. (His grandchildren politely, but behind his back, call it “bloviating.”)
“In the old days, we used hand-set lead type, then gradually used ever-more-modern methods; now we prepare the content in PDF and send a copy to another company to print the content on real newsprint. It still takes a lot of paper, but much less ink and labor (as well as being safer1).”
“Most young people are very familiar with HTML and online documents, but, unless your browser and printer are very fancy, it’s not easy to create a paper copy of the thing you need to file so that it’s very usable. One other advantage of hard-copy products is that they can be used without the internet or electricity. Remember, the US had a hugely-bad day recently when a major internet service provider had a problem!” [He continued to speak…]
Now let’s get down to ‘brass tacks.’ You all are supposed to be competent Raku users, so I will show you some interesting things you can do with PDF modules. But first, a little background on PDF (Adobe’s Portable Document Format).
The PDF was developed by the Adobe company in 1991 to replace the PostScript (PS) format. In fact, The original PS code is at the heart of PDF. Until 2008, Adobe defined and continued to develop PDF. Beginning in 2008, the ISO defined it in its 32000 standard. That standard is about a thousand pages long and can be obtained online at https://www.pdfa-inc.org at no cost.
One other important piece of the mix is the CLI ghostscript
interpreter which can be used, among other things, to compress PDF documents.
Before we continue, remember, we are a Debian shop, and any specific system software I mention is available in current versions.
We are still improving Rakudoc (see the recent Raku Advent post by Richard Hainsworth), but it can be used now for general text to produce PDF output. However, for almost any specific printed product needed, we can create the PDF on either a one-time case or a specific design for reuse.
A good example of that is the module PDF::NameTags
which can be modified to use different layouts, colors, fonts, images, and so forth. Its unique output feature is its reversibility–when printed two-sided, the person’s name shows regardless of whether the badge flips around or not. That module was used to create the name tags we are are wearing on our lanyards, and I created that module! I’ll be using parts of that module as an example as I continue.
Actually, I have a GitHub account under an alias, ‘tbrowder’, and I have published several useful modules to help my PDF work. I’ll mention some later.
Note I always publish my finished modules via ‘zef’, the standard Raku package manager. Then they are easily available for installation, and they automatically get listed on https://Raku.land and can easily be found. Why do I do that, especially since no one else may care? (As ‘@raiph’ or someone other Raku user once said, modules are usually created for the author’s use.) Because, if it helps someone else, as the Golden Rule says, it may be the right thing to do (as I believe Font::Utils
does).
Fonts used to be very expensive and hard to set type with, so the modern binary font is a huge improvement! Binary fonts come in different formats, and their history is fascinating.
This shop prefers OpenType fonts for two practical reasons: (1) they provide extensive Unicode glyph coverage for multi-language use (we operate world-wide as you well know) and (2) they provide kerning for more attractive type setting.
By using the HarffBuzz
library, the final PDF will only contain the glyphs actually used by the text (the process is called ‘subsetting’). If a user is not running Debian 12 or later, then he or she can additionally ‘use’ module Compress::PDF
which provides a subroutine which uses Ghostscript to remove unused glyphs. The module also provides a CLI program, pdf-compress
, which does the same thing.
For Latin and most European languages we use the following font collections available as Debian packages:
Font | Debian package |
---|---|
GNU Free Fonts | fonts-freefont-otf |
URW Base 35, | fonts-urw-base35 |
E B Garamond | fonts-egaramond, fonts-garamond-extra |
Cantarell | fonts-cantarell |
For other languages we rely on the vast glyph coverage of Google’s Noto fonts (in TrueType format). Debian has many of those fonts, but we find it easier to find the needed font at Google and download them onto our server as zip archives. After unzipping, we move the desired files to ‘/usr/local/share/fonts/noto’. We currently have these 10 Noto fonts available (plus 175 other variants):
Font |
---|
NotoSerif-Regular.ttf |
NotoSerif-Bold.ttf |
NotoSerif-Italic.ttf |
NotoSerif-BoldItalic.ttf |
NotoSans-Regular.ttf |
NotoSans-Bold.ttf |
NotoSans-Italic.ttf |
NotoSans-BoldItalic.ttf |
NotoSansMono-Regular.ttf |
NotoSansMono-Bold.ttf |
Note the file names above are in the family and style order as the Free Fonts in our own $HOME
directory.
To get a feel for the span of glyphs in some of the fonts, the FreeSerif font has over 1,000 glyphs and can be used for many languages. We typically use that font, and the rest of that font family, for most of our work around Europe and the Americas. For the rest of the world, Google’s Noto fonts should cover all but a tiny fraction of the population.
One of the many strengths of Raku is its handling of Unicode as a core entity. You can read about Unicode at its website. Of particular interest are the code charts at https://www.unicode.org/charts. If you select any chart you will see that the code points are shown in hexadecimal. In Raku the code points are natively in decimal. Fortunately, Raku has a method to do that:
# convert decimal 178 to hexadecimal (base 16)
say 178.base(16); # OUTPUT: B2
# convert a hexadecimal 'A23' to decimal
say 'A23'.parse-base(16); # OUTPUT: 2595
Or we can look at Wikipedia where there are charts showing both hexidecimal and decimal code points (Unicode_chars).
Not released yet is my Raku module PDF::FontCollection
which encapsulates useful font collections into a single reference list. The module has routines allowing the user to get a loaded font with a short code mnemonically associated with the font collection (a digit, and a one- or two-letter character for its style). It has an installed binary to show its fonts by number, code, and name. However, my most useful module, Font::Utils
, has made this module effectively obsolete.
Font::Utils
I now introduce my almost-released module Font::Utils
which uses fonts already installed and collects them into a file called font-files.list
which is then placed into the user’s $HOME/.Font-Utils
directory.
Since our servers already have the desired OpenType fonts installed, using Font::Utils
is actually more convenient since you can arrange the font list any way you want including: (1) creating your own mnemonic keys for easy reference, (2) deleting or adding data lines, and (3) reordering data lines.
Note that the actual OpenType font files are quite large, but a good design will ensure they are not loaded until specifically called for in the program. If they are already loaded, calling the routine will be is a no-op. The Font::Utils
module has one such routine.
Other non-Latin languages are covered in many freely available font collections, including right-to-left and other orientations along with layouts for users who need that capability (the Noto fonts are a good example). As noted, those can be easily added to your Font::Utils
collection.
Let’s take a look at the first couple of lines in the default installation of my $HOME/.Font-Utils/font-files.list
:
# key basename path
1 FreeSerif.otf /usr/share/fonts/opentype/freefont/FreeSerif.otf
We see the comments and the first line of data consisting of three fields. The first field is the unique code which you may change. The second field is the font file’s basename, and the last field is file font file’s path. You may delete or reorder or add new font entries as you wish.
Now let’s say you want to publish some text in Japanese. Oh, you say you don’t know Japanese? And you don’t have a keyboard to enter the characters? No problem!, There is a way to do that.
We first find from Wikipedia that some Unicode characters for the Japanese language are in the Hiragan collection, which covers hexadecimal code points 3041
through 3096
and 3099
through 309F
. Then we create a space-separated string of the characters for each word. We’ll use an arbitrary list of them:
my $jword1 = "3059 306A 306B 306C 305D";
my $jword2 = "3059-305D"; # same as $jword1 but with a hyphen for a range
my $jword3 = "306B-306F";
Note that we can use a hyphen to indicate a contiguous range of code points. (We could also use decimal code points, but that’s a bit more awkward due to ease and larger number of characters required as well as the confusing use of the joining ‘-‘ with subtraction.)
Oops, what font shall we use? I couldn’t find a suitable font with the searches on Debian, so I went online to Google, searched for Hiragana
, and found the font with description Noto Serif Japanese
.
I selected it, downloaded it, and got file Noto_Serif_JP.zip
. I created a directory named google-fonts
and moved the zip file there where I then unpacked them to get directory Noto_Serif_JP
with files:
README.txt
NotoSerifJP-VariableFont_wght.ttf
OFL.txt
static
static/NotoSerifJP-Bold.ttf
static/NotoSerifJP-SemiBold.ttf
static/NotoSerifJP-Medium.ttf
static/NotoSerifJP-Light.ttf
static/NotoSerifJP-ExtraLight.ttf
static/NotoSerifJP-ExtraBold.ttf
static/NotoSerifJP-Black.ttf
static/NotoSerifJP-Regular.ttf
The main font is a variable one, so I tried it to see the results.
There are many ways to lay out text on a page. The most useful for general use is to create reusable text boxes.
# define it
my PDF::Content::Text::Box $tb .= new(
:$text, :$font, :$font-size, :$height,
# style it
:WordSpacing(5), # extra spacing for Eastern languages
);
#...clone it and modify it...
$tb.clone(:content-width(200));
Use the $page.text
context to print a box:
my @bbox;
$page.text: {
.text-position = $x, $y;
# print the box and collect the resulting bounding box coordinates
@bbox = .print: $tb;
I won’t go into it very much, but you can do almost anything with PDF. The aforementioned PDF::NameTags
module has many routines for drawing and clipping. Another of my modules on deck is PDF::GraphicsUtils
which will encompass many similar routines as well as the published PDF::Document
module.
I was having trouble with the variable syntax needed testing scripts as well as test modules. Raku user and author @librasteve suggested creating an alias to do that. The result from my .bash_aliases
file;
alias r='raku -I.' # adhoc script Raku run command
alias rl='raku -I./lib' # adhoc script Raku run command
I was having problems with modules once and @ugexe suggested always using a load test for checking all modules will compile. Now I always create a test script similar to this:
# file: t/0-load-test.t
use Test;
my @modules = <
Font::Utils
Font::Utils::FaceFreeType
Font::Utils::Misc
Font::Utils::Subs
>;
plan +@modules;
for @modules {
use-ok $_, "Module $_ can be used okay";
}
And run it like this: r t/0*t
$ r t/0*t
# OUTPUT:
1..4
ok 1 - Module Font::Utils can be used okay
ok 2 - Module Font::Utils::FaceFreeType can be used okay
ok 3 - Module Font::Utils::Misc can be used okay
ok 4 - Module Font::Utils::Subs can be used okay
=finish
‘Use =finish
to debug a LONG rakumod file. I use it when I’m in the process of adding a new sub or modifying one and commit some error that causes a panic failure without a clear message. I add the =finish
after the first routine and see if it compiles. If it does I move the =finish
line to follow the next sub, and so on.
When I do get a failure, I have a better idea of where the bad code is. Note I do sometimes have to reorder routines becaause of inter-module dependencies. It’s also a good time to move completely independent routines to a lower-level module as described next.
Sometimes, as in Font::Utils
, I create a single, long file with routines that are dependent on other routines so that it is difficult to tell what is dependent upon what. Then I start creating another, lower-level module that has routines that are non-dependent on higher level modules. You can see that structure in the load test output shown above.
BEGIN
phaserModule authors often need access to the user’s $HOME
directory, so use of the BEGIN
phaser as a block can make it easier to access it from the Build
module, as well as the base modules. Here is that code from the Font::Utils
module:
unit module Font::Utils;
use PDF::Font::Loader :load-font;
our %loaded-fonts is export;
our $HOME is export = 0;
our $user-font-list is export;
our %user-fonts is export;
BEGIN {
if %*ENV<HOME>:exists {
$HOME = %*ENV<HOME>;
}
else {
die "FATAL: The environment variable HOME is not defined";
}
if not $HOME.IO.d {
die qq:to/HERE/;
FATAL: \$HOME directory '$HOME' is not usable.
HERE
}
my $fdir = "$HOME/.Font-Utils";
mkdir $fdir;
$user-font-list = "$fdir/font-files.list";
}
INIT {
if not $user-font-list.IO.r {
create-user-font-list-file;
}
create-user-fonts-hash $user-font-list;
create-ignored-dec-codepoints-list;
}
That BEGIN
block creates globally accessible variables and enables easy access for any build script to either create a new user fonts list or check an existing one. The INIT
block then uses a routine to create the handy %user-fonts
hash after the Raku compilation stage enables it.
As @lizmat and others on IRC #raku always warn about global variables: danger lurks from possible threaded use. But our ‘use cases’ should not trigger such a problem. However, other routines may, and we have used a fancy module to help the problem: OO::Monitors
by @jnthn. See it used to good effect in the class (monitor) Font::Utils::FaceFreeType
.
Currently the PDF standard doesn’t deal with active links, but my Raku friend David Warring gave me a solution in an email. I think I’ve tried it, and I think it works, but YMMV.
David said “Hi Tom, Adding a link can be done. It’s a bit lower level than I’d like and you need to know what to look for. Also needs PDF::API6
, rather than PDF::Lite
. I’m looking at the pdfmark documentation. It’s a postscript operator and part of the Adobe SDK. It’s not understood directly by the PDF standard, but it’s just a set of data-structures, so a module could be written for it. I’ll keep looking.”
His code:
# Example:
use PDF::API6;
use PDF::Content::Color :&color, :ColorName;
use PDF::Annot::Link;
use PDF::Page;
my PDF::API6 $pdf .= new;
my PDF::Page $page = $pdf.add-page;
$page.graphics: {
.FillColor = color Blue;
.text: {
.text-position = 377, 515;
my @Border = 0,0,0; # disable border display
my $uri = 'https://raku.org';
my PDF::Action::URI $action = $pdf.action: :$uri;
my PDF::Annot::Link $link = $pdf.annotation(
:$page,
:$action, # action to follow the link
:text("Raku homepage"), # display text (optional)
:@Border,
);
}
}
$pdf.save-as: "link.pdf";
App::Mi6
and Mi6::Helper
Because of the artistic nature of our work, we often need to create our own modules for new products. In that event, we use module App::Mi6
and its binary mi6
to ease the labor of managing recurring tasks with module development. By using a logical structure and a dist.ini
configuration file, the user can create Markdown files for display on GitHub or GitLab, test his or her test files in directories t
and xt
, and publish the module on ‘Zef/Fez’ with one command each.
By using my module Mi6::Helper
, the author can almost instantly create a new module ‘git’ source repository with its structure ready to use app mi6
with much boiler plate already complete.
I’m also working on a dlint
program to detect structural problems in the module’s git repository. It is a linter to check the module repo for the following:
META6.json
file match those in the module’s /resources
directory.use X
statements have matching depends
entries in the META6.json
file.The linter will not correct these problems, but if there is interest, that may be added in a future release.
Neb concluded his presentation:
“Bottom line: Raku’s many PDF modules provide fairly easy routines to define any pre-press content needed. Those modules continue to be developed in order to improve ease of use and efficiency. Graphic arts have always been fun to me, but now they are even ‘funner!'”.
As I always end these jottings, in the words of Charles Dickens’ Tiny Tim, “may God bless Us, Every one!” A Christmas Carol, a short story by Charles Dickens (1812-1870), a well-known and popular Victorian author whose many works include The Pickwick Papers, Oliver Twist, David Copperfield, Bleak House, Great Expectations, and A Tale of Two Cities2.
In the smoke-filled (virtual) room of the council of the high (from the smoke) elves, the wizened textualist Geoff, said “All of my stuff is in boxes and containers.” Empty shelves behind him indicated he was moving house.
“When you have a complex module,” Geoff continued, “and its difficult to describe how to install it, do all the steps in a container, and show the Dockerfile.”
“Aha!” said the newest member, who drew his vorpal sword, and went forth to slay the Jabberwock, aka putting RakuAST::RakuDoc::Render
into a container.
“Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!”
After days of wandering the murky jungle of Docker/Alpine/Github/Raku documentation, the baffled elf wondered if he was in another fantasy:
“So many rabbit holes to fall down.”
Best practice for a container is to chose an appropriate base image. Well obviously, there’s the latest Raku version by the friendly, hard-working gnomes at Rakuland. So, here’s the first attempt at a Dockerfile:
FROM docker.io/rakuland/raku
# Copy in Raku source code and build
RUN mkdir -p /opt/rakuast-rakudoc-render
COPY . /opt/rakuast-rakudoc-render
WORKDIR /opt/rakuast-rakudoc-render
RUN zef install . -/precompile-install
It failed. After peering at the failure message, it seemed that at least one of the dependent modules used by the rakuast-rakudoc-render
distribution needs a version of make.
That’s easily fixed, just add in build-essentials
, the vorpal-sworded elf thought. Something like:
FROM docker.io/rakuland/raku
# Install make, gcc, etc.
RUN apt-get update -y && \
apt-get install -y build-essential && \
apt-get purge -y
# Copy in Raku source code and build
RUN mkdir -p /opt/rakuast-rakudoc-render
COPY . /opt/rakuast-rakudoc-render
WORKDIR /opt/rakuast-rakudoc-render
RUN zef install . -/precompile-install
Failure! No apt
.
“How can there not be APT??” the Ubuntu-using elf thought in shock. Turns out that the rakuland/raku
image is built on an Alpine base, and Alpine have their own package manager apk
.
Unfortunately, build-essential is a debian package, but at the bottom of this rabbit hole lurks an apk
equivalent package build-base, leading to:
FROM docker.io/rakuland/raku
# Install make, gcc, etc.
RUN apk add build-base
# Copy in Raku source code and build
RUN mkdir -p /opt/rakuast-rakudoc-render
COPY . /opt/rakuast-rakudoc-render
WORKDIR /opt/rakuast-rakudoc-render
RUN zef install . -/precompile-install
Lo! upon using the Podman desktop to build an image from the Dockerfile, the process came to a succesful end.
But now to make things easier, there needs to be a link to the utility RenderDocs
, which takes all the RakuDoc sources from docs/
and renders them to $*CWD
(unless over-ridden by --src
or --to
, respectively). It will also render to Markdownunless an alternative format is given.
FROM docker.io/rakuland/raku
# Install make, gcc, etc.
RUN apk add build-base
# Copy in Raku source code and build
RUN mkdir -p /opt/rakuast-rakudoc-render
COPY . /opt/rakuast-rakudoc-render
WORKDIR /opt/rakuast-rakudoc-render
RUN zef install . -/precompile-install
# symlink executable to location on PATH
RUN ln -s /opt/rakuast-rakudoc-render/bin/RenderDocs /usr/local/bin/RenderDocs
# Directory where users will mount their documents
RUN mkdir /doc
# Directory where rendered files go
RUN mkdir /to
WORKDIR /
AND!!! when a container was created using this Dockerfile and run
with its own terminal, the utility RenderDocs was visible. Running
RenderDocs -h
produced the expected output (listing all the possible arguments).
Since the entire distribution is included in the container, running
RenderDocs --src=/opt/rakuast-rakudoc-render/docs README
will render README.rakudoc in --src
to /to/README.md because the default output format is MarkDown.
“Fab!”, screamed the boomer-generation newbie elf. “It worked”.
“Now lets try HTML”, he thought.
RenderDocs --format=HTML --src=/opt/rakuast-rakudoc-render/docs README
Failure: no sass.
“expletive deleted“, he sighed. “The Jabberwok is not dead!”
There are two renderers for creating HTML. One produces a single file with minimal CSS so that a normal browser can load it as a file locally and it can be rendered without any internet connection. This renderer is triggered using the option --single
. Which the containerised RenderDocs handles without problem.
But the normal use case is for HTML to be online, using a CSS framework and JS libraries from CDN sources. Since the renderer is more generic, it needs to handle custom CSS in the form of SCSS. This functionality is provided by calling an external program sass, which is missing in the container.
An internet search yields the following snippet for a container.
# install a SASS compiler
ARG DART_SASS_VERSION=1.82.0
ARG DART_SASS_TAR=dart-sass-${DART_SASS_VERSION}-linux-x64.tar.gz
ARG DART_SASS_URL=https://github.com/sass/dart-sass/releases/download/${DART_SASS_VERSION}/${DART_SASS_TAR}
ADD ${DART_SASS_URL} /opt/
RUN cd /opt/ && tar -xzf ${DART_SASS_TAR} && rm ${DART_SASS_TAR}
RUN ln -s /opt/dart-sass/sass /usr/local/bin/sass
The container image builds nicely, but the RenderDocs
command STILL chokes with an unavailable sass.
Except that diving into the container’s murky depths with an ls /opt/dart-sass/
shows that sass
exists!
The newbie was stumped.
So rested he by the Tumtum tree
And stood awhile in thought.
Turns out that the Alpine distribution uses a different compiler, and the wonderful dart-sass fae provide a suitable binary so a simple change was enough to get sass working in the container.
- ARG DART_SASS_TAR=dart-sass-${DART_SASS_VERSION}-linux-x64.tar.gz
+ ARG DART_SASS_TAR=dart-sass-${DART_SASS_VERSION}-linux-x64-musl.tar.gz
simple does not mean found at once, but the container contains RenderDocs, which produces markdown and HTML rendered files.
One, two! One, two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.
“I can publish this image so everyone can use it,” the FOSS fanatic elf proclaimed.
So the docker container image can be accessed using a FROM
or PULL
using the URL
docker.io/finanalyst/rakuast-rakudoc-render
“And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!
O frabjous day! Callooh! Callay!”
“It would be great,” mused the triumphant elf, “if RakuDoc sources, say for a README could be automatically included as the github README.md of repo”.
“May be as an action?”
Github actions can use containers to process files in a repo. Essentially, in an action, the contents of a repo are copied to a github-workspace, then they can be processed in the webspace, and changes to the workspace have to be committed and pushed back to the repository.
With a container, the contents of the workspace need to be made available to the container. Despite some documentation that starting a container in a github action automatically maps the github-workspace to some container directory, the exact syntax is not clear.
In order to discover how to deal with the multitude of possibilities, a new version of RenderDocs got written, and a new image generated, and again, and again … Unsurprisingly, between one meal and another, the ever hungry elf forgot which version was being tested.
“I’ll just include a --version
argument,” thought the elf. “I’ll can ask the Super Orcs!”.
And there behold was an SO answer to a similar question, and it was written by no lesser a high elf than the zefish package manager ugexe, not to be confused with the other Saint Nick, the package deliverer.
Mindlessly copying the spell fragment into his CLI script as:
multi sub MAIN( :version($v)! = False {
say "Using version {$?DISTRIBUTION.meta<version>} of rakuast-rakudoc-render distribution."
if $v;
}
the elf thought all done! “Callooh! Callay!”.
Except RenderDocs -v
generated Any.
“SSSSSSSSh-,” the elf saw the ominous shadow of Father Christmas looming, “-ine a light on me”.
On the IRC channel, the strong willed Coke pointed out that a script does not have a compile time variable, such as, $?DISTRIBUTION. Only a module does.
The all-knowing elf wizard @lizmat pointed out that command line scripts should be as short as possible, with the code in a module that exports a &MAIN
.
Imbibing this wisdom, our protagonist copied the entire script contents of bin/RenderDocs
to lib/RenderDocs.rakumod
, added a proto sub MAIN(|) is export { {*} }
, then made a short command line script with just use RenderDocs
.
Inside the container terminal:
# RenderDocs -v
Using version 0.20.0 of rakuast-rakudoc-render distribution.
With that last magical idiom, our intrepid elf was ported from one rabbit hole back to the one he had just fallen down.
“Beware the Jubjub bird, and shun
The frumious Bandersnatch!”
“I seem to be going backwards,” our re-ported elf sighed.
Once again, the github documentation was read. After much study and heartbreak, our hero discovered a sequence that worked:
docs
containing a file README.rakudoc
.github/
workflows/
CreateDocs.yml
CreateDocs.yml
name: RakuDoc to MD
on:
# Runs on pushes targeting the main branch
push:
branches: ["main"]
# Allows you to run this workflow manually from the
# Actions tab workflow_dispatch:
jobs:
container-job:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@master
with:
persist-credentials: false
fetch-depth: 0
- name: Render docs/sources
uses: addnab/docker-run-action@v3
with:
image: finanalyst/rakuast-rakudoc-render:latest
registry: docker.io
options: -v ${{github.workspace}}/docs:/docs -v ${{github.workspace}}:/to
run: RenderDocs
After examining the github actions logs, it seemed the rendered files were created, but the repository was not changed.
“Perhaps I should have used milk and not cream …” thought our fantasy elf.
There is in fact a missing step, committing and pushing from the github-workspace back to the repository. This can be done by adding the following to CreateDocs.yml
:
- name: Commit and Push changes
uses: Andro999b/[email protected]
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
branch: 'main'
Even this did not work! Github refused absolutely to write changes to the repository.
The weary elf substituted Lemon grass for Lavender in step 1, and just to be certain changed the repo settings following the instructions from the Github grimore
The content – at this stage of the tale – of CreateDocs.yml
is
name: RakuDoc to MD
on:
# Runs on pushes targeting the main branch
push:
branches: ["main"]
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
jobs:
container-job:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@master
with:
persist-credentials: false
fetch-depth: 0
- name: Render docs/sources
uses: addnab/docker-run-action@v3
with:
image: finanalyst/rakuast-rakudoc-render:latest
registry: docker.io
options: -v ${{github.workspace}}/docs:/docs -v ${{github.workspace}}:/to
run: RenderDocs
- name: Commit and Push changes
uses: Andro999b/[email protected]
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
branch: 'main'
It worked. “The Christmas present is now available for anyone who wants it”, thought our elf.
’Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.
(Jabberwocky, By Lewis Carroll)
Remember to git pull
for the rendered sources to appear locally as well.
“Wouldn’t it be nice to wrap the present in a ribbon? Why not put diagrams in the Markdown file? “
Our elf was on a streak, and fell down another rabbit hole: github does not allow svg in a Markdown file it renders from the repo. “It is impossible,” sighed the tired elf.
Alice laughed. “There’s no use trying,” she said: “one can’t believe impossible things.”
“I daresay you haven’t had much practice,” said the Queen. “When I was your age, I always did it for half-an-hour a day. Why, sometimes I’ve believed as many as six impossible things before breakfast.”
(Through the Looking Glass, Lewis Carroll)
Diagrams can be created using the dot
program of Graphviz, which is a package that Alpine provides. So, we can create a custom block for RakuAST::RakuDoc::Render that takes a description of a graph, sends it to dot
, gets an svg file back and inserts into the output.
Except: github will not allow svg directly in a markdown file for security reasons.
But: it will allow an svg in a file that is an asset on the repo. So, now all that is needed is to save the svg in a file, reference the file in the text, and copy the asset to the same directory as the Markdown text.
Except: the time stamps on the RakuDoc source files and the output files seem to be the same because of the multiple copying from the repo to the actions workspace to the docker container. So: add a --force
parameter to RenderDocs
.
So in Raku impossible things are just difficult.
The final content of CreateDocs.yml
is now
name: RakuDoc to MD
on:
# Runs on pushes targeting the main branch
push:
branches: ["main"]
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
jobs:
container-job:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@master
with:
persist-credentials: false
fetch-depth: 0
- name: Render docs/sources
uses: addnab/docker-run-action@v3
with:
image: finanalyst/rakuast-rakudoc-render:latest
registry: docker.io
options: -v ${{github.workspace}}/docs:/docs -v ${{github.workspace}}:/to
run: RenderDocs --src=/docs --to=/to --force
- name: Commit and Push changes
uses: Andro999b/[email protected]
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
branch: 'main'
Try adding a graph to a docs/README.rakudoc in a repo, for instance:
=begin Graphviz :headlevel(2) :caption<Simple example>
digraph G {
main -> parse -> execute;
main -> init;
main -> cleanup;
execute -> make_string;
execute -> printf
init -> make_string;
main -> printf;
execute -> compare;
}
=end Graphviz
Now you will have a README with an automatic Table of Contents, all the possibilities of RakuDoc v2, and an index at the end (if you indexed any items using X<>
markup).
(Sigh: All presents leave wrapping paper! A small file called semicolon_delimited_script is also pushed by github’s Commit and Push to the repo.)
Will Coleda and Justin DeVuyst have co-produced the eleventh Rakudo compiler release of 2024: 2024.12, with quite a few fixes and some minor improvements. Binary packages will become available shortly, as well as updates to Rakudo Star, if they are not already. This brings the effective bus factor for people able to do a Rakudo release to 2! Kudos to all involved!
The 7 new blog posts in the 2024 Raku Advent Calendar the past week:
There’s no stopping Anton this month: apart from again writing an advent blog post, Anton Antonov also published set 3 of Graph Neat Examples in Raku in which quite a few graphs are nicely visualized.
Weekly Challenge #300 is available for your perusal, a milestone! Kudos to Mohammad S. Anwar!
Blin
before the 2024.12 release, and added a VM.own-up
method and an “own-up” module.is revision-gated
trait.A release, cool blog posts, so many modules salvaged from oblivion: quite a few presents! Meanwhile, still: Слава Україні! Героям слава!
Please keep staying safe and healthy, and keep up the good work!
If you like what I’m doing, committing to a small sponsorship would mean a great deal!
In this post I am showcasing some examples of using the Raku programming language to create and visualize graphs. These examples are part of a series of neat examples, which are concise or straightforward code snippets that produce compelling visual or textual outputs.
(And, hopefully, we know something is neat when we see it…)
All computational graph features discussed here are provided by the “Graph” module. The graphs are visualized using D3.js (via “JavaScript::D3”) and Graphviz DOT (via Graph.dot
), providing both interactive and static visualizations.
Here is the link to the related presentation recording “Graph neat examples in Raku (Set 3):
The setup for these examples is the same as in the two sets, so it is skipped here.
Let us look first into nested graphs.
A nested graph is created by repeatedly applying a function to an initial value.
Here’s a basic example:
#% html
my $g1 = Graph::Nested.new({$_ ** 2}, 2, 3, :directed);
my $g2 = Graph::Nested.new({"f($_)"}, 'x', 3, :directed);
$g1.dot(:$background, engine => 'dot', vertex-shape => 'ellipse', vertex-width => 0.75, :svg)
~
$g2.dot(:$background, engine => 'dot', vertex-shape => 'ellipse', vertex-width => 0.75, :svg)
In this example, we start with a value and apply a function to it multiple times. The graph visually represents these nested applications.
We can extend this concept by using two functions, creating a binary tree structure:
my $g = Graph::Nested.new({["f($_)", "g($_)"]}, 'x', 3, :directed)
# Graph(vertexes => 15, edges => 14, directed => True)
#%html
$g.dot(node-width => 0.75, node-height => 0.3, node-shape => 'rect', engine => 'dot'):svg
This creates a more complex graph where each node has two children, representing the application of two different functions.
The mod graph example uses arithmetic operations to create a directed graph:
my $g = Graph::Nested.new({($_.Int ** 2 + 1) mod 10}, ^10, :directed)
# Graph(vertexes => 10, edges => 10, directed => True)
#%html
$g.dot(node-width => 0.3, node-height => 0.3, node-shape => 'circle', engine => 'sfdp'):svg
This graph shows how numbers transform through a series of operations, resulting in a directed cycle.
Finally, we have a range graph, which uses an integer range to create a graph:
my $g = Graph::Nested.new({^$_}, '9', 2, :directed)
# Graph(vertexes => 10, edges => 45, directed => True)
#%html
$g.dot(node-width => 0.4, node-height => 0.4, node-font-size => 18, engine => 'dot', size => (6, 6)):svg
This graph illustrates the concept of completeness and allows us to compute in-degrees and out-degrees of vertices.
$g.undirected-graph.is-complete
# True
say 'in : ', $g.vertex-in-degree(:pairs);
say 'out : ', $g.vertex-out-degree():p;
# in : (0 => 9 1 => 8 2 => 7 3 => 6 4 => 5 5 => 4 6 => 3 7 => 2 8 => 1 9 => 0)
# out : (0 => 0 1 => 1 2 => 2 3 => 3 4 => 4 5 => 5 6 => 6 7 => 7 8 => 8 9 => 9)
Next, let’s visualize file system structures as graphs. We’ll start by getting a list of file paths from a directory:
my @paths = paths($*HOME ~ "/MathFiles/RakuForPrediction");
my @paths2 = @paths>>.subst($*HOME.Str)>>.split("/", :skip-empty);
@paths2.elems
# 183
We then create graph edges from these paths:
my @edges = @paths2.map({ $_.rotor(2 => -1).map({ $_.head => $_.tail }) }).map(*.Slip).unique(:as({.Str}));
my $g = Graph.new(@edges, :directed)
# Graph(vertexes => 188, edges => 188, directed => True)
And visualize the graph using Graphviz:
#%html
my $preamble = q:to/END/;
fontcolor = "Ivory";
fontsize = "12";
labelloc = "t";
label = "Directory paths";
graph [size="8,8!"];
bgcolor="none";
node [style=filled, fixedsize=true, shape=circle, color="Black", fillcolor="SlateBlue", penwidth=1, fontsize=4, fontcolor="Gray", labelloc=c, width=0.08, height=0.08];
edge [color="SteelBlue", penwidth=0.6, arrowsize=0.4];
END
$g.dot(:$preamble, engine=>'twopi'):svg;
Finally, we’ll create a larger graph from the Raku documentation files:
my @paths = paths($*HOME ~ "/Downloads/doc/doc");
my @paths2 = @paths>>.subst($*HOME.Str)>>.split("/", :skip-empty);
my @edges = @paths2.map({ $_.rotor(2 => -1).map({ $_.head => $_.tail }) }).map(*.Slip).unique(:as({.Str}));
my $g2 = Graph.new(@edges, :directed)
# Graph(vertexes => 476, edges => 510, directed => True)
And plot it using D3.js:
#%js
$g2.edges ==>
js-d3-graph-plot(
width => 1100,
height => 400,
:$background,
:$title-color,
vertex-size => 3,
vertex-fill-color => 'SlateBlue',
vertex-label-font-size => 10,
vertex-label-color => 'none',
edge-thickness => 1,
directed => $g.directed,
force => {charge => {strength => -50, iterations => 1}, y => {strength => 0.4}, collision => {radius => 2, iterations => 1}, link => {distance => 1}}
)
Same graph, but with vertex names:
This visualization demonstrates the structure of the Raku documentation as a graph, offering insights into its organization.
Thank you for joining me in exploring these neat examples of graph visualization in Raku. I hope you will use Raku’s graph-functionalities a lot and often!
Search engine DuckDuckGo has donated $25.000 to the Foundation, it being one of the organizations that share DuckDuckGo’s vision of raising the standard of trust online (HackerNews comments). A welcome addition to the foundation’s treasury! Should you (or your organization) be interested in donating, you should check out the prospectus!
The community supported plugin-version of Comma (the Raku IDE) has reached the first beta-release, thanks to the work of John Haltiwanger. User response has been very positive! Note that this community supported plugin provides all of the Comma features, including the ones that were previously only available in the paid version.
As to was to be expected, 7 new blog posts in the 2024 Raku Advent Calendar the past week:
Apart from writing an advent blog post, Anton Antonov also wrote another blog post about neat graph examples in Raku as well as created a video about chess positions and knight’s tours via graphs (in Raku)!
Weekly Challenge #299 is available for your perusal.
VM.ownup
to produce backtraces of all threads and exitno-remote-debugging
” modulenativecallcast
.QuantHash
es infinilooping on .Str
/ .gist
/ .raku
Hash
es infinilooping on .Str
Failure
s$*USER
/ $*GROUP
produce sensible values on Windows (as much as possible).trans
method issues, with efficiency improvements of 5% to 300% for some use-cases so far.Math::Matrix
made it easy by Humberto Massa.Set
keys by seaker.So much to read and catch up on! Good stuff. Truly a Holiday Season! Meanwhile, still: Слава Україні! Героям слава!
Please keep staying safe and healthy, and keep up the good work!
If you like what I’m doing, committing to a small sponsorship would mean a great deal!
In this blog post, I will walk you through some neat examples of graph manipulation and visualization using Raku. These examples showcase the capabilities of Raku and its modules in handling graph-related tasks.
All computational graph features discussed here are provided by the “Graph” module. The graphs are visualized using D3.js (via “JavaScript::D3”) and Graphviz DOT (via Graph.dot
), providing both interactive and static visualizations.
What is a neat example?
Concise or straightforward code that produces compelling visual or textual outputs. In this context, neat examples:
- Showcase Raku programming.
- Use functionalities of different Raku modules.
- Give interesting perspectives on what is computationally possible.
Here is the link to the related presentation recording “Graph neat examples in Raku (Set 2):
The setup for these examples is the same as in the first set, so it is skipped here.
Here is a triangular grid graph:
#% js
use Graph::TriangularGrid;
use JavaScript::D3;
my $g = Graph::TriangularGrid.new(4, 4);
my @highlight = ($g.vertex-list Z=> $g.vertex-degree).classify(*.value).map({ $_.value».key });
js-d3-graph-plot( $g.edges(:dataset),
:@highlight,
background => 'none',
edge-thickness => 3,
vertex-size => 10,
vertex-label-color => 'none',
width => 1000,
height => 300,
margins => 5,
edge-color => 'SteelBlue',
force => {charge => {strength => -300}, y => {strength => 0.2}, link => {minDistance => 4}}
)
Here are the corresponding adjacency- and incidence matrices:
#% js
use Math::SparseMatrix;
use Data::Reshapers;
my ($amat, $imat) = $g.adjacency-matrix, $g.incidence-matrix;
my %opts = grid-lines => {width => 0.25, color => 'DimGray'}, title-color => 'Gray', color-palette => 'Greys', :!tooltip, background => 'White', height => 300;
js-d3-matrix-plot($amat, plot-label => 'Adjacency matrix', |%opts, width => 300+80, margins => {:2left, :40top, :2bottom, :80right})
~ "\n" ~
js-d3-matrix-plot($imat, plot-label => 'Incidence matrix', |%opts, width => 600, margins => {:2left, :40top, :2bottom, :2right})
Let’s start with a simple example: bipartite graph coloring. Although this might seem basic, it’s a good warm-up exercise. Here, we’re creating a grid graph.
my $gg = Graph::Grid.new(6, 16);
The method is-bipartite
checks if the graph is bipartite, meaning it can be colored using two colors such that no two adjacent vertices share the same color.
$gg.is-bipartite
# True
We can show the bipartite coloring of the grid graph using the following code:
$gg.bipartite-coloring
# {0_0 => 1, 0_1 => 0, ...
To prepare the graph for highlighting, we classify the vertices based on their bipartite coloring:
my %highlight = <SlateBlue Orange> Z=> $gg.bipartite-coloring.classify(*.value).nodemap(*».key).values;
.say for %highlight
# SlateBlue => [5_14 ...]
# Orange => [3_13. ..]
Finally, we plot the grid graph as a bipartite graph:
#%js
$gg.edges(:dataset) ==>
js-d3-graph-plot(
:%highlight,
vertex-coordinates => $gg.vertex-coordinates,
background => '#1F1F1F',
title-color => 'Silver',
edge-thickness => 5,
vertex-size => 12,
vertex-label-color => 'none',
directed => $gg.directed,
title => 'Grid graph bipartite coloring',
width => 1000,
height => 400,
margins => {top => 80},
force => {charge => {strength => -300}, x => {strength => 0.12}, link => {minDistance => 4}}
)
In this example, we create a grid graph with randomly directed edges and find its connected components.
my $g = Graph::Grid.new(10, 20, :!directed);
my $g2 = $g.directed-graph(method => 'random');
Finding connected components in a directed graph is more complex than in an undirected graph. The connected components are the subsets of the graph where each vertex is reachable from any other vertex in the same subset.
my @components = $g2.connected-components.grep(*.elems - 1);
@components».elems;
We highlight these connected components in the graph using:
#% js
$g2.edges(:dataset) ==>
js-d3-graph-plot(
vertex-coordinates => $g2.vertex-coordinates,
highlight => @components,
directed => $g2.directed,
background => '#1F1F1F',
title-color => 'Silver',
width => 1000,
height => 500,
vertex-label-color => 'none',
vertex-fill-color => 'SteelBlue',
edge-thickness => 3,
edge-color => 'Gray',
vertex-size => 14,
arrowhead-size => 4,
force => {charge => {strength => -160, iterations => 2}, collision => {radius => 1, iterations => 1}, link => {minDistance => 1}}
)
We can also visualize the graph using the Graphviz DOT language:
my $g3 = Graph::TriangularGrid.new(8, 16, scale => 0.3, :!directed);
$g3 = $g3.directed-graph(method => 'random', flip-threshold => 0.25);
#% html
$g3.dot(
highlight => $g3.connected-components.grep(*.elems - 1),
:!node-labels,
node-shape => <hexagon triangle>.pick,
node-height => 0.7,
node-width => 0.7,
edge-thickness => 4,
edge-color => 'Gray',
size => '10,6!',
engine => 'neato'
):svg
Remark: Note that the shape of graph vertices (nodes) is randomly selected.
Remark: The method .dot
takes graph vertex styling options with both prefixes “node-” and “vertex-“.
Graphviz DOT uses “node” in its specs.
In this final example, we create a collage of star graphs.
my @graphs = (^40).map({ Graph::Star.new(n => (8..16).pick, prefix => "$_-") });
my $bigGraph = Graph.new( @graphs.map(*.edges).flat )
We can’t use graph disjoined union because we need to keep the vertex prefixes for random highlights. Instead, we use a normal union:
my $bigGraph = reduce({ $^a.union($^b) }, |@graphs)
To get a range of colors, we use:
my @colors = (^14).map: { sprintf "#%02X%02X%02X", 250 - $_*10, 128 - $_*5, 114 + $_*10 };
Finally, we plot the collage of graphs:
#%js
$bigGraph.edges(:dataset) ==>
js-d3-graph-plot(
highlight => (@colors Z=> @graphs.pick(@colors.elems).map({ [|$_.vertex-list, |$_.edges] })).Hash,
background => '#1F1F1F',
title-color => 'Silver',
edge-thickness => 2,
vertex-stroke-color => 'LightSteelBlue',
vertex-size => 8,
vertex-label-color => 'none',
directed => False,
title => 'Collage graph',
width => 1200,
height => 700,
force => {charge => {strength => -40}, y => {strength => 0.2}, collision => {radius => 12}, link => {minDistance => 4}}
)
#% html
my $preamble = q:to/END/;
label = "Collage graph";
labelloc = "b";
fontcolor = "Gray";
fontsize = 42;
bgcolor = "#1F1F1F";
graph [size="9,9!"];
node [label="", shape=circle, style="filled", fillcolor="SteelBlue", color="Ivory", penwidth=3, width=0.65, height=0.65];
edge [color="SteelBlue", penwidth=4]
END
$bigGraph.dot(
highlight => (@colors Z=> @graphs.pick(@colors.elems).map({ [|$_.vertex-list, |$_.edges] }) ).Hash,
:$preamble,
engine => 'neato'):svg
These examples demonstrate the power and flexibility that Raku currently has for graph manipulation and visualization. I would say, the dark mode, dynamic Javascript plot of the star graphs collage is one of the prettiest graph plots I have created! (Just random, not tied to anything significant…)
As I reported recently, there seems to be a mutual acceptance of raku and perl in the wake of the name change and a bit of time passing:
This is a reproduction of the talk I gave today at the awesome London Perl & Raku Workshop. I met a couple of cool rakuteers and enjoyed the relaxed congregation of perl and raku folk. Seems like the anger has subsided not least thanks to the name change from perl6 to raku. Thanks to all the sponsors, organisers and attendees.
One of the questions I was asked at the workshop by some professional perl coders was “what is Raku and why should I use it?”. I was quite flustered by the question (despite having prepared for it the day before).
OK – I’m gonna unpack that last statement a bit. By best, I mean that raku is still not really used at scale or for large projects by large teams. There are many things to recommend raku for these kind of use cases – concurrency, strong types (optional), role-based composition, unicode and so on. Stability, bug stomping and performance are improving all the time.
But, in addition to great features, any innovative language has to be proven before a real business will be happy to commit to it as a fundamental technical building block. Businesses are, rightly, risk averse. They will consider aspects such as availability of skilled staff, core team bus factor, ecosystem health, etc. So best, for me, means most practical, most likely to succeed, most added value when real-world constraints are applied.
I run a web design consultancy – we focus on WordPress and other PHP based technologies. Have I rewritten WordPress in raku? No indeed. But, since I am an all in rakuteer I have effectively used raku to streamline our business processes:
rawp setup && rawp launch && rawp renewal
. This short performs nginx install, database install, WordPress install, TLS certificate generation, cron for certificate renewal – yes all thatrawm migrate
(and a suitable yaml config)Why do I write and share these modules as FOSS?
I like to code in raku – it is truly a pleasure. When I assemble and prove out a set of CLI commands to do a task, I am thinking “this is cool, how can I capture this recipe and run it automatically every time” (ie. I a way to remember what works and refine it). And I hope that by sharing others will be able to benefit from these potted scripts and may wish to extend and refine them in turn.
In common with the other raku modules listed above, this one works like this:
cat xxx.pl | perl
to run itCLI::Wordpress::Migrator is a script to migrate a WordPress site from one server to another. This performs export (backup), install, import (restore) and search-replace steps according to the configuration in ~/.rawm-config/rawm-config.yaml
This module installs the raku rawm
command for use from the local client command line. (RAku WordPress Migrator).
The process involves three systems:
from
server which is running the source siteto
server which is ready for the migrated siteHere’s the raku MAIN usage:
> rawm
Usage:
./rawm [--ts=<Str>] [--backup-only] [--download-only] [--upload-only] [--restore-only] [--cleanup-only] [--dry-run] <cmd>
<cmd> One of <connect export install import search-replace migrate>
--ts=<Str> Enter timestamp of files [Str] eg. --ts='20241025-17-02-42'
--backup-only Only perform remote backup
--download-only Only perform download (requires timestamp [--ts])
--upload-only Only perform upload (requires timestamp [--ts])
--restore-only Only perform restore (requires timestamp [--ts])
--cleanup-only Only perform remote cleanup
--dry-run Do not perform replace
Here’s the (sanitised) yaml:
from:
user: myusername
subdom: sdname
domain: mydomain.com
key-pub: kpname
port: 22
to:
user: myusername
subdom: sdname
domain: mydomain.com
key-pub: kpname
port: 22
wp-config:
locale: en_GB
db:
name: dbname
user: dbuser
pass: dbpass
prefix: wp_
title: My New WP Installation
url: mysite.com
admin:
user: aduser
pass: adpass
email: ademail
Here’s the perl code to perform the remote backup:
method perl {
my $code = q:to/END/;
#!/usr/bin/perl
use strict;
use warnings;
print "Doing remote backup at %NOW%\n";
`wp db --path='../%WP-DIR%' export %BU-DB-FN%`;
`tar -czf %BU-FS-FN% ../%WP-DIR%/wp-content`;
END
$code ~~ s:g/'%NOW%' /{ $.timestamp}/;
$code ~~ s:g/'%BU-DB-FN%'/{ $.bu-db-fn }/;
$code ~~ s:g/'%BU-FS-FN%'/{ $.bu-fs-fn }/;
$code ~~ s:g/'%WP-DIR%' /{ $.server.wp-dir }/;
$code
}
And here’s the raku code that runs it:
method backup {
my $s := $.server;
my $proc = Proc::Async.new:
:w, qqw|ssh -p { $s.port } -tt -i { $s.key-path } { $s.login }|;
my $promise = $proc.start;
$proc.say("mkdir { $s.tp-dir }");
$proc.say("cd { $s.tp-dir }");
$proc.say("echo \'{ $.perl }\' > exporter.pl");
$proc.say('cat exporter.pl | perl');
sleep 30;
$proc.say("exit");
await $promise;
}
This is a snippet of the source code at https://github.com/librasteve/raku-CLI-Wordpress-Migrator if you want to see the full story.
You are welcome to review it all, fork, PR any changes you would like and use it to manage your WordPress estate!
If you have been paying close attention to my raku module history you will know that often I have the opportunity to install raku on the remote machine and to run that for various tasks (i.e. raws-ec2 setup
). But in the case of Migrator
the idea is to backup and restore remote machines hosted by a third party firm running cPanel and with tight restrictions on installing non standard languages. Happily, as with pretty much all modern Linux installs, their hosted systems are preloaded with perl. So automating the process of logon, save generated perl file on the remote system and run it is very widely applicable.
Meantime, driving this with raku is a very natural choice. Raku features that facilitate this are:
qqx
Ultimately, technically, perl and raku are very complementary – combining the ubiquity of perl with the expressivity of raku to produce a practical outcome. And both have a familiar look and feel…
As usual, comments and feedback very welcome!
~librasteve
The other day, not one but two people tried to use my rak
module to create a custom file system search utility. And had problems getting it to work.
Now, truth be told: they were the first people to use that module other than myself for the App::Rak
command line interface (as described in It's time to rak!). And apparently, the use of the plumbing of App::Rak
was less straightforward than I expected, specifically with regards to the way results are returned.
It wasn't until a bit later that I realized they were reaching for the wrong tool. What they really wanted was to apply some search criteria to a list of files, as determined by some simple rules that weren't covered (yet) by App::Rak
.
Well, there's a module for that: paths
, a fast recursive file / directory finder. Which is one of the dependencies of rak
. And thus of App::Rak
.
All of the code examples here assume you have also added a
use paths;
to your code. And if thepaths
module is not installed yet, you should install it withzef install paths
in your shell.
So how do you use it?
.say for paths;
will produce a list of all files from the current directory (recursively down), except the ones that reside in a directory that starts with a period (so, e.g. all the files and subdirectories in a .git
directory would be skipped).
So, what it you would like to get all JSON-files (as defined by their .json
extension)?
.say for paths(:file(*.ends-with(".json")));
What if you'd like to list all files in ".git" directories?
.say for paths(:dir(".git"));
Or you're just interested in directory names, not in the files inside directories?
.say for paths(:!file);
The :!file
indicates that you're not interested in files. This is Raku's way of specifying the named argument "file" with a False
value. Some would write this as file => False
, which would also work in Raku.
All of the above examples assumed the current directory as a starting place. The paths
subroutine also takes an optional positional parameter: the directory from which to start. So if you want to know all of the directories on your computer, you could start from the root directory:
.say for paths("/", :!file);
This may take a while!
The Raku Programming Language has the IO::Path
object, which conceptually consists of a volume, a directory, and a basename. It supports both purely textual operations, and operations that access the filesystem, e.g. to resolve a path, or to read all the content of a file.
Unfortunately, creating such an object is relatively expensive, so paths
has chosen to just provide absolute paths as strings. If you want to work with IO::Path
objects, the only thing that needs to be done, is to call the .IO
method on the path.
For instance, if you would like to know the name of each file that contains the string "frobnicate", you could do:
.say for paths.grep: *.IO.slurp.contains("frobnicate");
The .IO
method call turns the path string into an IO::Path
object, the .slurp
method call reads the whole contents of the file into memory as a string assuming UTF-8 encoding, and the .contains
returns True
if the given string was found in its invocant.
If you're suprised by the
*.IO...
syntax: that is calledWhatever priming
. In this case, the syntax is short for{ .IO.slurp.contains("frobnicate") }
.
Now, if you do that, there's a good chance that this will end in an execution error, something like Malformed UTF-8 near byte 8b at line 1 col 2
. That's because there's a good chance that at least one of the files is a binary file. Which is generally not valid UTF-8.
You could just ignore those cases with:
.say for paths.grep: { .contains("frobnicate") with .IO.slurp }
The slurp
method will return a Failure
if it couldn't complete the reading of the file. The with
then will only topicalize the value if it got something defined (and Failure
s are considered to not be defined in this context). Then the contains
method is called as before and we get either True
or False
from that.
But doing it this way may just be a little expensive resource wise. If resource usage is an issue for your program, then maybe there's a better way to find out whether something contains text or binary information. And there is: with the sister module path-utils
.
The path-utils
module contains 41 subroutines that take a path string and then perform some check on that path. Let's look at path-is-text
: "Returns 1 if path looks like it contains text, 0 if not".
use path-utils <path-is-text>;
.say for paths.grep: { path-is-text($_) && .IO.slurp.contains("frobnicate") }
But what if you'd only like to look up texts in PDF files? Well, the selection part can be done efficiently by path-utils
as well, with the path-is-pdf
subroutine.
use path-utils <path-is-pdf>;
.say for paths.grep: { path-is-pdf($_) }
but that would only show the files that appear to be PDF files. To actually search in them, you could for instance use Steve Roe's PDF::Extract
module.
use path-utils <path-is-pdf>;
use PDF::Extract;
.say for paths.grep: { path-is-pdf($_) && Extract.new(:file($_)).text.contains("frobnicate") }
It is always important to really understand the question, and to ask further if you don't understand the question. And make sure that the question askers understand your reply. And keep repeating that until you and the question asker are on the same page.
In this case, pointing these two Raku developers to the paths
module, made their project suddenly (almost) a piece of cake.
And for me, it was a fine reason to highlight these cool modules in the Raku ecosystem.
If you like what I'm doing, committing to a small sponsorship would mean a great deal to me!
So, let's say you're golang developer and want pure Go to write some CICD task:
cat task.go
package main
import "fmt"
func main() {
fmt.Println("Hello, pipeline")
}
Go is cool, but there is one thing that makes it difficult to use in high level scenarios - its verbosity. Passing parameters to go tasks and return them back to main scenario takes some efforts and a lot of boilerplate code. It'd be good to keep main code concise and easy to read.
Rakulang in other hand is perfect language when it comes to munge data in and out, due it's extreme flexibility and expressiveness.
In this post I am going to show how to embed golang tasks into CICD pipelines, with a little help of Sparrow framework.
Let's, first of all, modify our golang task code, new version will be:
cat task.go
package main
import (
"fmt"
"github.com/melezhik/sparrowgo"
)
func main() {
type Params struct {
Message string
}
type Result struct {
Message string
}
var params Params
// sparrowgo takes care about passing
// input parameters from Raku to Go
sparrowgo.Config(¶ms)
// read params from pipeline
fmt.Printf("Task params: %s\n", params.Message)
// return results back to pipeline
sparrowgo.UpdateState(&Results{Message : "Hello from Go"})
}
All we've done here is utilized Sparrowgo package that "convert" golang task into Sparrow task with a benefits of passing and returning data from and to Rakulang.
Finally this how our pipeline will look like, now it's Raku part:
#!raku
my $s = task-run ".", %(
:message<Hello from Raku>
);
say "Result: ", $s<Message>;
High level design.
Now, once we have some prove of concept code in place, we can get a high level picture of what our pipeline system could look like:
[ Raku scenario to pass and handle data in and out ]
\ \ \
task.go -> result -> task.go -> result -> task.go -> ...
So, we have the best of two worlds - Raku to write scenarios with less of code and Golang to do all heaving lifting where performance and strict type checking is required.
I published my new book: A Language a Day, which is a collection of brief overviews to 21 programming languages.
This book provides a concise overview of 21 different programming languages. Each language is introduced using the same approach: solving several programming problems to showcase its features and capabilities. Languages covered in the book: C++, Clojure, Crystal, D, Dart, Elixir, Factor, Go, Hack, Hy, Io, Julia, Kotlin, Lua, Mercury, Nim, OCaml, Raku, Rust, Scala, and TypeScript.
Each chapter covers the essentials of a different programming language. To make the content more consistent and comparable, I use the same structure for each language, focusing on the following mini projects:
Each language description follows—where applicable—this pattern:
You can find all the code examples in this book on GitHub: github.com/ash/a-language-a-day.
You can buy it on Amazon or LeanPub as an electronic or Kindle edition, or as a paper hardcover or paperback version. More information with the links to the shops.
While going through all currently open Rakudo issues (as described in Raku Fall Issue Cleanup), I was reminded of a project that I started in April 2020 to make the semantics of the ...
aka "sequence operator" more sane and more performant. And that I came to the conclusion that continuing that project (within the confines of Rakudo) would be a losing battle.
So I created a repository, worked on that for about two weeks. And then stopped. And then forgot all about it until a few days ago.
I have no idea why I stopped. It was early in the Covid pandemic, so maybe I just got depressed. And disheartened by the immense number of still failing tests.
It was weird seeing quite a lot of code that I had written then (more than 4.5 years ago), that looked familiar, but of which I had no active recollection. The code was pre new-dispatch. And pre zef-ecosystem.
It definitely had my style of coding, generally. Still it was oddly unfamiliar in some aspects. Weird how one's coding style changes over the years. More experience? Learned better ways of building class hierarchies? Possibly.
In any case, in order to familiarize myself with the code again, and whether it would be worthwhile to bring that project to fruition, I decided to bring the code up to my own current style / "standards". With about 1200 lines of code, and much of it NQP, that initially felt a bit daunting. But once done, it felt familiar again and some of the concepts that I had been using felt again as the right move forward.
After having done all that, and actually fixing some of the test failures, it was time to look at the still unsolved Rakudo issues that were marked with "... magical madness". The first issue I looked at was about the elucidation of sequences with complex numbers. That was actually actually an easy fix: it was almost more work to add a test for it.
With that done, it was time to release the module to the Raku ecosystem for the first time: Sequence::Generator
. Which gave me the opportunity to comment on an open Rakudo issue, stating it would be fixed if they'd zef install
and use
this module.
There's still a lot of work to be done on the module though. But at least it's now usable and if your case of using the ...
operator is covered by this module, then you'd have 4x to 20x times faster support for it.
And if you've found a bug, or missed some functionality, there's now a dedicated place to let your grievances be known. And a good chance they will be acted upon more quickly.
So how much faster is the module really?
Let's look at a simple benchmark producing all of the odd numbers below 1000. Because imports in the Raku Programming Language are always lexical, it is possible to test both versions of the infix ...
operator in a single program:
# number of time to run code
my int $times = 10000;
# simple timer logic returning number of nanoseconds per iteration
sub timer(&code) {
my $then = now;
(^$times).map: &code;
now - $then
}
# using the core infix ... logic
my $old = timer {
my @a = 1,3,5 ... 1000;
}
# using the Sequence::Generator infix ... logic
my $new = timer {
use Sequence::Generator;
my @a = 1,3,5 ... 1000;
}
# the null loop
my $nil = timer { Nil }
# the result
printf "%.2fx as fast\n", ($old - $nil) / ($new - $nil);
shows up on my computer as:
5.37x as fast
Of course, your mileage may vary. But with this code you should be able to do your own ...
benchmarks more easily.
It was good to recoup the investment I had done in that part of the Rakudo code in 2020. And hopefully set more steps towards integrating this functionality and these semantics into a future language version of the Raku Programming Language.
If you like what I'm doing, committing to a small sponsorship would mean a great deal to me!
Last Saturday, October 26, 2024, another edition of the London Perl & Raku Workshop took place. It was a reopening after a few years break. London Perl Workshop has a long history, started in 2004, so this year it was formally 20 years old.
For me, it was also a Perl-and-or-Raku event after a break. The last in-person event was PerlCon 2019, which was an annual European Perl conference that was held in Riga that year. Since then, after the Covid’s strike, I have organised a couple of (well, three of them) Raku Conferences, but that was online in Zoom. Online due to lockdown first, and then due to people slowly reinventing the joy of travel.
In-between, there were at least two in-person gatherings, another edition of the German Perl and Raku Workshop in April and the American Perl and Raku Conference in Las Vegas in June. I did not attend any of them, but looked at the photos and videos, so I was envy enough because I missed it.
You should not forget that right now, right today, at this very moment, an on-going war against my beloved Ukraine takes place. During the first months, it was not possible to think about anything else, not to mention any entertainment trips or conferences. But time keeps going… and you see less and less blue-and-yellow flags on the streets. Even at Waterstones, the books about that war are located at the very bottom shelves due to the unfortunate nature of alphabetical sorting.
When this year’s London Perl Workshop was announced, I decided to go. Then, I decided not to go. In the end, one week before the event, I thought that I’d better go. Because of my uncertainty, I did not manage to submit a talk on time, so even a small 5-minute lightning talk submission (which I did on the day of the event, ha-ha), did not have a chance to went through. The venue, apparently, was rented until 6pm sharp, and there was no room for anything else.
So, here are the slides of my talks. I’ve added some comments so that you can enjoy it by following my message.
My journey was quite short, I only had a couple of days in London, but the weather made its present and the plane from Amsterdam landed in Southampton instead. So, I took a chance to see the old walls there and to learn that they offer beans on breakfast at IKEA’s canteen.
I really enjoy London, it’s so huge that you fill extremely comfortable in it, how weird that may sound. Despite only a limited couple of days in London, I tried to use the time to wander around just as a tourist.
I also escaped from the middle of the conference day to see the urban life around venue’s location.
On the day before the conference, I met with an ex-colleague, with whom I worked at two different places, both Perl-related. For a few years, he’s been living in London, and I noticed that he looked so much happier since I last saw him. I hope that is because he lives in this great city now, not because he’s programming in Go, not Perl anymore.
Of course, I was happy to see people I know from the Perl and/or Raku communities. I did not see most of them at least since 2019, so some of them required a few seconds to recognise me, LOL.
The workshop was really well attended. Even taking into account the significant drop in the numbers since the beginning of 2000s, it was more than 100 people, and there were two full tracks of the talks.
Let me not list the talks I attended, as the full schedule is online, and the recordings should appear soon.
There was also a third track, partially filled with regular talks, but partially with a broader-scoped events such as Science Perl Talks. I’ve no idea what they talked about there, but it’s somehow connected with the recently-published first issue of the Science Perl Journal and a traditional drama around its establisher.
But the appearance of the new publication is a good thing in the end. I was also surprised, that in 2024, a few new publications about Perl appeared or are planned to be published. I suspect that some of them issued during the previous years are AI-generated texts, but the rest are quite normal and useful books.
There are no decision yet whether there will be another London Perl Workshop in 2025. On the bright side, there should be a soon announcement on the date and location of the German Perl Workshop 2025. Also, there are plans to have a Dutch Perl Workshop 2025 in Utrecht.
I am thinking of if it’s time to revitalise the annual European Perl and Raku Conference.
This is a reproduction of the talk I gave today at the awesome London Perl & Raku Workshop. I met a couple of cool rakuteers and enjoyed the relaxed congregation of perl and raku folk. Seems like the anger has subsided not least thanks to the name change from perl6 to raku. Thanks to all the sponsors, organisers and attendees.
The more observant members of the raku community will notice that I released a new raku module HTML::Functional a couple of weeks back.
The general idea for this came from the Software Unscripted podcast created by Richard Feldman … one of the dev team on the Elm language – A delightful language
for reliable web applications.
Elm is a functional language that streamlines the development of websites by making the code simpler, more composable, more expressive and more maintainable than raw HTML. I want this for raku!
Actually I want to code my HTML on the server side with raku whereas Elm compiles to JavaScript and runs in the browser. BUT I do like the Elm code style.
Here’s the HTML from the Elm example…
[the code examples in this post may be found in https://github.com/librasteve/raku-HTML-Functional … they are presented as images here for the syntax highlighting… zoom Cmd+ your browser to get a better view]
And here it is in Elm…
This functional code style is cleaner and more expressive than classic HTML tags. Elm avoids the need to repeat tags at start and end of a block and eliminates visually noisy angle brackets.
And now for the raku…
To get this module simply go zef install HTML::Functional
and then use HTML::Functional;
in your header.
This is even cleaner than the Elm and employs a lot of raku functional features implicitly:
div
, h1
, p
and so on()
are optional in raku function callsnamed
argumentsh1
) are passed as raku positional arguments:name<value>
@inners
are passed as a literal Array [] – div
contains h1
and p
strong
is evaluated before p
, before div
and so on;
is used as the Array literal separator to suppress nesting of tagsNormally the items in a raku literal Array are comma , separated. Raku precedence considers that div [h1 x, p y]; is equivalent to div( h1(x, p(y) ) ); … so the p
tag is embedded within the h1
tag unless parens are used to clarify. But replace the comma ,
with a semi colon ;
and predisposition to nest is reversed. So div [h1 x; p y]; is equivalent to div( h1(x), p(y) ). Boy that Larry Wall was smart!
The raku example also shows the power of the raku Q lang at work:
""
interpolate their contents"{fn x}"
~
is for Str concatenationq:to/END/;
can be used for verbatim text blocksTo see this in action, view the demo video:
The video showcases IntelliJ IDEA Universal IDE with the latest comma-plugin2.0 courtesy of @ab5stract.
HTML::Functional does this:
The clean HTML code it delivers is in tension with modern event driven web frameworks such as React. As we have seen in other posts, HTML::Functional is even more maintainable if used in conjunction with HTMX.
And let’s end with a quote (sadly I did not record the originator)…
I think i just expressed my thought in a wrong way, haha. I am a functional freak, and the first thing i did was check out Raku’s functional patterns. I was amazed. Raku can be extremely functional, but in my opinion language can be called functional when there’s no other way other than functional for the most part. Raku has great functional support, but the language doesn’t force you into anything, you can do basically anything! A sandbox language, and i am loving it.
anon
Comments and feedback welcome!
~librasteve
About 3 weeks ago I thought it was time to go through the outstanding Rakudo compiler issues (an implementation of the Raku Programming Language) to see how many of them would have been fixed by now with the new Raku grammar.
Why? Because we have reached 85+% of test coverage of the new Raku Grammar in roast, the official Raku test-suite.
Success as defined by the number of test files that completely pass without any test failures. Most other test files also have tests passing, but they're not 100% clean.
At that point there were 1312 open Rakudo issues, with the oldest being from 2017.
There are now 778 issues left open. So you could say that's 534 issues closed. Actually, it is a few more as during this 3 week period, a few new issues were posted.
In this blog post I'll be describing the types of issues that I've seen, and how I handled them.
Sadly, there's not a lot I could do at those specific issues, as my JVM foo is close to zero. There are currently 22 open issues for the JVM-backend. If you have any JVM chops, it would be greatly appreciated if you would apply them to solving these Rakudo issues!
When I started, the oldest open issue in the Rakudo repository was about 7.5 years old, and effectively lost in the mist of time. In any case they predated significant changes, such as the new dispatch mechanism. Not to mention the Raku Programming Language still had a different name then.
If you're really interested in the mists of time, you could check out the old issues that were migrated from the venerable RT system, with the oldest open issue from 2010!
So going from the oldest issue to the newest, it was a matter of checking if the problem still existed. And if it appeared fixed, and if it was possible to write a test for it, write a test for it and close the issue. And sometimes there even was a PR for a test already, so that was even more a no-brainer.
Quite a few number of issues were actually marked as fixed, but were also marked as needing tests. Sadly, the original author of the issue had not done that or didn't do that after the issue was fixed. In most cases it was just a few minutes to write a test, test it and commit it.
After 18 months of work in 2020 and 2021, the new dispatch mechanism became default in the 2021.10 release of the Rakudo compiler. Most of the multi method dispatch related issues that were made before that time, appeared fixed. So it just was a matter of writing the tests (if there weren't any yet) and commit them.
While testing all of these issues, I always also tested whether they were fixed in the new Raku grammar, based on the RakuAST project (which is why I started doing this bug hunting streak in the first place).
Running code with the new Raku grammar, is as easy as prefixing
RAKUDO_RAKUAST=1
to your call toraku
. For instanceraku -e '{ FIRST say "first" }'
does not output anything with the legacy grammar. But withRAKUDO_RAKUAST=1 raku -e '{ FIRST say "first" }'
it will say "first" because theFIRST
phaser fires for any block when it is first executed with the new Raku grammar.
And to my surprise, in many cases they were! For those cases a special test file is being reserved to which tests are added for issues that have been fixed with the new Raku grammar in RakuAST.
These issues are then marked as being fixed in RakUAST but left open, so that people are hopefully prevented from creating duplicate issues for problems that apparently haven't been fixed yet.
A large percentage of these issues appear fixed because they were essentially static optimizer issues, and the new Raku grammar doesn't have any of these compile-time optimisations yet. So it's important to be able to check for regressions of this type once optimizations are being added back in. In turn, these static optimizer issues were often caused by the optimizer not having enough or the correct information for doing optimizations. Which in turn was one of the reasons to start with RakuAST to begin with.
And then there were the issues that were simply still reporting an existing problem. Some of them, with the knowledge that I acquired over the years, looked like easy to fix. So I put in some effort to fix them. A non-exhaustive list:
$:F
as placeholder variable with use isms
Failure
objects when numerically comparing non-numerical values+permutations(30)
@*ARGS
to contain other Cool
values (apart from strings)Rat
s with 0
denominator (e.g. <1/0> <=> <-1/0>
)⁰¹²³⁴⁵⁶⁷⁸⁹
superscript characters--repl-mode=interactive
CLI argument always force an interactive REPLany
junctions in regex interpolationval()
(such as ⅓
)rlrwap
as line editor in REPL if no other modules installedAbout 50 of the outstanding issues look like they should be fixable without turning into large projects, so I will be looking at these in the coming days / weeks.
Some of the open issues were basically feature requests. Sometimes I felt that they could be easily implemented (such as several error message improvements) so I implemented them. Others I created a Pull Request for. And for still others I felt a problem-solving issue would be needed (which I then created). And some I closed, knowing almost 100% they would never be accepted.
If this was one of your issues, and you still feel that feature should become part of the Raku Programming Language, please don't be discouraged! Looking at a 60+ issues for 3 weeks in a row sometimes made me a bit grumpy at the end of the day. Please make a new problem solving issue in that case!
Many issues looked like they would be more easily solvable in the new Raku grammar with RakuAST. There are now 289 of them. These will be next on my list.
It was an interesting ride through memory lane the past weeks. With about 200 commits, that's not bad at all!
Note that with these numbers of issues, if I had an error rate of only 1%, there are at least 5 issues that were closed when they shouldn't have been closed. If you feel that an issue has been closed incorrectly, please leave a comment and I'll re-open them if you cannot do that yourself.
Sadly, because of the additional tests that I wrote, the number of roast test-files passing has now dropped again below the 85% mark. Still, I do think this is progress, as the errors that they check for would have been encountered during the development of the Raku grammar sooner or later anyway.
Anyway, it was fun being able to close as many issues as I did! Want to join in the fun? There are still 778 issues open waiting for someone to close them!
If you like what I'm doing, committing to a small sponsorship would mean a great deal to me!
Late Edit – thanks to a genius suggestion from @wamba, I have made some late changes to improve the code – specifically the concerns I mentioned in v1 about when
clause and lisp ((((
are now fixed. Excellent input!!
Regular followers of this blog will know that I am on a bit of a Functional tack … I’m highly motivated to improve my functional chops because I am giving a talk at the London Perl and Raku Conference shortly entitled Raku HTML::Functional
and I just realised that the audience is going to be a bunch of deep perl and raku (linux) experts who know Functional coding inside out. Yikes … hope I don’t get any tough questions from folks who really know their stuff!
By contrast, I am a dabbler in Functional. I like the feel of using .map
and .grep
and so on, but I am on the learning curve. And I am resistant to languages that constantly get in my way since largely I am trying to make code that works rather than to wrestle with compile errors. (And no I do not work in large teams since you ask).
So when I saw a recent post on HN, written in F#, I felt challenged to work out what was going on and to try and relate to Raku.
type Meat = Chicken | Beef | Pork | Fish | Veggie
type Ingredient =
| Cheese | Rice | Beans | Salsa | Guacamole | SourCream | Lettuce
| Tomato | Onion | Cilantro | PicoDeGallo
type Burrito = Meat option * Ingredient list
let (>>=) burrito f =
match burrito with
| Some meat, ingredients -> f (Some meat, ingredients)
| None, _ -> None, []
let returnBurrito (meat, ingredients) = meat, ingredients
let tortilla = returnBurrito (Some Veggie, [])
let addMeat meat (m, ingredients) = Some meat, ingredients
let addIngredient ingredient (meat, ingredients) =
meat, ingredient :: ingredients
let addMissionBurritoIngredients (meat, ingredients) =
meat, Cheese :: Rice :: Beans :: ingredients
let holdThe ingredient (meat, ingredients) =
meat, List.filter (fun i -> i <> ingredient) ingredients
let burrito =
tortilla
>>= addMeat Chicken
>>= addMissionBurritoIngredients
>>= holdThe Cheese
>>= addIngredient PicoDeGallo
>>= addIngredient Salsa
>>= addIngredient Guacamole
>>= addIngredient SourCream
printfn "%A" burrito
This from the OP by William Cotton
Since we are talking Monads, I realised that the raku Definitely module written by masukomi
would come in handy. This module arose from a post I made here some time back, so it was a good time to revisit.
https://github.com/librasteve/raku-Burrito/blob/main/burrito-dm.raku
use Definitely;
enum Meat <Chicken Beef Pork Fish Veggie>;
enum Ingredient <Cheese Rice Beans Salsa Guacamole SourCream Lettuce
Tomato Onion Cilantro PicoDeGallo>;
sub returnBurrito($meat, @ingredients) {
$meat, @ingredients
}
sub tortilla {
returnBurrito(something(Veggie), [])
}
sub add-meat($meat, ($, @ingredients)) {
something($meat), @ingredients
}
sub add-ingredient($ingredient, ($meat, @ingredients)) {
$meat, [$ingredient, |@ingredients]
}
sub add-mission-burrito-ingredients(($meat, @ingredients)) {
$meat, [Cheese, Rice, Beans, |@ingredients]
}
sub hold-the($ingredient, ($meat, @ingredients)) {
($meat, [@ingredients.grep(* != $ingredient)]);
}
multi infix:«>>=»((None $, @), +@ ) is prec(prec => 'f=') {
nothing(),[]
}
multi infix:«>>=»($burrito, +(&f, *@args)) is prec(prec => 'f=') {
f( |@args, $burrito )
}
tortilla()
>>= (&add-meat, Beef)
>>= (&add-mission-burrito-ingredients)
>>= (&hold-the, Cheese)
>>= (&add-ingredient, PicoDeGallo)
>>= (&add-ingredient, Salsa)
>>= (&add-ingredient, Guacamole)
>>= (&add-ingredient, SourCream)
==> say();
I hope that you will agree that Raku does a generally solid job of handling the translation from F#.
There are a couple of raised eyebrows around the EDIT – NOW FIXEDwhen {...}
clauses and the handling of variadic arity of the passed in function in the match
and the lisp-like ((((((
parens in the application of the custom binder. Otherwise, it is pretty smooth.
The Definitely module works well here, I have also tried with Rawley Fowlers Monad::Result module which was similarly successful.
In this self-study, I leaned on the excellent Wikipedia Monad page, which mentions that a true Monad implementation has three operations:
And it shows chaining of this halve
function as an example of chaining with the bind operator in Haskell:
halve :: Int -> Maybe Int
halve x
| even x = Just (x `div` 2)
| odd x = Nothing
-- This code halves x twice. it evaluates to Nothing if x is not a multiple of 4
halve x >>= halve
So, to improve the Definitely module, I have added a binding operator to be used like this:
use Definitely;
sub halve(Int $x --> Maybe[Int]) {
given $x {
when * %% 2 { something( $x div 2 ) }
when ! * %% 2 { nothing(Int) }
}
}
say (halve 4) >>= &halve; #1
say (something 32) >>= &halve >>= &halve >>= &halve; #4
say (halve 3) ~~ None; #True
Note that the Result::Monad module already has bind and map operators provided.
For now this is a PR, feel free to install directly from my fork here if you would like to try it. Now released to zef package installer eco-system…
zef install https://github.com/librasteve/Definitely.git
As usual all comments and feedback welcome!
~librasteve
NOTE: There is an issue when opening existing Comma projects that were created in earlier versions. Please use New project from Existing Sources...
rather than Open
and make sure to select Yes
when it prompts you about overwriting an existing .idea
file in the project directory.
This release represents a major shift for the Comma project in many ways.
From the bottom of my heart, I want to express the deepest gratitude and thanks to Jonathan Worthington (jnthn++
),
Edument, and all past and future contributors to the Comma project. There's been
so much effort put into this codebase and it was an honor to be able to work on it.
The most major change is the shift to the IntelliJ Platform Gradle Plugin 2.0
. This
allows Comma to be built (as a plugin) without cloning the
intellij-community
repo and downloading it's entire dependency tree!
This does seem to preclude building Comma as a standalone IDE, at least for the time being. That appears to be a different beast entirely and we will have to investigate that as the time and tuits allow.
Other major changes included updating the code to correct for broken and (some) deprecated
API changes, as well as the significant cosmetic adjustment of migrating Perl6
to Raku
.
The latter should be almost entirely finished, but there might be some stragglers that I've
missed.
Building should be as simple as opening this repository in IntelliJ IDEA
(using version 2024.2
or greater), and selecting
build > build
from the Gradle build target options. Or, for more immediate gratification, you can select intellij platform > runIde
.
Update: If you don't feel like building it yourself, you can now simply download the plugin zip from GitHub. From inside IntelliJ IDEA, open the Settings > Plugins
, find the gear icon, and select Install Plugin from Disk...
.
Next steps:
<Your wishlists go here!>
Happy hacking! :D
Back in ’21 I asked the question Can Raku replace HTML? As expected that rather click-baity title got a lot of complaints. So I couldn’t resist repeating the meme.
If you are wondering, Raku can replace PHP literally…
"PHP".subst(/PHP/, 'Raku').say; #Raku
BUT that’s beside the point. Just my sense of -Ofun getting out of hand.
In recent posts, I have been digging in to HTMX and Raku Cro…
And while in the web application frame of mind, I started to think maybe I can use Raku with WordPress, perhaps initially to just write some front end with Raku and HTMX served with Cro and to talk to the WP database backend. (This kind of combination is already a thing with WordPress and React).
And then that made me think yeah well WordPress (and Laravel, OJS, etc.) continue to be popular and lend PHP a kind of ongoing zombie existence. PHP is not likely to suddenly bust out of its web language niche, so likely over time it will gradually fade away in popularity. And much of the gravity in web development is going to drag PHPers towards JavaScript. And, since I am a PHP coder in my day job, I realised that (like me) many PHPers travellers would rather not get dragged into the JavaScript / React / Composer / Node black hole of complexity. And so maybe Raku and HTMX would one day become a good upgrade path from PHP since it has roots in perl – the original web language – with a friendlier syntax (eg for OO). Even the $
sigil for variables, {}
curlies and the ;
semicolon make for a smooth transition from PHP. Maybe in this niche Raku can ultimately replace PHP…
Then I started to think about what made PHP the goto language for web developers originally. How would Raku stack up?
Remember this:
<body>
<div class="container">
<h1>Welcome to My Simple PHP Page!</h1>
<p>
Today is:
<?php
// Get the current date and time
echo date("l, F j, Y, g:i a");
?>
</p>
<p>
Random number:
<?php
// Generate a random number between 1 and 100
echo rand(1, 100);
?>
</p>
</div>
</body>
</html>
The full source of this index.php file is in this gist … Simple PHP HTML Page
To serve this page, you can run a server like this:
php -S localhost:8000 -t /path/to/directory
Horrible though it is, this intertwining of PHP and HTML is what made PHP the goto web language in its heyday. And that got me thinking, could this be done with Raku?
So, knowing Raku to be a very flexible language, I made a new module Cro::WebApp::Evaluate. Here’s the synopsis:
<body>
<div class="container">
<h1>Welcome to My Simple Raku Page!</h1>
<p>
Today is:
<?raku
#Get the current date and time
use DateTime::Format;
say strftime('%Y-%m-%d %H:%M:%S', DateTime.now);
?>
</p>
<p>
Random number:
<?raku
#Generate a random number between 1 and 100
say (^100).pick;
?>
</p>
</div>
</body>
And here’s how to serve this as index.raku using the Raku Cro web framework.
use Cro::HTTP::Router;
use Cro::WebApp::Template;
use Cro::WebApp::Evaluate;
sub routes() is export {
route {
evaluate-location 'evaluates';
get -> {
evaluate 'index.raku';
}
get -> *@path {
static 'static', @path;
}
}
}
I leave it as an exercise for the reader to show how to have Cro render and serve index.php files in a parallel directory and route structure – perhaps for an incremental migration effort.
Do I expect this new module to be embraced by the PHP community? No. In most cases, I think that hybrid PHP/HTML pages like this have been replaced by templating systems or web frameworks.
Am I a little ashamed to have made this module? Yes. Honestly, I would not encourage coders to start using Raku like this – Cro Templates would be a better solution for most projects.
Are there some point needs where this approach can be applied? Maybe. Since this was a seminal feature of early PHP, I expect that there are some point cases where embedding Raku and HTML will be the cleanest way to (re)package some code. For example where a single dynamic page uses PHP for a database query, wrapping the results as json and then passing the data into a JavaScript function … for some client-side logic with the (eg.) Google Maps API and then dynamic presentation.
Is this a module prerequisite for PHPers to migrate to Raku? Probably not. However, I think that the presence of this module can bring some comfort to PHP coders that anything that can be done in PHP can be (re)done in Raku.
As usual comments & feedback welcome!
~librasteve
This post is kind of part 3, coming off last week’s thrilling episode.
I am a simple sole, I want to reduce the cognitive load in my web projects. The general idea is to go back to the halcyon early days of the web before Netscape dropped the JS-bomb. You know HTML for the layout and CSS for the style. An elegant division of roles.
When I read about HTMX it was clear that Raku and Cro are ideal candidates for the back end HTML assembly, defining routes and serving RESTful APIs. As we have seen in the previous posts, HTMX eliminates the need for JS to make dynamic web content. Lovely.
Remember – we are talking simpler ways to build attractive, dynamic, modern websites. While HTMX is well suited to 90% of this, if you are building a webapp like FaceBook or Google Maps, then it’s not for you.
But what to do about style and CSS?
Well HTMX is neutral to CSS … it can be used with Bootstrap, Tailwind, SASS and so on. But many of these CSS tools have evolved to jam more stuff into the HTML tag attributes.
In my mind, the ideal would be something like this for a simple navbar:
<nav>
<ul>
<li><a href="#">About</a></li>
<li><a href="#">Services</a></li>
<li><a href="#">Products</a></li>
</ul>
</nav>
I had heard that Pico CSS was often used in HTMX projects. And sure enough, the Pico Components have this feel…
Here’s Bootstrap for contrast:
<nav class="navbar navbar-expand-lg navbar-light bg-light">
<div class="collapse navbar-collapse" id="navbarNav">
<ul class="navbar-nav ms-auto">
<li class="nav-item">
<a class="nav-link" href="#">About</a>
</li>
<li class="nav-item">
<a class="nav-link" href="#">Services</a>
</li>
<li class="nav-item">
<a class="nav-link" href="#">Product</a>
</li>
</ul>
</div>
</nav>
What about Tailwind, also for contrast:
<nav class="bg-gray-100">
<div class="flex justify-end">
<ul class="flex space-x-3">
<li class="nav-item">
<a class="text-gray-700 hover:text-gray-900" href="#">About</a>
</li>
<li class="nav-item">
<a class="text-gray-700 hover:text-gray-900" href="#">Services</a>
</li>
<li class="nav-item">
<a class="text-gray-700 hover:text-gray-900" href="#">Product</a>
</li>
</ul>
</div>
</nav>
Bootstrap and Tailwind come at the cost of “more stuff in the HTML tags”.
Here’s the Pico CSS example:
So for our goals, it looks like Pico CSS is on a good track. Their website says:
What’s that?
Looks like my goal all along has been Semantic HTML (not that I knew at the time).
By adding semantic HTML tags to your pages, you provide additional information that helps define the roles and relative importance of the different parts of your page.
(As opposed to non-semantic HTML, which uses tags that don’t directly convey meaning.)
So having more powerful Semantic HTML is a win.
Hopefully the figure above is enough of an eye opener for now. There’s much more info out there if you are curious. But obviously read the rest of my post first.
[For hardcore aficionados, I plan to look into Web Components in a future post. I also think that Bootstrap and Tailwind and SASS in general are good companions to HTMX and Raku — but my project and this series of posts starts by using Pico CSS to minimize the cognitive load on the style side – later we will come back to these other styling tools].
So lets see how this looks in action. All the code for these posts is on GitHub for your perusal and collaboration.
I have been using Pico CSS as part of my project to rebuild the HTMX examples for Raku / Cro largely by translating the Python / Flask examples. This post draws on the Tabs HATEOAS one in particular, since I have in mind that I will want a Tab Component in my toolbag but that Pico CSS does not provide one out of the box. Pico does have Accordions so there is some prior art for inspiration.
Anywho, here’s the way the final code ended up.
tabs/index.crotmp:
<nav>
<ul>
<li>
<a href="#" hx-get="/tabs/tab1" hx-target="#tab">Tab 1</a>
</li>
<li>
<a href="#" hx-get="/tabs/tab2" hx-target="#tab">Tab 2</a>
</li>
<li>
<a href="#" hx-get="/tabs/tab3" hx-target="#tab">Tab 3</a>
</li>
</ul>
</nav>
<div id="tab" hx-get="/tabs/tab1" hx-trigger="load"></div>
/tabs/tab1.crotmp (tab2 and tab 3 are much the same so I won’t bore you)
<section>
<figure>
<blockquote>
"When you're new to something, you bring an ignorance that can
be highly innovative."
<footer>
<cite>– Rick Rubin</cite>
</footer>
</blockquote>
</figure>
</section>
/Routes/Examples/Tabs.rakumod to fulfil the hx-get
attrs.
use Cro::HTTP::Router;
use Cro::WebApp::Template;
sub tabs-routes() is export {
route {
template-location 'templates/tabs';
get -> {
template 'index.crotmp';
}
get -> 'tab1' {
template 'tab1.crotmp';
}
get -> 'tab2' {
template 'tab2.crotmp';
}
get -> 'tab3' {
template 'tab3.crotmp';
}
}
}
And the proof… oh yeah, Pico has built in dark mode
Thanks for tuning in, please feel free to like, share or comment.
You can find me on the Raku Discord and IRC Channels.
~librasteve
Sparky recent releases have introduced a lot of features allowing to build user interfaces for internal web applications.
In a nutshell Sparky is a web platform to run automation tasks, and Sparky equips you with a rich feature set to build a frontend to launch tasks with parameters.
Let's brew some coffee, even though I've been trying to slow down on caffeine recently.
allow_manual_run: true
vars:
-
name: Flavor
default: "latte"
type: select
values: [ espresso, amerikano, latte ]
-
name: Topic
default: "milk"
type: select
values: [ milk, cream, cinnamon ]
multiple: true
-
name: Step3
default: "boiled water"
type: input
This simple Sparky job definition will spin up a simple UI with 3 controls:
In Raku scenario those input parameters are handled like that:
my $flavor = tags()<Flavor>;
my $water = tags()<Step3>;
my @topics = [];
# tags() needs :array modifier
# as multiple choices are passed
for tags(:array)<Topic><> -> $t {
@topics.push: $t;
};
Simple, huh?
One more coffee, please!
Another cool feature of Sparky is a multi scenario flows or group variables. Let's say you would like to give a choice of coffer or tea, so instead of having 2 job definitions with different sets of variables, let's create just one:
vars:
# tea vars
-
name: Flavor
default: "black"
type: select
values: [ black, green ]
group: [ tea ]
-
name: Topic
default: "milk"
type: select
values: [ milk, cream ]
group: [ tea ]
# coffee vars
-
name: Flavor
default: "latte"
type: select
values: [ espresso, amerikano, latte ]
group: [ coffee ]
-
name: Topic
default: "milk"
type: select
values: [ milk, cream, cinnamon ]
group: [ coffee ]
multiple: true
# common vars
-
name: Step3
default: "boiled water"
type: input
group: [ tea, coffee ]
# every var
# in group vars
# is a separate
# scenario
group_vars:
- tea
- coffee
Now, when the job gets run, we have a choice:
And then (when click on proper link), let's enjoy some tea:
Sparky is a nice web console to build internal automation services with HTML and write automation scenarios with Raku. Thanks for reading.
TL; DR: How to deploy docker to minikube when one does not need anything fancy, but pure Raku.
So, you have your own pet K8s cluster deployed as minikube and you want to play with it. You have few microservices to build and you don't want to bother with kubernetes low level commands at all.
On the other hand, you setup is complex enough to express it in a bunch of yaml files or kubectl commands. Here is an elegant way to handle this in pure Raku, and it's called Sparky ...
|-------------------------------------|
| Sparky -> kubectl -> MiniKube |
| /\ /\ /\ |
| pod pod pod |
|-------------------------------------|
So the infrastructure part is simple - on the same host we install minikube and Sparky that underlying uses kubectl to deploy containers into k8s cluster.
As usually Sparky job is a pure Raku code, but this time some plugins will be of use as well ...
Sparky is integrated with - Sparrowhub - https://sparrowhub.io - repository of Sparrow plugins - useful building blocks for any sort of automation.
Let's use a couple of them - k8s-deployment and k8s-pod-check to deploy and check Kubernetes pods. From Sparky point of view those are just Raku functions, with some input parameters.
task-run "dpl create", "k8s-deployment", %(
:deployment_name<nginx>,
:app_name<nginx>,
:image<nginx:1.14.2>,
:3replicas,
);
# give it some time to allow all pods to start ...
sleep(5);
task-run "nginx pod check", "k8s-pod-check", %(
:name<nginx>,
:namespace<default>,
:die-on-check-fail,
:3num,
);
For the tutorial purpose we are going to deploy nginx server with 3 replicas, by using k8s-deployment plugin.
Let's give it a try.
And the very first deploy ... fails:
... some output ...
11:03:34 :: deployment.apps/nginx created
11:03:34 :: [repository] - installing k8s-pod-check, version 0.000012
11:03:34 :: [repository] - install Data::Dump to /home/astra/.sparrowdo/minikube/sparrow6/plugins/k8s-pod-check/raku-lib
All candidates are currently installed
No reason to proceed. Use --force-install to continue anyway
[task run: task.pl6 - nginx pod check]
[task stdout]
11:03:41 :: ${:die-on-check-fail(Bool::True), :name("nginx"), :namespace("default"), :num(3)}
11:03:41 :: ===========================
11:03:41 :: NAME READY STATUS RESTARTS AGE
11:03:41 :: nginx-77d8468669-5gxbf 0/1 ErrImagePull 0 5s
11:03:41 :: nginx-77d8468669-c5vbl 0/1 ErrImagePull 0 5s
11:03:41 :: nginx-77d8468669-lhc54 0/1 ErrImagePull 0 5s
11:03:41 :: ===========================
11:03:41 :: nginx-77d8468669-5gxbf POD_NOT_OK
11:03:41 :: nginx-77d8468669-c5vbl POD_NOT_OK
11:03:41 :: nginx-77d8468669-lhc54 POD_NOT_OK
[task check]
stdout match <^^ 'nginx' \S+ \s+ POD_OK $$> False
---
Although Kubernetes deployment has been successfully created, further k8s-pod-check failed to verify that all pods are running.
Use of die-on-check-fail
option made the job stops strait away after this point.
The reason is in ErrImagePull
- nginx docker image is not accessible from within a minikube, which a known minkube DNS issue, which is easy to fix.
All we need to do is to upload nginx docker image manually, so that minukube will pick it up from a file cache:
minikube image load nginx:1.14.2
Now, when have restarted the failed job we get this:
... some output ...
11:05:42 :: deployment.apps/nginx unchanged
[task run: task.pl6 - nginx pod check]
[task stdout]
11:05:47 :: ${:die-on-check-fail(Bool::False), :name("nginx"), :namespace("default"), :num(3)}
11:05:47 :: ===========================
11:05:47 :: NAME READY STATUS RESTARTS AGE
11:05:47 :: nginx-77d8468669-5gxbf 1/1 Running 0 2m10s
11:05:47 :: nginx-77d8468669-c5vbl 1/1 Running 0 2m10s
11:05:47 :: nginx-77d8468669-lhc54 1/1 Running 0 2m10s
11:05:47 :: ===========================
11:05:47 :: nginx-77d8468669-5gxbf POD_OK
11:05:47 :: nginx-77d8468669-c5vbl POD_OK
11:05:47 :: nginx-77d8468669-lhc54 POD_OK
[task check]
stdout match <^^ 'nginx' \S+ \s+ POD_OK $$> True
<3 pods are running> True
---
The last deployment has not changed (with is denoted by "deployment.apps/nginx unchanged" line), as we did not change anything, however minikube now is able to pick the recently uploaded docker image and all pods now are running.
Congratulation with the very first successfully deployment to Kubernetes via Sparky!
In the end let's remove our test pods, by using k8s-deployment plugin:
task-run "dpl delete", "k8s-deployment", %(
:deployment_name<nginx>,
:action<delete>,
);
This simple scenario is going to give us some ideas on how to deploy to Kubernetes in imperative way using pure Raku, I, personally like this approach better, as having a bunch of helm charts and yaml files seems overkill when one need just to deploy some none production code, however, as always YMMV, thanks for reading ...
So, you have a hundred VMs you need to manage, and you have ... Ansible ? I should stop here, as this is a tool that is standard in configuration management nowadays, but I'd dare to continue and say there is a better alternative to it.
But before we get into it, why am I so frustrated with Ansible? Here are my points:
YAML based declarative DSL really stinks on complex tasks as it lacks the flexibility that imperative languages have.
YAML is not even a programming language, and you gonna pay the price very soon.
To keep ansible code clean and simple, extra efforts are required, one need to refactor out all the complexity from YAML to python modules and this feels like "why I even start using YAML DSL"?
Ansible reports are frustrating as I always need to add these debug tasks to show real STDOUT/STDERR emitted from commands, where it should just work out of the box.
Ansible ties me to the idea of "running on a host," where sometimes I need to run tasks not tied to hosts, yes, you can still use "ansible_connection=local" but this feels awkward.
So, meet Sparky - elegant, efficient and all-battery included automation tool. It's written on powerful and modern Raku language, with bulma css frontend and web sockets.
To install Sparky - install Rakudo first and then install Sparky itself as a Raku module:
curl https://rakubrew.org/install-on-perl.sh | sh
eval "$(~/.rakubrew/bin/rakubrew init Bash)"
rakubrew download moar-2024.05
git clone https://github.com/melezhik/sparky.git
cd sparky/
# install Sparky and it's dependencies
zef install --/test .
# init sparky sqlite database
raku db-init.raku
# run sparky job runner
nohup sparkyd >~/.sparkyd.log < /dev/null &
# run sparky web console
cro run
This simple scenario gets it up and running; if you go to http://127.0.0.1:4000 you'll see a nice Sparky web console. We use the console to run sparky jobs.
So we have a control plane that would manage many hosts over ssh, using push mode:
---------------
| CP , Sparky |
---------------
[ssh]
/ / | \ \
host host host host host
This is pretty much what ansible does ...
Now say, we have 5 NGINX servers we need to restart, let's drop a simple Sparky job to do this in pure Raku language:
use Sparky::JobApi;
class Pipeline does Sparky::JobApi::Role {
method stage-main {
for 1..5 -> $i {
my $j = self.new-job :workers<5>;
$j.queue: %(
sparrowdo => %(
bootstrap => true,
host => "nginx_{$i}.local.domain"
),
tags => %(
stage => "child",
i => $i
)
);
}
}
method stage-child {
service-restart "nginx"
}
}
In this scenario, Sparky will run five parallel jobs that restart nginx on five hosts. Simple and elegant.
Moreover those five jobs will appear as five separate reports in Sparky UI …
Of course, this is only a quick glance at Sparky architecture and features, things to cover further:
Sparky project - https://github.com/melezhik/sparky
The question has been raised, how to get named arguments into sub EXPORT
via a use
-statement. The ever helpful raiph provided an answer, which in turn left me with the question, why he didn’t just use a Capture
to move the data around. Well, because that doesn’t work. The compiler actually evaluates the expression \(:1a, :2b)
into (1, 2)
before passing it on to EXPORT
.
If it’s hard, do it functional!
# foo.raku
use v6.d;
constant &transporter = sub { \(:1a, :2b); }
use foo &transporter;
# lib/foo.rakumod
use v6.d;
proto sub EXPORT(|) { * }
multi sub EXPORT(&transporter) {
&EXPORT(|transporter);
}
multi sub EXPORT(:$a, :$b) {
dd $a, $b;
Map.new
}
The idea is to hand a function to use
to be called by EXPORT
, and then redispatch the value that is produced by that function, to take advantage of Raku´s excellent signature binding. The proto
and refering to sub EXPORT
explicitly is needed because there is also a predefined (and in this case hidden) package
called EXPORT
.
I’m passing on named arguments to EXPORT
, but all kinds of stuff could be returned by &transporter
. So long as everything is known pretty early on at compile-time. The use
-statement is truly an early bird.
As the title states, I made Raku bigger because lol context (that’s how the Synopsis is calling **@
) makes supporting feed operators fairly easy. I wonder if Larry added this syntax to Signature
with that goal in mind. With PR#5532 the following becomes possible.
<abc bbc cbc> ==> trans('a' => 'x', 'b' => 'i') ==> say();
# OUTPUT: (xic iic cic)
Armed with this I can make a script of mine a little simpler.
use MONKEY-TYPING;
augment class IO::Path {
method trans(|c) {
my $from = self.Str;
my $to = self.Str.trans(|c);
self.rename($to) unless $from eq $to
}
}
sub rename-whitespace(IO::Path $dir where *.d){
dir($dir).grep({ .d || .f && .rw })
==> trans("\c[space]" => "\c[no-break space]", "\c[apostrophe]" => "\c[prime]")
==> sub (*@a) { print '.' for @a}();
dir($dir).grep({ .d && .rw })».&?ROUTINE;
}
rename-whitespace('.'.IO);
put '';
I don’t like spaces in filenames, as they are often found with audio or video files. Having auto-complete friendly names makes using a CLI less bumpy. By teaching IO::Path
to rename files by providing rules, as they are understood by Str.trans
, I can use a feed operator to get the job done. (I wouldn’t be surprised to learn, that anonymous subs DWIM here to be emergent behaviour in Raku.)
Having another PR that adds .trans
to IO::Path
is tempting but requires more thought.
A follow up to the Welsh dragon.
Firing up another localisation
Steps to Ryuu
Comments on the Raku program
More generally about localisation of coding
If you want to make Ryuu better?
In my previous blog about Y Ddraig, I created a localisation of the Raku Language in Welsh. During a recent conversation, someone mentioned there may be interest in a Japanese localisation, so I thought I would try the same techniques.
I do not speak or read or have ever studied Japanese. The localisation given below will be about as clunky and awkward as any can be. I imagine there may be some hilarious stupidities as well.
So to be clear, this article is about a proof of concept rather than a real effort to create a production-ready program.
However, it took me 40 minutes from start to finish, including setting up the github repo.
Since I like dragons, I named the Japanese cousin to Raku 'Ryuu'. It's a whimsy, not to be treated with much seriousness.
Basically I created a github repo, copied my existing Welsh localisation and changed CY to JA, and draig to ryuu.
Within the automation/
directory I used the translation technique explained for Welsh to create the JA
file from the template. The translated.txt
file needed some manual cleaning, because the English word for has multiple Japanese equivalents. I chose one more or less at random. In addition, Google translate did some strange things to the order of words and numbers in a line.
After adapting the files in the bin/
directory, and installing the distribution with Raku's zef
utility, I ran tr2ryuu
on the Raku program simple.raku
.
A comment about my Welsh blog was that the program in Y Ddraig was not all in Welsh. And here the program is not all in Japanese.
Remember that the user-facing part of a program will be in the language of the user, in this case it is English. However, the coder-facing part of the program will be in the language of the coder. Below, the coder interface is in Japanese (or rather my ham-fisted attempt at Japanese).
The following is the result (which I put in a file called simple.ryuu):
私の $choice;
私の $continue;
私の @bad = <damn stupid nutcase>;
リピート {
$choice = プロンプト "Type something, like a number, or a string: ";
言う "You typed in 「" ~ ($choice ~~ 任意(@bad) ?? "*" × $choice.文字 !! $choice) ~ "」";
与えられた $choice {
いつ "dragon" {
言う "which is 'draig' in Welsh"
}
いつ 任意(@bad) {
言う "wash your mouth with soap"
}
いつ IntStr {
言う "which evaluates to an integer ", $choice
}
いつ RatStr {
言う "which evaluates to a rational number ", $choice
}
デフォルト {
言う "which does not evaluate to a number "
}
}
$continue = プロンプト "Try again? If not type N: "
} まで $continue 当量 任意(<N n>)
What is amazing to me is that when I ran ryuu simple.ryuu
, the program ran without error.
The simple.raku
program is obviously trivial, but what I wanted to show are some interesting Raku features. Note how I created an array of words with @bad = <damn stupid nutcase>;
, and then later I tested to see whether an input word was one of the array elements.
The Raku idiom いつ 任意(@bad)
or in English when any( @bad )
compares the topic variable, in this case the input value, with each array element and creates a junction of Boolean results. The 'any' effectively or
's the result to collapse the junction.
Junctions are not common in programming languages, so I thought if there would be problems, then it would be there. So I was surprised to find my Raku program works without error in another language.
All the major coding languages are in English. There are, however, coders from all over the world, and the majority of those from non-English speaking nations would have needed to learn English before (or at the same time as) they learnt coding.
We are thus creating a new technological elite: those who can understand English (or some subset of it), and those who cannot. The more coding becomes an essential part of life, the greater the ability divide between coders (who speak English) and non-coders will become.
The aim of localising a programming language is to provide an entry into coding in a form that is more accessible to every human being, whatever their natural language.
However, the aim of this approach is not to eliminate English at every level of complexity, but to provide a sufficiently rich language for most normal coding and educational needs.
In addition, by having a canonical language (Raku, which is based on English) into which all localised languages can be translated, what we get is a universal auxiliary language together with a universality of being able to code.
Having a single auxiliary language means that a non-English speaking person writing in a localised coding language can translate the program with the problem into Raku, have a developer on the other side of the globe find the problem, and suggest a solution in code, then for that solution to be translated back into the local language.
Naturally, a person who wants to learn more about coding, or who needs to delve deeper into the workings of a module, will need to learn English. Learning wider to learn deeper is a normal part of the educational experience.
Ryuu or however it should be called, absolutely is in need of Tender loving care. Please feel free to use the github issues or PR processes to suggest better translations.
At some stage, Ryuu will join the official Raku localisations.
Actually creating a localization of an existing programming language in an existing human language
Introduction
Considerations
The plan for y Ddraig (the dragon in Welsh)
Constraints and first steps
Forwards into Draig and running
Completing the translation
Backwards to canonical form
Drawbacks
Nearly all programming languages that are widely used in the world today have English as their base human language.
This means that a young person living in a non-English environment must first learn English (if only a limited sub-set of English), and then learn the skills needed for coding. This puts the majority of the humanity at a disadvantage.
Would it not be useful to create programming languages that use the script and words of human languages, but which compile into programs that will run with state of the art computer software?
Here is how I created a Welsh cousin of Raku, and I called it y Ddraig - or The dragon.1
There are some practical obstacles to creating any new programming language, and here are some of the ameliorating reasons why the Raku Programming Language is a good choice to base a new one on.
Different human languages use different writing systems and most need extra letters not covered by the ASCII set
;
, ,
, and {}
.Different operating systems
All professional programmers are proficient in English, and so can answer questions about program errors in English. The number of programmers speaking Welsh is quite small, and the same would be true for many other human languages.
Whilst the plan is to create y Ddraig as a language that can be used with as little English as possible, there are several stages:
First is to create a localization (L10N) of Raku, or a module called L10N::CY
.
Next, the operating system has to be adapted so that a executable called draig
is available, which will also mean that in a graphic interface (GUI), double clicking on a file with a file-extension of .draig
will run Raku with the L10N::CY module already loaded. This is trivial.
For personal reasons, I stopped using Windows on my PC, and I use Ubuntu Linux exclusively. So, where there are terminal sessions, I shall be showing how I created Y ddraig using a Linux terminal.
Since Y ddraig is a Raku cousin, or technically a Raku localization, the Raku language needs to be installed. In addition, it needs to be a version of the language released after December 2023. Information about the installation of Raku, and its package manager zef, can be found on the Raku website.
The first stage is to create the L10N::CY
module. It is simply a normal Raku module, which is then installed with the zef
package manager.
Raku module development is conventionally done by creating a github repository. Working with git is quite simple for the basic functionality, but there is a long learning curve when working with others. But none of that is the topic here.
Elizabeth Mattijsen, who is responsible for all this Raku internationalization magic, has created a template internationalization module for the Klingon language (yep: aliens get to be the first to use localizations of a Terran computer language)2.
So I git cloned the Klingon, and created a github repo for the Welsh. My git nick is finanalyst, so here's the terminal code lines:
git clone https://github.com/lizmat/L10N-TLH.git rakuast-L10N-Klingon
git clone https://github.com/finanalyst/rakuast-L10N-CY.git rakuast-L10N-Welsh
In the following, I shall call Elizabeth's repo, the Klingon repo, and mine, the Welsh repo. If you want to create your own language, the convention being followed is to name the language according to an ISO 639-1 supported language code, at least for the foreseeable future. You should also think of an filename extension (like .draig
here) for programs in the new language (Raku cousin).
The two critical parts of the module are update-localization
, and a root text file which we will call the localization map. It should be named by the language code. Here it is called CY
for Cymraeg or the Welsh language, for Klingon, it is TLH
.
The update-localization
utility in from the Klingon repo looks for a repo root directory file with 2 or 3 upper case characters. This is taken as the localization map and is automatically converted into all the magical modules.
The biggest step is to translate the terms to be stored in CY
. The template for the localization map can be found at Github Raku localizations. To get this as a local text file, I used the following terminal code to download the template in to my working directory.
curl 'https://raw.githubusercontent.com/Raku/L10N/main/TEMPLATE' > CY
The pristine form of CY
contains a few lines of comment (starting with the characters '# ', note the space), and then a number of sections starting with
# KEY TRANSLATION
Within each section there is a key and then an English Raku keyword, eg.
#adverb-pc-delete delete
Note that it has been commented out with single #
. This means that the update-localization
utility will ignore the line.
Now comes the translation part. Each significant commented line (a line with #
and no space at the start) has two parts: a KEY and a TRANSLATION, with some spaces between them. The translation process is to substitute the English Raku keyword with the Welsh word, and remove the #
. For example, the first significant line becomes
adverb-pc-delete dileu
When starting the translation process, and to see how the system works, it is sufficient to translate a minimum number of keys. (Eg., for the Draig program below, I only need eleven words.)
Once I have enough key words for the program, all that is needed is to run ./update-localization
. This then creates a directory tree under lib/
.
Here is a short program in Raku (English cousin), which we store in a file called 'simple.raku' in the root directory of the repo.
my $choice;
my $continue;
my @bad = <damn stupid nutcase>;
repeat {
$choice = prompt 'Type something, like a number, or a string: ';
say 'You typed in 「' ~ ( $choice ~~ any( @bad ) ?? '*' x $choice.chars !! $choice) ~ '」';
given $choice {
when 'dragon' { say "which is 'draig' in Welsh" }
when any( @bad ) { say "wash your mouth with soap" }
when IntStr { say "which evaluates to an integer ", $choice }
when RatStr { say "which evaluates to a rational number ", $choice }
default { say "which does not evaluate to a number "}
}
$continue = prompt 'Try again? If not type N: ';
} until $continue eq any(<N n>) ;
Try running it in a terminal where the working directory is the root directory of the repo, thus:
raku simple.raku
If you input some words, it will tell you the input is a string, if you input something naughty (well only one of the three words 'damn stupid nutcase'), you will get another response, and then there are responses depending on whether the number is an integer or a rational.
The code uses 11 keywords, which I translated and put into CY
. Obviously, there are many strings that form the user interface, and these are hard-coded in this program in English. We are concerned at the moment with the infrastructure keywords that form the programming language.
Now lets translate the Raku program using a simple Raku utility called tr2draig
.
We shall specify here that the Raku program is of the form somename.raku
and that we want a Draig program of the form somename.draig
.
The utility is the following Raku script:
#!/usr/bin/env raku
sub MAIN(
$filename where *.IO.f #= source file to be localized to Welsh
) {
$filename.IO.extension('draig').spurt: $filename.IO.slurp.AST.DEPARSE("CY")
}
Breaking the program down, #!/usr/bin/env raku
is standard for a script with execute permission.
$filename where *.IO.f #= ...
is a nice Raku idiom for a program called from a terminal. The program expects a string that names a file. It checks that the filename exists and is of type 'f'. If not, then an error message will be provided from the comment following #=
.
$filename.IO.extension('draig').spurt:
takes the filename, creates a new file with the extension 'draig' replacing the previous extension (which was 'raku'), then spurts text into it, the text it uses being generated by the expression after the :
.
$filename.IO.slurp.AST.DEPARSE("CY")
takes the filename (which has extension 'raku'), makes it into a filehandle, slurps (sucks) in the text that is in the file, parses the text as a Raku program into an Abstract Symbol Tree (AST), and then deparses
the symbol tree using the new Welsh keywords into a new program with Welsh.
For reasons related to distributing Raku software, I have placed the utility in the
bin/
directory. There are two ways to get a copy of these files, either by creating a clone of my Github repository (the url is given above), or by installing the Raku distribution, aszef install "L10N::CY"
. If zef is set up in a typical way, then the utilities below can be run without specifying the path.
The translation utility is run like this
bin/tr2draig simple.raku
This produces a file simple.draig
, which contains
fy $choice;
fy $continue;
fy @bad = <damn stupid nutcase>;
ailadrodd {
$choice = prydlon "Type something, like a number, or a string: ";
dywedyd "You typed in 「" ~ ($choice ~~ unrhyw(@bad) ?? "*" x $choice.golosg !! $choice) ~ "」";
a-roddwyd $choice {
pryd "dragon" {
dywedyd "which is 'draig' in Welsh"
}
pryd unrhyw(@bad) {
dywedyd "wash your mouth with soap"
}
pryd IntStr {
dywedyd "which evaluates to an integer ", $choice
}
pryd RatStr {
dywedyd "which evaluates to a rational number ", $choice
}
rhagosodedig {
dywedyd "which does not evaluate to a number "
}
}
$continue = prydlon "Try again? If not type N: "
} hyd $continue eq unrhyw(<N n>)
Now we want a way to run draig
programs. The easiest way is create another Raku program draig
, which we place in the bin/
directory. bin/draig
has the following content:
#!/usr/bin/env raku
sub draig(*@_) {
%*ENV<RAKUDO_RAKUAST> = 1;
%*ENV<RAKUDO_OPT> = '-ML10N::CY';
run $*EXECUTABLE, @_;
}
multi sub MAIN() {
draig
}
multi sub MAIN(
$filename where *.IO.f #= source file to be run in Welsh
) {
draig $filename
}
Here's a gloss of the program:
sub draig(*@_) {...
This is a helper subroutine called later. It sets up environment variables, and preloads the localization module, before running Raku with the Welsh keywords.
multi sub MAIN()
runs the sub draig (above) when no program is given. This puts the user into a REPL, where statements can be input directly, parsed and run immediately. However, draig
will run using the Welsh keywords.
multi sub MAIN(
handles the case when
$filename where *.IO.f #= source file to be run in Welsh
)draig
is given a filename. As explained above, the filename is tested for existence.
Now try running bin/draig simple.draig
in a terminal.
If the
RakuAST-L10N-CY
distribution has been installed withzef
, then all you will need isdraig simple.draig
.
The running code produces exactly the same output as the English Raku program. The user interface output is still in English, and for completeness, I should translate all of the text strings to Welsh as well.
At this point, we can translate any English version of a Raku program into a Draig program, and draig
will run it, but only if the Raku program uses the 11 keywords I translated.
In order to create a full localization, all of the Translation values need to be converted to Welsh. The first step (and I really must re-emphasise it is a first step) is to use an automated translation tool. A correct localization will need first-language Welsh speakers to go through the CY
file and correct the translations.
At the time of writing, the localization has not been properly verified, so it has not yet been added to the official Raku localizations.
For the automated translation, I have created the directory automation/
. I again downloaded the TEMPLATE into a CY
file in the automation/
directory.
I have written some automation helper utilities, namely:
find-untranslated
, takes a CY file and splits it into two new files, with line numbers at the start of each line to help match later. One file is partial.txt
with the starting key and comment lines, and the second file is to-be-translated.txt
. Both contain approximately 700 lines.combine-translated
, takes partial.txt
and another file translated.txt
(see below) to create a new CY file.Next I copy/pasted the lines for translation (from the file to-be-translated.txt
into Google's translate to Welsh page. The operation took a couple of copy/pastes due to size limitations, but the text is not overly large.
The translated text can be copied straight back to a new file (translated.txt), and then recombined with partials.txt
to create CY.
As mentioned above, suppose a Welsh-speaker using y Ddraig
runs into a programming problem, a syntax error or logic not working as the programmer assumes. An English speaking programmer will probably not be able to help.
But ... .draig program can be retranslated back to the canonical form of Raku. This is done by a utility called tr2raku
. It is almost the inverse of tr2draig
, but instead of replacing the file extension .draig
with .raku
, we add it on to the filename so that its clear it is a canonicalisation of a Raku cousin.
The utility bin/tr2raku
contains the following contents.
#!/usr/bin/env raku
sub MAIN(
$filename where *.IO.f #= Welsh source file to be turned to canonical form
) {
$filename.IO.extension('raku', :0parts).spurt: $filename.IO.slurp.AST("CY").DEPARSE
}
The difference can be seen that the language signifier (CY
) is a parameter to the AST
method, rather than the DEPARSE
method.
There should be no reason why this recipe cannot be applied to Mandarin, Hindi, or Japanese.
The problems stem from the development history of Raku. Error messages are in English, and so Raku cousins, like Draig, will have English error messages.
The problem is not insurmountable, but it will take a lot of translator hours.
Another problem is that helper modules, for example, JSON::Fast
, which imports/exports structured data from/to .json
files into Raku data structures. The module has two main methods to-json
and from-json
. These names are set by the module, not by Raku.
A program in y Ddraig
will be able to access all Raku modules without restriction, but it will need to use the canonical (English) names.
However, if many Raku localizations come into being, and a user base for them develops, these are all soluble problems.
Footnotes
A reader may wonder why the language is Y ddraig, but draig is given in dictionaries as the translation for dragon. Well ..., draig is a feminine word, and the definite particle Y triggers a mutation in the next feminine word, so d mutates to dd.
My next project is to create a localization with Egyptian hieroglyphs
Sparky is a task-runner allow teams to automate their daily tasks by creating #Rakulang scenarios and customizable UI
mkdir -p ~/.sparky/projects/build-rakudo/
nano ~/.sparky/projects/build-rakudo/sparky.yaml
sparrowdo:
no_sudo: true
no_index_update: true
bootstrap: false
format: default
allow_manual_run: true
vars:
-
name: version
default: "2023.12"
type: input
-
name: arch
values: [ alpine, debian, ubuntu ]
type: select
default: alpine
nano ~/.sparky/projects/build-rakudo/sparrowfile
#!raku
task-run "files/build-rakudo", %(
rakudo_version => tags()<version>,
arch => tags()<arch>,
);
In this imaginary scenario we want to build a Rakudo docker image for a specific Rakudo version and Linux distribution:
mkdir -p ~/.sparky/projects/build-rakudo/files/build/
nano ~/.sparky/projects/build-rakudo/files/build/task.bash
cat << HERE > $cache_root_dir/install-rakudo.sh
mkdir ~/rakudo && cd $_
curl -LJO https://rakudo.org/dl/rakudo/rakudo-$1.tar.gz
tar -xvzf rakudo-*.tar.gz
cd rakudo-*
perl Configure.pl --backend=moar --gen-moar
make
make install
HERE
if test $(config arch) == "alpine"; then
cat << HERE > $cache_root_dir/Dockerfile
FROM alpine:latest
ARG rakudo_version=2023.12
RUN apk update && apk add git make gcc musl-dev
RUN adduser -D -h /home/worker -s /bin/bash -G wheel worker
USER worker
ENV PATH="/home/worker/rakudo-$rakudo_version/install/bin:/home/worker/rakudo-$rakudo_version/install/share/perl6/vendor/bin:/home/worker/rakudo-$rakudo_version/install/share/perl6/core/bin:/home/worker/rakudo-$rakudo_version/install/share/perl6/site/bin:/home/worker/.raku/bin:${PATH}"
COPY install-rakudo.sh .
RUN sh ./install-rakudo.sh $rakudo_version
HERE
elif test $(config arch) == "debian"; then
cat << HERE > $cache_root_dir/Dockerfile
FROM debian:latest
ARG rakudo_version=2023.12
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -q -o Dpkg::Use-Pty=0
RUN apt-get install -q -y -o Dpkg::Use-Pty=0 build-essential curl git
RUN useradd -m -d /home/worker --shell /bin/bash worker
USER worker
ENV PATH="/home/worker/rakudo-$rakudo_version/install/bin:/home/worker/rakudo-$rakudo_version/install/share/perl6/vendor/bin:/home/worker/rakudo-$rakudo_version/install/share/perl6/core/bin:/home/worker/rakudo-$rakudo_version/install/share/perl6/site/bin:/home/worker/.raku/bin:${PATH}"
COPY install-rakudo.sh .
RUN sh ./install-rakudo.sh $rakudo_version
HERE
elif test $(config arch) == "ubuntu"; then
cat << HERE > $cache_root_dir/Dockerfile
FROM ubuntu:latest
ARG rakudo_version=2023.12
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -q -o Dpkg::Use-Pty=0
RUN apt-get install -q -y -o Dpkg::Use-Pty=0 build-essential curl git
RUN useradd -m -d /home/worker --shell /bin/bash worker
USER worker
ENV PATH="/home/worker/rakudo-$rakudo_version/install/bin:/home/worker/rakudo-$rakudo_version/install/share/perl6/vendor/bin:/home/worker/rakudo-$rakudo_version/install/share/perl6/core/bin:/home/worker/rakudo-$rakudo_version/install/share/perl6/site/bin:/home/worker/.raku/bin:${PATH}"
COPY install-rakudo.sh .
RUN sh ./install-rakudo.sh $rakudo_version
HERE
else
echo "$(config arch) is not supported"
exit 1
fi
docker build $cache_root_dir/ \
-f $cache_root_dir/Dockerfile \
--build-arg rakudo_version=$(config rakudo_version) \
-t team/rakudo:$(config arch)-$(config version)
docker push team/rakudo:$(config arch)-$(config version)
Once we've created all necessary files we can navigate to a "build-rakudo" project in Sparky UI and hit "build now" button:
By choosing a Rakudo version and Linux distribution and launching a new build, within a few minutes we get a new Rakudo docker image published to an internal docker registry.
This is just a simple example of how one can use Sparky for automation, there is more then that, but the main idea is to spin up jobs quickly with simple web interfaces generated from YAML specifications, resulting in various kinds of centralized tools available for needs of your team.
For the scaling it's even possible to convert scenarios into plain Raku modules or Sparrow plugins and distribute them across many teams.
Conclusion
Sparky is versatile task runner enable small teams of developers of self-hosted platform to automate all boring and manual stuff they might have during development cycle, check it out for more details - https://raku.land/zef:melezhik/Sparky
Over on Reddit zeekar wasn’t too happy about Raku’s love of Seq
. It’s immutability can be hindering indeed.
my @nums = [ [1..10], ];
@nums[0] .= grep: * % 2;
@nums[0].push(11); # We can't push to a Seq.
I provided a solution I wasn’t happy with. It doesn’t DWIM and is anything but elegant. So while heavily digesting on my sofa (it is this time of the year), the problem kept rolling around in my head. At first I wanted to wrap Array.grep()
, but that would be rather intrusive and likely break Rakudo itself. After quite a bit of thinking, I ended up with the question. How can I have indexable container (aka Array) that will turn each value on assignment into an (sub-)Array?
my Array() @foo = [ 1..10, ];
dd @foo;
# Array[Array(Any)] @foo = Array[Array(Any)].new($[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
@foo[0] .= grep: * % 2;
@foo[1] = 42;
dd @foo;
# Array[Array(Any)] @foo = Array[Array(Any)].new($[1, 3, 5, 7, 9], $[42])
The answer is obvious. By telling the compiler what I want! Coersion-types have become really hard to distinguish from magic.
I wish you all a Merry Christmas and the very best questions for 2024.
Note: This post is also available as a gist if you find that format more readable.
This research was conducted while preparing an upcoming Raku Advent Calendar post. The Raku code uses a basic supply pipeline to feed $volume
objects through a validation stage that requires a CRC32 check before going to the output sink, which prints the processing time of the validation stage.
The "reaction graph" is designed to simulate a stream processing flow, where inputs arrive and depart via Candycane™ queues (that's the name of Santa's Workshop Software's queueing service, in case you weren't familiar).
The entire scenario is contrived in that CRC32 was chosen due to native implementation availability in both Raku and Zig, allowing comparison. It's not an endorsement of using CRC32 in address validation to deliver Santa's, or anyone's, packages.
Also, thanks to the very helpful folks at ziggit.dev for answering my newbie question in depth.
The source code:
At larger volumes, Raku struggles with the initialization speed of the $volume
objects that are instantiated. I replaced the native Raku class with one written in Zig, using the is repr('CStruct')
trait in Raku and the extern struct
qualifier in Zig.
In Zig I use a combination of an arena allocator (for the string passed from Raku) and a memory pool (designed to quicklymake copies of a single type, exactly fitting our use case) to construct Package
objects.
Additionally, for Raku+Zig the CRC32 hashing routine from Zig's stdlib
is used via a tiny wrapper function.
A --bad-packages
option is provided by both Raku scripts, which makes 10% of the objects have a mismatched address/CRC32 pair.
The library tested was compiled with -Doptimize=ReleaseFast
.
Batches are repeated $batch
times, which defaults to 5.
All results from an M2 MacBook Pro.
This test and its is only intended to reflect the case where an object is constructed in Zig based on input from Raku. It is not intended to be a test of Zig's native speed in the creation of structs.
There is a call to sleep
that gives time -- 0.001
seconds -- to get the react
block up and running before emitting the first True
on the $ticker-supplier
. This affects overall runtime but not the batch or initialization metrics.
The speed of Raku+Zig was so fast that the tool used to measure these details (cmdbench
) could not find results in ps
for the execution because it had already finished. These are marked as Unmeasured
.
In the next iteration of this research, there sould be two additional entries in the data tables below for:
Volume | Edition | Runtime | Batch Time | Initialization | Max bytes |
---|---|---|---|---|---|
10,000 | Raku | 1.072s | 1: 0.146596686s 2: 0.138983732s 3: 0.142380065s 4: 0.136050775s 5: 0.134760525s |
0.008991746s | 180240384 |
10,000 | Raku+Zig | 0.44s | 1: 0.010978411s 2: 0.006575705s 3: 0.004145623s 4: 0.004280415s 5: 0.00468929s |
0.020358033s | Unmeasured |
10,000 | Raku ( bad-packages ) |
1.112s | 1: 0.157788932s 2: 0.149544686s 3: 0.156293433s 4: 0.151365477s 5: 0.147947436s |
0.008059955s | 196263936 |
10,000 | Raku+Zig ( bad-packages ) |
0.463s | 1: 0.031300276s 2: 0.01006562s 3: 0.010693328s 4: 0.011056994s 5: 0.010770828s |
0.010954495s | Unmeasured |
The Raku+Zig solution wins in performance, but loses the initialization race. Raku is doing a decent showing in comparison to how far it has come performance-wise.
Volume | Edition | Overall | Batch Time | Initialization | Max bytes |
---|---|---|---|---|---|
100,000 | Raku | 7.163s | 1: 1.360029456s 2: 1.32534014s 3: 1.353072834s 4: 1.346668338s 5: 1.351110502s |
0.062402473s | 210173952 |
100,000 | Raku+Zig | 0.75s | 1: 0.079802007s 2: 0.073638176s 3: 0.053291894s 4: 0.05087652s 5: 0.050394687s |
0.05855585s | 241205248 |
100,000 | Raku ( bad-packages ) |
7.89s | 1: 1.496982355s 2: 1.484494027s 3: 1.497365023s 4: 1.490810525s 5: 1.492416774s |
0.060026016s | 209403904 |
100,000 | Raku+Zig ( bad-packages ) |
1.076s | 1: 0.16960934s 2: 0.111172493s 3: 0.110844786s 4: 0.113021202s 5: 0.111713535s |
0.051436311s | 242450432 |
We see Raku+Zig take first place in everything but memory consumption, which we can assume is a function of using the NativeCall bridge, not to mention my new-ness as a Zig programmer.
Volume | Edition | Overall | Batch Time | Initialization | Max bytes |
---|---|---|---|---|---|
1,000,000 | Raku | 68.081s | 1: 13.475302627s 2: 13.161153845s 3: 13.293998956s 4: 13.364662217s 5: 13.474755295s |
0.95481884s | 417103872 |
1,000,000 | Raku+Zig | 3.758s | 1: 0.788083286s 2: 0.509883905s 3: 0.492898873s 4: 0.500868284s 5: 0.498677495s |
0.575087671s | 514064384 |
1,000,000 | Raku+Zig ( bad-packages ) |
75.796s | 1: 14.940173822s 2: 14.632683637s 3: 14.866796226s 4: 15.272903792s 5: 15.027481448s |
0.704549212s | 396656640 |
1,000,000 | Raku+Zig ( bad-packages ) |
6.553s | 1: 1.362189763s 2: 1.061496504s 3: 1.069134685s 4: 1.062746049s 5: 1.061096044s |
0.528011288s | 462766080 |
Raku's native CRC32 performance is clearly lagging here. Raku+Zig keeps its domination except in the realm of memory usage. It would be hard to justify using the Raku native version strictly on its reduced memory usage, considering the performance advantage on display here
A "slow first batch" problem begins to affect Raku+Zig. Running with bad-packages
enabled slows down the Raku+Zig crc32 loop, hinting that there might be some optimizations on either the Raku or the Zig/clang side of things that can't kick in when the looped data is heterogenous.
Dynamic runtime optimization sounds more like a Rakudo thing than a Zig thing, though.
Volume | Edition | Runtime | Batch Time | Initialization | Max bytes |
---|---|---|---|---|---|
10,000,000 | Raku | 704.852s | 1: 136.588638184s 2: 136.851019628s 3: 138.44696743s 4: 139.777040922s 5: 139.490784317s |
13.299274221s | 2055012352 |
10,000,000 | Raku+Zig | 38.505s | 1: 8.843459877s 2: 4.84300835s 3: 4.991842433s 4: 5.077245603s 5: 4.939533707s |
9.375436134s | 2881126400 |
10,000,000 | Raku ( bad-packages ) |
792.1s | 1: 162.333803401s 2: 174.815386318s 3: 168.299796081s 4: 162.643428135s 5: 163.205406678s |
10.252639311s | 2124267520 |
10,000,000 | Raku+Zig ( bad-packages ) |
65.174 | 1: 14.41616445s 2: 11.078961309s 3: 10.662389991s 4: 11.20240076s 5: 10.614430063s |
6.778600235s | 2861596672 |
Pure Raku really struggles with a volume of this order of magnitude. But if you add in just a little bit of Zig, you can reasonably supercharge Raku's capabilities.
The "slow first batch" for Raku+Zig has been appearing in more understated forms in other tests. Here the first batch is over double the runtime of the second batch. What is causing this?
This doesn't seem to work. At least, I'm not patient enough. The process seems to stall, growing and shrinking memory but never finishing.
This is a preliminary report in blog post form based on a contrived code sample written for another, entirely different blog post. More data and deeper analysis will have to come later.
Zig's C ABI compatibility is clearly no put on. It works seamlessly with Raku's NativeCall. Granted, we haven't really pushed the boundaries of what the C ABI can look like but one of the core takeaways is actually that with Zig we can design that interface. In other words, we are in charge of how ugly, or not, it gets. Considering how dead simple the extern struct <-> is repr('CStruct')
support is, I don't think the function signatures need to get nearly as gnarly as they get in C.
Sussing the truth of that supposition out will take some time and effort in learning Zig. I'm looking forward to it. My first stop will probably be a JSON library that uses Zig. I'm also going to be looking into using Zig as the compiler for Rakudo, as it might simplify our releases significantly.
According to Larry, laziness is a programmers virtue. The best way to be lazy is having somebody else do it. By my request, SmokeMachine kindly did so. This is not fair. We both should have been lazy and offload the burden to the CORE-team.
Please consider the following code.
my @many-things = (1..10).List;
sub doing-one-thing-at-a-time($foo) { ... }
say doing-one-thing-at-a-time(@many-things.all);
Rakudo goes out of it’s way to create the illusion that sub doing-one-thing-at-a-time
can deal with a Junction
. It can’t, the dispatcher does all the work of running code in parallel. There are tricks we can play to untangle a Junction
, but there is no guarantee that all values are produced. Junction
s are allowed to short-circuit.
This was bouncing around in my head for quite some time, until it collided with my thoughts about Range
. We may be handling HyperSeq
and RaceSeq
wrong.
my @many-things = (1..10).List;
sub doing-one-thing-at-a-time($foo) { ... }
say doing-one-thing-at-a-time(@many-tings.hyper(:degree<10>));
As with Junction
s doing dispatch-magic to make hyper/race just work, moving the handling to the dispatcher would move the decision from the callee to the caller and, as such, from the author of a module to the user. We can do that by hand already with .hyper.grep(*.foo)
or other forms of boilerplate. In Raku-land we should be able to do better and provide a generalisation of transforming calls with the help of the dispatcher.
I now know what to ask Santa for this year.
My version of JSON::Class
is now released. The previous post explains why does this worth a note.
Lately, some unhappiness has popped up about Range
and it’s incomplete numericaliness. Having just one blogpost about it is clearly not enough, given how big Range
s can be.
say (-∞..∞).elems;
# Cannot .elems a lazy list
in block <unit> at tmp/2021-03-08.raku line 2629
I don’t quite agree with Rakudo here. There are clearly ∞ elements in that lazy list. This could very well be special-cased.
The argument has been made, that many operators in Raku tell you what type the returned value will have. Is that so? (This question is always silly or unnecessary.)
say (1 + 2&3).WHAT;
# (Junction)
Granted, Junction
is quite special. But so are Range
s. Yet, Raku covers the former everywhere but the latter feels uncompleted. Please consider the following code.
multi sub infix:<±>(Numeric \n, Numeric \variance --> Range) {
(n - variance) .. (n + variance)
}
say 2.6 > 2 ± 0.5;
# True
my @heavy-or-light = 25.6, 50.3, 75.4, 88.8;
@heavy-or-light.map({ $_ ≤ 75 ± 0.5 ?? „$_ is light“ !! „$_ is heavy“ }).say;
# (25.6 is heavy 50.3 is heavy 75.4 is heavy 88.8 is heavy)
To me that looks like it should DWIM. It doesn’t, because &infix:«≤»
defaults to coercing to Real
and then comparing numerically.
This could easily be fixed by adding a few more multis and I don’t think it would break any production code. We already provide quite a few good tools for scientists. And those scientists do love their error bars — which are ranges. I would love for them to have another reason to use Raku over … that other language.
This will be a short one. I have recently released a family of WWW::GCloud
modules for accessing Google Cloud services. Their REST API is, apparently, JSON-based. So, I made use of the existing JSON::Class
. Unfortunately, it was missing some features critically needed for my work project. I implemented a couple of workarounds, but still felt like it’s not the way it has to be. Something akin to LibXML::Class
would be great to have…
There was a big “but” in this. We already have XML::Class
, LibXML::Class
, and the current JSON::Class
. All are responsible for doing basically the same thing: de-/serializing classes. If I wanted another JSON serializer then I had to take into account that JSON::Class
is already taken. There are three ways to deal with it:
JSON::Class
and re-implement it as a backward-incompatible version.The first two options didn’t appeal to me. The third one is now about to happen.
I expect it to be a stress-test for Raku ecosystem as, up to my knowledge, it’s going to be the first case where two different modules share the same name but not publishers.
As a little reminder:
JSON::Class:auth<zef:jonathanstowe>
in their dependencies and, perhaps, in their use
statement.JSON::Class:auth<zef:vrurg>
.There is still some time before I publish it because the documentation is not ready yet.
Let’s 🤞🏻.
In this post I will show you the solution to the Task 2 of this week’s Weekly Challenge 233. This time I was using the help of ChatGPT, which offered a very nice solution.
You are given an array of integers.
Write a script to sort the given array in increasing order based on the frequency of the values. If multiple values have the same frequency then sort them in decreasing order.
Example 1
Input: @ints = (1,1,2,2,2,3)
Ouput: (3,1,1,2,2,2)
'3' has a frequency of 1
'1' has a frequency of 2
'2' has a frequency of 3
Example 2
Input: @ints = (2,3,1,3,2)
Ouput: (1,3,3,2,2)
'2' and '3' both have a frequency of 2, so they are sorted in decreasing order.
Example 3
Input: @ints = (-1,1,-6,4,5,-6,1,4,1)
Ouput: (5,-1,4,4,-6,-6,1,1,1)
I asked ChatGPT the following: ‘Create a program. Here is the description of the task.’ To make sure the result is in the Raku programming language, I also added the following instruction in my local settings so that it knows which programming language I prefer by default.
The solution generated by ChatGPT was quite good with a small mistake.
ChatGPT: Sure, here’s a Raku program that implements the described sorting algorithm for the given task:
sub frequency-sort(@arr) { my %freq; %freq{$_}++ for @arr; my @sorted = @arr.sort: { %freq{$^b} <=> %freq{$^a} || $^b <=> $^a }; return @sorted; } my @ints = (1, 1, 2, 2, 2, 3); my @sorted = frequency-sort(@ints); say "Input: @ints"; say "Output: @sorted";
First, Raku would not interpolate arrays, so say "Input: @ints"
prints the string literary: Input: @ints
. Second, in the sorting rule, sorting by frequency must be ascending rather than descending, so we need to switch the placeholder variables $^b
and $^a
.
In the rest, ChatGPT did a good work. It computed the frequency using a loop, and I especially liked how it addressed the requirement to sort by two dimensions.
Here is the full program with the above-mentioned corrections:
sub frequency-sort(@arr) { my %freq; %freq{$_}++ for @arr; my @sorted = @arr.sort: { %freq{$^a} <=> %freq{$^b} || $^b <=> $^a }; return @sorted; } my @tests = (1, 1, 2, 2, 2, 3), (2, 3, 1, 3, 2), (-1, 1, -6, 4, 5, -6, 1, 4, 1); say frequency-sort($_) for @tests;
This program prints the following output for the given test cases:
$ raku ch-2.raku [3 1 1 2 2 2] [1 3 3 2 2] [5 -1 4 4 -6 -6 1 1 1]
In this post, I will demonstrate my solution to another Task of The Weekly Challenge, week 233. Here’s how it reads:
You are given an array of words made up of alphabets only.
Write a script to find the number of pairs of similar words. Two words are similar if they consist of the same characters.
Example 1
Input: @words = ("aba", "aabb", "abcd", "bac", "aabc")
Output: 2
Pair 1: similar words ("aba", "aabb")
Pair 2: similar words ("bac", "aabc")
Example 2
Input: @words = ("aabb", "ab", "ba")
Output: 3
Pair 1: similar words ("aabb", "ab")
Pair 2: similar words ("aabb", "ba")
Pair 3: similar words ("ab", "ba")
Example 3
Input: @words = ("nba", "cba", "dba")
Output: 0
There’s a slight moment that may be needs extra comments. In the second example all three words constructed of the same two letters, a and b. So, all of the three words match the definition of a ‘similar’ word. But as the task needs to find pairs, we need to construct all the possible pairs out of those three words.
In my solution, I chose to use a handy classify
method. For an array, it creates a hash, where the keys are the common classifying symbol, and the values are the lists of the input elements that match this classification property.
Here is the whole first program together with all the test cases provided in the description. The program maps every word to a corresponding string that consists of the sorted unique letters in the word.
my @tests = ["aba", "aabb", "abcd", "bac", "aabc"], ["aabb", "ab", "ba"], ["nba", "cba", "dba"]; for @tests -> @words { say @words.classify(*.comb.unique.sort.join).grep(*.value.elems > 1); }
For example, the word aba will be associated with the key ab. The program prints the following output:
$ raku ch-1.raku (ab => [aba aabb] abc => [bac aabc]) (ab => [aabb ab ba]) ()
The format of the output differs from the examples, but it can be enhanced if needed. My goal was to create a compact solution
But I would assume that you’d be interested in looking at what classify
produces. I am also curious. For the same @tests
, it returns the following three hashes:
{ab => [aba aabb], abc => [bac aabc], abcd => [abcd]} {ab => [aabb ab ba]} {abc => [cba], abd => [dba], abn => [nba]}
As you see, each string was put into one of the classification bins.
The second part of the task is to find pairs. After the grep
, we already filtered out everything that has less than two elements, so if data passed through this filter, there will be at least one pair. For bigger arrays, we can use another Raku’s built-in mechanism: the combinations
method.
The updated mail loop of the program looks like this now.
for @tests -> @words { say "Test case: ", @words; my %classification = @words.classify(*.comb.unique.sort.join).grep(*.value.elems > 1); my $pairs = 0; for %classification.kv -> $k, $v { my @pairs = $v.combinations(2); $pairs += @pairs.elems; say "$k: ", @pairs; } say "Answer: $pairs pair{$pairs == 1 ?? '' !! 's'}.\n"; }
The ‘redundant’ code here is added just to have a more detailed output so that we can see which pairs were actually found. Let us look at the output for the initial test cases:
$ raku ch-1.raku Test case: [aba aabb abcd bac aabc] ab: [(aba aabb)] abc: [(bac aabc)] Answer: 2 pairs. Test case: [aabb ab ba] ab: [(aabb ab) (aabb ba) (ab ba)] Answer: 3 pairs. Test case: [nba cba dba] Answer: 0 pairs.
On this page, I’ll briefly cover the solutions to the tasks for this week’s Weekly Challenge #231.
You are given an array of distinct integers.
Write a script to find all elements that is neither minimum nor maximum. Return -1 if you can’t.
Example 1
Input: @ints = (3, 2, 1, 4)
Output: (3, 2)
The minimum is 1 and maximum is 4 in the given array. So (3, 2) is neither min nor max.
Example 2
Input: @ints = (3, 1)
Output: -1
Example 3
Input: @ints = (2, 1, 3)
Output: (2)
The minimum is 1 and maximum is 3 in the given array. So 2 is neither min nor max.
Here is my original solution in the Raku programming language.
sub solve(@data) { @data.grep: * != (@data.min, @data.max).any }
As the tasks requires that we print -1
when there are no elements in the output, let us add an update to satisfy this requirement:
sub solve(@data) { (@data.grep: * != (@data.min, @data.max).any) || -1 }
The *
in this code will actually replace the $_
variable. Would you prefer it, you may use $_
, but you’ll need parentheses in this case. So, instead of @data.grep: * != ...
, you need @data.grep({$_ != ...})
, which may be a less clear code for some people.
Finally, let us use some math notation and replace calling the .any
method with a ‘contains’ operator:
sub solve(@data) { (@data.grep: * ∉ (@data.min, @data.max)) || -1 }
Well, actually, ‘does not contain’. And this is my final solution.
Note that you may want to use the .minmax
method instead of two calls to .min
and .max
, but .minmax
returns a range, which is not that suitable for this task.
Adding some test cases and passing them to the solve
function:
my @tests = (3, 2, 1, 4), (3, 1), (2, 1, 3); say solve($_) for @tests;
The program prints the expected output:
$ raku ch-1.raku (3 2) -1 (2)
You are given a list of passenger details in the form “9999999999A1122”, where 9 denotes the phone number, A the sex, 1 the age and 2 the seat number.
Write a script to return the count of all senior citizens (age >= 60).
Input: @list = ("7868190130M7522","5303914400F9211","9273338290F4010")
Ouput: 2
The age of the passengers in the given list are 75, 92 and 40.
So we have only 2 senior citizens.
Input: @list = ("1313579440F2036","2921522980M5644")
Ouput: 0
Apparently, the solution requires extracting information from a string in a specific format. It is not quite clear from the description whether the strings always contains the same number of characters, and thus the age and seat number are always two-digit values. But let’s use this assumption.
As we do not need any other information from the ticket code, no need to properly parse it, so I preferred anchoring around the only letter in the string and consider the next two digits as the age. Of course, you may make it simpler and just extract the two digits counting from the end of the string.
sub is-sinior($ticket) { ~($ticket ~~ / <alpha> (\d\d) /)[0] >= 75 }
Unlike Perl 5, Raku ignores spaces in regexes by default, so I added some air to it. On the other hand, extracting matches may seem a bit more complicated.
For the first given example (see task’s description), the Match object contains the following information:
「M75」 alpha => 「M」 0 => 「75」
So, I am taking the 0th element using [0]
and stringily it with the ~
prefix operator.
In essence, the task has been solved. Let’s add the test cases and run them:
my @tests = ('7868190130M7522', '5303914400F9211', '9273338290F4010'), ('1313579440F2036', '2921522980M5644'); for @tests -> @tickets { say [email protected]({is-sinior($_)}); }
The program prints:
$ raku ch-2.raku 2 0
* * *
The second stage in the process to update RakuDoc is now over and the third (GAMMA review) stage is starting. In order not to repeat some history, please take a look at Revising Rakudoc.
An online version is available of the proposed RakuDoc language.
The whole of the Raku documentation suite is written in RakuDoc.
About half of the original design ideas outlined in S26 were documented in current POD6. Some of the ideas were available, but not documented. Some instructions were not realised at all.
It should be remembered that RakuDoc is parsed by the compiler (eg. Rakudo) as part of a Raku program, and is then rendered by the renderer (eg. Raku::Pod::Render
) into (for example) HTML. When I use the word 'implemented', I mean that a RakuDoc instruction is properly parsed and rendered. Some of the instructions defined in S26 were parsed by Rakudo, but not rendered, and some were not parsed properly or at all, so could not be rendered.
The revision process has therefore identified and rectified the parsing deficiencies, and identified the rendering flaws. RakuDoc is correctly parsed only on the most recent versions of Rakudo, which at the time of writing has yet to be released. Raku::Pod::Render
still does not handle RakuDoc in its entirety.
It became clear that the RakuDoc serves two inter-related use cases:
RakuDoc had a simple table markup, which is very similar to the Markdown syntax. It worked, but the simplicity of the syntax was at the cost of flexibility.
Looking around at other ways of specifying a table, we identified two paradigms (there may be more), namely the one used by HTML and the one used by the GTK grid widget. Both of them allow for cells that span more than one column or row, and both allow for embedding (eg. a table inside a cell of a table).
After several iterations, a new procedural model was created and rendered. The design allows for spanning and embedding, but it also allows an author to specify a table row by row, or column by column, or even using a mixture of both.
An example showing a markup using both rows and columns can be seen in the online draft.
A semantic block is a section of text that should be easily available to another software tool, or can be moved around the final document.
For example, a section on the authors of a document (including contact or affiliations) is most easily written at the top of the document, but often it is better to place the information towards the bottom of the text.
This is done by creating a semantic block (simply by making the calling the block in uppercase letters). The block can be hidden from view by adding the metadata option :hidden
. All the data is placed in a special structure.
The rendered text can be placed in the document later using the P<>
instruction, or it can be accessed by another tool that may only be wanting the VERSION or LICENSE.
One of the strengths of RakuDoc is the ability to add optional metadata to blocks of text.
The new version of the defining document explains this concept in more detail. Metadata options are optional, with reasonable defaults being assumed. This means that a short form of the block is sufficient in most cases.
In the description above, the option :hidden
was mentioned. Another example, is :caption
. Suppose you want to write a semantic block called =AUTHORS
at the start of the document, but you want for it to appear later in the document as Article authors
, then you could specify it as follows:
=for AUTHORS :caption<Article authors> :hidden
A. N. Writer, socMedia nic @psuedonym
M. Z. Orator, socMedia nic @politician
Article text continues
Pages later
P<semantic: AUTHORS>
It is possible to include a link L<for reference see | #A very long title somewhere in the text>
where the text on the right-hand side of the |
is a heading. However, this can become tiresome if you want to include several links to the same place.
So, a metadata option :id
can be included in a heading. This allows you to do the following:
=for head3 :id<lnk>
How to correctly link to other places in a manual
Pages of text
Properly linking is important, L<see for example|#lnk>
RakuDoc has instructions for block level text, such as headings, paragraphs, code.
Typically blocks will be included in the Table of Contents.
It also has markup instructions that work in line, and which do not (typically) affect the ToC.
For example, a simple markup instruction is C< text >
, which renders like text
. I have used the Markdown equivalent here. In RakuDoc, everything between the C<
and >
is verbatim and styled differently to normal text, just like the Markdown code quotes. However, RakuDoc also has V< text >
which treats everything inside the angle brackets as verbatim but does not style it differently.
A new markup instruction in RakuDoc is M< text | metadata>
. A renderer will place the text in the rendered text, but will also provide a mechanism for the user to take the metadata and provide new functionality. For instance, M< fa-copy | font awesome v5 >
could be interpreted to insert the font-awesome
icon called fa-copy
into the text. Or M< Buy now | PayPal, database-id >
could expose the API for the PayPal payment platform.
RakuDoc is inherently customisable. It is also designed to be output neutral (although at the moment HTML is the most common output form). Semantic blocks can be invented within a document, and a renderer can allow for other user-defined blocks and markup instructions to be created.
However, RakuDoc is specific about naming rules. A built-in block must be all lower case, and renderers should not allow user-defined blocks to use all lower case. A semantic block is all upper case. And a user-defined block must have at least one upper-case letter and one lower-case letter.
All markup instructions, which are inline instructions, must be a single Unicode character with the property UPPER. Built-in markup instructions are the ASCII characters and Δ
. All other codes can be used.
The naming rules have been created to ensure that even if a user-defined block or markup becomes popular, it is not a part of the RakuDoc standard. Renderers are only required to implement the RakuDoc standard, and may render other blocks, or not.
These are some of the interesting additions to RakuDoc that are being proposed. There are more.
Since the Gamma review stage is now underway, it is almost certain that there may be more changes because the revision is now open to the Raku community for comment and requests. Discussion is open both for the language design and for the explanation of the design.
As might be admitted, community requests for changes to the overall design will face significant resistance from the main authors in order to maintain backwards compatibility with the previous version of RakuDoc, and the integrity of the underlying paradigms. New block or inline instructions will be more readily considered, but requests for examples, explanation, and greater clarity will be very much appreciated.
I was always concerned about making things easier.
No, not this way. A technology must be easy to start with, but also be easy in accessing its advanced or fine-tunable features. Let’s have an example of the former.
This post is a quick hack, no proof-reading or error checking is done. Please, feel free to report any issue.
Part of my ongoing project is to deal with JSON data and deserialize it into Raku classes. This is certainly a task
for JSON::Class
. So far, so good.
The keys of JSON structures tend to use lower camel case which is OK, but we like
kebabing in Raku. Why not, there is
JSON::Name
. But using it:
There are roles. At the point I came to the final solution I was already doing something like1:
class SomeStructure does JSONRecord {...}
Then there is AttrX::Mooish
, which is my lifevest on many occasions:
use AttrX::Mooish;
class Foo {
has $.foo is mooish(:alias<bar>);
}
my $obj = Foo.new: bar => "the answer";
say $obj.foo; # the answer
Apparently, this way it would still be a lot of manual interaction with aliasing, and that’s what I was already doing for a while until realized that there is a bettter way. But be back to this later…
And, eventually, there are traits and MOP.
That’s the easiest part. What I want is to makeThisName
look like make-this-name
. Ha, big deal!
unit module JSONRecord::Utils;
our sub kebabify-attr(Attribute:D $attr) {
if $attr.name ~~ /<.lower><.upper>/ {
my $alias = (S:g/<lower><upper>/$<lower>-$<upper>/).lc given $attr.name.substr(2);
...
}
}
I don’t export the sub because it’s for internal use mostly. Would somebody need it for other purposes it’s a rare case where a long name like JSONRecord::Utils::kebabify-attr($attr)
must not be an issue.
The sub is not optimal, it’s what I came up with while expermineting with the approach. The number of method calls and regexes can be reduced.
I’ll get back later to the yada-yada-yada up there.
Now we need a bit of MOP magic. To handle all attributes of a class we need to iterate over them and apply the aliasing. The first what comes to mind is to use role body because it is invoked at the early class composition times:
unit role JSONRecord;
for ::?CLASS.^attributes(:local) -> $attr {
# take care of it...
}
Note the word “early” I used above. It actually means that when role’s body is executed there are likely more roles waiting for their turn to be composed into the class. So, there are likely more attributes to be added to the class.
But we can override Metamodel::ClassHOW
compose_attributes
method of our target ::?CLASS
and rest assured no one would be missed:
unit role JSONRecordHOW;
use JSONRecord::Utils;
method compose_attributes(Mu \obj, |) {
for self.attributes(obj, :local) -> $attr {
# Skip if it already has `is mooish` trait applied – we don't want to mess up with user's intentions.
next if $attr ~~ AttrX::Mooish::Attribute;
JSONRecord::Utils::kebabify-attr($attr);
}
nextsame
}
Basically, that’s all we currently need to finalize the solution. We can still use role’s body to implement the key elements of it:
unit role JSONRecord;
use JSONRecordHOW;
unless ::?CLASS.HOW ~~ JSONRecordHOW {
::?CLASS.HOW does JSONRecordHOW;
}
Job done! Don’t worry, I haven’t forgot about the yada-yada-yada above!
But…
The original record role name itself is even longer than JSONRecord
, and it consists of three parts. I’m lazy. There are a lot of JSON structures and I want less typing per each. A trait? is jrecord
?
unit role JSONRecord;
multi sub trait_mod:<is>(Mu:U \type, Bool:D :$jrecord) is export {
unless type.HOW ~~ JSONRecordHOW {
type.HOW does JSONRecordHOW
type.^add_role(::?ROLE);
}
}
Now, instead of class SomeRecord does JSONRecord
I can use class SomeRecord is jrecord
. In the original case the win is even bigger.
There is absolutely nothing funny about it. Just a common way to keep a reader interested!
Seriously.
The reason for the yada in that snippet is to avoid a distraction from the primary purpose of the example. Here is what is going on there:
I want AttrX::Mooish
to do the dirty work for me. Eventually, what is needed is to apply the is mooish
trait as shown above. But the traits are just subs. Therefore all is needed now is to:
&trait_mod:<is>($attr, :mooish(:$alias));
Because this is what Raku does internally when encounters is mooish(:alias(...))
. The final version of the kebabifying sub is:
our sub kebabify-attr(Attribute:D $attr) {
if $attr.name ~~ /<.lower><.upper>/ {
my $alias = (S:g/<lower><upper>/$<lower>-$<upper>/).lc given $attr.name.substr(2);
&trait_mod:<is>($attr, :mooish(:$alias));
}
}
Since the sub is used by the HOW above, we can say that the &trait_mod<is>
would be called at compile time2.
Now, it used to be:
class SomeRecord does JSONRecord {
has $.aLongAttrName is mooish(:alias<a-long-attr-name>);
has $.shortname;
}
Where, as you can see, I had to transfer JSON key names to attribute names, decide where aliasing is needed, add it, and make sure no mistakes were made or attributes are missed.
With the above rather simple tweaks:
class SomeRecord is jrecord {
has $.aLongAttrName;
has $.shortname;
}
Job done.
Before I came down to this solution I’ve got 34 record classes implemented using the old approach. Some are little, some are quite big. But it most certainly could’ve taken much less time would I have the trait at my disposal back then…
I have managed to finish one more article in the Advanced Raku For Beginners series, this time about type and object composition in Raku.
It’s likely to take a long before I can write another.
Once, long ago, coincidentally a few people were asking the same question: how do I get a method object of a class?
Answers to the question would depend on particular circumstances of the code where this functionality is needed. One
would be about using MOP methods like .^lookup
, the other is to use method name and indirect resolution on invocant:
self."$method-name"(...)
. Both are the most useful, in my view. But sometimes declaring a method as our
can be
helpful too:
class Foo {
our method bar {}
}
say Foo::<&bar>.raku;
Just don’t forget that this way we always get the method of class Foo
, even if a subclass overrides method bar
.
In the earliest days of Raku, Damian Conway specified a documentation markup language to accompany it. Since it was modeled on Perl's POD it was called <sound of trumpets and dramatic pause>
POD6.
The Specification of POD6 (S26) was mostly incorporated without much extra explanation in the documentation suite. In this way, the description of POD6 was itself was an illustration of many of the features it documented, and some that it did not document.
Since Raku is defined by its test suite, and not its documentation, there were other details of POD6 in the tests that were not documented, even in S26.
Raku developed and morphed, but POD6 remained. The tooling for rendering the documentation sources needed updating, and the documentation site had to be modernised.
A project of mine was to upgrade the basic renderer that would transform POD6 to HTML, but allow for developers to customise the templates for each type of POD6 block type. (The first Pod::To::HTML
renderer hard-coded representations of POD6 markup, eg. B<this is bold>
was <strong>this is bold</strong>
and could not be changed.)
It turned out that S26 allowed for much more than had been included in the first documentation sources, including custom blocks and custom markup.
The project to upgrade the original HTML renderer morphed into Raku::Pod::Render, and transforming a directory full of individual documentation sources into an interlinked and searchable set of documents required another layer of tooling Collection. For example, collecting together all the pages that can be grouped as tutorials, or reference, or language, and creating a separate page for them automatically.
I covered these two projects in a presentation to RakuCon 2022.
Some of the original ideas in S26 had not been implemented, such as aliases and generic numbering. Other ideas had become outdated, such as a way to specify document encoding, which is now solved with Unicode.
In addition, RakuAST (see RakuAST for early adopters ) is on the horizon, which will radically change the speed of documentation processing.
There are also two implementations of POD6, one in Raku and one in Javascript, namely Alexandr Zahatski's Podlite.
This was an ideal time to revisit POD6 and recast it into Rakudoc - new name for the markup language, and its new file extension ".rakudoc".
I was invited to the first Raku Core Summit and I put together a presentation about the changes I thought needed to be made based on my own experience, but also using comments from other developers.
We came to a number of consensus agreements about the minimal changes that were needed, and some extra functionality to handle new questions, such as documentation versioning.
It was also clear that Rakudoc (aka POD6) has two separate parts: components that interact closely with the program being documented, and components that will be rendered separately into HTML (or an ebook). The documentation file needs to make this clear.
I have now written the first draft of the revision and the documentation file that encapsulates it. An HTML version can be found at new-raku.finanalyst.org/language/rakudoc, alongside the old documentation file and the simple table implementation. I am planning future blogs to describe some of the proposed revisions.
However, none of the revisions will break existing POD6, so Rakudoc should be backwards compatible with POD6. The version at new-raku
is a VERY early first draft, and it will go through several review stages.
The first Raku Core Summit was organised by Elizabeth Mattijsen and hosted by Elizabeth and Wendy at their home. It was a really good meeting and I am sincerely grateful for their generosity and hospitality. The summit was also supported by The Perl and Raku Foundation, Rootprompt, and Edument.
The first Raku Core Summit, a gathering of folks who work on “core” Raku things, was held on the first weekend of June, and I was one of those invited to attend. It’s certainly the case that I’ve been a lot less active in Raku things over the last 18 months, and I hesitated for a moment over whether to go. However, even if I’m not so involved day to day in Raku things at the moment, I’m still keen to see the language and its ecosystem move forward, and – having implemented no small amount of the compiler and runtime since getting involved in 2007 – I figured I’d find something useful to do there!
The area I was especially keen to help with is RakuAST, something I started, and that I’m glad I managed to bring far enough that others could see the potential and were excited enough to pick it up and run with it.
One tricky aspect of implementing Raku is the whole notion of BEGIN time (of course, this is also one of the things that makes Raku powerful and thus is widely used). In short, BEGIN time is about running code during the compile time, and in Raku there’s no separate meta-language; anything you can do at runtime, you can (in principle) do at compile time too. The problem at hand was what to do about references from code running at compile time to lexically scoped symbols in the surrounding scope. Of note, that lexical scope is still being compiled, so doesn’t really exist yet so far as the runtime is concerned. The current compiler deals with this by building up an entire flattened table of everything that is visible, and installing it as a fake outer scope while running the BEGIN-time code. This is rather costly, and the hope in RakuAST was to avoid this kind of approach in general.
A better solution seemed to be at hand by spotting such references during compilation, resolving them, and fixating them – that is, they get compiled as if they were lookups into a constant table. (This copies the suggested approach for quasiquoted code that references symbols in the lexical scope of where the quasiquoted code appears.) This seemed promising, but there’s a problem:
my $x = BEGIN %*ENV<DEBUG> ?? -> $x { note "Got $x"; foo($x) } !! -> $x { foo($x) };
It’s fine to post-declare subs, and so there’s no value to fixate. Thankfully, the generalized dispatch mechanism can ride to the rescue; we can:
When compiling Raku code, timing is everything. I knew this and tried to account for it in the RakuAST design from the start, but a couple of things in particular turned out a bit awkward.
I got a decent way into this restructuring work during the core summit, and hope to find time soon to get it a bit further along (I’ve been a mix of busy, tired, and had an eye infection to boot since getting back from the summit, so thus far there’s not been time for it).
I also took part in various other discussions and helped with some other things; those that are probably most worth mentioning are:
Thanks goes to Liz for organizing the summit, to Wendy for keeping everyone so well fed and watered, to the rest of attendees for many interesting discussions over the three days, to TPRF and Rootprompt for sponsoring the event, and to Edument for supporting my attendance.
Hi hackers! Today the MoarVM JIT project is nearly 9 years old. I was inspired by Jonathan's presentation reflecting on the development of MoarVM, to do the same for the MoarVM JIT, for which I have been responsible.
For those who are unfamiliar, what is commonly understood as 'JIT compilation' for virtual machines is performed by two components in MoarVM.
This post refers only to the native code generation backend component. It, too, is split into two mostly-independent systems:
One one hand, as a result of my limited experience, time and resources, and on the other hand as a result of the design of MoarVM.
MoarVM was originally designed as a traditional interpreter for a high level language (much like the Perl interpreter). Meaning that it has a large number of different instructions and many instructions operate on high-level data structures like strings, arrays and maps (as opposed to pointers and machine words).
This is by no means a bad or outdated design. Frequently executed routines (string manipulation, hash table lookups etc.) are implemented using an efficient language (C) and driven by a language that is optimized for usability (Raku). This design is also used in modern machine learning frameworks. More importantly, this was a reasonable design because it is a good target for the Rakudo compiler.
For the JIT compiler, this means two things:
The machine code generated by the JIT compiler then will mostly consists of consecutive function calls to VM routines, which is not the type of code where a compiler can really improve performance much.
In other words, suppose 50% of runtime is spent in interpretation overhead (instruction decoding and dispatch), and 50% is spent in VM routines, then removing interpretation overhead via JIT compilation will at best result in a twofold increase in performance. For many programs, the observed performance increase will be even less.
Mind that I'm specifically refering to the improvement due to machine code generation, and not to those due to type specialization, inlining etc. (the domain of 'spesh'). These latter features have resulted in much more significant performance improvements.
For me personally, it was a tremendously valuable learning experience which led directly to my current career, writing SQL compilers for Google Cloud.
For the Raku community, even if we never realized the performance improvements that I might have hoped at the start, I hope that the JIT project (as it exists) has been valuable, if for no other reason than identifying the challenges of JIT compilation for MoarVM. A future effort may be able to do better based on what we learned; and I hope my blog posts are a useful resource from that perspective.
Assuming that time and resources were not an issue:
If any of this comes to pass, you'll find my report on it right here. Thanks for reasding and until then!
It's been over a month since I first came across -- finally -- a clean way to present anyone who runs Linux with a simple, clean, non-"virtualenv" installation of raku
: rakudo-pkg
to the rescue!
rakudo-pkg
vs a virtual environment like rakubrew
There was always a bit of an icky feeling related to relying on rakubrew
(and rakudobrew
before it) for requiring inquiring minds to first ignore what their system offered them through official channels and instead install some in order to have access to anything remotely resembling an up-to-date version of the raku
runtime (or perl6
before it).
Unfortunately, in the case of most official package repositories, the latest officially available versions were often ancient1. It's heartening to note, however, that this situation has improved significantly since the official debut of the "finished" language as 6.c
half a decade ago. Still the official repositories lag far behind the improvements that are made, even today.
In my opinion, it is one thing to encounter a virtualenv
-style tool after you have hit some limitation with running the system installation of a language. But being exposed to adding a whole new mothball to your home directory and login shell configuration as a requirement to just trying out a language is not the strongest look in terms of an advocacy perspective.
Having a dedicated system path for the tools also fixes issues related to tools that do not inherit environment variables created by executing all of those tweaks stashed in .bash_profile
or (in my case) .config/fish/config.fish
.
A virtualenv approach is also particularly un-desirable as it is potentially resolvable through guarantees made at the language design layer around Raku's approach to module and language versioning.
Raku naturally shows it's previous life as that caterpillar formerly-known-as-Perl-6 most strongly when you encounter its own versioning.
use v6.c
is guaranteed to access a historical standard of Raku behavior. use v6.*
optimistically says "use *Whatever* version you consider the newest". use v6.d
gives you guarantees that the language won't start spitting deprecation warnings pertaining to later versions, starting with v6.e
, while also doing everything exactly as v6.d
intended even on a newer release.
It would be interesting to stress test the implicit and explicit language level guarantees of Raku by dog-fooding an old fashioned "smoke test" on our own with regard to the claims made in the designs of the language versioning and the module repositories and authorities concepts. A sort of "distributed DarkPAN simulator" for Raku in the 2020s.
The CompUnit repositories and module authorities are ideas that intend to make backward compatibility easier in a world where sometimes you want to run a locally patched variant of a public module that is otherwise identical (or even wildly incompatible) and other times you want to be able to run two different versions of a library side-by-side -- at the same time.
A/A testing of library upgrades at vanguard for a bit before rolling out to the fleet, anyone? (That's a different, likely far more profitable, library idea for you, my intrepid reader).
Check out the blog post announcement of the new GitHub Task based release flow and the latest iteration of the rakudo-pkg
project.
It was a long road to the first official release, so it is not at all fair to blame distribution maintenance teams to not bother with ensuring that the bleeding edge version of a still-baking language was--or is--easily accessible. Things have gotten better since the release of 6.c
.↩
I've been using Windows 10 for a while as I wait to install a new m2 SSD in this laptop to provide a dedicated place for Linux. I've noticed some very strange and disappointing issues with Unicode characters when running Raku from a terminal.
Thanks to #raku on Freenode, I managed to find a solution:
chcp 65001
This changes the Unicode code page to 65001 and magically fixes the issues I was seeing.
To make the change more permanent, it is possible to use change some registry key values under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage
. Modify ACP
, MACCP
, and OEMCP
all to value 65001
, give the OS a reboot, et voila!
Thanks to the ever-present raiph for his reddit comment which pointed me to a Stack Overflow question from a user facing the same problem, which in turn pointed to the solution provided for a question from a C# programmer.
Around 18 months ago, I set about working on the largest set of architectural changes that Raku runtime MoarVM has seen since its inception. The work was most directly triggered by the realization that we had no good way to fix a certain semantic bug in dispatch without either causing huge performance impacts across the board or increasingly complexity even further in optimizations that were already riding their luck. However, the need for something like this had been apparent for a while: a persistent struggle to optimize certain Raku language features, the pain of a bunch of performance mechanisms that were all solving the same kind of problem but each for a specific situation, and a sense that, with everything learned since I founded MoarVM, it was possible to do better.
The result is the development of a new generalized dispatch mechanism. An overview can be found in my Raku Conference talk about it (slides, video); in short, it gives us a far more uniform architecture for all kinds of dispatch, allowing us to deliver better performance on a range of language features that have thus far been glacial, as well as opening up opportunities for new optimizations.
Today, this work has been merged, along with the matching changes in NQP (the Raku subset we use for bootstrapping and to implement the compiler) and Rakudo (the full Raku compiler and standard library implementation). This means that it will ship in the October 2021 releases.
In this post, I’ll give an overview of what you can expect to observe right away, and what you might expect in the future as we continue to build upon the possibilities that the new dispatch architecture has to offer.
The biggest improvements involve language features that we’d really not had the architecture to do better on before. They involved dispatch – that is, getting a call linked to a destination efficiently – but the runtime didn’t provide us with a way to “explain” to it that it was looking at a dispatch, let alone with the information needed to have a shot at optimizing it.
The following graph captures a number of these cases, and shows the level of improvement, ranging from a factor of 3.3 to 13.3 times faster.
Let’s take a quick look at each of these. The first, new-buf
, asks how quickly we can allocate Buf
s.
for ^10_000_000 {
Buf.new
}
Why is this a dispatch benchmark? Because Buf
is not a class, but rather a role. When we try to make an instance of a role, it is “punned” into a class. Up until now, it works as follows:
new
methodfind_method
method would, if needed, create a pun of the role and cache it-> $role-discarded, |args { $pun."$name"(|args) }
This had a number of undesirable consequences:
With the new dispatch mechanism, we have a means to cache constants at a given program location and to replace arguments. So the first time we encounter the call, we:
new
method on the class punned from the roleFor the next thousands of calls, we interpret this dispatch program. It’s still some cost, but the method we’re calling is already resolved, and the argument list rewriting is fairly cheap. Meanwhile, after we get into some hundreds of iterations, on a background thread, the optimizer gets to work. The argument re-ordering cost goes away completely at this point, and new
is so small it gets inlined – at which point the buffer allocation is determined dead and so goes away too. Some remaining missed opportunities mean we still are left with a loop that’s not quite empty: it busies itself making sure it’s really OK to do nothing, rather than just doing nothing.
Next up, multiple dispatch with where
clauses.
multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * fac($n - 1) }
for ^1_000_000 {
fac(5)
}
These were really slow before, since:
where
clause involvedwhere
clauses twice in the event the candidate was chosen: once to see if we should choose that multi candidate, and once again when we entered itWith the new mechanism, we:
where
clause, in a mode whereby if the signature fails to bind, it triggers a dispatch resumption. (If it does bind, it runs to completion)Once again, after the setup phase, we interpret the dispatch programs. In fact, that’s as far as we get with running this faster for now, because the specializer doesn’t yet know how to translate and further optimize this kind of dispatch program. (That’s how I know it currently stands no chance of turning this whole thing into another empty loop!) So there’s more to be had here also; in the meantime, I’m afraid you’ll just have to settle for a factor of ten speedup.
Here’s the next one:
proto with-proto(Int $n) { 2 * {*} }
multi with-proto(Int $n) { $n + 1 }
sub invoking-nontrivial-proto() {
for ^10_000_000 {
with-proto(20)
}
}
Again, on top form, we’d turn this into an empty loop too, but we don’t quite get there yet. This case wasn’t so terrible before: we did get to use the multiple dispatch cache, however to do that we also ended up having to allocate an argument capture. The need for this also blocked any chance of inlining the proto
into the caller. Now that is possible. Since we cannot yet translate dispatch programs that resume an in-progress dispatch, we don’t yet get to further inline the called multi
candidate into the proto
. However, we now have a design that will let us implement that.
This whole notion of a dispatch resumption – where we start doing a dispatch, and later need to access arguments or other pre-calculated data in order to do a next step of it – has turned out to be a great unification. The initial idea for it came from considering things like callsame
:
class Parent {
method m() { 1 }
}
class Child is Parent {
method m() { 1 + callsame }
}
for ^10_000_000 {
Child.m;
}
Once I started looking at this, and then considering that a complex proto
also wants to continue with a dispatch at the {*}
, and in the case a where
clauses fails in a multi
it also wants to continue with a dispatch, I realized this was going to be useful for quite a lot of things. It will be a bit of a headache to teach the optimizer and JIT to do nice things with resumes – but a great relief that doing that once will benefit multiple language features!
Anyway, back to the benchmark. This is another “if we were smart, it’d be an empty loop” one. Previously, callsame
was very costly, because each time we invoked it, it would have to calculate what kind of dispatch we were resuming and the set of methods to call. We also had to be able to locate the arguments. Dynamic variables were involved, which cost a bit to look up too, and – despite being an implementation details – these also leaked out in introspection, which wasn’t ideal. The new dispatch mechanism makes this all rather more efficient: we can cache the calculated set of methods (or wrappers and multi candidates, depending on the context) and then walk through it, and there’s no dynamic variables involved (and thus no leakage of them). This sees the biggest speedup of the lot – and since we cannot yet inline away the callsame
, it’s (for now) measuring the speedup one might expect on using this language feature. In the future, it’s destined to optimize away to an empty loop.
A module that makes use of callsame
on a relatively hot path is OO::Monitors,
, so I figured it would be interesting to see if there is a speedup there also.
use OO::Monitors;
monitor TestMonitor {
method m() { 1 }
}
my $mon = TestMonitor.new;
for ^1_000_000 {
$mon.m();
}
A monitor
is a class that acquires a lock around each method call. The module provides a custom meta-class that adds a lock attribute to the class and then wraps each method such that it acquires the lock. There are certainly costly things in there besides the involvement of callsame
, but the improvement to callsame
is already enough to see a 3.3x speedup in this benchmark. Since OO::Monitors
is used in quite a few applications and modules (for example, Cro uses it), this is welcome (and yes, a larger improvement will be possible here too).
I’ve seen some less impressive, but still welcome, improvements across a good number of other microbenchmarks. Even a basic multi dispatch on the +
op:
my $i = 0;
for ^10_000_000 {
$i = $i + $_;
}
Comes out with a factor of 1.6x speedup, thanks primarily to us producing far tighter code with fewer guards. Previously, we ended up with duplicate guards in this seemingly straightforward case. The infix:<+>
multi candidate would be specialized for the case of its first argument being an Int
in a Scalar
container and its second argument being an immutable Int
. Since a Scalar
is mutable, the specialization would need to read it and then guard the value read before proceeding, otherwise it may change, and we’d risk memory safety. When we wanted to inline this candidate, we’d also want to do a check that the candidate really applies, and so also would deference the Scalar
and guard its content to do that. We can and do eliminate duplicate guards – but these guards are on two distinct reads of the value, so that wouldn’t help.
Since in the new dispatch mechanism we can rewrite arguments, we can now quite easily do caller-side removal of Scalar
containers around values. So easily, in fact, that the change to do it took me just a couple of hours. This gives a lot of benefits. Since dispatch programs automatically eliminate duplicate reads and guards, the read and guard by the multi-dispatcher and the read in order to pass the decontainerized value are coalesced. This means less repeated work prior to specialization and JIT compilation, and also only a single read and guard in the specialized code after it. With the value to be passed already guarded, we can trivially select a candidate taking two bare Int
values, which means there’s no further reads and guards needed in the callee either.
A less obvious benefit, but one that will become important with planned future work, is that this means Scalar
containers escape to callees far less often. This creates further opportunities for escape analysis. While the MoarVM escape analyzer and scalar replacer is currently quite limited, I hope to return to working on it in the near future, and expect it will be able to give us even more value now than it would have been able to before.
The benchmarks shown earlier are mostly of the “how close are we to realizing that we’ve got an empty loop” nature, which is interesting for assessing how well the optimizer can “see through” dispatches. Here are a few further results on more “traditional” microbenchmarks:
The complex number benchmark is as follows:
my $total-re = 0e0;
for ^2_000_000 {
my $x = 5 + 2i;
my $y = 10 + 3i;
my $z = $x * $x + $y;
$total-re = $total-re + $z.re
}
say $total-re;
That is, just a bunch of operators (multi dispatch) and method calls, where we really do use the result. For now, we’re tied with Python and a little behind Ruby on this benchmark (and a surprising 48 times faster than the same thing done with Perl’s Math::Complex
), but this is also a case that stands to see a huge benefit from escape analysis and scalar replacement in the future.
The hash read benchmark is:
my %h = a => 10, b => 12;
my $total = 0;
for ^10_000_000 {
$total = $total + %h<a> + %h<b>;
}
And the hash store one is:
my @keys = 'a'..'z';
for ^500_000 {
my %h;
for @keys {
%h{$_} = 42;
}
}
The improvements are nothing whatsoever to do with hashing itself, but instead look to be mostly thanks to much tighter code all around due to caller-side decontainerization. That can have a secondary effect of bringing things under the size limit for inlining, which is also a big help. Speedup factors of 2x and 1.85x are welcome, although we could really do with the same level of improvement again for me to be reasonably happy with our results.
The line-reading benchmark is:
my $fh = open "longfile";
my $chars = 0;
for $fh.lines { $chars = $chars + .chars };
$fh.close;
say $chars
Again, nothing specific to I/O got faster, but when dispatch – the glue that puts together all the pieces – gets a boost, it helps all over the place. (We are also decently competitive on this benchmark, although tend to be slower the moment the UTF-8 decoder can’t take it’s “NFG can’t possibly apply” fast path.)
I’ve also started looking at larger programs, and hearing results from others about theirs. It’s mostly encouraging:
Text::CSV
benchmark test-t
has seen roughly 20% improvement (thanks to lizmat for measuring)Cro::HTTP
test application gets through about 10% more requests per secondCORE.setting
, the standard library. However, a big pinch of salt is needed here: the compiler itself has changed in a number of places as part of the work, and there were a couple of things tweaked based on looking at profiles that aren’t really related to dispatch.One unpredicted (by me), but also welcome, improvement is that profiler output has become significantly smaller. Likely reasons for this include:
sink
method when a value was in sink context. Now, if we see that the type simply inherits that method from Mu
, we elide the call entirely (again, it would inline away, but a smaller call graph is a smaller profile).proto
when the cache was missed, but would then not call an onlystar proto
again when it got cache hits in the future. This meant the call tree under many multiple dispatches was duplicated in the profile. This wasn’t just a size issue; it was a bit annoying to have this effect show up in the profile reports too.To give an example of the difference, I took profiles from Agrammon to study why it might have become slower. The one from before the dispatcher work weighed in at 87MB; the one with the new dispatch mechanism is under 30MB. That means less memory used while profiling, less time to write the profile out to disk afterwards, and less time for tools to load the profiler output. So now it’s faster to work out how to make things faster.
I’m afraid so. Startup time has suffered. While the new dispatch mechanism is more powerful, pushes more complexity out of the VM into high level code, and is more conducive to reaching higher peak performance, it also has a higher warmup time. At the time of writing, the impact on startup time seems to be around 25%. I expect we can claw some of that back ahead of the October release.
Changes of this scale always come with an amount of risk. We’re merging this some weeks ahead of the next scheduled monthly release in order to have time for more testing, and to address any regressions that get reported. However, even before reaching the point of merging it, we have:
blin
to run the tests of ecosystem modules. This is a standard step when preparing Rakudo releases, but in this case we’ve aimed it at the new-disp
branches. This found a number of regressions caused by the switch to the new dispatch mechanism, which have been addressed.As I’ve alluded to in a number of places in this post, while there are improvements to be enjoyed right away, there are also new opportunities for further improvement. Some things that are on my mind include:
callsame
one here is a perfect example! The point we do the resumption of a dispatch is inside callsame
, so all the inline cache entries of resumptions throughout the program stack up in one place. What we’d like is to have them attached a level down the callstack instead. Otherwise, the level of callsame
improvement seen in micro-benchmarks will not be enjoyed in larger applications. This applies in a number of other situations too.FALLBACK
method could have its callsite easily rewritten to do that, opening the way to inlining.Int
s (which needs a great deal of care in memory management, as they may box a big integer, not just a native integer).I would like to thank TPF and their donors for providing the funding that has made it possible for me to spend a good amount of my working time on this effort.
While I’m to blame for the overall design and much of the implementation of the new dispatch mechanism, plenty of work has also been put in by other MoarVM and Rakudo contributors – especially over the last few months as the final pieces fell into place, and we turned our attention to getting it production ready. I’m thankful to them not only for the code and debugging contributions, but also much support and encouragement along the way. It feels good to have this merged, and I look forward to building upon it in the months and years to come.
I recently wrote about the new MoarVM dispatch mechanism, and in that post noted that I still had a good bit of Raku’s multiple dispatch semantics left to implement in terms of it. Since then, I’ve made a decent amount of progress in that direction. This post contains an overview of the approach taken, and some very rough performance measurements.
Of all the kinds of dispatch we find in Raku, multiple dispatch is the most complex. Multiple dispatch allows us to write a set of candidates, which are then selected by the number of arguments:
multi ok($condition, $desc) {
say ($condition ?? 'ok' !! 'not ok') ~ " - $desc";
}
multi ok($condition) {
ok($condition, '');
}
Or the types of arguments:
multi to-json(Int $i) { ~$i }
multi to-json(Bool $b) { $b ?? 'true' !! 'false' }
And not just one argument, but potentially many:
multi truncate(Str $str, Int $chars) {
$str.chars < $chars ?? $str !! $str.substr(0, $chars) ~ '...'
}
multi truncate(Str $str, Str $after) {
with $str.index($after) -> $pos {
$str.substr(0, $pos) ~ '...'
}
else {
$str
}
}
We may write where
clauses to differentiate candidates on properties that are not captured by nominal types:
multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * fac($n - 1) }
Every time we write a set of multi
candidates like this, the compiler will automatically produce a proto
routine. This is what is installed in the symbol table, and holds the candidate list. However, we can also write our own proto
, and use the special term {*}
to decide at which point we do the dispatch, if at all.
proto mean($collection) {
$collection.elems == 0 ?? Nil !! {*}
}
multi mean(@arr) {
@arr.sum / @arr.elems
}
multi mean(%hash) {
%hash.values.sum / %hash.elems
}
Candidates are ranked by narrowness (using topological sorting). If multiple candidates match, but they are equally narrow, then that’s an ambiguity error. Otherwise, we call narrowest one. The candidate we choose may then use callsame
and friends to defer to the next narrowest candidate, which may do the same, until we reach the most general matching one.
Raku leans heavily on multiple dispatch. Most operators in Raku are compiled into calls to multiple dispatch subroutines. Even $a + $b
will be a multiple dispatch. This means doing multiple dispatch efficiently is really important for performance. Given the riches of its semantics, this is potentially a bit concerning. However, there’s good news too.
The overwhelmingly common case is that we have:
where
clausesproto
callsame
This isn’t to say the other cases are unimportant; they are really quite useful, and it’s desirable for them to perform well. However, it’s also desirable to make what savings we can in the common case. For example, we don’t want to eagerly calculate the full set of possible candidates for every single multiple dispatch, because the majority of the time only the first one matters. This is not just a time concern: recall that the new dispatch mechanism stores dispatch programs at each callsite, and if we store the list of all matching candidates at each of those, we’ll waste a lot of memory too.
The situation in Rakudo today is as follows:
proto
holding a “dispatch cache”, a special-case mechanism implemented in the VM that uses a search tree, with one level per argument.proto
, it’s not too bad either, though inlining isn’t going to be happening; it can still use the search tree, thoughwhere
clauses, it’ll be slow, because the search tree only deals in finding one candidate per set of nominal types, and so we can’t use itcallsame
; it’ll be slow tooEffectively, the situation today is that you simply don’t use where
clauses in a multiple dispatch if its anywhere near a hot path (well, and if you know where the hot paths are, and know that this kind of dispatch is slow). Ditto for callsame
, although that’s less commonly reached for. The question is, can we do better with the new dispatcher?
Let’s start out with seeing how the simplest cases are dealt with, and build from there. (This is actually what I did in terms of the implementation, but at the same time I had a rough idea where I was hoping to end up.)
Recall this pair of candidates:
multi truncate(Str $str, Int $chars) {
$str.chars < $chars ?? $str !! $str.substr(0, $chars) ~ '...'
}
multi truncate(Str $str, Str $after) {
with $str.index($after) -> $pos {
$str.substr(0, $pos) ~ '...'
}
else {
$str
}
}
We then have a call truncate($message, "\n")
, where $message
is a Str
. Under the new dispatch mechanism, the call is made using the raku-call
dispatcher, which identifies that this is a multiple dispatch, and thus delegates to raku-multi
. (Multi-method dispatch ends up there too.)
The record phase of the dispatch – on the first time we reach this callsite – will proceed as follows:
raku-invoke
dispatcher with the chosen candidate.When we reach the same callsite again, we can run the dispatch program, which quickly checks if the argument types match those we saw last time, and if they do, we know which candidate to invoke. These checks are very cheap – far cheaper than walking through all of the candidates and examining each of them for a match. The optimizer may later be able to prove that the checks will always come out true and eliminate them.
Thus the whole of the dispatch processes – at least for this simple case where we only have types and arity – can be “explained” to the virtual machine as “if the arguments have these exact types, invoke this routine”. It’s pretty much the same as we were doing for method dispatch, except there we only cared about the type of the first argument – the invocant – and the value of the method name. (Also recall from the previous post that if it’s a multi-method dispatch, then both method dispatch and multiple dispatch will guard the type of the first argument, but the duplication is eliminated, so only one check is done.)
Coming up with good abstractions is difficult, and therein lies much of the challenge of the new dispatch mechanism. Raku has quite a number of different dispatch-like things. However, encoding all of them directly in the virtual machine leads to high complexity, which makes building reliable optimizations (or even reliable unoptimized implementations!) challenging. Thus the aim is to work out a comparatively small set of primitives that allow for dispatches to be “explained” to the virtual machine in such a way that it can deliver decent performance.
It’s fairly clear that callsame
is a kind of dispatch resumption, but what about the custom proto
case and the where
clause case? It turns out that these can both be neatly expressed in terms of dispatch resumption too (the where
clause case needing one small addition at the virtual machine level, which in time is likely to be useful for other things too). Not only that, but encoding these features in terms of dispatch resumption is also quite direct, and thus should be efficient. Every trick we teach the specializer about doing better with dispatch resumptions can benefit all of the language features that are implemented using them, too.
Recall this example:
proto mean($collection) {
$collection.elems == 0 ?? Nil !! {*}
}
Here, we want to run the body of the proto
, and then proceed to the chosen candidate at the point of the {*}
. By contrast, when we don’t have a custom proto
, we’d like to simply get on with calling the correct multi
.
To achieve this, I first moved the multi candidate selection logic from the raku-multi
dispatcher to the raku-multi-core
dispatcher. The raku-multi
dispatcher then checks if we have an “onlystar” proto
(one that does not need us to run it). If so, it delegates immediately to raku-multi-core
. If not, it saves the arguments to the dispatch as the resumption initialization state, and then calls the proto
. The proto
‘s {*}
is compiled into a dispatch resumption. The resumption then delegates to raku-multi-core
. Or, in code:
nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-multi',
# Initial dispatch, only setting up resumption if we need to invoke the
# proto.
-> $capture {
my $callee := nqp::captureposarg($capture, 0);
my int $onlystar := nqp::getattr_i($callee, Routine, '$!onlystar');
if $onlystar {
# Don't need to invoke the proto itself, so just get on with the
# candidate dispatch.
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi-core', $capture);
}
else {
# Set resume init args and run the proto.
nqp::dispatch('boot-syscall', 'dispatcher-set-resume-init-args', $capture);
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-invoke', $capture);
}
},
# Resumption means that we have reached the {*} in the proto and so now
# should go ahead and do the dispatch. Make sure we only do this if we
# are signalled to that it's a resume for an onlystar (resumption kind 5).
-> $capture {
my $track_kind := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 0);
nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_kind);
my int $kind := nqp::captureposarg_i($capture, 0);
if $kind == 5 {
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi-core',
nqp::dispatch('boot-syscall', 'dispatcher-get-resume-init-args'));
}
elsif !nqp::dispatch('boot-syscall', 'dispatcher-next-resumption') {
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-constant',
nqp::dispatch('boot-syscall', 'dispatcher-insert-arg-literal-obj',
$capture, 0, Nil));
}
});
Deferring to the next candidate (for example with callsame
) and trying the next candidate because a where
clause failed look very similar: both involve walking through a list of possible candidates. There’s some details, but they have a great deal in common, and it’d be nice if that could be reflected in how multiple dispatch is implemented using the new dispatcher.
Before that, a slightly terrible detail about how things work in Rakudo today when we have where
clauses. First, the dispatcher does a “trial bind”, where it asks the question: would this signature bind? To do this, it has to evaluate all of the where
clauses. Worse, it has to use the slow-path signature binder too, which interprets the signature, even though we can in many cases compile it. If the candidate matches, great, we select it, and then invoke it…which runs the where
clauses a second time, as part of the compiled signature binding code. There is nothing efficient about this at all, except for it being by far more efficient on developer time, which is why it happened that way.
Anyway, it goes without saying that I’m rather keen to avoid this duplicate work and the slow-path binder where possible as I re-implement this using the new dispatcher. And, happily, a small addition provides a solution. There is an op assertparamcheck
, which any kind of parameter checking compiles into (be it type checking, where
clause checking, etc.) This triggers a call to a function that gets the arguments, the thing we were trying to call, and can then pick through them to produce an error message. The trick is to provide a way to invoke a routine such that a bind failure, instead of calling the error reporting function, will leave the routine and then do a dispatch resumption! This means we can turn failure to pass where
clause checks into a dispatch resumption, which will then walk to the next candidate and try it instead.
This gets us most of the way to a solution, but there’s still the question of being memory and time efficient in the common case, where there is no resumption and no where
clauses. I coined the term “trivial multiple dispatch” for this situation, which makes the other situation “non-trivial”. In fact, I even made a dispatcher called raku-multi-non-trivial
! There are two ways we can end up there.
where
clauses. As soon as we see this is the case, we go ahead and produce a full list of possible candidates that could match. This is a linked list (see my previous post for why).callsame
happens, we end up in the trivial dispatch resumption handler, which – since this situation is now non-trivial – builds the full candidate list, snips the first item off it (because we already ran that), and delegates to raku-multi-non-trivial
.Lost in this description is another significant improvement: today, when there are where
clauses, we entirely lose the ability to use the MoarVM multiple dispatch cache, but under the new dispatcher, we store a type-filtered list of candidates at the callsite, and then cheap type guards are used to check it is valid to use.
I did a few benchmarks to see how the new dispatch mechanism did with a couple of situations known to be sub-optimal in Rakudo today. These numbers do not reflect what is possible, because at the moment the specializer does not have much of an understanding of the new dispatcher. Rather, they reflect the minimal improvement we can expect.
Consider this benchmark using a multi
with a where
clause to recursively implement factorial.
multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * fac($n - 1) }
for ^100_000 {
fac(10)
}
say now - INIT now;
This needs some tweaks (and to be run under an environment variable) to use the new dispatcher; these are temporary, until such a time I switch Rakudo over to using the new dispatcher by default:
use nqp;
multi fac($n where $n <= 1) { 1 }
multi fac($n) { $n * nqp::dispatch('raku-call', &fac, $n - 1) }
for ^100_000 {
nqp::dispatch('raku-call', &fac, 10);
}
say now - INIT now;
On my machine, the first runs in 4.86s, the second in 1.34s. Thus under the new dispatcher this runs in little over a quarter of the time it used to – a quite significant improvement already.
A case involving callsame
is also interesting to consider. Here it is without using the new dispatcher:
multi fallback(Any $x) { "a$x" }
multi fallback(Numeric $x) { "n" ~ callsame }
multi fallback(Real $x) { "r" ~ callsame }
multi fallback(Int $x) { "i" ~ callsame }
for ^1_000_000 {
fallback(4+2i);
fallback(4.2);
fallback(42);
}
say now - INIT now;
And with the temporary tweaks to use the new dispatcher:
use nqp;
multi fallback(Any $x) { "a$x" }
multi fallback(Numeric $x) { "n" ~ new-disp-callsame }
multi fallback(Real $x) { "r" ~ new-disp-callsame }
multi fallback(Int $x) { "i" ~ new-disp-callsame }
for ^1_000_000 {
nqp::dispatch('raku-call', &fallback, 4+2i);
nqp::dispatch('raku-call', &fallback, 4.2);
nqp::dispatch('raku-call', &fallback, 42);
}
say now - INIT now;
On my machine, the first runs in 31.3s, the second in 11.5s, meaning that with the new dispatcher we manage it in a little over a third of the time that current Rakudo does.
These are both quite encouraging, but as previously mentioned, a majority of multiple dispatches are of the trivial kind, not using these features. If I make the most common case worse on the way to making other things better, that would be bad. It’s not yet possible to make a fair comparison of this: trivial multiple dispatches already receive a lot of attention in the specializer, and it doesn’t yet optimize code using the new dispatcher well. Of note, in an example like this:
multi m(Int) { }
multi m(Str) { }
for ^1_000_000 {
m(1);
m("x");
}
say now - INIT now;
Inlining and other optimizations will turn this into an empty loop, which is hard to beat. There is one thing we can already do, though: run it with the specializer disabled. The new dispatcher version looks like this:
use nqp;
multi m(Int) { }
multi m(Str) { }
for ^1_000_000 {
nqp::dispatch('raku-call', &m, 1);
nqp::dispatch('raku-call', &m, "x");
}
say now - INIT now;
The results are 0.463s and 0.332s respectively. Thus, the baseline execution time – before the specializer does its magic – is less using the new general dispatch mechanism than it is using the special-case multiple dispatch cache that we currently use. I wasn’t sure what to expect here before I did the measurement. Given we’re going from a specialized mechanism that has been profiled and tweaked to a new general mechanism that hasn’t received such attention, I was quite ready to be doing a little bit worse initially, and would have been happy with parity. Running in 70% of the time was a bigger improvement than I expected at this point.
I expect that once the specializer understands the new dispatch mechanism better, it will be able to also turn the above into an empty loop – however, since more iterations can be done per-optimization, this should still show up as a win for the new dispatcher.
With one relatively small addition, the new dispatch mechanism is already handling most of the Raku multiple dispatch semantics. Furthermore, even without the specializer and JIT really being able to make a good job of it, some microbenchmarks already show a factor of 3x-4x improvement. That’s a pretty good starting point.
There’s still a good bit to do before we ship a Rakudo release using the new dispatcher. However, multiple dispatch was the biggest remaining threat to the design: it’s rather more involved than other kinds of dispatch, and it was quite possible that an unexpected shortcoming could trigger another round of design work, or reveal that the general mechanism was going to struggle to perform compared to the more specialized one in the baseline unoptimized, case. So far, there’s no indication of either of these, and I’m cautiously optimistic that the overall design is about right.
My goodness, it appears I’m writing my first Raku internals blog post in over two years. Of course, two years ago it wasn’t even called Raku. Anyway, without further ado, let’s get on with this shared brainache.
I use “dispatch” to mean a process by which we take a set of arguments and end up with some action being taken based upon them. Some familiar examples include:
$basket.add($product, $quantity)
. We might traditionally call just $product
and $qauntity
the arguments, but for my purposes, all of $basket
, the method name 'add'
, $product
, and $quantity` are arguments to the dispatch: they are the things we need in order to make a decision about what we’re going to do.uc($youtube-comment)
. Since Raku sub calls are lexically resolved, in this case the arguments to the dispatch are &uc
(the result of looking up the subroutine) and $youtube-comment
.At first glance, perhaps the first two seem fairly easy and the third a bit more of a handful – which is sort of true. However, Raku has a number of other features that make dispatch rather more, well, interesting. For example:
wrap
allows us to wrap any Routine
(sub or method); the wrapper can then choose to defer to the original routine, either with the original arguments or with new argumentsproto
routine that gets to choose when – or even if – the call to the appropriate candidate is madecallsame
in order to defer to the next candidate in the dispatch. But what does that mean? If we’re in a multiple dispatch, it would mean the next most applicable candidate, if any. If we’re in a method dispatch then it means a method from a base class. (The same thing is used to implement going to the next wrapper or, eventually, to the originally wrapped routine too). And these can be combined: we can wrap a multi method, meaning we can have 3 levels of things that all potentially contribute the next thing to call!Thanks to this, dispatch – at least in Raku – is not always something we do and produce an outcome, but rather a process that we may be asked to continue with multiple times!
Finally, while the examples I’ve written above can all quite clearly be seen as examples of dispatch, a number of other common constructs in Raku can be expressed as a kind of dispatch too. Assignment is one example: the semantics of it depend on the target of the assignment and the value being assigned, and thus we need to pick the correct semantics. Coercion is another example, and return value type-checking yet another.
Dispatch is everywhere in our programs, quietly tieing together the code that wants stuff done with the code that does stuff. Its ubiquity means it plays a significant role in program performance. In the best case, we can reduce the cost to zero. In the worst case, the cost of the dispatch is high enough to exceed that of the work done as a result of the dispatch.
To a first approximation, when the runtime “understands” the dispatch the performance tends to be at least somewhat decent, but when it doesn’t there’s a high chance of it being awful. Dispatches tend to involve an amount of work that can be cached, often with some cheap guards to verify the validity of the cached outcome. For example, in a method dispatch, naively we need to walk a linearization of the inheritance graph and ask each class we encounter along the way if it has a method of the specified name. Clearly, this is not going to be terribly fast if we do it on every method call. However, a particular method name on a particular type (identified precisely, without regard to subclassing) will resolve to the same method each time. Thus, we can cache the outcome of the lookup, and use it whenever the type of the invocant matches that used to produce the cached result.
When one starts building a runtime aimed at a particular language, and has to do it on a pretty tight budget, the most obvious way to get somewhat tolerable performance is to bake various hot-path language semantics into the runtime. This is exactly how MoarVM started out. Thus, if we look at MoarVM as it stood several years ago, we find things like:
where
comes at a very high cost)Sub
object has a private attribute in it that holds the low-level code handle identifying the bytecode to run)These are all still there today, however are also all on the way out. What’s most telling about this list is what isn’t included. Things like:
$obj.SomeType::method-name()
)A few years back I started to partially address this, with the introduction of a mechanism I called “specializer plugins”. But first, what is the specializer?
When MoarVM started out, it was a relatively straightforward interpreter of bytecode. It only had to be fast enough to beat the Parrot VM in order to get a decent amount of usage, which I saw as important to have before going on to implement some more interesting optimizations (back then we didn’t have the kind of pre-release automated testing infrastructure we have today, and so depended much more on feedback from early adopters). Anyway, soon after being able to run pretty much as much of the Raku language as any other backend, I started on the dynamic optimizer. It gathered type statistics as the program was interpreted, identified hot code, put it into SSA form, used the type statistics to insert guards, used those together with static properties of the bytecode to analyze and optimize, and produced specialized bytecode for the function in question. This bytecode could elide type checks and various lookups, as well as using a range of internal ops that make all kinds of assumptions, which were safe because of the program properties that were proved by the optimizer. This is called specialized bytecode because it has had a lot of its genericity – which would allow it to work correctly on all types of value that we might encounter – removed, in favor of working in a particular special case that actually occurs at runtime. (Code, especially in more dynamic languages, is generally far more generic in theory than it ever turns out to be in practice.)
This component – the specializer, known internally as “spesh” – delivered a significant further improvement in the performance of Raku programs, and with time its sophistication has grown, taking in optimizations such as inlining and escape analysis with scalar replacement. These aren’t easy things to build – but once a runtime has them, they create design possibilities that didn’t previously exist, and make decisions made in their absence look sub-optimal.
Of note, those special-cased language-specific mechanisms, baked into the runtime to get some speed in the early days, instead become something of a liability and a bottleneck. They have complex semantics, which means they are either opaque to the optimizer (so it can’t reason about them, meaning optimization is inhibited) or they need special casing in the optimizer (a liability).
So, back to specializer plugins. I reached a point where I wanted to take on the performance of things like $obj.?meth
(the “call me maybe” dispatch), $obj.SomeType::meth()
(dispatch qualified with a class to start looking in), and private method calls in roles (which can’t be resolved statically). At the same time, I was getting ready to implement some amount of escape analysis, but realized that it was going to be of very limited utility because assignment had also been special-cased in the VM, with a chunk of opaque C code doing the hot path stuff.
But why did we have the C code doing that hot-path stuff? Well, because it’d be too espensive to have every assignment call a VM-level function that does a bunch of checks and logic. Why is that costly? Because of function call overhead and the costs of interpretation. This was all true once upon a time. But, some years of development later:
I solved the assignment problem and the dispatch problems mentioned above with the introduction of a single new mechanism: specializer plugins. They work as follows:
The vast majority of cases are monomorphic, meaning that only one set of guards are produced and they always succeed thereafter. The specializer can thus compile those guards into the specialized bytecode and then assume the given target invocant is what will be invoked. (Further, duplicate guards can be eliminated, so the guards a particular plugin introduces may reduce to zero.)
Specializer plugins felt pretty great. One new mechanism solved multiple optimization headaches.
The new MoarVM dispatch mechanism is the answer to a fairly simple question: what if we get rid of all the dispatch-related special-case mechanisms in favor of something a bit like specializer plugins? The resulting mechanism would need to be a more powerful than specializer plugins. Further, I could learn from some of the shortcomings of specializer plugins. Thus, while they will go away after a relatively short lifetime, I think it’s fair to say that I would not have been in a place to design the new MoarVM dispatch mechanism without that experience.
All the method caching. All the multi dispatch caching. All the specializer plugins. All the invocation protocol stuff for unwrapping the bytecode handle in a code object. It’s all going away, in favor of a single new dispatch instruction. Its name is, boringly enough, dispatch
. It looks like this:
dispatch_o result, 'dispatcher-name', callsite, arg0, arg1, ..., argN
Which means:
dispatcher-name
result
(Aside: this implies a new calling convention, whereby we no longer copy the arguments into an argument buffer, but instead pass the base of the register set and a pointer into the bytecode where the register argument map is found, and then do a lookup registers[map[argument_index]]
to get the value for an argument. That alone is a saving when we interpret, because we no longer need a loop around the interpreter per argument.)
Some of the arguments might be things we’d traditionally call arguments. Some are aimed at the dispatch process itself. It doesn’t really matter – but it is more optimal if we arrange to put arguments that are only for the dispatch first (for example, the method name), and those for the target of the dispatch afterwards (for example, the method parameters).
The new bootstrap mechanism provides a small number of built-in dispatchers, whose names start with “boot-“. They are:
boot-value
– take the first argument and use it as the result (the identity function, except discarding any further arguments)boot-constant
– take the first argument and produce it as the result, but also treat it as a constant value that will always be produced (thus meaning the optimizer could consider any pure code used to calculate the value as dead)boot-code
– take the first argument, which must be a VM bytecode handle, and run that bytecode, passing the rest of the arguments as its parameters; evaluate to the return value of the bytecodeboot-syscall
– treat the first argument as the name of a VM-provided built-in operation, and call it, providing the remaining arguments as its parametersboot-resume
– resume the topmost ongoing dispatchThat’s pretty much it. Every dispatcher we build, to teach the runtime about some other kind of dispatch behavior, eventually terminates in one of these.
Teaching MoarVM about different kinds of dispatch is done using nothing less than the dispatch mechanism itself! For the most part, boot-syscall
is used in order to register a dispatcher, set up the guards, and provide the result that goes with them.
Here is a minimal example, taken from the dispatcher test suite, showing how a dispatcher that provides the identity function would look:
nqp::dispatch('boot-syscall', 'dispatcher-register', 'identity', -> $capture {
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-value', $capture);
});
sub identity($x) {
nqp::dispatch('identity', $x)
}
ok(identity(42) == 42, 'Can define identity dispatch (1)');
ok(identity('foo') eq 'foo', 'Can define identity dispatch (2)');
In the first statement, we call the dispatcher-register
MoarVM system call, passing a name for the dispatcher along with a closure, which will be called each time we need to handle the dispatch (which I tend to refer to as the “dispatch callback”). It receives a single argument, which is a capture of arguments (not actually a Raku-level Capture
, but the idea – an object containing a set of call arguments – is the same).
Every user-defined dispatcher should eventually use dispatcher-delegate
in order to identify another dispatcher to pass control along to. In this case, it delegates immediately to boot-value
– meaning it really is nothing except a wrapper around the boot-value
built-in dispatcher.
The sub identity
contains a single static occurrence of the dispatch
op. Given we call the sub twice, we will encounter this op twice at runtime, but the two times are very different.
The first time is the “record” phase. The arguments are formed into a capture and the callback runs, which in turn passes it along to the boot-value
dispatcher, which produces the result. This results in an extremely simple dispatch program, which says that the result should be the first argument in the capture. Since there’s no guards, this will always be a valid result.
The second time we encounter the dispatch
op, it already has a dispatch program recorded there, so we are in run mode. Turning on a debugging mode in the MoarVM source, we can see the dispatch program that results looks like this:
Dispatch program (1 temporaries)
Ops:
Load argument 0 into temporary 0
Set result object value from temporary 0
That is, it reads argument 0 into a temporary location and then sets that as the result of the dispatch. Notice how there is no mention of the fact that we went through an extra layer of dispatch; those have zero cost in the resulting dispatch program.
Argument captures are immutable. Various VM syscalls exist to transform them into new argument captures with some tweak, for example dropping or inserting arguments. Here’s a further example from the test suite:
nqp::dispatch('boot-syscall', 'dispatcher-register', 'drop-first', -> $capture {
my $capture-derived := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg', $capture, 0);
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-value', $capture-derived);
});
ok(nqp::dispatch('drop-first', 'first', 'second') eq 'second',
'dispatcher-drop-arg works');
This drops the first argument before passing the capture on to the boot-value
dispatcher – meaning that it will return the second argument. Glance back at the previous dispatch program for the identity function. Can you guess how this one will look?
Well, here it is:
Dispatch program (1 temporaries)
Ops:
Load argument 1 into temporary 0
Set result string value from temporary 0
Again, while in the record phase of such a dispatcher we really do create capture objects and make a dispatcher delegation, the resulting dispatch program is far simpler.
Here’s a slightly more involved example:
my $target := -> $x { $x + 1 }
nqp::dispatch('boot-syscall', 'dispatcher-register', 'call-on-target', -> $capture {
my $capture-derived := nqp::dispatch('boot-syscall',
'dispatcher-insert-arg-literal-obj', $capture, 0, $target);
nqp::dispatch('boot-syscall', 'dispatcher-delegate',
'boot-code-constant', $capture-derived);
});
sub cot() { nqp::dispatch('call-on-target', 49) }
ok(cot() == 50,
'dispatcher-insert-arg-literal-obj works at start of capture');
ok(cot() == 50,
'dispatcher-insert-arg-literal-obj works at start of capture after link too');
Here, we have a closure stored in a variable $target
. We insert it as the first argument of the capture, and then delegate to boot-code-constant
, which will invoke that code object and pass the other dispatch arguments to it. Once again, at the record phase, we really do something like:
And the resulting dispatch program? It’s this:
Dispatch program (1 temporaries)
Ops:
Load collectable constant at index 0 into temporary 0
Skip first 0 args of incoming capture; callsite from 0
Invoke MVMCode in temporary 0
That is, load the constant bytecode handle that we’re going to invoke, set up the args (which are in this case equal to those of the incoming capture), and then invoke the bytecode with those arguments. The argument shuffling is, once again, gone. In general, whenever the arguments we do an eventual bytecode invocation with are a tail of the initial dispatch arguments, the arguments transform becomes no more than a pointer addition.
All of the dispatch programs seen so far have been unconditional: once recorded at a given callsite, they shall always be used. The big missing piece to make such a mechanism have practical utility is guards. Guards assert properties such as the type of an argument or if the argument is definite (Int:D
) or not (Int:U
).
Here’s a somewhat longer test case, with some explanations placed throughout it.
# A couple of classes for test purposes
my class C1 { }
my class C2 { }
# A counter used to make sure we're only invokving the dispatch callback as
# many times as we expect.
my $count := 0;
# A type-name dispatcher that maps a type into a constant string value that
# is its name. This isn't terribly useful, but it is a decent small example.
nqp::dispatch('boot-syscall', 'dispatcher-register', 'type-name', -> $capture {
# Bump the counter, just for testing purposes.
$count++;
# Obtain the value of the argument from the capture (using an existing
# MoarVM op, though in the future this may go away in place of a syscall)
# and then obtain the string typename also.
my $arg-val := nqp::captureposarg($capture, 0);
my str $name := $arg-val.HOW.name($arg-val);
# This outcome is only going to be valid for a particular type. We track
# the argument (which gives us an object back that we can use to guard
# it) and then add the type guard.
my $arg := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 0);
nqp::dispatch('boot-syscall', 'dispatcher-guard-type', $arg);
# Finally, insert the type name at the start of the capture and then
# delegate to the boot-constant dispatcher.
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-constant',
nqp::dispatch('boot-syscall', 'dispatcher-insert-arg-literal-str',
$capture, 0, $name));
});
# A use of the dispatch for the tests. Put into a sub so there's a single
# static dispatch op, which all dispatch programs will hang off.
sub type-name($obj) {
nqp::dispatch('type-name', $obj)
}
# Check with the first type, making sure the guard matches when it should
# (although this test would pass if the guard were ignored too).
ok(type-name(C1) eq 'C1', 'Dispatcher setting guard works');
ok($count == 1, 'Dispatch callback ran once');
ok(type-name(C1) eq 'C1', 'Can use it another time with the same type');
ok($count == 1, 'Dispatch callback was not run again');
# Test it with a second type, both record and run modes. This ensures the
# guard really is being checked.
ok(type-name(C2) eq 'C2', 'Can handle polymorphic sites when guard fails');
ok($count == 2, 'Dispatch callback ran a second time for new type');
ok(type-name(C2) eq 'C2', 'Second call with new type works');
# Check that we can use it with the original type too, and it has stacked
# the dispatch programs up at the same callsite.
ok(type-name(C1) eq 'C1', 'Call with original type still works');
ok($count == 2, 'Dispatch callback only ran a total of 2 times');
This time two dispatch programs get produced, one for C1
:
Dispatch program (1 temporaries)
Ops:
Guard arg 0 (type=C1)
Load collectable constant at index 1 into temporary 0
Set result string value from temporary 0
And another for C2:
Dispatch program (1 temporaries)
Ops:
Guard arg 0 (type=C2)
Load collectable constant at index 1 into temporary 0
Set result string value from temporary 0
Once again, no leftovers from capture manipulation, tracking, or dispatcher delegation; the dispatch program does a type guard against an argument, then produces the result string. The whole call to $arg-val.HOW.name($arg-val)
is elided, the dispatcher we wrote encoding the knowledge – in a way that the VM can understand – that a type’s name can be considered immutable.
This example is a bit contrived, but now consider that we instead look up a method and guard on the invocant type: that’s a method cache! Guard the types of more of the arguments, and we have a multi cache! Do both, and we have a multi-method cache.
The latter is interesting in so far as both the method dispatch and the multi dispatch want to guard on the invocant. In fact, in MoarVM today there will be two such type tests until we get to the point where the specializer does its work and eliminates these duplicated guards. However, the new dispatcher does not treat the dispatcher-guard-type
as a kind of imperative operation that writes a guard into the resultant dispatch program. Instead, it declares that the argument in question must be guarded. If some other dispatcher already did that, it’s idempotent. The guards are emitted once all dispatch programs we delegate through, on the path to a final outcome, have had their say.
Fun aside: those being especially attentive will have noticed that the dispatch mechanism is used as part of implementing new dispatchers too, and indeed, this ultimately will mean that the specializer can specialize the dispatchers and have them JIT-compiled into something more efficient too. After all, from the perspective of MoarVM, it’s all just bytecode to run; it’s just that some of it is bytecode that tells the VM how to execute Raku programs more efficiently!
A resumable dispatcher needs to do two things:
When a resumption happens, the resume callback will be called, with any arguments for the resumption. It can also obtain the resume initialization state that was set in the dispatch callback. The resume initialization state contains the things needed in order to continue with the dispatch the first time it is resumed. We’ll take a look at how this works for method dispatch to see a concrete example. I’ll also, at this point, switch to looking at the real Rakudo dispatchers, rather than simplified test cases.
The Rakudo dispatchers take advantage of delegation, duplicate guards, and capture manipulations all having no runtime cost in the resulting dispatch program to, in my mind at least, quite nicely factor what is a somewhat involved dispatch process. There are multiple entry points to method dispatch: the normal boring $obj.meth()
, the qualified $obj.Type::meth()
, and the call me maybe $obj.?meth()
. These have common resumption semantics – or at least, they can be made to provided we always carry a starting type in the resume initialization state, which is the type of the object that we do the method dispatch on.
Here is the entry point to dispatch for a normal method dispatch, with the boring details of reporting missing method errors stripped out.
# A standard method call of the form $obj.meth($arg); also used for the
# indirect form $obj."$name"($arg). It receives the decontainerized invocant,
# the method name, and the the args (starting with the invocant including any
# container).
nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-meth-call', -> $capture {
# Try to resolve the method call using the MOP.
my $obj := nqp::captureposarg($capture, 0);
my str $name := nqp::captureposarg_s($capture, 1);
my $meth := $obj.HOW.find_method($obj, $name);
# Report an error if there is no such method.
unless nqp::isconcrete($meth) {
!!! 'Error reporting logic elided for brevity';
}
# Establish a guard on the invocant type and method name (however the name
# may well be a literal, in which case this is free).
nqp::dispatch('boot-syscall', 'dispatcher-guard-type',
nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 0));
nqp::dispatch('boot-syscall', 'dispatcher-guard-literal',
nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $capture, 1));
# Add the resolved method and delegate to the resolved method dispatcher.
my $capture-delegate := nqp::dispatch('boot-syscall',
'dispatcher-insert-arg-literal-obj', $capture, 0, $meth);
nqp::dispatch('boot-syscall', 'dispatcher-delegate',
'raku-meth-call-resolved', $capture-delegate);
});
Now for the resolved method dispatcher, which is where the resumption is handled. First, let’s look at the normal dispatch callback (the resumption callback is included but empty; I’ll show it a little later).
# Resolved method call dispatcher. This is used to call a method, once we have
# already resolved it to a callee. Its first arg is the callee, the second and
# third are the type and name (used in deferral), and the rest are the args to
# the method.
nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-meth-call-resolved',
# Initial dispatch
-> $capture {
# Save dispatch state for resumption. We don't need the method that will
# be called now, so drop it.
my $resume-capture := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg',
$capture, 0);
nqp::dispatch('boot-syscall', 'dispatcher-set-resume-init-args', $resume-capture);
# Drop the dispatch start type and name, and delegate to multi-dispatch or
# just invoke if it's single dispatch.
my $delegate_capture := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg',
nqp::dispatch('boot-syscall', 'dispatcher-drop-arg', $capture, 1), 1);
my $method := nqp::captureposarg($delegate_capture, 0);
if nqp::istype($method, Routine) && $method.is_dispatcher {
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi', $delegate_capture);
}
else {
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-invoke', $delegate_capture);
}
},
# Resumption
-> $capture {
... 'Will be shown later';
});
There’s an arguable cheat in raku-meth-call
: it doesn’t actually insert the type object of the invocant in place of the invocant. It turns out that it doesn’t really matter. Otherwise, I think the comments (which are to be found in the real implementation also) tell the story pretty well.
One important point that may not be clear – but follows a repeating theme – is that the setting of the resume initialization state is also more of a declarative rather than an imperative thing: there isn’t a runtime cost at the time of the dispatch, but rather we keep enough information around in order to be able to reconstruct the resume initialization state at the point we need it. (In fact, when we are in the run phase of a resume, we don’t even have to reconstruct it in the sense of creating a capture object.)
Now for the resumption. I’m going to present a heavily stripped down version that only deals with the callsame
semantics (the full thing has to deal with such delights as lastcall
and nextcallee
too). The resume initialization state exists to seed the resumption process. Once we know we actually do have to deal with resumption, we can do things like calculating the full list of methods in the inheritance graph that we want to walk through. Each resumable dispatcher gets a single storage slot on the call stack that it can use for its state. It can initialize this in the first step of resumption, and then update it as we go. Or more precisely, it can set up a dispatch program that will do this when run.
A linked list turns out to be a very convenient data structure for the chain of candidates we will walk through. We can work our way through a linked list by keeping track of the current node, meaning that there need only be a single thing that mutates, which is the current state of the dispatch. The dispatch program mechanism also provides a way to read an attribute from an object, and that is enough to express traversing a linked list into the dispatch program. This also means zero allocations.
So, without further ado, here is the linked list (rather less pretty in NQP, the restricted Raku subset, than it would be in full Raku):
# A linked list is used to model the state of a dispatch that is deferring
# through a set of methods, multi candidates, or wrappers. The Exhausted class
# is used as a sentinel for the end of the chain. The current state of the
# dispatch points into the linked list at the appropriate point; the chain
# itself is immutable, and shared over (runtime) dispatches.
my class DeferralChain {
has $!code;
has $!next;
method new($code, $next) {
my $obj := nqp::create(self);
nqp::bindattr($obj, DeferralChain, '$!code', $code);
nqp::bindattr($obj, DeferralChain, '$!next', $next);
$obj
}
method code() { $!code }
method next() { $!next }
};
my class Exhausted {};
And finally, the resumption handling.
nqp::dispatch('boot-syscall', 'dispatcher-register', 'raku-meth-call-resolved',
# Initial dispatch
-> $capture {
... 'Presented earlier;
},
# Resumption. The resume init capture's first two arguments are the type
# that we initially did a method dispatch against and the method name
# respectively.
-> $capture {
# Work out the next method to call, if any. This depends on if we have
# an existing dispatch state (that is, a method deferral is already in
# progress).
my $init := nqp::dispatch('boot-syscall', 'dispatcher-get-resume-init-args');
my $state := nqp::dispatch('boot-syscall', 'dispatcher-get-resume-state');
my $next_method;
if nqp::isnull($state) {
# No state, so just starting the resumption. Guard on the
# invocant type and name.
my $track_start_type := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $init, 0);
nqp::dispatch('boot-syscall', 'dispatcher-guard-type', $track_start_type);
my $track_name := nqp::dispatch('boot-syscall', 'dispatcher-track-arg', $init, 1);
nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_name);
# Also guard on there being no dispatch state.
my $track_state := nqp::dispatch('boot-syscall', 'dispatcher-track-resume-state');
nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_state);
# Build up the list of methods to defer through.
my $start_type := nqp::captureposarg($init, 0);
my str $name := nqp::captureposarg_s($init, 1);
my @mro := nqp::can($start_type.HOW, 'mro_unhidden')
?? $start_type.HOW.mro_unhidden($start_type)
!! $start_type.HOW.mro($start_type);
my @methods;
for @mro {
my %mt := nqp::hllize($_.HOW.method_table($_));
if nqp::existskey(%mt, $name) {
@methods.push(%mt{$name});
}
}
# If there's nothing to defer to, we'll evaluate to Nil (just don't set
# the next method, and it happens below).
if nqp::elems(@methods) >= 2 {
# We can defer. Populate next method.
@methods.shift; # Discard the first one, which we initially called
$next_method := @methods.shift; # The immediate next one
# Build chain of further methods and set it as the state.
my $chain := Exhausted;
while @methods {
$chain := DeferralChain.new(@methods.pop, $chain);
}
nqp::dispatch('boot-syscall', 'dispatcher-set-resume-state-literal', $chain);
}
}
elsif !nqp::istype($state, Exhausted) {
# Already working through a chain of method deferrals. Obtain
# the tracking object for the dispatch state, and guard against
# the next code object to run.
my $track_state := nqp::dispatch('boot-syscall', 'dispatcher-track-resume-state');
my $track_method := nqp::dispatch('boot-syscall', 'dispatcher-track-attr',
$track_state, DeferralChain, '$!code');
nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_method);
# Update dispatch state to point to next method.
my $track_next := nqp::dispatch('boot-syscall', 'dispatcher-track-attr',
$track_state, DeferralChain, '$!next');
nqp::dispatch('boot-syscall', 'dispatcher-set-resume-state', $track_next);
# Set next method, which we shall defer to.
$next_method := $state.code;
}
else {
# Dispatch already exhausted; guard on that and fall through to returning
# Nil.
my $track_state := nqp::dispatch('boot-syscall', 'dispatcher-track-resume-state');
nqp::dispatch('boot-syscall', 'dispatcher-guard-literal', $track_state);
}
# If we found a next method...
if nqp::isconcrete($next_method) {
# Call with same (that is, original) arguments. Invoke with those.
# We drop the first two arguments (which are only there for the
# resumption), add the code object to invoke, and then leave it
# to the invoke or multi dispatcher.
my $just_args := nqp::dispatch('boot-syscall', 'dispatcher-drop-arg',
nqp::dispatch('boot-syscall', 'dispatcher-drop-arg', $init, 0),
0);
my $delegate_capture := nqp::dispatch('boot-syscall',
'dispatcher-insert-arg-literal-obj', $just_args, 0, $next_method);
if nqp::istype($next_method, Routine) && $next_method.is_dispatcher {
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-multi',
$delegate_capture);
}
else {
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'raku-invoke',
$delegate_capture);
}
}
else {
# No method, so evaluate to Nil (boot-constant disregards all but
# the first argument).
nqp::dispatch('boot-syscall', 'dispatcher-delegate', 'boot-constant',
nqp::dispatch('boot-syscall', 'dispatcher-insert-arg-literal-obj',
$capture, 0, Nil));
}
});
That’s quite a bit to take in, and quite a bit of code. Remember, however, that this is only run for the record phase of a dispatch resumption. It also produces a dispatch program at the callsite of the callsame
, with the usual guards and outcome. Implicit guards are created for the dispatcher that we are resuming at that point. In the most common case this will end up monomorphic or bimorphic, although situations involving nestings of multiple dispatch or method dispatch could produce a more morphic callsite.
The design I’ve picked forces resume callbacks to deal with two situations: the first resumption and the latter resumptions. This is not ideal in a couple of ways:
Only the second of these really matters. The reason for the non-uniformity is to make sure that the overwhelming majority of calls, which never lead to a dispatch resumption, incur no per-dispatch cost for a feature that they never end up using. If the result is a little more cost for those using the feature, so be it. In fact, early benchmarking shows callsame
with wrap
and method calls seems to be up to 10 times faster using the new dispatcher than in current Rakudo, and that’s before the specializer understands enough about it to improve things further!
Everything I’ve discussed above is implemented, except that I may have given the impression somewhere that multiple dispatch is fully implemented using the new dispatcher, and that is not the case yet (no handling of where
clauses and no dispatch resumption support).
Getting the missing bits of multiple dispatch fully implemented is the obvious next step. The other missing semantic piece is support for callwith
and nextwith
, where we wish to change the arguments that are being used when moving to the next candidate. A few other minor bits aside, that in theory will get all of the Raku dispatch semantics at least supported.
Currently, all standard method calls ($obj.meth()
) and other calls (foo()
and $foo()
) go via the existing dispatch mechanism, not the new dispatcher. Those will need to be migrated to use the new dispatcher also, and any bugs that are uncovered will need fixing. That will get things to the point where the new dispatcher is semantically ready.
After that comes performance work: making sure that the specializer is able to deal with dispatch program guards and outcomes. The goal, initially, is to get steady state performance of common calling forms to perform at least as well as in the current master
branch of Rakudo. It’s already clear enough there will be some big wins for some things that to date have been glacial, but it should not come at the cost of regression on the most common kinds of dispatch, which have received plenty of optimization effort before now.
Furthermore, NQP – the restricted form of Raku that the Rakudo compiler and other bits of the runtime guts are written in – also needs to be migrated to use the new dispatcher. Only when that is done will it be possible to rip out the current method cache, multiple dispatch cache, and so forth from MoarVM.
An open question is how to deal with backends other than MoarVM. Ideally, the new dispatch mechanism will be ported to those. A decent amount of it should be possible to express in terms of the JVM’s invokedynamic
(and this would all probably play quite well with a Truffle-based Raku implementation, although I’m not sure there is a current active effort in that area).
While my current focus is to ship a Rakudo and MoarVM release that uses the new dispatcher mechanism, that won’t be the end of the journey. Some immediate ideas:
handles
(delegation) and FALLBACK
(handling missing method call) mechanisms can be made to perform better using the new dispatcherassuming
– used to curry or otherwise prime arguments for a routine – is not ideal, and an implementation that takes advantage of the argument rewriting capabilities of the new dispatcher would likely perform a great deal betterSome new language features may also be possible to provide in an efficient way with the help of the new dispatch mechanism. For example, there’s currently not a reliable way to try to invoke a piece of code, just run it if the signature binds, or to do something else if it doesn’t. Instead, things like the Cro router have to first do a trial bind of the signature, and then do the invoke, which makes routing rather more costly. There’s also the long suggested idea of providing pattern matching via signatures with the when
construct (for example, when * -> ($x) {}; when * -> ($x, *@tail) { }
), which is pretty much the same need, just in a less dynamic setting.
Working on the new dispatch mechanism has been a longer journey than I first expected. The resumption part of the design was especially challenging, and there’s still a few important details to attend to there. Something like four potential approaches were discarded along the way (although elements of all of them influenced what I’ve described in this post). Abstractions that hold up are really, really, hard.
I also ended up having to take a couple of months away from doing Raku work at all, felt a bit crushed during some others, and have been juggling this with the equally important RakuAST project (which will be simplified by being able to assume the presence of the new dispatcher, and also offers me a range of softer Raku hacking tasks, whereas the dispatcher work offers few easy pickings).
Given all that, I’m glad to finally be seeing the light at the end of the tunnel. The work that remains is enumerable, and the day we ship a Rakudo and MoarVM release using the new dispatcher feels a small number of months away (and I hope writing that is not tempting fate!)
The new dispatcher is probably the most significant change to MoarVM since I founded it, in so far as it sees us removing a bunch of things that have been there pretty much since the start. RakuAST will also deliver the greatest architectural change to the Rakudo compiler in a decade. Both are an opportunity to fold years of learning things the hard way into the runtime and compiler. I hope when I look back at it all in another decade’s time, I’ll at least feel I made more interesting mistakes this time around.
Many years back, Larry Wall shared his thesis on the nature of scripting. Since recently even Java gained 'script' support I thought it would be fitting to revisit the topic, and hopefully relevant to the perl and raku language community.
The weakness of Larry's treatment (which, to be fair to the author, I think is more intended to be enlightening than to be complete) is the contrast of scripting with programming. This contrast does not permit a clear separation because scripts are programs. That is to say, no matter how long or short, scripts are written commands for a machine to execute, and I think that's a pretty decent definition of a program in general.
A more useful contrast - and, I think, the intended one - is between scripts and other sorts of programs, because that allows us to compare scripting (writing scripts) with 'programming' (writing non-script programs). And to do that we need to know what other sorts of programs there are.
The short version of that answer is - systems and applications, and a bunch of other things that aren't really relevant to the working programmer, like (embedded) control algorithms, spreadsheets and database queries. (The definition I provided above is very broad, by design, because I don't want to get stuck on boundary questions). Most programmers write applications, some write systems, virtually all write scripts once in a while, though plenty of people who aren't professional programmers also write scripts.
I think the defining features of applications and systems are, respectively:
Consider for instance a mail client (like thunderbird) in comparison to a mailer daemon (like sendmail) - one provides an interface to read and write e-mails (the model) and the other provides functionality to send that e-mail to other servers.
Note that under this (again, broad) definition, libraries are also system software, which makes sense, considering that their users are developers (just as for, say, PostgreSQL) who care about things like performance, reliability, and correctness. Incidentally, libraries as well as 'typical' system software (such as database engines and operating system kernels) tend to be written in languages like C and C++ for much the same reasons.
What then, are the differences between scripts, applications, and systems? I think the following is a good list:
Obviously these distinctions aren't really binary - 'short' versus 'long', 'ad-hoc' versus 'general purpose' - and can't be used to conclusively settle the question whether something is a script or an application. (If, indeed, that question ever comes up). More important is that for the 10 or so scripts I've written over the past year - some professionally, some not - all or most of these properties held, and I'd be surprised if the same isn't true for most readers.
And - finally coming at the point that I'm trying to make today - these features point to a specific niche of programs more than to a specific technology (or set of technologies). To be exact, scripts are (mostly) short, custom programs to automate ad-hoc tasks, tasks that are either to specific or too small to develop and distribute another program for.
This has further implications on the preferred features of a scripting language (taken to mean, a language designed to enable the development of scripts). In particular:
This niche doesn't always exist. In computing environments where everything of interest is adequately captured by an application, or which lacks the ability to effectively automate ad-hoc tasks (I'm thinking in particular of Windows before PowerShell), the practice of scripting tends to not develop. Similarily, in a modern 'cloud' environment, where system setup is controlled by a state machine hosted by another organization, scripting doesn't really have much of a future.
To put it another way, scripting only thrives in an environment that has a lot of 'scriptable' tasks; meaning tasks for which there isn't already a pre-made solution available, environments that have powerful facilities available for a script to access, and whose users are empowered to automate those tasks. Such qualities are common on Unix/Linux 'workstations' but rather less so on smartphones and (as noted before) cloud computing environments.
Truth be told I'm a little worried about that development. I could point to, and expound on, the development and popularity of languages like go and rust, which aren't exactly scripting languages, or the replacement of Javascript with TypeScript, to make the point further, but I don't think that's necessary. At the same time I could point to the development of data science as a discipline to demonstrate that scripting is alive and well (and indeed perhaps more economically relevant than before).
What should be the conclusion for perl 5/7 and raku? I'm not quite sure, mostly because I'm not quite sure whether the broader perl/raku community would prefer their sister languages to be scripting or application languages. (As implied above, I think the Python community chose that they wanted Python 3 to be an application language, and this was not without consequences to their users).
Raku adds a number of features common to application languages (I'm thinking of it's powerful type system in particular), continuing a trend that perl 5 arguably pioneered. This is indeed a very powerful strategy - a language can be introduced for scripts and some of those scripts are then extended into applications (or even systems), thereby ensuring its continued usage. But for it to work, a new perl family language must be introduced on its scripting merits, and there must be a plentiful supply of scriptable tasks to automate, some of which - or a combination of which - grow into an application.
For myself, I would like to see scripting have a bright future. Not just because scripting is the most accessible form of programming, but also because an environment that permits, even requires scripting, is one were not all interesting problems have been solved, one where it's users ask it to do tasks so diverse that there isn't an app for that, yet. One where the true potential of the wonderful devices that surround is can be explored.
In such a world there might well be a bright future for scripting.
I’d like to thank everyone who voted for me in the recent Raku Steering Council elections. By this point, I’ve been working on the language for well over a decade, first to help turn a language design I found fascinating into a working implementation, and since the Christmas release to make that implementation more robust and performant. Overall, it’s been as fun as it has been challenging – in a large part because I’ve found myself sharing the journey with a lot of really great people. I’ve also tried to do my bit to keep the community around the language kind and considerate. Receiving a vote from around 90% of those who participated in the Steering Council elections was humbling.
Alas, I’ve today submitted my resignation to the Steering Council, on personal health grounds. For the same reason, I’ll be taking a step back from Raku core development (Raku, MoarVM, language design, etc.) Please don’t worry too much; I’ll almost certainly be fine. It may be I’m ready to continue working on Raku things in a month or two. It may also be longer. Either way, I think Raku will be better off with a fully sized Steering Council in place, and I’ll be better off without the anxiety that I’m holding a role that I’m not in a place to fulfill.
Both ADD and SUB refer to the same LOAD node |
The DO node is inserted for the LET operator. It ensures that the value of the LOAD node is computed before the reference in either branch |
I want to revive Carl Mäsak's Coding Contest as a crowd-sourced contest.
The contest will be in four phases:
For the first phase, development of tasks, I am looking for volunteers who come up with coding tasks collaboratively. Sadly, these volunteers, including myself, will be excluded from participating in the second phase.
I am looking for tasks that ...
This is non-trivial, so I'd like to have others to discuss things with, and to come up with some more tasks.
If you want to help with task creation, please send an email to [email protected], stating your intentions to help, and your freenode IRC handle (optional).
There are other ways to help too:
In these cases you can use the same email address to contact me,
or use IRC (moritz
on freenode) or twitter.
After a perilous drive up a steep, narrow, winding road from Lake Geneva we arrived at an attractive Alpine village (Villars-sur-Ollon) to meet with fellow Perl Mongers in a small restaurant. There followed much talk and a little clandestine drinking of exotic spirits including Swiss whisky. The following morning walking to the conference venue there was an amazing view of mountain ranges. On arrival I failed to operate the Nespresso machine which I later found was due to it simply being off. Clearly software engineers should never try to use hardware. At least after an evening of drinking.
Wendy’s stall was piled high with swag including new Bailador (Perl 6 dancer like framework) stickers, a Shadowcat booklet about Perl 6 and the new O’Reilly “Thinking in Perl 6″. Unfortunately she had sold out of Moritz’s book “Perl 6 Fundamentals” (although there was a sample display copy present). Thankfully later that morning I discovered I had a £3 credit on Google Play Books so I bought the ebook on my phone.
The conference started early with Damian Conway’s Three Little Words. These were “has”, “class” and “method” from Perl 6 which he liked so much that he had added them to Perl 5 with his “Dios” – “Declarative Inside-Out Syntax” module. PPI wasn’t fast enough so he had to replace it with a 50,000 character regex PPR. Practical everyday modules mentioned included Regexp::Optimizer and Test::Expr. If the video doesn’t appear shortly on youtube a version of his talk dating from a few weeks earlier is available at https://www.youtube.com/watch?v=ob6YHpcXmTg
Jonathan Worthington returned with his Perl 6 talk on “How does deoptimization help us go faster?” giving us insight into why Perl 6 was slow at the Virtual Machine level (specifically MoarVM). Even apparently simple and fast operations like indexing an array were slow due to powerful abstractions, late binding and many levels of Multiple Dispatch. In short the flexibility and power of such an extensible language also led to slowness due to the complexity of code paths. The AST optimizer helped with this at compile time but itself took time and it could be better to do this at a later compile time (like Just In Time). Even with a simple program reading lines from a file it was very hard to determine statically what types were used (even with type annotations) and whether it was worth optimizing (since the file could be very short).
The solution to these dynamic problems was also dynamic but to see what was happening needed cheap logging of execution which was passed to another thread. This logging is made visible by setting the environment variable MVM_SPESH_LOG to a filename. Better tooling for this log would be a good project for someone.
For execution planning we look for hot (frequently called) code, long blocks of bytecode (slow to run) and consider how many types are used (avoiding “megamorphic” cases with many types which needs many versions of code). Also analysis of the code flow between different code blocks and SSA. Mixins made the optimization particularly problematic.
MoarVM’s Spesh did statistical analysis of the code in order to rewrite it in faster, simpler ways. Guards (cheap check for things like types) were placed to catch cases where it got it wrong and if these were triggered (infrequently) it would deoptimize as well, hence the counterintuitive title since “Deoptimization enables speculation” The slides are at http://jnthn.net/papers/2017-spw-deopt.pdf with the video at https://www.youtube.com/watch?v=3umNn1KnlCY The older and more dull witted of us (including myself) might find the latter part of the video more comprehensible at 0.75 Youtube speed.
After a superb multi-course lunch (the food was probably the best I’d had at any Perl event) we returned promptly to hear Damian talk of “Everyday Perl 6”. He pointed out that it wasn’t necessary to code golf obfuscated extremes of Perl 6 and that the average Perl 5 programmer would see many things simpler in Perl 6. Also a rewrite from 5 to 6 might see something like 25% fewer lines of code since 6 was more expressive in syntax (as well as more consistent) although performance problems remained (and solutions in progress as the previous talk had reminded us).
Next Liz talked of a “gross” (in the numerical sense of 12 x 12 rather than the American teen sense) of Perl 6 Weeklies as she took us down memory lane to 2014 (just about when MoarVM was launched and when unicode support was poor!) with some selected highlights and memories of Perl 6 developers of the past (and hopefully future again!). Her talk was recorded at https://www.youtube.com/watch?v=418QCTXmvDU
Cal then spoke of Perl 6 maths which he thought was good with its Rats and FatRats but not quite good enough and his ideas of fixing it. On the following day he showed us he had started some TDD work on TrimRats. He also told us that Newton’s Method wasn’t very good but generated a pretty fractal. See https://www.youtube.com/watch?v=3na_Cx-anvw
Lee spoke about how to detect Perl 5 memory leaks with various CPAN modules and his examples are at https://github.com/leejo/Perl_memory_talk
The day finished with Lightning Talks and a barbecue at givengain — a main sponsor.
On the second day I noticed the robotic St Bernards dog in a tourist shop window had come to life.
Damian kicked off the talks with my favourite of his talks, “Standing on the Shoulders of Giants”, starting with the Countess of Lovelace and her Bernoulli number program. This generated a strange sequence with many zeros. The Perl 6 version since it used rational numbers not floating point got the zeros right whereas the Perl 5 version initially suffered from floating point rounding errors (which are fixable).
Among other things he showed us how to define a new infix operator in Perl 6. He also showed us a Perl 6 sort program that looked exactly like LISP even down to the Lots of Irritating Superfluous Parentheses. I think this was quicksort (he certainly showed us a picture of Sir Tony Hoare at some point). Also a very functional (Haskell-like) equivalent with heavy use of P6 Multiple Dispatch. Also included was demonstration of P6 “before” as a sort of typeless/multi-type comparison infix. Damian then returned to his old favourite of Quantum Computing.
My mind and notes got a bit jumbled at this point but I particularly liked the slide that explained how factorisation could work by observing the product of possible inputs since this led to a collapse that revealed the factors. To do this on RSA etc., of course, needs real hardware support which probably only the NSA and friends have (?). Damian’s code examples are at http://www.bit.do/Perl6SOG with an earlier version of his talk at https://www.youtube.com/watch?v=Nq2HkAYbG5o Around this point there was a road race of classic cars going on outside up the main road into the village and there were car noises in the background that strangely were more relaxing than annoying.
After Quantum Chaos Paul Johnson brought us all back down to ground with an excellent practical talk on modernising legacy Perl 5 applications based on his war stories. Hell, of course, is “Other People’s Code”, often dating from Perl’s early days and lacking documentation and sound engineering.
Often the original developers had long since departed or, in the worse cases, were still there. Adding tests and logging (with stack traces) were particularly useful. As was moving to git (although its steep learning curve meant mentoring was needed) and handling CPAN module versioning with pinto. Many talks had spoken of the Perl 6 future whereas this spoke of the Perl 5 past and present and the work many of us suffer to pay the bills. It’s at https://www.youtube.com/watch?v=4G5EaUNOhR0
Jonathan then spoke of reactive distributed software. A distributed system is an async one where “Is it working?” means “some of it is working but we don’t know which bits”. Good OO design is “tell don’t ask” — you tell remote service to do something for you and not parse the response and do it yourself thus breaking encapsulation. This is particularly important in building well designed distributed systems since otherwise the systems are less responsive and reliable. Reactive (async) works better for distributed software than interactive (blocking or sync).
We saw a table that used a Perl 6 promise for one value and a supply for many values for reactive (async) code and the equivalent (one value) and a Perl 6 Seq for interactive code. A Supply could be used for pub/sub and the Observer Pattern. A Supply could either be live (like broadcast TV) or, for most Perl 6 supplies, on-demand (like Netflix). Then samples of networking (socket) based code were discussed including a web client, web server and SSH::LibSSH (async client bindings often very useful in practical applications like port forwarding)
https://github.com/jnthn/p6-ssh-libssh
Much of the socket code had a pattern of “react { whenever {” blocks with “whenever” as a sort of async loop.He then moved on from sockets to services (using a Supply pipeline) and amazed us by announcing the release of “cro”, a microservices library that even supports HTTP/2 and Websockets, at http://mi.cro.services/. This is installable using Perl 6 by “zef install –/test cro”.
Slides at http://jnthn.net/papers/2017-spw-sockets-services.pdf and video at https://www.youtube.com/watch?v=6CsBDnTUJ3A
Next Lee showed Burp Scanner which is payware but probably the best web vulnerabilities scanner. I wondered if anyone had dare run it on ACT or the hotel’s captive portal.
Wendy did some cheerleading in her “Changing Image of Perl”. An earlier version is at https://www.youtube.com/watch?v=Jl6iJIH7HdA
Sue’s talk was “Spiders, Gophers, Butterflies” although the latter were mostly noticeably absent. She promises me that a successor version of the talk will use them more extensively. Certainly any Perl 6 web spidering code is likely to fit better on one slide than the Go equivalent.
During the lightning talks Timo showed us a very pretty Perl 6 program using his SDL2::Raw to draw an animated square spiral with hypnotic colour cycling type patterns. Also there was a talk by the author about https://bifax.org/bif/— a distributed bug tracking system (which worked offline like git).
Later in the final evening many of us ate and chatted in another restaurant where we witnessed a dog fight being narrowly averted and learnt that Wendy didn’t like Perl 5’s bless for both technical and philosophical reasons.
Time for some old man's reminiscence. Or so it feels when I realize that I've spent more than 10 years involved with the Perl 6 community.
It was February 2007.
I was bored. I had lots of free time (crazy to imagine that now...), and I spent some of that answering (Perl 5) questions on perlmonks. There was a category of questions where I routinely had no good answers, and those were related to threads. So I decided to play with threads, and got frustrated pretty quickly.
And then I remember that a friend in school had told me (about four years earlier) that there was this Perl 6 project that wanted to do concurrency really well, and even automatically parallelize some stuff. And this was some time ago, maybe they had gotten anywhere?
So I searched the Internet, and found out about Pugs, a Perl 6 compiler written in Haskell. And I wanted to learn more, but some of the links to the presentations were dead. I joined the #perl6 IRC channel to report the broken link.
And within three minutes I got a "thank you" for the report, the broken links were gone, and I had an invitation for a commit bit to the underlying SVN repo.
I stayed.
Those were they wild young days of Perl 6 and Pugs. Audrey Tang was pushing Pugs (and Haskell) very hard, and often implemented a feature within 20 minutes after somebody mentioned it. Things were unstable, broken often, and usually fixed quickly. No idea was too crazy to be considered or even implemented.
We had bots that evaluated Perl 6 and Haskell code, and gave the result directly on IRC. There were lots of cool (and sometimes somewhat frightening) automations, for example for inviting others to the SVN repo, to the shared hosting system (called feather), for searching SVN logs and so on. Since git was still an obscure and very unusable, people tried to use SVK, an attempt to implement a decentralized version control system on top of of the SVN protocol.
Despite some half-hearted attempts, I didn't really make inroads into compiler developments. Having worked with neither Haskell nor compilers before proved to be a pretty steep step. Instead I focused on some early modules, documentation, tests, and asking and answering questions. When the IRC logger went offline for a while, I wrote my own, which is still in use today.
I felt at home in that IRC channel and the community. When the community asked for mentors for the Google Summer of Code project, I stepped up. The project was a revamp of the Perl 6 test suite, and to prepare for mentoring task, I decided to dive deeper. That made me the maintainer of the test suite.
I can't recount a full history of Perl 6 projects during that time range, but I want to reflect on some projects that I considered my pet projects, at least for some time.
It is not quite clear from this (very selected) timeline, but my Perl 6 related activity dropped around 2009 or 2010. This is when I started to work full time, moved in with my girlfriend (now wife), and started to plan a family.
The technologies and ideas in Perl 6 are fascinating, but that's not what kept me. I came for the technology, but stayed for the community.
There were and are many great people in the Perl 6 community, some of whom I am happy to call my friends. Whenever I get the chance to attend a Perl conference, workshop or hackathon, I find a group of Perl 6 hackers to hang out and discuss with, and generally have a good time.
Four events stand out in my memory. In 2010 I was invited to the Open Source Days in Copenhagen. I missed most of the conference, but spent a day or two with (if memory serve right) Carl Mäsak, Patrick Michaud, Jonathan Worthington and Arne Skjærholt. We spent some fun time trying to wrap our minds around macros, the intricacies of human and computer language, and Japanese food. (Ok, the last one was easy). Later the same year, I attended my first YAPC::EU in Pisa, and met most of the same crowd again -- this time joined by Larry Wall, and over three or four days. I still fondly remember the Perl 6 hallway track from that conference. And 2012 I flew to Oslo for a Perl 6 hackathon, with a close-knit, fabulous group of Perl 6 hackers. Finally, the Perl Reunification Summit in the beautiful town of Perl in Germany, which brought together Perl 5 and Perl 6 hackers in a very relaxed atmosphere.
For three of these four events, different private sponsors from the Perl and Perl 6 community covered travel and/or hotel costs, with their only motivation being meeting folks they liked, and seeing the community and technology flourish.
The Perl 6 community has evolved a lot over the last ten years, but it is still a very friendly and welcoming place. There are lots of "new" folks (where "new" is everybody who joined after me, of course :D), and a surprising number of the old guard still hang around, some more involved, some less, all of them still very friendly and supportive
I anticipate that my family and other projects will continue to occupy much of my time, and it is unlikely that I'll be writing another Perl 6 book (after the one about regexes) any time soon. But the Perl 6 community has become a second home for me, and I don't want to miss it.
In the future, I see myself supporting the Perl 6 community through infrastructure (community servers, IRC logs, running IRC bots etc.), answering questions, writing a blog article here and there, but mostly empowering the "new" guard to do whatever they deem best.
After about nine months of work, my book Perl 6 Fundamentals is now available for purchase on apress.com and springer.com.
The ebook can be purchased right now, and comes in the epub and PDF formats (with watermarks, but DRM free). The print form can be pre-ordered from Amazon, and will become ready for shipping in about a week or two.
I will make a copy of the ebook available for free for everybody who purchased an earlier version, "Perl 6 by Example", from LeanPub.
The book is aimed at people familiar with the basics of programming; prior
Perl 5 or Perl 6 knowledge is not required. It features a practical example in most chapters (no mammal hierarchies or class Rectangle
inheriting from class Shape
), ranging from simple input/output and text formatting to plotting with python's matplotlib libraries. Other examples include date and time conversion, a Unicode search tool and a directory size visualization.
I use these examples to explain subset of Perl 6, with many pointers to more
documentation where relevant. Perl 6 topics include the basic lexicographic
structure, testing, input and output, multi dispatch, object orientation, regexes and grammars, usage of modules, functional programming and interaction
with python libraries through Inline::Python
.
Let me finish with Larry Wall's description of this book, quoted from his foreword:
It's not just a reference, since you can always find such materials online. Nor is it just a cookbook. I like to think of it as an extended invitation, from a well-liked and well-informed member of our circle, to people like you who might want to join in on the fun. Because joy is what's fundamental to Perl. The essence of Perl is an invitation to love, and to be loved by, the Perl community. It's an invitation to be a participant of the gift economy, on both the receiving and the giving end.
The Perl 6 naming debate has started again. And I guess with good reason. Teaching people that Perl 6 is a Perl, but not the Perl requires too much effort. Two years ago, I didn't believe. Now you're reading a tired man's words.
I'm glad that this time, we're not discussing giving up the "Perl" brand, which still has very positive connotations in my mind, and in many other minds as well.
And yet, I can't bring myself to like "Rakudo Perl 6" as a name. There are two vary shallow reasons for that: Going from two syllables, "Perl six", to five of them, seems a step in the wrong direction. And two, I remember the days when the name was pretty young, and people would misspell it all the time. That seems to have abated, though I don't know why.
But there's also a deeper reason, probably sentimental old man's reason. I remember the days when Pugs was actively developed, and formed the center of a vibrant community. When kp6 and SMOP and all those weird projects were around. And then, just when it looked like there was only a single compiler was around, Stefan O'Rear conjured up niecza, almost single-handedly, and out of thin air. Within months, it was a viable Perl 6 compiler, that people on #perl6 readily recommended.
All of this was born out of the vision that Perl 6 was a language with no single, preferred compiler. Changing the language name to include the compiler name means abandoning this vision. How can we claim to welcome alternative implementations when the commitment to one compiler is right in the language name?
However I can't weigh this loss of vision against a potential gain in popularity. I can't decide if it's my long-term commitment to the name "Perl 6" that makes me resent the new name, or valid objections. The lack of vision mirrors my own state of mind pretty well.
I don't know where this leaves us. I guess I must apologize for wasting your time by publishing this incoherent mess.
Perl 6 is innovative in many ways, and sometimes we don't fully appreciate all the implications, for good or for bad.
There's one I stumbled upon recently: The use of fancy Unicode symbols for built-in stuff. In this case: the `.gist` output of Match objects. For example
my token word { \w+ } say 'abc=def' ~~ /<word> '=' <word>/;produces this output:
「abc=def」 word => 「abc」 word => 「def」
And that's where the problems start. In my current quest to write a book on Perl 6 regexes, I noticed that the PDF that LeanPub generates from my Markdown sources don't correctly display those pesky 「」 characters, which are
$ uni -c 「」 「 - U+0FF62 - HALFWIDTH LEFT CORNER BRACKET 」 - U+0FF63 - HALFWIDTH RIGHT CORNER BRACKET
When I copied the text from the PDF and pasted into my editor, they showed up correctly, which indicates that the characters are likely missing from the monospace font.
The toolchain allows control over the font used for displaying code, so I tried all the monospace fonts that were available. I tried them in alphabetical order. Among the earlier fonts I tried was Deja Vu Sans Mono, which I use in my terminal, and which hasn't let me down yet. No dice. I arrived at Noto, a font designed to cover all Unicode codepoints. And it didn't work either. So it turns out these two characters are part of some Noto Sans variants, but not of the monospace font.
My terminal, and even some font viewers, use some kind of fallback where they use glyphs from other fonts to render missing characters. The book generation toolchain does not.
The Google Group for Leanpub was somewhat helpful: if I could recommend an Open Source mono space font that fit my needs, they'd likely include it in their toolchain.
So I searched and searched, learning more about fonts than I wanted to know. My circle of geek friends came up with several suggestions, one of them being Iosevka, which actually contains those characters. So now I wait for others to step up, either for LeanPub to include that font, or for the Noto maintainers to create a monospace variant of those characters (and then LeanPub updating their version of the font).
And all of that because Perl 6 was being innovative, and used two otherwise little-used characters as delimiters, in an attempt to avoid collisions between delimiters and content.
(In the mean time I've replaced the two offending characters with ones that look similar. It means the example output is technically incorrect, but at least it's readable).
At YAPC::EU 2010 in Pisa I received a business card with "Rakudo Star" and the
date July 29, 2010 which was the date of the first release -- a week earlier
with a countdown to 1200 UTC. I still have mine, although it has a tea stain
on it and I refreshed my memory over the holidays by listening again to Patrick
Michaud speaking about the launch of Rakudo Star (R*):
https://www.youtube.com/watch?v=MVb6m345J-Q
R* was originally intended as first of a number of distribution releases (as
opposed to a compiler release) -- useable for early adopters but not initially production
Quality. Other names had been considered at the time like Rakudo Beta (rejected as
sounding like "don't use this"!) and amusingly Rakudo Adventure Edition.
Finally it became Rakudo Whatever and Rakudo Star (since * means "whatever"!).
Well over 6 years later and we never did come up with a better name although there
was at least one IRC conversation about it and perhaps "Rakudo Star" is too
well established as a brand at this point anyway. R* is the Rakudo compiler, the main docs, a module installer, some modules and some further docs.
However, one radical change is happening soon and that is a move from panda to
zef as the module installer. Panda has served us well for many years but zef is
both more featureful and more actively maintained. Zef can also install Perl
6 modules off CPAN although the CPAN-side support is in its early days. There
is a zef branch (pull requests welcome!) and a tarball at:
http://pl6anet.org/drop/rakudo-star-2016.12.zef-beta2.tar.gz
Panda has been patched to warn that it will be removed and to advise the use of
zef. Of course anyone who really wants to use panda can reinstall it using zef
anyway.
The modules inside R* haven't changed much in a while. I am considering adding
DateTime::Format (shown by ecosystem stats to be widely used) and
HTTP::UserAgent (probably the best pure perl6 web client library right now).
Maybe some modules should also be removed (although this tends to be more
controversial!). I am also wondering about OpenSSL support (if the library is
available).
p6doc needs some more love as a command line utility since most of the focus
has been on the website docs and in fact some of these changes have impacted
adversely on command line use, eg. under Windows cmd.exe "perl 6" is no longer
correctly displayed by p6doc. I wonder if the website generation code should be
decoupled from the pure docs and p6doc command line (since R* has to ship any
new modules used by the website). p6doc also needs a better and faster search
(using sqlite?). R* also ships some tutorial docs including a PDF generated from perl6intro.com.
We only ship the English one and localisation to other languages could be
useful.
Currently R* is released roughly every three months (unless significant
breakage leads to a bug fix release). Problems tend to happen with the
less widely used systems (Windows and the various BSDs) and also with the
module installers and some modules. R* is useful in spotting these issues
missed by roast. Rakudo itself is still in rapid development. At some point a less frequently
updated distribution (Star LTS or MTS?) will be needed for Linux distribution
packagers and those using R* in production). There are also some question
marks over support for different language versions (6.c and 6.d).
Above all what R* (and Rakudo Perl 6 in general) needs is more people spending
more time working on it! JDFI! Hopefully this blog post might
encourage more people to get involved with github pull requests.
https://github.com/rakudo/star
Feedback, too, in the comments below is actively encouraged.
There is a Release Candidate for Rakudo Star 2016.11 (currently RC2) available at
http://pl6anet.org/drop/
This includes binary installers for Windows and Mac.
Usually Star is released about every three months but last month's release didn't include a Windows installer so there is another release.
I'm hoping to release the final version next weekend and would be grateful if people could try this out on as many systems as possible.
Any feedback email steve *dot* mynott *at* gmail *dot* com
Full draft announce at
https://github.com/rakudo/star/blob/master/docs/announce/2016.11.md
We turned up in Cluj via Wizz Air to probably one of the best pre YAPC parties ever located on three levels on the rooftop of Evozon’s plush city centre offices. We were well supplied with excellent wine, snacks and the local Ursus beer and had many interesting conversations with old friends.
On the first day Tux spoke about his Text::CSV modules for both Perl 5 and 6 on the first day and I did a short talk later in the day on benchmarking Perl 6. Only Nicholas understood my trainspotter joke slide with the APT and Deltic! Sadly my talk clashed with Lee J talking about Git which I wanted to see so I await the youtube version! Jeff G then spoke about Perl 6 and parsing languages such as JavaScript. Sadly I missed Leon T’s Perl 6 talk which I also plan on watching on youtube. Tina M gave an excellent talk on writing command line tools. She also started the lightning talks with an evangelical talk about how tmux was better than screen. Geoffrey A spoke about configuring sudo to run restricted commands in one directory which seemed a useful technique to me. Dave C continued his conference tradition of dusting off his Perl Vogue cover and showing it again. The age of the image was emphasised by the amazingly young looking mst on it. And Stefan S ended with a call for Perl unification.
The main social event was in the courtyard of the main museum off the central square with free food and beer all evening and an impressive light show on the slightly crumbling facade. There were some strange chairs which resembled cardboard origami but proved more comfortable than they looked when I was finally able to sit in one. The quality of the music improved as the evening progressed (or maybe the beer helped) I was amazed to see Perl Mongers actually dancing apparently inspired by the younger Cluj.pm members.
Day Two started with Sawyer’s State of the Velociraptor which he had, sensibly, subcontracted to various leading lights of the Perl Monger community. Sue S (former London.pm leader) was up first with a short and sweet description of London.pm. Todd R talked about Houston.pm. Aaron Crane spoke about the new improved friendlier p5p. Tina about Berlin.pm and the German Perl community site she had written back in the day. This new format worked very well and it was obvious Perl Mongers groups could learn much from each other. Max M followed with a talk about using Perl and ElasticSearch to index websites and documents and Job about accessibility.
1505 had, from the perspective of London.pm, one of the most unfortunate scheduling clashes at YAPC::EU ever, with three titans of London.pm (all former leaders) battling for audience share. I should perhaps tread carefully here lest bias become apparent but the heavyweight Sue Spence was, perhaps treacherously, talking about Go in the big room and Dave Cross and Tom talking about Perl errors and HTML forms respectively in the other rooms. This momentous event should be reproducible by playing all three talks together in separate windows once they are available.
Domm did a great talk on Postgres which made me keen to use this technology again. André W described how he got Perl 6 running on his Sailfish module phone while Larry did a good impression of a microphone stand. I missed most of Lance Wick’s talk but the bit I caught at the end made me eager to watch the whole thing.
Guinevere Nell gave a fascinating lightning talk about agent based economic modelling. Lauren Rosenfield spoke of porting (with permission) a “Python for CS” book to perl 6. Lukas Mai described his journey from Perl to Rust. Lee J talked about photography before Sue encouraged people to break the London.pm website. Outside the talk rooms on their stall Liz and Wendy had some highly cool stuffed toy Camelia butterflies produced by the Beverly Hills Teddy Bear Company and some strange “Camel Balls” bubblegum. At the end of the day Sue cat herded many Mongers to eat at the Enigma Steampunk Bar in central Cluj with the cunning ploy of free beer money (recycled from the previous year’s Sherry money).
The third day started with Larry’s Keynote in which photographs of an incredible American house “Fallingwater” and Chinese characters (including “arse rice”) featured heavily. Sweth C gave a fast and very useful introduction to swift. Nicholas C then confused a room of people for an hour with a mixture of real Perl 5 and 6 and an alternative timeline compete with T shirts. The positive conclusion was that even if the past had been different the present isn’t likely to have been much better for the Perl language family than it is now! Tom spoke about Code Review and Sawyer about new features in Perl 5.24. Later I heard Ilya talk about running Perl on his Raspberry PI Model B and increasing the speed of his application very significantly to compensate for its low speed! And we finished with lightning talks where we heard about the bug tracker OTRS (which was new to me), Job spoke about assistive tech and Nine asked us to ask our bosses for money for Perl development amongst several other talks. We clapped a lot in thanks, since this was clearly a particularly well organised YAPC::EU (due to Amalia and her team!) and left to eat pizza and fly away the next day. Some stayed to visit a salt mine (which looked most impressive from the pictures!) and some stayed longer due to Lufthansa cancelling their flights back!
The meeting first night was in a large beer bar in the centre of Nuremberg.
We went back to the Best Western to find a certain exPumpkin already resident in the bar.
Despite several of the well named Bitburgers we managed to arrive at the
conference venue on time the following morning. Since my knowledge of German was
limited to a C grade 'O' Level last century my review talks will be mostly
limited to English talks. Apologies in advance to those giving German talks
(not unreasonable considering the country). Hopefully other blog posts will
cover these.
Masak spoke about the dialectic between planning (like physics) and chaos (like
biology) in software development.
http://masak.org/carl/gpw-2016-domain-modeling/talk.pdf
Tobias gave a good beginners guide to Perl 6 in German and I was able to follow
most of the slides since I knew more Perl 6 than German and even learnt a thing
or two.
After lunch Stefan told us he was dancing around drunk and naked on the turn of
the 2000s and also about communication between Perl 6 and Perl 5 and back again
via his modules Inline::Perl5 (from Perl 6) -- the most important take away
being that "use Foo::Bar:from<Perl5>" can be used from Perl 6 and "use
Inline::Perl6" from Perl 5. The modules built bridges like those built in the
old school computer game "Lemmings".
http://niner.name/talks/Perl%205%20and%20Perl%206%20-%20a%20great%20team/Perl%205%20and%20Perl%206%20-%20a%20great%20team.odp
Max told us (in German) about his Dancer::SearchApp search
engine which has based on Elastic Search but I was able to follow along on the
English version of his slides on the web.
http://corion.net/talks/dancer-searchapp/dancer-searchapp.en.html
Sue got excited about this. Tina showed us some slides in Vim and her module
to add command line tab completion to script arguments using zsh and bash. I
wondered whether some of her code could be repurposed to add fish shell man
page parsing autocompletion to zsh. She also had a good lightening talk about
Ingy's command line utility for github.
https://github.com/perlpunk/myslides/tree/master/app-spec
Second day started early with Moritz talking about Continuous Delivery which
could mean just delivering to a staging server. He was writing a book about it
at deploybook.com with slides at:
https://deploybook.com/talks/gpw2016-continuous-delivery.pdf
Salve wanted us to write elegant code as a reply to the Perl Jam guy at CCC in
a self confessed "rant".
Sawyer described writing Ref::Util to optimise things like "ref $foo" in a
Hardcore Perl 5 XS/Core talk and Masak told us about his little 007 language
written in Perl 6 as a proof of concept playroom for future Perl 6 extended
macro support and demonstrated code written over lunch in support of this.
http://masak.org/carl/gpw-2016-big-hairy-yaks/talk.pdf
Stefan gave a great talk about CURLI and explained the complexity of what was
intended.
I gave my talk on "Simple Perl 6 Fractals and Concurrency" on Friday. It
started badly with AV issues my side but seemed well received. It was useful
speaking with people about it and I managed to speed things up *after* the talk
and I should have new material for a 2.0 version.
There were very good talks on extracting data from PDFs and writing JSON apis.
https://github.com/mickeyn/PONAPI
looked very interesting and would have saved me much coding at a recent job.
There were some great lightening talks at th end of the day. Sawyer wanted
people to have English slides and gave his talk in Hebrew to stress this.
Things ended Friday night with great food and beer in a local bar.
To me It seemed a particularly good FOSDEM for both for Perl5/6 and
other talks although very crowded as usual and I didn't see the usual
*BSD or Tor stalls. I was stuck by the statistic that there were
about 500 speakers from many thousands of people so of the order of
one speaker per tens of attendees which is very high.
Videos are already starting to appear at
On Saturday I started with Poettering and systemd which was a keynote
and perhaps a little disappointing since he usually is a better
speaker and the audio was a little indistinct. systemd had won being
used by all distros except gentoo and slackware. They were now working
on a dns resolver component which supported DNSSEC although in
practice validating signed zone files would slow down browsing and
currently only 2% of websites had it activated. He didn't mention
strong criticisms of its security by crypto experts such as DJB.
The most amusing talk was Stark's retro running of Postgres on
NetBSD/VAX which exposed some obscure OS bugs and was livened up by a
man in an impressive Postgres Elephant costume appearing. We later
spoke to Mr Elephant who said he was both blind and very hot at the
time. I then went to the Microkernel room to hear about GNU/Hurd
progress from Thibault since this room is usually "OPEN" and he's an
excellent speaker. I noticed even this obscure room was quite crowded
as compared with previous years so I'd guess total attendees this year
were high. He stressed the advantages of running device drivers in
userspace as allowing more user "freedom" to mount fs etc. without
root and improving kernel stability since the drivers could crash and
restart without bringing down the kernel. In previous years he had
talked of his DDE patches allowing linux 2.6 hardware drivers on Hurd
and this year he was using the NetBSD Rump kernel under Hurd to add
sound support with USB support promised. His demo was RMS singing his
song on his Hurd laptop. The irony was he needed to use BSD code on a
GNU/BSD/Hurd system to do it! There had been some work on X86-64 Hurd
but it wasn't there yet since he needed more help from the community.
I then saw some lightening talks (actually 20 mins long) including a
good one on C refactoring.
The Perl dinner on Saturday night featured the usual good food and
conversation and the devroom was on Sunday. Ovid spoke about Perl 6
and its advantages (such as being able to perform maths on floats
correctly). I had a python guy sitting next to me who admitted he had
never been to a Perl talk before so that was a success in reaching
someone new. Will Braswell spoke next about his "Rperl" compiler
which translated his own quite restricted subset (no regexps yet and
no $_) of Perl 5 line by line into C++ in order to run some of the
language shootups benchmarks (a graphical animation of planetary
motion) at increased speed. I'd not seen Will before and he was an
excellent speaker who left me more impressed than I'd expected and I
hope he gets to YAPC::EU in the summer. I saw some non-Perl stuff
next for variety including a good one on the Go debugger Delve which
was aware of the go concurrency and could be used as a basic REPL. I
returned to Perl to see Bart explain some surprisingly simple X86-64
assembly language to do addition and ROT13 which he interfaced with
Perl 6 using NativeCall (although it stuck me that the
CPAN P5NCI module on Perl 5 would have also worked).
Again an excellent talk and a good start to the a
run of some of the best Perl talks I'd ever seen. Stevan Little's talk
was one of the his most amusing ever and perl wasn't really dead.
Sawyer also did an excellent promotion of Perl 5 targeted at the
people who maybe hadn't used it since the early 2000s explaining what
had changed. Liz finished with her autobiographical account of Perl
development and some nice short Perl 6 examples. We all ate again in
the evening together my only regrets being I'd missed the odd talk or
two (which I should be able to watch on video).
MetaCPAN, like the rest of "CPAN", was built assuming the sole context of Perl5. Which is cool until we want to use it for Perl6 and avoid the troubles associated with different namespaces, dist mgmt, etc... To largely avoid and more easily handle these issues for MetaCPAN it's been suggested that we have separate instances. The existing Perl5 instance only needs to be changed to ignore Perl6 distributions. There has already been some breakage because it didn't ignore a Perl6 dist of mine which exists in the Perl5 world:( And the new Perl6 instance will do just the opposite and only look at Perl6 distributions.
In contrast, and relatedly, on CPAN we've designated a special spot for Perl6 distributions in order to keep them separate from the Perl5 dists. This reserved place is a Perl6 subdir in an author's dir (/author/id/*/*/*/Perl6/). Any dists in or under that spot on the fs will be considered a Perl6 dist; valid or invalid. So this is where the Perl6 MetaCPAN will look and the Perl5 instance will not.
Current development is being done on these temporary branches:
And the main dev instance is running on hack.p6c.org. The web end is at http://hack.p6c.org:5001 and the api is at http://hack.p6c.org:5000.
So far the idea has been to iterate on the aforementioned branches and instance until we have something that works sufficiently well. At that point we'll tidy up the branches and submit them for merging. Shortly after that time the hope is that we'll be able to stand up the official Perl6 instance.
The list of requirements for being adequately cooked is:
All of these have been hacked in and are at various degrees of completeness. Next up is testing and fixing bugs until nothing major is left. To that end I've recently loaded up the dev instance with all the distributions from modules.perl6.org. The dist files were generated, very hackily, with https://github.com/jdv/cpan-api/blob/master/test_p6_eco_to_p6_cpan.pl. I also just loaded them all under one user, mine, for simplicity. That load looks like it has problems of its own as well as revealing a bunch of issues. So in the coming days I hope to get that all sorted out.
In the Perl5 world, just in case anyone is unaware, CPAN is a major factor. Its basically the hub of the Perl5 world.
What I am referring to here as CPAN is not just the mirrored collection of 32K+ distributions. Its the ecosystem that's built up around that collection. This ecosystem has many parts, some more important than others depending on who you talk to, but the most important parts to me are:
These are the 5 aspects of "CPAN" that I'd like to see happen for Perl6. One way to get that would be to write the whole thing from scratch in Perl6. While it may sound cool in some sort of dogfoody and/or bootstrappy kind of way to some, it sounds like a lot of work to me and we're a bit strapped for developer resources. Another way would be to add support for Perl6 to the existing CPAN bits. The hope there being, primarily, that it'd be a lot less work. The latter approach is what I've been working on lately. And if we want to refactor ourselves off the Perl5 bits in the future we can take our time doing it; later.
At this time we have:So we can publish Perl6 distributions to CPAN and search that collection. Well, sort of on that last bit. The metacpan prototype instance is not currently tracking CPAN. Its actually been loaded up with Perl6 distributions from the Perl6 module ecosystem (modules.perl6.org) for testing. But hopefully soon we'll have an official Perl6 metacpan instance, separate from the Perl5 instance, that will track CPAN's Perl6 content as it should.
What we need next is:If anyone is interested in working on any of this stuff please stop by #perl6 on freenode. If nobody else is able to help you I'll (jdv79) do my best.
Thanks to those on Freenode IRC/perl6 for help.
Further corrections and expansions welcome either on iRC via pull request to https://github.com/stmuk/glr-html
pre GLR | GLR |
|
LIST IS NOW PARCEL |
> say (1,2,3).WHAT (Parcel) |
> say (1,2,3).WHAT (List) |
LACK OF IMPLICIT LIST FLATTENING |
> my @array = 1,(2,3),4 1 2 3 4 > @array.elems 4 |
my @array = 1,(2,3),4 [1 (2 3) 4] > @array.elems 3 to flatten> my @list := 1, [2, 3], 4 (1 [2 3] 4) > dd @list.flat.list (1, 2, 3, 4) or> my @array = (1,(2,3),4).flat [1 2 3 4] or more complex structures (jnthn++)say gather [[[[["a", "b"], "c"], "a"], "d"], "e"].deepmap(*.take) |
.lol METHOD REMOVED |
> dd (1,2,3).lol (1; 2; 3) |
|
SINGLE ARG RULE |
> dd (1,) (1,) > dd [1,] $ = [1] > dd [[1,]] $ = [[1]] |
> dd (1,) (1) > dd [1,] [1] > dd [[1],] [[1],] |
LIST NOW IMMUTABLE |
> my @array = 1,2,3 1 2 3 > @array.shift 1 > dd @array @array = [2, 3]<> |
> my @list := 1,2,3 (1 2 3) > @list.shift Method 'shift' not found for invocant of class 'List' > @list[0] 1 > dd @list (1, 2, 3) |
ARRAY IS MUTABLE AND A SUBCLASS OF LIST |
> my @array = 1,2,3 [1 2 3] > @array[0]=0 0 > dd @array @array = [0, 2, 3] >say (Array).^mro ((Array) (List) (Cool) (Any) (Mu)) |
|
SLIP SUBCLASS OF LIST |
> my @a = 1, (2, 3).Slip, 4 [1 2 3 4] > my $slip = slip(2,3) (2 3) > dd $slip Slip $slip = $(2, 3) > my @array = 1,$slip,4 [1 2 3 4] > (1,$(2,3),4) (1 (2 3) 4) > (1,|(2,3),4) (1 2 3 4) |
|
SEQUENCE |
> my $grep = (1..4).grep(*>2); dd $grep>>.Int; (3, 4) > dd $grep>>.Int; This Seq has already been iterated, and its values consumed in block prevent consumption> my $grep = (1..4).grep(*>2); my $cache=$grep.cache (3 4) > say $cache>>.Int (3 4) > say $cache>>.Int (3 4) > my @array = 1,(2,3),4 [1 (2 3) 4] > dd @array.flat (1, $(2, 3), 4).Seq > dd @array.flat.list (1, $(2, 3), 4) |
So we are anticipating a long rollout cycle for PHP 6, and we did not want to take the same route that the Perl project did, with project contributors still working on Perl 6 I think six years later. People make fun of Microsoft, but take a look at Perl 6. . . .
Sure, PHP 6 may have a shorter release cycle than Perl 6 has, but at the end of it all, we'll have Perl 6, and you'll still have PHP.So how did those predictions work out? Well, after a little over six years of development, we discovered that we were never going to see a PHP 6 at all. Having seen how long Perl 6 had taken, and how long PHP 6 was taking, the number 6 is associated with failure. So they cancelled PHP 6 and voted to change the name to PHP 7. Problem solved! No, really, this is some of the actual reasoning given by people on the 6 to 7 RFC. (Someone should tell the ES6 folks before the curse strikes our browsers!)
Just sayin'.
xoxo,
Andy
print
removed in favor of function print()
, ostensibly to make a consistent API but really just to mess with people.At FOSDEM 2015, Larry announced that there will likely be a Perl 6 release candidate in 2015, possibly around the September timeframe. What we’re aiming for is concurrent publication of a language specification that has been implemented and tested in at least one usable compilation environment — i.e., Rakudo Perl 6.
So, for the rest of 2015, we can expect the Rakudo development team to be highly focused on doing only those things needed to prepare for the Perl 6 release later in the year. And, from previous planning and discussion, we know that there are three major areas that need work prior to release: the Great List Refactor (GLR), Native Shaped Arrays (NSA), and Normalization Form Grapheme (NFG).
…which brings us to Parrot. Each of the above items is made significantly more complicated by Rakudo’s ongoing support for Parrot, either because Parrot lacks key features needed for implementation (NSA, NFG) or because a lot of special-case code is being used to maintain adequate performance (lists and GLR).
At present most of the current userbase has switched over to MoarVM as the backend, for a multitude of reasons. And more importantly, there currently aren’t any Rakudo or NQP developers on hand that are eager to tackle these problems for Parrot.
In order to better focus our limited resources on the tasks needed for a Perl 6 language release later in the year, we’re expecting to suspend Rakudo’s support for the Parrot backend sometime shortly after the 2015.02 release.
Unfortunately the changes that need to be made, especially for the GLR, make it impractical to simply leave existing Parrot support in place and have it continue to work at a “degraded” level. Many of the underlying assumptions will be changing. It will instead be more effective to (re)build the new systems without Parrot support and then re-establish Parrot as if it is a new backend VM for Rakudo, following the techniques that were used to create JVM, MoarVM, and other backends for Rakudo.
NQP will continue to support Parrot as before; none of the Rakudo refactorings require any changes to NQP.
If there are people that want to work on refactoring Rakudo’s support for Parrot so that it’s more consistent with the other VMs, we can certainly point them in the right direction. For the GLR this will mainly consists of migrating parrot-specific code from Rakudo into NQP’s APIs. For the NSA and NFG work, it will involve developing a lot of new code and feature capabilities that Parrot doesn’t possess.
This past weekend I attended the 2014 Austrian Perl Workshop and Hackathon in Salzburg, which turned out to be an excellent way for me to catch up on recent changes to Perl 6 and Rakudo. I also wanted to participate directly in discussions about the Great List Refactor, which has been a longstanding topic in Rakudo development.
What exactly is the “Great List Refactor” (GLR)? For several years Rakudo developers and users have identified a number of problems with the existing implementation of list types — most notably performance. But we’ve also observed the need for user-facing changes in the design, especially in generating and flattening lists. So the term GLR now encompasses all of the list-related changes that seem to want to be made.
It’s a significant (“great”) refactor because our past experience has shown that small changes in the list implementation often have far-reaching effects. Almost any bit of rework of list fundamentals requires a fairly significant refactor throughout much of the codebase. This is because lists are so fundamental to how Perl 6 works internally, just like the object model. So, as the number of things that are desirable to fix or change has grown, so has the estimated size of the GLR effort, and the need to try to achieve it “all at once” rather than piecemeal.
The pressure to make progress on the GLR has been steadily increasing, and APW2014 was significant in that a lot of the key people needed for that would be in the same location. Everyone I’ve talked to agrees that APW2014 was a smashing success, and I believe that we’ve now resolved most of the remaining GLR design issues. The rest of this post will describe that.
This is an appropriate moment to recognize and thank the people behind the APW effort. The organizers did a great job. The Techno-Z and ncm.at venues were fantastic locations for our meetings and discussions, and I especially thank ncm.at, Techno-Z, yesterdigital, and vienna.pm for their generous support in providing venues and food at the event.
So, here’s my summary of GLR issues where we were able to reach significant progress and consensus.
(Be sure to visit our gift shop!)
Much of the GLR discussion at APW2014 concerned flattening list context in Perl 6. Over the past few months and years Perl 6 has slowly but steadily reduced the number of functions and operators that flatten by default. In fact, a very recent (and profound) change occurred within the last couple of months, when the .[]
subscript operator for Parcels switched from flattening to non-flattening. To illustrate the difference, the expression
(10,(11,12,13),(14,15)).[2]
previously would flatten out the elements to return 12, but now no longer flattens and produces (14,15)
. As a related consequence, .elems
no longer flattens either, changing from 6 to 3.
Unfortunately, this change created a inconsistency between Parcels and Lists, because .[]
and .elems
on Lists continued to flatten. Since programmers often don’t know (or care) when they’re working with a Parcel or a List, the inconsistency was becoming a significant pain point. Other inconsistencies were increasing as well: some methods like .sort
, .pick
, and .roll
have become non-flattening, while other methods like .map
, .grep
, and .max
continue to flatten. There’s been no really good guideline to know or decide which should do which.
Flattening behavior is great when you want it, which is a lot of the time. After all, that’s what Perl 5 does, and it’s a pretty popular language. But once a list is flattened it’s hard to get the original structure if you wanted that — flattening discards information.
So, after many animated discussions, review of lots of code snippets, and seeking some level of consistency, the consensus on Perl 6 flattening behavior seems to be:
[ ]
array constructor are unchanged; they continue to flatten their input elements. (Arrays are naturally flat.)for @a,@b { ... }
flattens @a,@b
and applies the block to each element of @a
followed by each element of @b
. Note that flattening can easily be suppressed by itemization, thus for @a, $@b { ... }
flattens @a
but does all of @b
in a single iteration..map
, .grep
, and .first
… the programmer will have to use .flat.grep
and .flat.first
to flatten the list invocant. Notably, .map
will no longer flatten its invocant — a significant change — but we’re introducing .for
as a shortcut for .flat.map
to preserve a direct isomorphism with the for
statement.There’s ongoing conjecture of creating an operator or syntax for flattening, likely a postfix of some sort, so that something like .|grep
would be a convenient alternative to .flat.grep
, but it doesn’t appear that decision needs to be made as part of the GLR itself.((1,2), 3, (4,5)).map({...}) # iterates over three elements map {...}, ((1,2),3,(4,5)) # iterates over five elements (@a, @b, @c).pick(1) # picks one of three arrays pick 1, @a, @b, @c # flatten arrays and pick one element
As a result of improvements in flattening consistency and behavior, it appears that we can eliminate the Parcel type altogether. There was almost unanimous agreement and enthusiasm at this notion, as having both the Parcel and List types is quite confusing.
Parcel was originally conceived for Perl 6 as a “hidden type” that programmers would rarely encounter, but it didn’t work out that way in practice. It’s nice that we may be able to hide it again — by eliminating it altogether.
Thus infix:<,>
will now create Lists directly. It’s likely that comma-Lists will be immutable, at least in the initial implementation. Later we may relax that restriction, although immutability also provides some optimization benefits, and Jonathan points out that may help to implement fixed-size Arrays.
Speaking of optimization, eliminating Parcel may be a big boost to performance, since Rakudo currently does a fair bit of converting Parcels to Lists and vice-versa, much of which goes away if everything is a List.
During a dinner discussion Jonathan reminded me that Synopsis 4 has all of the looping constructs as list generators, but Rakudo really only implements for
at the moment. He also pointed out that if the loop generators are implemented, many functions that currently use gather/take
could potentially use a loop instead, and this could be much more performant. After thinking on it a bit, I think Jonathan is on to something. For example, the code for IO::Handle.lines()
currently does something like:
gather { until not $!PIO.eof { $!ins = $!ins + 1; take self.get; } }
With a lazy while
generator, it could be written as
(while not $!PIO.eof { $!ins++; self.get });
This is lazily processed, but doesn’t involve any of the exception or continuation handling that gather/take
requires. And since while
might choose to not be strictly lazy, but lines()
definitely should be, we may also use the lazy
statement prefix:
lazy while not $!PIO.eof { $!ins++; self.get };
The lazy
prefix tells the list returned from the while
that it’s to generate as lazily as it possibly can, only returning the minimum number of elements needed to satisfy each request.
So as part of the GLR, we’ll implement the lazy list forms of all of the looping constructs (for
, while
, until
, repeat
, loop
). In the process I also plan to unify them under a single LoopIter
type, which can avoid repetition and be heavily optimized.
This new loop iterator pattern should also make it possible to improve performance of for
statements when performed in sink context. Currently for
statements always generate calls to .map
, passing the body of the loop as a closure. But in sink context the block of a for
statement could potentially be inlined. This is the way blocks in most other loops are currently generated. Inlining the block of the body could greatly increase performance of for
loops in sink context (which are quite common).
Many people are aware of the problem that constructs such as for
and map
aren’t “consuming” their input during processing. In other words, if you’re doing .map
on a temporary list containing a million elements, the entire list stays around until all have been processed, which could eat up a lot of memory.
Naive solutions to this problem just don’t work — they carry lots of nasty side effects related to binding that led us to design immutable Iterators. We reviewed a few of them at the hackathon, and came back to the immutable Iterator we have now as the correct one. Part of the problem is that the current implementation is a little “leaky”, so that references to temporary objects hang around longer than we’d like and these keep the “processed” elements alive. The new implementation will plug some of the leaks, and then some judicious management of temporaries ought to take care of the rest.
In the past year much work has been done to improve sink context to Rakudo, but I’ve never felt the implementation we have now is what we really want. For one, the current approach bloats the codegen by adding a call to .sink
after every sink-context statement (i.e., most of them). Also, this only handles sink for the object returned by a Routine — the Routine itself has no way of knowing it’s being called in sink context such that it could optimize what it produces (and not bother to calculate or return a result).
We’d really like each Routine to know when it’s being called in sink context. Perl 5 folks will instantly say “Hey, that’s wantarray
!”, which we long ago determined isn’t generally feasible in Perl 6.
However, although a generalized wantarray
is still out of reach, we can provide it for the limited case of detecting sink contexts that we’re generating now, since those are all statically determined. This means a Routine can check if it’s been called in sink context, and use that to select a different codepath or result. Jonathan speculates that the mechanism will be a flag in the callsite, and I further speculate the Routine will have a macro-like keyword to check that flag.
Even with detecting context, we still want any objects returned by a Routine to have .sink
invoked on them. Instead of generating code for this after each sink-level statement, we can do it as part of the general return handler for Routines; a Routine in sink context invokes .sink
on the object it would’ve otherwise returned to the caller. This directly leads to other potential optimizations: we can avoid .sink
on some objects altogether by checking their type, and the return handler probably doesn’t need to do any decontainerizing on the return value.
As happy as I am to have discovered this way to pass sink context down into Routines, please don’t take this as opening an easy path to lots of other wantarray-like capabilities in Perl 6. There may be others, and we can look for them, but I believe sink context’s static nature (as well as the fact that a false negative generally isn’t harmful) makes it quite a special case.
One area that has always been ambiguous in the Synopses is determining when various contextualizing methods must return a copy or are allowed to return self
. For example, if I invoke .values
on a List object, can I just return self
, or must I return a clone that can be modified without affecting the original? What about .list
and .flat
on an already-flattened list?
The ultra-safe answer here is probably to always return a copy… but that can leave us with a lot of (intermediate) copies being made and lying around. Always returning self
leads to unwanted action-at-a-distance bugs.
After discussion with Larry and Jonathan, I’ve decided that true contextualizers like .list
and .flat
are allowed to return self
, but other method are generally obligated to return an independent object. This seems to work well for all of the methods I’ve considered thus far, and may be a general pattern that extends to contextualizers outside of the GLR.
(small matter of programming and documentation)
The synopses — especially Synopsis 7 — have always been problematic in describing how lists work in Perl 6. The details given for lists have often been conjectural ideas that quickly prove to epic fail in practice. The last major list implementation was done in Summer 2010, and Synopsis 7 was supposed to be updated to reflect this design. However, the ongoing inconsistencies (that have led to the GLR) really precluded any meaningful update to the synopses.
With the progress recently made at APW2014, I’m really comfortable about where the Great List Refactor is leading us. It won’t be a trivial effort; there will be significant rewrite and refactor of the current Rakudo codebase, most of which will have to be done in a branch. And of course we’ll have to do a lot of testing, not only of the Perl 6 test suite but also the impact on the module ecosystem. But now that much of the hard decisions have been made, we have a roadmap that I hope will enable most of the GLR to be complete and documented in the synopses by Thanksgiving 2014.
Stay tuned.
"I'm just happy that the two of you liked my work." -- vanstynAlthough he was talking about DBIx, I think that captures the spirit of conference as a whole. All of us here -- from the n00bs to the pumpkings -- want to share our work and make something useful for others. It's not an organization where we wait for pronouncements from on high, but one where users create endless variations and share them. Not an organization so much as a family.
"We have faith, hope, and love, but the most awesome of these is love." -- Larry WallA line like this might seem a bit hokey out of context, but it was actually moving when I heard it. We have faith that we can use Perl to solve our problems. We have hope that Perl 5 and 6 will continue to get better. And we love Perl, unconditionally, despite all of her flaws. And as Wil Wheaton says about us geeks, we just want to love our special thing the best we can, and go the extra mile to share it with others.
[This is a response to the Russian Perl Podcast transcribed by Peter Rabbitson and discussed at blogs.perl.org.]
I found this translation and podcast to be interesting and useful, thanks to all who put it together.
Since there seems to have been some disappointment that Perl 6 developers didn’t join in the discussions about “Perl 7” earlier this year, and in the podcast I’m specifically mentioned by name, I thought I’d go ahead and comment now and try to improve the record a bit.
While I can’t speak for the other Perl 6 developers, in my case I didn’t contribute to the discussion because nearly all the things I would’ve said were already being said better by others such as Larry, rjbs, mst, chromatic, etc. I think a “Perl 7” rebrand is the wrong approach, for exactly the reasons they give.
A couple of statements in the podcast refer to “hurting the feelings of Perl 6 developers” as being a problem resulting from a rebrand to Perl 7. I greatly appreciate that people are concerned with the possible impact of a Perl 5 rebrand on Perl 6 developers and our progress. I believe that Perl 6’s success or failure at this point will have little to do with the fact that “6 is larger than 5”. I don’t find the basic notion of “Perl 7” offensive or directly threatening to Perl 6.
But I fully agree with mst that “you can’t … have two successive numbers in two brands and not expect people to be confused.” We already have problems explaining “5” and “6” — adding more small integers to the explanation would just make an existing problem even worse, and wouldn’t do anything to address the fundamental problems Perl 6 was intended to resolve.
Since respected voices in the community were already saying the things I thought about the name “Perl 7”, I felt that adding my voice to that chorus could only be more distracting than helpful to the discussion. My involvement would inject speculations on the motivations of Perl 6 developers into what is properly a discussion about how to promote progress with Perl 5. I suspect that other Perl 6 developers independently arrived at similar conclusions and kept silent as well (Larry being a notable exception).
I’d also like to remark on a couple of @sharifulin’s comments in the podcast (acknowledging that the transcribed comments may be imprecise in the translation from Russian):
First, I’m absolutely not the “sole developer” of Perl 6 (13:23 in the podcast), or even the sole developer of Rakudo Perl 6. Frankly I think it’s hugely disrespectful to so flippantly ignore the contributions of others in the Perl 6 development community. Let’s put some actual facts into this discussion… in the past twelve months there have been over 6,500 commits from over 70 committers to the various Perl 6 related repositories (excluding module repositories), less than 4% (218) of those commits are from me. Take a look at the author lists from the Perl 6 commit logs and you may be a little surprised at some of the people you find listed there.
Second, there is not any sense in which I think that clicking “Like” on a Facebook posting could be considered “admitting defeat” (13:39 in the podcast). For one, my “Like” was actually liking rjbs’ reply to mst’s proposal, as correctly noted in the footnotes (thanks Peter!).
But more importantly, I just don’t believe that Perl 5 and Perl 6 are in a battle that requires there to be a conquerer, a vanquished, or an admission of defeat.
Pm
$foo->WHAT
can tell you if you have a Str
, Int
, or IO::Handle
. $path =~ s/^([a-z]:)/\l$1/s;
//server/share
) that OS2.pm had only half-implemented. And so a huge block of code cruft bit the dust.sub _tmpdir {
my $self = shift;
my @dirlist = @_;
my $tmpdir;
foreach (@dirlist) {
next unless defined && -d && -w _;
$tmpdir = $_;
last;
}
return $self->canonpath($tmpdir);
}
$_
, @_
, and shift
.method !tmpdir( *@dirlist ) {
my $tmpdir = first { .defined && .IO.w && .IO.d }, @dirlist;
return self.canonpath($tmpdir);
}
$tmpdir
to the first defined writable directory in @dirlist
." Less, easier to read code is easier to maintain.if( $foo )
to if $foo
, etc.git
, and make
-- enough to commit to repositories and build a software package, anyway.git clone
it to on your own machine.rakudo
directory. There are a few setup things that you'll want to do. First of all, go ahead and build Rakudo, using the normal steps: perl ./Configure.pl --gen-parrot
make
make install
$PATH
environment variable. Which, if you don't know how to do it -- well here's Google. In particular, you'll need to add the full path to the rakudo/install/bin
directory. make spectest
t/spec
before hitting ^C. You will need these tests later to make sure you didn't break anything. git remote add upstream git://github.com/rakudo/rakudo.git
git clone git://github.com/tadzik/panda.git
cd panda
perl6 bootstrap.pl
git checkout -b mynewbranchname
rakudo/src
folder, so this is where you'll want to edit the contents.vm
directory contains files specific to the virtual machines Rakudo runs under. At this time of this writing, there's only one thing in there, parrot
, but very soon there will also be a jvm
directory. Exciting! Most of the purpose of this code is to map functions to lower-level operations, in either Parrot or Java.Perl6
directory contains the grammar and actions used to build the language, as well as the object metamodel. The contents of this folder are written in NQP, or Not Quite Perl. This section determines how the language is parsed.core
directory contains the files that will be built into the core setting. You'll find classes or subroutines in here for just about everything in Perl: strings, operators like eq
, filehandles, sets, and more. Individual files look similar to modules, but these are "modules" that are available to every Perl 6 program.gen
directory contains files that are created in the compliation process. The core setting lives here, creatively named CORE.setting
. And if you look at it, it's just a concatenation of the files in core
, put together in the order specified in rakudo/tools/build/Makefile.in
. While these files can and do get overwritten in the build process, it's often a good idea to keep a copy of CORE.setting open so you can find what you're looking for faster -- and then go edit it in core
.git bisect
for problems later. And push your edits to Github as a free backup. If you get stuck, drop by #perl6 on irc.freenode.net and ask questions. git fetch upstream
git merge upstream/nom
perl Configure.pl
make
make spectest
#?pugs 1 skip 'reason'
#?niecza 1 skip 'reason'
rakudo/t/spectest.data
. If your code fixes broken tests, then you'll want to *unfudge* by removing the #?rakudo skip
lines above the relevant tests. perl6 panda/rebootstrap.pl
git commit; git push
will add it to the ticket. If there aren't any problems, someone will just merge it in a couple days.At YAPC::NA 2012 in Madison, WI I gave a lightning talk about basic improvements in Rakudo’s performance over the past couple of years. Earlier today the video of the lightning talks session appeared on YouTube; I’ve clipped out my talk from the session into a separate video below. Enjoy!
A couple of weeks ago I entered the Dallas Personal Robotics Group Roborama 2012a competition, and managed to come away with first place in the RoboColumbus event and Line Following event (Senior Level). For my robot I used one of the LEGO Mindstorms sets that we’ve been acquiring for use by our First Lego League team, along with various 3rd party sensors.
The goal of the RoboColumbus event was to build a robot that could navigate from a starting point to an ending point placed as far apart as possible; robots are scored on distance to the target when the robot stops. If multiple robots touch the finish marker (i.e., distance zero), then the time needed to complete the course determines the rankings. This year’s event was in a long hall with the target marked by an orange traffic cone.
Contestants are allowed to make minor modifications to the course to aid navigation, so I equipped my robot with a HiTechnic IRSeeker sensor and put an infrared (IR) electronic ball on top of the traffic cone. The IRSeeker sensor reports the relative direction to the ball (in multiples of 30 degrees), so the robot simply traveled forward until the sensor picked up the IR signal, then used the IR to home in on the traffic cone. You can see the results of the winning run in the video below, especially around the 0:33 mark when the robot makes its first significant IR correction:
http://youtu.be/x1GvpYAArfY
My first two runs of RoboColumbus didn’t do nearly as well; the robot kept curving to the right for a variety of reasons, and so it never got a lock on the IR ball. Some quick program changes at the contest and adjustments to the starting direction finally made for the winning run.
For the Line Following contest, the course consisted of white vinyl tiles with electrical tape in various patterns, including line gaps and sharp angles. I used a LineLeader sensor from mindsensors.com for basic line following, with some heuristics for handling the gap conditions. The robot performed fine on my test tiles at home, but had difficulty with the “gap S curve” tiles used at the contest. However, my robot was the only one that successfully navigated the right angle turns, so I still ended up with first place.
Matthew and Anthony from our FLL robotics team also won other events in the contest, and there are more videos and photos available. The contest was a huge amount of fun and I’m already working on new robot designs for the next competition.
Many thanks to DPRG and the contest sponsors for putting on a great competition!