Adjective Noun

Page 2

Using matplotlib in Perl 6 (part 4)

2017-03-08 00:00, Tags: perl python matplotlib

This is Part 4 in a series. You can start at the Intro here.

The last post of this series was kind of more about replicating numpy's linspace function in Perl 6 than it was about testing the limits of this matplotlib wrapper. In this part I am going to hit one of those limits and encounter a minor short-coming of the wrapper as it currently exists.

If you're following along, I had decided to take a shot at one of the histograms and jumped into the first one named histogram_demo_features. I took a glance at the Python code and froze...

import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt

Straight off the bat I'm in trouble: Queue dramatic music. The Matplotlib wrapper I'd been using was really a wrapper for matplotlib's sub-module pyplot. My wrapper gave me no access to other sub-modules like mlab. I was going to have to modify the wrapper.

In Python, importing a package doesn't necessarily give me access to the sub-packages, eg. If I import the matplotlib base package, I can't use matplotlib.pyplot or matplotlib.mlab... but! If I import only matplotlib.pyplot, I can now also use matplotlib.mlab. Maybe (probably) I just don't understand Python packaging.

Stefan Seifert, the guy who created the Inline::Python module is a clever guy. Me, I'm not so clever, so I just kinda hacked away until things worked, and this is what I landed on. I imported matplotlib at the package level (which will run when the wrapped modules is used) and defined pyplot and mlab as their own class.

use Inline::Python;
my $py = Inline::Python.new();
$py.run('import matplotlib.pyplot');

class Matplotlib::Mlab {
    method FALLBACK($name, |c) {
        $py.call('matplotlib.mlab', $name, |c);
    }
}

class Matplotlib::Plot {
    method FALLBACK($name, |c) {
        $py.call('matplotlib.pyplot', $name, |c);
    }
}

class Matplotlib {
    method FALLBACK($name, |c) {
        $py.call('matplotlib', $name, |c);
    }
}

Oh, I also dropped the py from pyplot in my module because... reasons. I've also got a class for the top-level matplotlib package. I'm not sure if there's methods in there I will need to call, but it doesn't hurt to be prepared.

I don't even know if jumped this hurdle, or just kinda kicked it over and stumbled ahead. Let me know if you have a more sane way to I could have done this. In any case, what mattered to me most at the time was that it worked and I could move on to playing with plots. To use this fancy new wrapper in all my previous examples, all I need to do is change the class instantiation from this

my $plt = Matplotlib.new;

... to this

my $plt = Matplotlib::Plot;

Which actually maps closer the Python code, anyways. With that out of the way I can move on the next few lines of code.

np.random.seed(0)

# example data
mu = 100  # mean of distribution
sigma = 15  # standard deviation of distribution
x = mu + sigma * np.random.randn(437)

I can guess what random.seed does. Pseudo-random number generators (or PRNG's) use an algorithm to compute a random number; this is the "pseudo" part of pseudo-random. Provided you start the algorithm at the same number (the seed) each time, the result is always the same. How the seed is obtained normally (and how the random numbers are generated) differs between operating systems and programming languages. The seed function in Perl is called srand, so that part's easy.

Then we come to randn. A quick search led me to this StackOverflow post where I learned that it creates a "normal distribution." That link jumps to one of the replies, which is from an actual statistician! This helpful human explains that a normal distribution is "a distribution where the values are more likely to occur near the mean value". So, think bell curve.

I'm not a stats guy. Heck, I'm not even a maths guy... So I headed to RosettaCode to grab a normal distribution function in Perl 6. I modified it slightly (hopefully without breaking it) so that behaves like a very simple clone of numpy.random.randn, and like numpy, stuck it in it's own sub-module to the Numpl module I created in Part 3.

class Numpl::Random {
    method randn($n) {
        sqrt( -2 × log(rand) ) × cos( τ × rand ) xx $n;
    }
}
class Numpl {
    # linspace stuff ...

    method random {
        Numpl::Random.new();
    }
}

Which means I could now do this

my $np = Numpl.new;
my $x = $np.random.randn(437)

The next few lines are pretty straight-forward, so moving now to mlab.normpdf. Ok, so the comment there tells me that this thing adds a line of "best fit", but what the heck does it have to do with PDF? Being curious, I did a search and found out it stands for 'Probability Density Function'. With my curiosity quenched, I converted the rest of the code to Perl without much fanfare.

use Numpl;
use Matplotlib;

my $np   = Numpl.new;
my $plt  = Matplotlib::Plot.new;
my $mlab = Matplotlib::Mlab.new;

srand(0);

# example data
my $mu = 100;    # mean of distribution
my $sigma = 15;  # standard deviation of distribution
my $x = $np.random.randn(437).map( * × $sigma + $mu );

my $num_bins = 50;

my ( $fig, $ax ) = $plt.subplots();

# the histogram of the data
my ( $n, $bins, $patches ) = $ax.hist( $x, $num_bins, :normed(1) );

# add a 'best fit' line
my $y = $bins.map(-> $value {
    $mlab.normpdf( $value, $mu, $sigma )
});
$ax.plot($bins, $y, '--');
$ax.set_xlabel('Smarts');
$ax.set_ylabel('Probability density');
$ax.set_title('Histogram of IQ: $\mu=100$, $\sigma=15$');

# Tweak spacing to prevent clipping of ylabel
$fig.tight_layout();
$plt.show();

So, um, yeah... Not much to say here that hasn't been covered. I'm using a map again on the results of randn and norpdf. The rest is pretty standard translation stuff, and here's the result.

Even though I am seeding the PRNG, Perl will generate random numbers differently than Python, so this doesn't look exactly like the one in the gallery. You can remove the srand to get a different graph each time. The colours, however, are a little... academic. I think next I'll try applying one of the style sheets to a graph.

To be continued...

Using matplotlib in Perl 6 (part 3)

2017-03-06 00:15, Tags: perl python matplotlib

This is Part 3 in a series. You can start at the Intro here.

So far this Perl 6 wrapper class for matplotlib is going well. With the first graph in the gallery under my belt, I moved on to second example, fill_demo.py. Here's the Python code

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0, 1, 500)
y = np.sin(4 * np.pi * x) * np.exp(-5 * x)

fig, ax = plt.subplots()

ax.fill(x, y, zorder=10)
ax.grid(True, zorder=5)
plt.show()

Being a rather simple graph, you might think that the Perl code would be fairly straight forward, and you'd be right. The only minor curveball is another numpy function linspace, and that rather hairy looking operation for the y value.

To reiterate what I've said in previous parts, I'm not familiar with numpy at all. Or at least, I wasn't before I started working on these graphs. I inserted a print(x) into the Python code to see what linspace did. In hindsight, the name is obvious; It simply creates a linearly spaced sequence of numbers. Here's a quick demonstration

>>> numpy.linspace(0, 1, 3)
array([ 0. ,  0.5,  1. ])
>>> numpy.linspace(0, 1, 5)
array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ])
>>> numpy.linspace(0, 1, 6)
array([ 0. ,  0.2,  0.4,  0.6,  0.8,  1. ])

In the case of x in the graph, it's a sequence of 500 numbers evenly spaced from 0 to 1. So how would I do this in Perl. It is a monotonic sequence, so maybe the Sequence operator can help us again. So far we've seen a couple simple Sequences. The most simple sequence defines the start of a number series, and Perl lazily generates the rest.

> 0, 2, 4, 8 ... 1024
(0 2 4 8 16 32 64 128 256 512 1024)

This is nice and all, but I had to define 4 numbers for Perl to understand that sequence. If I dropped the 8, then it would have generated a sequence of even numbers. Instead, I can define a function which calculates how the next value should be generated.

> 0, 2, * × 2 ... 1024
(0 2 4 8 16 32 64 128 256 512 1024)

Here we see our old friends, the Whatever *. This time I'm asking for the Sequence starting 0, 2, then I define a mini-function that takes Whatever the last number in the Sequence was and multiplies is by 2. Sequences are a fascinating part of Perl 6 that could occupy a blog post all their own, but for now, that's enough of a foundation to create a simplified linspace function.

sub linspace( $start, $end, $steps ) {
    my $step = ( $end - $start ) ÷ ( $steps - 1 );
    return $start, * + $step ... $end;
}

linspace( 0, 1, 3 ); # Result: (0, 0.5, 1)
linspace( 0, 1, 5 ); # Result: (0, 0.25, 0.5, 0.75, 1)
linspace( 0, 1, 6 ); # Result: (0, 0.2, 0.4, 0.6, 0.8, 1)

This function does what I need, but it could stand to be a little more robust. I'll make a module to house this function, just in case I need it for future plots. I'll also add type constraints to the parameters, and multiple dispatch functions to handle steps of 0 or 1. I'm calling my library Numpl. Any resemblance to actual libraries is purely coincidental.

class Numpl {

    proto method linspace( Real $start, Real $end, Int $steps ) { * }

    multi method linspace( $start, $end, 0 ) { Empty }

    multi method linspace( $start, $end, 1 ) { $start }

    multi method linspace( $start, $end, Int $steps ) {
        my $step = ( $end - $start ) ÷ ( $steps - 1 );
        return $start, * + $step ... $end
    }
}

So that was a very scenic tour around the Sequence operator, but now that we have a linspace function, the rest is smooth sailing. Getting back to the plot, the final code now looks like this.

use Matplotlib;
use Numpl;

my $plt = Matplotlib.new;

my @x = Numpl.linspace( 0, 1, 500 );
my @y = @x.map(-> $x {
    sin( 4 × π × $x ) × exp( -5 × $x )
});

my ( $fig, $ax ) = $plt.subplots();

$ax.fill( $@x, $@y, :zorder(10) );
$ax.grid( True, :zorder(5) );
$plt.show();

Oh yeah, there was that hairy looking operation for the y values. To refresh your memory, the python looked like this

y = np.sin(4 * np.pi * x) * np.exp(-5 * x)

I can pretty much copy this operation exactly as it appears and put it inside the map. I've put the map operation on it's own line as I think this aids readability.

You might have also noticed I'm using proper @arrays in this one, and there is a valid reason for this. Sequences are lazy, and typically can only be iterated over once. x is iterated over in the map and that would prevent any later iterations. Here I've assigned the Sequence to a fully reified Array, which also means the Sequence is evaluated eagerly at assignment (ie. not lazily). As covered in part 2, Python doesn't like Perl arrays as positional arguments, so I coerce them to scalar values when I pass them.

The other way I could have handled this was to coerce my Sequence to a List on assignment like so.

my $x = Numpl.linspace( 0, 1, 500 ).List;

Then I could have assigned it to a scalar variable and still be free to iterate over it as many times as I like. If you have no need for laziness from your linspace function, you could modify it to return a reified List either coercing the return value as above, or by instructing the Sequence to be eagerly evaluated.

return ( $start, * + $step ... $end ).List
# or
return eager $start, * + $step ... $end

But I like to be able to control the laziness of my linspace result from outside the function.

Whichever way you do it, the end result should look like this

The wrapper module is working out well... a little too well. I wanted to tackle something a little more complex, and scrolling through the matplotlib gallery I saw it. It's high time for a histogram.

To be continued...

Using matplotlib in Perl 6 (Part 2)

2017-03-05 23:00, Tags: perl python matplotlib

This is Part 2 in a series. You can start at the Intro here.

In Part 1 of this series, I managed to use matplotlib in Perl 6 to output a simple graph from the matplotlib tutorial. In this post, I will tackle the first graph in the matplotlib gallery, barh_demo.py.

import matplotlib.pyplot as plt
import numpy as np

plt.rcdefaults()
fig, ax = plt.subplots()

# Example data
people = ('Tom', 'Dick', 'Harry', 'Slim', 'Jim')
y_pos = np.arange(len(people))
performance = 3 + 10 * np.random.rand(len(people))
error = np.random.rand(len(people))

ax.barh(y_pos, performance, xerr=error, align='center',
        color='green', ecolor='black')
ax.set_yticks(y_pos)
ax.set_yticklabels(people)
ax.invert_yaxis()  # labels read top-to-bottom
ax.set_xlabel('Performance')
ax.set_title('How fast do you want to go today?')

plt.show()

Again, numpy is being used, but I figured Perl 6 can probably do whatever is needed natively, so I proceeded to convert the imports to Perl parlance (perlance?) and got started.

use Matplotlib;
my $plt = Matplotlib.new;

$plt.rcdefaults();
my ( $fig, $ax ) = $plt.subplots();

As mentioned in the tail end of Part 1, Python likes scalar values, so from here on in I'm going to stick to using mostly scalar variables in my Perl code. I may also provide alternative methods of doing things if I think they are noteworthy or interesting. Now on with the show!

I need to create the example data... people is just your average list of strings. Perl has always had nice quoting construct to create a list of words, so that ones is as simple as

my $people = < Tom Dick Harry Slim Jim >;

Now what's numpy.arange doing? I knew that len(people) was 5, but I didn't know what arange was doing with that 5. It turns out it simply creates an array of values starting at 0, incrementing by one and up to 5, ie. [0, 1, 2, 3, 4]. As seen in Part 1, I can use the handy Sequence operator for that, but creating lists like this is fairly common which is why Perl 6 also provides a convenient shortcut for it:

my $list = (0 ...^ 5); # Result: (0, 1, 2, 3, 4)
my $list = ^5;         # same thing

Of course, I can use a variable instead of a literal integer. As covered in Part 1, lists used in a numerical context are coerced to their number of elements. These features combined mean that my y_pos assignment looks like this.

my $y_pos = ^$people;

For the next two lists of values, performance and error, numpy is being used to generate some random values. Both values reference len(people), and this equates to the number of random values returned: 5. Perl has a fairly standard rand function that - without arguments - returns a random value between 0 and 1. However, I can use the "list repetition" operator (xx) to create a list of random values. Again referring to Part 1, I know that mathematical operations on a numpy array are equivalent to mapping that operation across each element, so I ended up with this.

my $performance = ( rand xx $people ).map( 10 × * + 3 );
my $xerr = rand xx $people;

In those 3 lines of code, The $people list variable is used in a numerical context so it is evaluated to it's number of elements: 5. You could always be more explicit in your code and say $people.elems if you prefer.

You may have noticed I called my list of error values xerr instead of error and there's a good reason. Perl 6 provides a convenient syntax for passing named arguments (kwargs in Python) when a variable has the same name as the named argument.
Oh, and one more thing. Remember that < word quoting > syntax I used before? I can use that when passing named arguments too. Here's a quick illustration using an example function foo() that takes 1 positional parameter, and 1 named parameter.

sub foo( $positional, :$named ) { ... }

# Calling the function
foo( 'bar', :named('baz') );

# Using the word-quoting construct
foo( 'bar', :named<baz> );

# or if I had a $named variable
my $named = 'baz';
foo( 'bar', :$named );

To some, it may seem odd to have so many ways to pass a named argument (and truthfully there's more) but I hope what comes through here is that the syntax tries to consistent in how it's used... But I digress... Less plodding and more plotting!

With the above out of the way, the rest of the code is pretty straight forward. Here's the final code.

use Matplotlib;
my $plt = Matplotlib.new;

$plt.rcdefaults();
my ( $fig, $ax ) = $plt.subplots();

# Example data
my $people = < Tom Dick Harry Slim Jim >;
my $y_pos = ^$people;
my $performance = ( rand xx $people ).map( 10 × * + 3 );
my $xerr = rand xx $people;

$ax.barh(
    $y_pos, $performance,
    :$xerr, :align<center>, :color<green>, :ecolor<black>
);
$ax.set_yticks($y_pos);
$ax.set_yticklabels($people);
$ax.invert_yaxis(); # labels read top-to-bottom
$ax.set_xlabel('Performance');
$ax.set_title('How fast do you want to go today?');

$plt.show();

This graph is based on random data, so the output will vary slightly with each run, but ultimately I get something along the lines of this when I run it.

Well that was mostly painless. The wrapper module is working as expected, and I was able to emulate a basic function of numpy with relative ease. I glanced back to the the Matplotlib gallery... Next in line was the fill_demo. Let's go!

Using matplotlib in Perl 6 (part 1)

2017-03-04 20:30, Tags: perl python matplotlib

This is Part of a series. You can start at the Intro here

Looking at the graphs in the Matplotlib tutorial, the first one is a straight line (boring). The second is a few red dots plotting a curve (getting better). The 3rd graph looked a more interesting: Three curved plots consisting of different patterns and colors. Here's the Python code.

#!/usr/bin/env python

import numpy as np
import matplotlib.pyplot as plt

# evenly sampled time at 200ms intervals
t = np.arange(0., 5., 0.2)

# red dashes, blue squares and green triangles
plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')
plt.show()

I ran it in python to confirm it worked as expected (it did). I then proceeded to copy and paste the code into a new Perl 6 file and changed the import statements to use statements.

use Matplotlib;
my $plt = Matplotlib.new;

Now for the data... The t values were generated by a numpy function called arange. I didn't know what that did, and rather than wrap numpy as well, I figured I could probably do whatever it was doing with Perl 6. As the comment suggested, it was a simple number sequence. Perl 6 has a handy Sequence operator for just this sort of thing.

my @t  = 0, 0.2 ...^ 5;

This defines a Sequence starting at 0, followed by 0.2, then increasing at the same interval until is gets up to (^) 5. If I wanted to include 5 in the Sequence, I would just drop the ^. Using some well placed print statements, I confirmed by @t values were the same as the python version.

I then pasted the rest of the Python code into my Perl script, inserted a @ in front of each t and ran it. Things did not go as planned. Well, I got a graph, but it wasn't quite right

Some of you might have already guessed where I went wrong. Spoiler, I don't have any experience with numpy arrays (or numpy for that matter). A few more print statements let me know where I was going wrong. You see, when referring to a list (or array) in a numerical context, Perl will coerce the array to it's number of elements

my @t = (2, 3, 4);
@t ** 2; # Result: 9

However numpy arrays will apply that operation to each value in the array. I could create an object in Perl 6 that behaves the same way, but for now I'll just use Perl's map function which is perfectly apt for this sort of thing. Perl 6 provides a number of ways to map values in an array: I can be very explicit, or very succinct.

@t.map(-> $x { $x ** 2 }) # Result: (4, 9, 16)
@t.map(*²)                # same thing

Note the lack of curlies {} in the second example. This is a special syntax where * takes on the value of whatever value it was passed. In fact, Perl 6 actually refers to that * as Whatever. This WhateverCode is best suited for simple operations, which perfectly describes a binary mathematical operation such as exponentiation.

Additionally, Perl contains quite a few unicode operators; My favourites are × for multiplication and ÷ for division, instead of the historical compromise of * and /. In the second example above, I am using the superscript of 2 (²) to raise Whatever (*) to the power of 2.

I modified my arguments accordingly and ran it again. Another graph! This one also had 2 dots... But they were in a different place to last time, so that's progress, right? I looked at the gist again and noticed the arguments to plot were inside square brackets. Accepting it was probably due a difference in how each language handled positional arguments, I diligently wrapped my arguments to plot in square brackets.

$plt.plot([
    @t, @t,         'r--',    # red dashes
    @t, @t.map(*²), 'bs',     # blue squares
    @t, @t.map(*³), 'g^'      # green triangles
]);

$plt.show();

This time when I ran it and was delighted when a graph identical to the one on the tutorial appeared before my eyes.

Wondering about those square brackets, I played around a little with the call to plot. I quickly learned that Python prefers to be given what Perl calls "scalar" variables. In simple terms, a scalar is a variable that Perl will treats as a single "thing". Passing a Perl array of 3 @things to a Python function would cause those 3 values to get expanded out to 3 positional arguments. Creating a scalar list $things or coercing $@things to be passed in a scalar context works much better.

With regard to that map, it seems that by giving the map directly to plot, it was more "listy" than scalar. By assigning the result to a scalar first, I could then pass those values to plot without having to wrap all the arguments in square brackets, like so..

my $t  = [ 0, 0.2 ...^ 5 ];
my $t2 = $t.map(*²);
my $t3 = $t.map(*³);

$plt.plot(
    $t, $t,  'r--',    # red dashes
    $t, $t2, 'bs',     # blue squares
    $t, $t3, 'g^'      # green triangles
);

$plt.show();

So that about wraps up Part 1. This graph was kind of interesting, but I knew matplotlib was capable of more. I felt no great excitement from the other images in the tutorial, so headed off to the gallery to see what else I could sink my teeth into.

To be continued...

Later Earlier