Recently, I attended YAPC::NA in Asheville, NC. One of the most interesting technical talks I saw was use types;
by Reini Urban. In it, he describes new possibilities for making type-safe Perl constructs be honored by the Perl interpreter.
This is interesting for a number of reasons: say you have an array of items, which you know are going to be native integers, you can store them with a whole lot less memory than the current situation:
Under Reini’s proposal, this could be reduced to:
Yes, that’s about two data structures for the entire (fixed size) array. This is the most dramatic of the memory reductions and would bring Perl quite close to C in terms of memory use for data without resorting to relatively slow pack/unpack systems like Judy. It would also be very fast.
There are also provisions for other typed scalar values (such as float, strings, etc) and also a solution for hashes called “perfect hashes”.
Already this is all very good stuff, but it was the treatment of :const
that caught my eye;
my int $x : const;
This guarantees that once Perl has allocated and initialized the value, it may not change. This would allow one of the worst problems with the existing threads::tbb
library to be avoided.
Currently, the data structures that you share through threads::tbb::concurrent
and other containers must not be updated after they are saved to the container. However, it is very difficult to ensure this and there are some operations which change memory, which you wouldn’t think are writes – even innocuous things such as printing an integer, which may upgrade the data structure from an IV (integer only) to a PVIV (integer + string “dualvar”) to cache the conversion. The current API avoids this problem mostly by copying memory structures very quickly on the first access, and then subsequent accesses will work with the copied data. But it is not foolproof, and it’s possible that any access to data structures which have been shared may cause another thread to segfault the process due to an unsafe data structure access.
This proposal would provide the perfect API for ensuring that changes are thread-safe: requiring that only :const
values may be put inside it. I would posit that the result would not only be easier to work with and safer by design, but use less memory and run faster, as the values do not need to be copied to share between threads in the first place.
But it gets even better.
With “perfect” hashes, and :const
and typed arrays and scalars, the Moose-like idiom of __PACKAGE__->make_immutable
, which forbids a range of run-time dynamic behavior, could go even further: it could mark the actual package stash (symbol table) as :perfect
or :const
. If the Perl_clone()
API takes this into account, then thread creation in Perl could get a whole lot faster. Currently, creating a new Perl thread basically duplicates the entire heap of your program – this comes from its design as a replacement for the Unix copy-on-write fork(), which was never implemented for Windows. If modules can mark which parts of them have finished changing, and may no longer change, then those parts don’t need to be duplicated. Even if they are not :const
– just being :perfect
could allow the perfect hash keys to be shared, and so on.
One of the senior Perl figures I talked to about this was excited that this would be “real” :const
, not the “humpty-dumpty const” behavior required by C++. In C++, the const
keyword can be overridden by mutable
. This was intended for things like reference counts, but in any case those are probably best handled by the relevant API calls specific to reference counts (which can use atomic
types).
The summary is that Moose programs which use the make_immutable
API would be able to start new threads, using far less CPU time and memory to do so; as well as the other benefits of this code, such as using less memory for object instances.
Of course, this is all very much a roadmap, and requires a lot of community acceptance of the idea, and working proposals in terms of patch sets and so on. However, Reini is hopeful for targeting Perl 5.16 for the first few changes, and 5.18 for more of them. At the current rate of Perl releases, that could mean this hitting a production Perl release as early as 2012.
Excellent call to action. I will put clone #17 to work on it immediately. Meanwhile, we need terminology. As you know, something as innocuous as an m///g or a substr() on a Unicode string needs to attach MAGIC to an SV, and this is probably a feature; m///g on a ‘const’ string seems like a good thing. On the other hand we want to encourage thread-friendly code, right? And this code is also fork-friendly, because it doesn’t trigger COW as much.
Anyway we need two names, if only so we can talk about it.
Yay hallway chats. yapc++
There has also been some discussion on Google+ of the paper.