Parallelized list-processing with stream.py

Version 0.8 of stream.py is out with a lot of goodies to make it easy to parallelize a pipeline. I have written previously that by expressing a list-processing task as a pipeline (or at least thinking of it as such), we can have parallelism for free. Well, at least as free as Python allows us to be. We can choose between threading and multiprocessing as our concurrency model: with first choice, we have shared-memory but no real concurrence of threads (thanks to the GIL), and with second choice, we have real multi-tasking but only under the tax of ...

Using doctest.testmod() properly

The usual, library-recommended way of running all doctests in a Python module is:

if __name__ == '__main__':
    import doctest
    doctest.testmod()
This, however, has a flaw. __main__ always exits with status 0, which in Unix parlance signals that it has executed without errors where in fact some test cases might have failed. So instead, you should write:
if __name__ == '__main__':
    import doctest
    if doctest.testmod().failed:
        import sys
        sys.exit(1)
This way, it is possible to check if the doctests has succeeded programmatically from a shell script, which can be very useful when you have many test scripts to run. ■

Streamlined list processing aka. pipes in Python

One of the most widely admired contributions of Unix to the culture of operating systems and command languages is the pipe, as used in a pipeline of commands. — Dennis M. Ritchie [dmr]

If the pipe is so great; why, then, does your favorite modern, dynamic programming language not have pipes built-in?

If the idea is merely to direct the output of some computation to be the input of another, any programming languages with a procedure abstraction would let you do that simply by composition. Like this,

f(g(h(input)))

However, that is not a pipeline! The procedure f can ...