Skip to content

tbarnetlamb/hyphen

Repository files navigation

GitHub CI

Hyphen

Hyphen allows one to access Haskell modules from Python (3 or better). (More precisely, it allows one to access Haskell modules compiled with GHC from CPython.) It is in some sense the dual of the cpython package on Hackage, which allows Haskell code to access Python modules.

For instance:

>>> import hyphen, hs.Prelude
>>> hs.Prelude.drop(1, [1,2,3])
<hs.GHC.Types.[] object of Haskell type [GHC.Integer.Integer], containing '[2,3]'>
>>> list(hs.Prelude.drop(1, [1,2,3]))   # Convert back to Python list
[2, 3]
>>> hs.Prelude.id(3)
3

Why the name?

The other obvious portmanteau is 'Pascal', which is taken.

Building

For a guide to building Hyphen, see BUILDING.md. Hyphen has been successfully built and used on Mac OS X, Ubuntu, Windows (32 bit and 64 bit) and Windows+Cygwin (32 bit and 64 bit).

Basic usage

Once you have imported hyphen, you can import Haskell modules as though they were Python modules with an hs prefix, so for example

>>> import hyphen
>>> import hs.Prelude
>>> import hs.Data.Text

The functions defined in these modules can now be called from Python. For instance, the Haskell function Prelude.drop can be referred to from python as hs.Prelude.drop. The only slightly fiddly thing is that some Haskell functions can have names that Python won't accept. (So for instance there's a function Prelude.(+), but Python won't let you refer to hs.Prelude.(+).) In such cases, you can find these symbols as, for example, hs.Prelude._["+"]. (Note that hs.Prelude._["drop"] works too.)

(Quick note: so far we're just talking about how to access Haskell code from installed modules. In this usage pattern, the Haskell code that you want to access from Python should have Cabal install it to the (system or user) library, and you'll be able to import it from there. Hyphen also supports just having a file Whatever.hs in the same directory as your python script and importing functions from there into python: see 'non-library modules' below for more.)

When you call a Haskell function, you can call it with arguments that are themselves Haskell objects, but you can also call it with python objects as arguments, in which case we will try to marshall those arguments into Haskell objects before calling the Haskell function. For instance

>>> hs.Prelude._["+"] (1, 2)
3
>>> hs.Prelude.sum([1,2,3]) # list converted to Haskell list
6
>>> hs.Prelude.drop(5, "Hello, world")
", world"

In these three cases, the return value has been marshalled back from Haskell to Python. Our philosophy is to aggressively try to marshall parameters to Haskell types, but only to convert the very simplest types (String, Text, Int, Integer, Float, Complex) back to Python types on return. More complicated objects stay as Haskell objects. For instance:

>>> hs.Prelude.drop(1, [1,2,3])
<hs.GHC.Types.[] object of Haskell type [GHC.Integer.Type.Integer], containing '[2,3]'>

These Haskell objects can be used from python in many ways. The basic way of using them is to pass them to further Haskell functions; and this certainly works, for example:

>>> my_list = hs.Prelude.drop(1, [1,2,3])
>>> hs.Prelude.sum(my_list)
5

But we can also do other things. For instance, if the return value is a Haskell list, its python representation will be a python iterable:

>>> for x in my_list:
...     print(x)
...
2
3

...and if the return value is a Haskell Map or HashMap, the resulting Haskell object will behave like a python dict...

>>> import hs.Data.Map
>>> my_map = hs.Data.Map.fromList([(1, 'Hello'), (2, 'World')])
>>> my_map[1]
'Hello'
>>> print(sorted([key for key in my_map]))
[1, 2]

...and if the return value is a function object, we can call it. (Indeed, the Haskell function objects we get returned by calling Haskell-functions-that-return-functions are the exact same kind of thing as the Haskell function objects that get imported into modules when we import hs.something; in other words, they're the same as the objects we've been calling so far.)

>>> my_func = hs.Prelude.const(4)
>>> my_func
<hyphen.HsFunObj object of Haskell type b_0 -> GHC.Integer.Type.Integer>
>>> my_func('Hello')
4

In a similar vein, if the Haskell object is in Cmp or Eq then the corresponding python object will support comparisons or equality tests, and if the Haskell object is hashable, the python object will be too.

As you'd expect, you can partially apply any function just by calling it with fewer than the full number of arugments, and you'll get a function which accepts the remaining arguments later.

In addition, Haskell objects that represent IO actions can be induced to actually perform the action by calling .act on them. This returns whatever the return type of the action might be.

>>> hs.Prelude.putStrLn("Test") # Construct IO action, but don't perform it
<hs.GHC.Types.IO object of Haskell type GHC.Types.IO ()>
>>> hs.Prelude.putStrLn("Test").act()
Test
<hs.GHC.Tuple.() object of Haskell type (), containing '()'>

It goes without saying that it is important to remember to call .act: if your code doesn't seem to be doing what it's meant to be doing, it's possible that you're constructing an IO action but then discarding it without performing it!

Finally, if we find a a Haskell type T and a function f :: T -> <something> which is defined in the same module, then this gives rise to a member function on the Python representation of objects of type T, such that foo.f(*args) means the same as f(foo, *args). For example, if we have Haskell code in the module Test as follows:

data Test = Test Integer deriving (Typeable, Show)

extract_number :: Test -> Integer
extract_number (Test i) = i

make_sum :: Test -> Integer -> Integer
make_sum (Test i) j = i + j

Then from python we can do:

>>> import hyphen
>>> hyphen.find_and_load_haskell_source()
>>> from hs.Test import Test
>>> my_test_obj = Test(3)
>>> my_test_obj
<hs.Test.Test object of Haskell type Test.Test, containing 'Test 3'>
>>> my_test_obj.extract_number
<bound method Test.extract_number of <hs.Test.Test object of Haskell type Test.Test, containing 'Test 3'>>
>>> my_test_obj.extract_number()
3
>>> my_test_obj.make_sum(4)
7

Names of Haskell functions that begin with an underscore are exempt from this rule because we don't want users to accidentally end up defining python members like __getitem which can change the behavior of an object quite dramatically. If you want to create such members deliberately (which you might be: perhaps you're trying to use Haskell to build a python type that has interesting non-standard behaviors), then you can escape the exemption as follows: if you define a member is Haskell with a name like hy__<type-name>__<something>__, then this will be used to create a python member called __<something>__.

For instance, continuing the example above, if the Haskell code continues

hy__Test__getitem__ :: Test -> Integer -> Integer
hy__Test__getitem__ (Test i) j = i + j

Then the python example could have continued

>>> my_test_obj[5]
8

More detail about marshalling to Haskell

We have already seen some examples of how objects are marshalled from Haskell to Python. A key point about this process is that a single Python object could be used to construct Haskell objects of various different types, depending on type expected by the Haskell function which we're applying. For instance:

>>> import hyphen, hs.Prelude, hs.Data.Text
>>> hs.Prelude.drop(6, "Hello world")   # Python string -> Haskell String
'world'
>>> hs.Data.Text.drop(6, "Hello world") # Python string -> Haskell Text
'world'
>>> hs.Prelude.drop(1, (1, 2))          # Python tuple  -> Haskell list
<hs.GHC.Types.[] object of Haskell type [GHC.Integer.Type.Integer], containing '[2]'>
>>> hs.Prelude.snd((1, 2))              # Python tuple  -> Haskell tuple
2

On the other hand, you can apply polymorphic Haskell functions to Python objects, and the type of the Python object to which we apply the function will be used to determine what Haskell type should be used for the polymorphic arguments; for instance:

>>> hs.Prelude._['+'] (1, 2)            # Select Integer version
3
>>> hs.Prelude._['+'] (1+0j, 2+3j)      # Select Complex Float version
(3+3j)
>>> hs.Prelude.id([1, 2, 3])            # Invoke version of id for lists of integers
<hs.GHC.Types.[] object of Haskell type [GHC.Integer.Type.Integer], containing '[1,2,3]'>

When a Python object could have been conerted into multiple Haskell types, we will 'break the tie' and convert it to some preferred type:

>>> hs.Prelude.id((1, 2))  # Prefer to convert Python tuples to Haskell tuples, not lists
<hs.GHC.Tuple.(,) object of Haskell type (GHC.Integer.Type.Integer, GHC.Integer.Type.Integer), containing '(1,2)'>
>>> hs.Prelude.id((1, "Test")) # Prefer to convert Python strings to Haskell Text
<hs.GHC.Tuple.(,) object of Haskell type (GHC.Integer.Type.Integer, Data.Text.Internal.Text), containing '(1,"Test")'>

This is not foolproof however; for instance, we get an error in the following case:

>>> hs.Prelude._['+'] (1, 2+3j)
Traceback (most recent call last):
...
TypeError: Incompatible types: cannot resolve object of type
    a -> a -> a
to type
    GHC.Integer.Type.Integer -> Data.Complex.Complex GHC.Types.Float -> a

Before we close this section, we'll cover two other behaviors of the marshalling code that are important or useful.

One key point is that Python functions can be marshalled into Haskell functions. For example:

>>> hs.Prelude.foldr((lambda x, y: x + y), 0, [1, 2, 3])
6

Although you should be careful when doing this; Haskell idioms make full use of Haskell's ability to have infinite stack depth, whereas Python has a finite stack depth. This can cause problems in examples like:

>>> hs.Prelude.foldr((lambda x, y: x + y), 0, range(10000))
...
RuntimeError: maximum recursion depth exceeded while calling a Python object

Similarly, Python functions can be marshalled into IO actions. In this case the Python function will be called with no arguments. For example, we can take the usual 'replicate' function in the Prelude, force its type to be Int -> IO Text -> IO [Text] (see the section on types below for more on this), then play with it as follows:

>>> hs.Prelude.replicate
<hyphen.HsFunObj object of Haskell type GHC.Types.Int -> a -> [a]>
>>> specialized_repl = hs.Prelude.replicate.subst(a=hs.Prelude.IO(hs.Data.Text.Text()))
>>> specialized_repl
<hyphen.HsFunObj object of Haskell type GHC.Types.Int -> GHC.Types.IO Data.Text.Internal.Text -> [GHC.Types.IO Data.Text.Internal.Text]>
>>> specialized_repl(4, (lambda : input()))
<hs.GHC.Types.[] object of Haskell type [GHC.Types.IO Data.Text.Internal.Text]>
>>> hs.Prelude.sequence(specialized_repl(4, (lambda : input())))
<hs.GHC.Types.IO object of Haskell type GHC.Types.IO [Data.Text.Internal.Text]>
>>> hs.Prelude.sequence(specialized_repl(4, (lambda : input()))).act()
Fee
Fi
Fo
Fum
<hs.GHC.Types.[] object of Haskell type [Data.Text.Internal.Text], containing '["Fee","Fi","Fo","Fum"]'>

(The reason we have to force the type is that otherwise we have no way of knowing that we should marshall the Python function into an IO action, nor what that IO action should return.)

The other useful behavior is that if we can marshall a python object to a Haskell object of type T, then we can generally also marshall it to Haskell type Maybe T (in which case we insert an implicit Just); we can also marshall python None to any Haskell type Maybe x, in which case we render it as Nothing.

For example:

>>> import hyphen, hs.Prelude, hs.Data.Maybe, hs.Data.Text
>>> identity_on_Maybe_Text = hs.Prelude.id.subst(a=hs.Data.Maybe.Maybe(hs.Data.Text.Text()))
>>> identity_on_Maybe_Text
<hyphen.HsFunObj object of Haskell type GHC.Base.Maybe Data.Text.Internal.Text -> GHC.Base.Maybe Data.Text.Internal.Text>
>>> identity_on_Maybe_Text("Hello")
<hs.GHC.Base.Just object of Haskell type GHC.Base.Maybe Data.Text.Internal.Text, containing 'Just "Hello"'>
>>> identity_on_Maybe_Text(None)
<hs.GHC.Base.Nothing object of Haskell type GHC.Base.Maybe Data.Text.Internal.Text, containing 'Nothing'>

Don't cross the streams

In general, the goal with hyphen is that things Just Work (TM). I'll leave it to the reader to decide whether that goal is met in general, but there's one really nasty case where it seems pretty impossible to make things Just Work in the best possible way. So now I'll give the single most important warning about using Hyphen: you must not construct reference loops that consist of Haskell and python objects. If you do this, then neither the Haskell garbage collection nor the Python garbage collection will be able to collect the objects in the loop, because neither will be able to see the 'whole loop' and recognize that it can be collected.

To be clear, it's perfectly fine to have Python objects refer to Haskell objects (whenever you get a return value from a Haskell function which is not marshalled back to Python, and you store that result somewhere in a Python object, you're building Python objects which refer to Haskell objects. It's perfectly fine to have Haskell objects refer to Python objects. (When you marshall a python closure over to Haskell for use as a Haskell function), then that closure (and hence, indirectly, any python objects which that closure refers to) will now be referred to from Haskell. It's perfectly fine, even, to have Haskell objects refer to Python objects which themselves refer to Haskell objects which refer to Python objects which refer to Haskell objects and so on. What you can't do is have (say) a Haskell object refer to a Python object which refers to the original Haskell object.

Importing non-library modules

In general, the preferred way of accessing Haskell code from Python via hyphen is to install the Haskell code as a ghc-pkg visible library; such libraries can always be directly imported into Python with hyphen (as we have been doing so far).

But you might want to have some Haskell source code in the same directory as your python source, and then magically import Haskell functions from that source. This is possible, but there are some limitations. The basic issue is that to ensure type-safety, the hyphen system must enforce that we only ever compile and import from source once per program run. (Reason: Haskell will recompile the source if it has changed, which might lead us to have binary-incompatible objects floating around which the type system thinks can be freely inter-substituted, leading to Problems.) So we have to have a system for deciding which Haskell modules we want compiled as part of this 'one shot' compilation.

We provide two options for this.

Option one is enabled by calling hyphen.find_and_load_haskell_source(). This looks in the directory of the running script for Haskell source, and (if any is found) we will also check subdirectories for more source, and (if any is found in a subdirectory) we check sub-sub directories recursively. All the source we find, we try to compile; if there are .hs files lying around which are not valid source, we will get errors. Once compiled, we can import the contents of these files from the hs.* namespace as usual.

This is meant to cover the case when you're writing a little script and you want to import a little bespoke Haskell routine. It assumes that you can control the contents of the directory where your script lives; not an unreasonable assumption.

Option two is not really recommended. It covers the case where you're writing a python library which is being imported from somewhere on the python path and which (in turn) wants to import a little Haskell piece of code that lives in the same directory. (As we've said, the recommended way of handling this is to install the Haskell code as a Haskell library visible form ghc-pkg, then to import it from python... but we'll assume this isn't possible for you.) To enable option two, call find_and_load_haskell_source(check_full_path=True) as your python library module is imported. This will check the entire python path for Haskell files (using the same rules as were used to check the script directory above, including recursively reading subdirectories). We then compile them and again they may be imported. This is (a) somewhat slow, and (b) runs the risk that we'll come across an .hs file somewhere in the path which isn't valid Haskell and die.

What happens to Haskell type constructors and data constructors?

So far, we've talked a lot about what happens to objects imported from Haskell modules. But Haskell modules can also contain type constructors and data constructors. What happens to them? The answer is that they are both transformed into python type objects. So if we have a Haskell type constructor with a couple of data constructors, we will end up with a little class hierarchy on the Python side. For example, if the Haskell module Test has:

data Example = ExampleWithInt    Int
             | ExampleWithString String deriving (Typeable, Show)

we'll end up with python classes arranged as follows

hyphen.HsObj (base class of all Haskell objects viewed from python via hyphen)
|
\--- hs.Test.Example
        |
		\---- hs.Test.ExampleWithInt
		|
		\---- hs.Test.ExampleWithString

As we can see from Python:

>>> hs.Test.Example
<class 'hs.Test.Example'>
>>> type(hs.Test.Example)
<class 'type'>
>>> hs.Test.ExampleWithInt
<class 'hs.Test.ExampleWithInt'>
>>> hs.Test.ExampleWithString
<class 'hs.Test.ExampleWithString'>
>>> hs.Test.ExampleWithString.__bases__
(<class 'hs.Test.Example'>,)
>>> hs.Test.ExampleWithInt.__bases__
(<class 'hs.Test.Example'>,)

We can call the data constructors to make Haskell objects of the relevant types:

>>> hs.Test.ExampleWithInt(1)
<hs.Test.ExampleWithInt object of Haskell type Test.Example, containing 'ExampleWithInt 1'>
>>> hs.Test.ExampleWithString("hello")
<hs.Test.ExampleWithString object of Haskell type Test.Example, containing 'ExampleWithString "hello"'>

Things behave as you'd expect

>>> isinstance(hs.Test.ExampleWithInt(1), hs.Test.ExampleWithInt)
True
>>> isinstance(hs.Test.ExampleWithInt(1), hs.Test.ExampleWithString)
False
>>> isinstance(hs.Test.ExampleWithInt(1), hs.Test.Example)
True
>>> type(hs.Test.ExampleWithInt(1))
<class 'hs.Test.ExampleWithInt'>

It's also worth remarking that you can take apart a Haskell object to see what arguments were applied to its constructor as follows:

>>> hs.Test.ExampleWithInt(1)._components
(1,)

One potentially confusing thing: you need to realize that, for data constructors which take no parameters, there's a big difference between the python representation of the data constructor itself and the value you get when you call it. (Whereas in Haskell they're basically the same thing.) For instance:

>>> hs.Prelude.LT
<class 'hs.GHC.Types.LT'>
>>> hs.Prelude.LT()
<hs.GHC.Types.LT object of Haskell type GHC.Types.Ordering, containing 'LT'>
>>> type(hs.Prelude.LT)
<class 'type'>
>>> type(hs.Prelude.LT())
<class 'hs.GHC.Types.LT'>

Another thing to bear in mind is that since a Haskell module's type/type constructor namespace and object namespace are completely separate things, it is possible to have the same name defined in both namespaces (in such cases, because of capitalization rules, the object in this case will necessarily be a Data constructor). Python has onyl a single namespace, so we must somehow resolve this contention. We generally do this by creating a lightweight 'doublet' object; then you can do hs.Foo.Bar.MyName.the_tycon or hs.Foo.Bar.MyName.the_dacon to specify exactly what you mean. There is one exception to this, which is when you have a type constructor with only one data constructor, and which has the same name as the type constructor. This is a common pattern in Haskell and it's annoying to have to do the .the_dacon stuff in this case. So in this case alone we have MyName refer to the data constructor, which is probably what you mean, and allow you to write MyName.the_type to get the type constructor.)

The difference between Python types and Haskell types

We've just described how hyphen creates python types to represent Haskell type constructors and data constructors. Every Haskell object viewed from python via hyphen will be given a python type using these types. (Exception: if the type constructor is invisible because it's not exported from anywhere, then the type will just be HsObj.)

If follows from all this that the Python type of a Haskell object viewed through hyphen can be rather different to its Haskell type. On the one hand, the Python type will depend on exactly which data constructor was used. On the other hand, if the Haskell type is something like Map Int Int, the python type will not pick up the arguments (Int and Int) provided to the type constructor; the python type will just be hs.Data.Map.Map, and so:

>>> map1 = hs.Prelude.id({1 :1, 2:2})
>>> map1
<hs.Data.Map.Base.Map object of Haskell type Data.Map.Base.Map GHC.Integer.Type.Integer GHC.Integer.Type.Integer, containing 'fromList [(1,1),(2,2)]'>
>>> map2 = hs.Prelude.id({1:'one', 2:'two'})
>>> map2
<hs.Data.Map.Base.Map object of Haskell type Data.Map.Base.Map GHC.Integer.Type.Integer Data.Text.Internal.Text, containing 'fromList [(1,"one"),(2,"two")]'>
>>> type(map1) == type(map2)
True

If you care to know the full Haskell type of a Haskell object viewed from hyphen, we can certainly do that; it's visible as the .hstype member. This will be a python object of python type hyphen.HsType:

>>> map1.hstype
hs.Data.Map.Map(hs.GHC.Integer.Integer(), hs.GHC.Integer.Integer())
>>> str(map1.hstype) # NB: str(...) representation easier to read than repr, more closely matches Haskell notation
'<hyphen.HsType object representing Data.Map.Base.Map GHC.Integer.Type.Integer GHC.Integer.Type.Integer>'
>>> str(map2.hstype)
'<hyphen.HsType object representing Data.Map.Base.Map GHC.Integer.Type.Integer Data.Text.Internal.Text>'
>>> map1.hstype == map2.hstype
False

HsType objects can be manipulated in a few ways. You can break them up by calling the head and tail members. (head will return a triple giving the name/module/package key of a type constructor, or just a string in the case that the head of the type is a type variable). You can build HsTypes by invoking type constructors (pass other HsTypes as the parameters, and strings for type variables). To build HsTypes representng type variables do HsType('foo') (or HsType('foo', kind='* -> *') etc. for variables of more complicated kinds), and if you want to create an odd HsType with a type variable as its head and a nontrivial tail (legal Haskell, but very odd), you can do something like HsType('a', hs.Prelude.Integer(), kind="*->*")).

>>> map1.hstype.head
('Map', 'Data.Map.Base', 'conta_LKCPrTJwOTOLk4OU37YmeN')
>>> map1.hstype.tail
(hs.GHC.Integer.Integer(), hs.GHC.Integer.Integer())
>>> import hs.Data.Map
>>> hs.Data.Map.Map('a', 'b')
hs.Data.Map.Map(hyphen.HsType("a"), hyphen.HsType("b"))
>>> hyphen.HsType("a")
hyphen.HsType("a")
>>> hyphen.HsType("a", kind="* -> *")
hyphen.HsType("a", kind="* -> *")
>>> hyphen.HsType("a").head
'a'
>>> hyphen.HsType('a', hs.Prelude.Integer(), kind="*->*")
hyphen.HsType("a", hs.Prelude.Integer(), kind="* -> *")

In this connexion, it's important when there are nullary type constructors in play to be mindful of the difference between the python object we create to represent the type constructor and the HsType you get by invoking it:

>>> hs.Prelude.Integer
<class 'hs.GHC.Integer.Type.Integer'>
>>> type(hs.Prelude.Integer)
<class 'type'>
>>> hs.Prelude.Integer()
hs.Prelude.Integer()
>>> str(hs.Prelude.Integer())
'<hyphen.HsType object representing GHC.Integer.Type.Integer>'
>>> type(hs.Prelude.Integer())
<class 'hyphen.HsType'>

There are a few other interesting things to mention. One is that you can call subst on HsTypes to provide substitutions for type variables therein. You can call kind to get the kind of the type. And you can call fvs to get the free variables (as a dictionary mapping variables to strings representing their kinds).

>>> my_hstype = hs.Prelude.id.hstype
>>> my_hstype
hs.GHC.Prim._['(->)'](hyphen.HsType("a"), hyphen.HsType("a"))
>>> str(my_hstype)
'<hyphen.HsType object representing a -> a>'
>>> my_hstype.fvs
{'a': '*'}
>>> my_hstype.kind
'*'
>>> my_hstype2 = my_hstype.subst(a=map1.hstype)
>>> str(my_hstype2)
'<hyphen.HsType object representing Data.Map.Base.Map GHC.Integer.Type.Integer GHC.Integer.Type.Integer -> Data.Map.Base.Map GHC.Integer.Type.Integer GHC.Integer.Type.Integer>'

Once you know how to construct and play with HsType objects, this also allows you to do some new things with HsObj. You can also call subst on HsObjs, in order to substitute for the type variables that occur in their types, and you can call narrow_type to narrow an object to a specific type.

>>> int_identity = hs.Prelude.id.subst(a=hs.Prelude.Int())
>>> int_identity
<hyphen.HsFunObj object of Haskell type GHC.Types.Int -> GHC.Types.Int>
>>> int_identity('Foo')
Traceback (most recent call last):
...
TypeError: an integer is required (got type str)
>>> int_identity(1)
1
>>> hs.Prelude.id.narrow_type(int_identity.hstype)
<hyphen.HsFunObj object of Haskell type GHC.Types.Int -> GHC.Types.Int>

What about exceptions?

We try to be sensible about converting Haskell exceptions to Python exceptions whenever they escape from Haskell code. For instance:

>>> hs.Prelude.quot(1, 0)
Traceback (most recent call last):
...
ZeroDivisionError: divide by zero

(here the Haskell DivideByZero was converted to a python ZeroDivisionError). Python exceptions raised by python functions that were marshalled into Haskell functions will propagate through Haskell code and finally come out to the python code on the other side.

Notes on customizability and the low-level layer

A brief point about the implementation of hyphen. Basically, hyphen consists of two rather separate componnents. The first component is a fairly low-level bridge between Haskell and Python. This low-level bridge is written mostly in Haskell with a little bit of C. The second component is built on top of this and gives the high-level bridge we've just described. This second part is in fact implemented in Python. There are two nice results of this. First, if something goes wrong in the high-level layer, you'll get a nice traceback (at least until the problem hits the low-level layer), which may help you to diagnose what's up and what can be done to fix it. Second, the high level layer is full of hooks that can be manipulated in Python: so if you, the Python user, want to add functionality (say) to marshall python objects to some particular Haskell type in some particular way, that's totally something you can do, from pure python, by just customizing the high-level layer via its hooks.

If this is something that interests you, the best way to start is by reading the sources of the high level layer, especially marshall_ctor.py, marshall_obj_to_hs.py and marshall_obj_to_py,py. There are comments in each of those files which provide a guide for customizing behavor.

For more documentation on the low-level layer, see LOWLEVEL.md.

Notes on efficiency

The previous section describes the implementation of hyphen in terms of a low-level layer sitting on a high level layer. This allows us to segue into the question of efficiency. Obviously, if you call a Haskell function from Python via hyphen then once we actually get through to running Haskell code, things will be just as efficient as running any other Haskell code anywhere else. So as long as you don't have context switches between Haskell and python inside your inner loop (the intended case for using hyphen is that you should not have such things!),then you're probably fine.

If you do want to do lots of context switches in an inner or nearly-inner loop, then you might have a problem. The python high-level layer which handles marshalling of objects between the two langauges is not designed to be super fast, so your code may be slow. If it's not possible to avoid having context switches close to your inner loop, you can try bypassing the high-level layer and using the low-level layer directly. If you do that, then things will be pretty efficient: if you invoke a Haskell function from Python using the low-level layer, then it's basically a type check followed by a direct jump from python to a C function and from that C function to the Haskell machine code. So it should be reasonably fast.

One final point in this connection is that, even going via the low-level layer, resolving polymorphic Haskell objects down to monomorphic ones is reasonably expensive. So you should resolve any polymorphism once and for all, outside your inner loop, then call the monomorphic functions from inside the inner loop.

Notes on the GIL

Python notoriously has a locking structure, called the Global Interpreter Lock or GIL, which must be held whenever the python interpreter is doing anything. Hyphen allows you to optionally release this lock before calling in to Haskell code, and re-acquire it on returning from the Haskell code. This is often a good idea (it allows other python threads to make progress while the Haskell code is doing its work); but it might not be (releasing and re-aquiring the GIL is a little expensive and if all your Haskell calls return quickly and/or your python code isn't multithreaded anyway, then there's no benefit to releasing the GIL).

If you want to have hyphen release the GIL while running Haskell code, do hyphen.hslowlevel.set_GIL_mode_fancy(). If you want it not to bother to release the GIL, do hyphen.hslowlevel.set_GIL_mode_lazy. If you want to know the current state, do hyphen.hslowlevel.get_GIL_mode() (which will return 'fancy' or 'lazy' as appropriate).

Notes on signals and Keyboard Interrupts

This section only applies on unix-like OSs.

Sometimes, the operating system will send a signal to a running program, telling it (say) to terminate, or to interrupt what it's doing. For instance, when you press ctrl+C in the console, the runnning program recieves a interrupt signal. Python allows you to install a Python signal handler for any signal, which is basically some python code which will get executed when the signal is sent by the operating system. A default handler is installed for the interrupt signal, which basically raises a KeyboardInterrupt exception which then propagates outward through your Python code, causing the code to stop unless the exception is caught.

Basically the same story is true in Haskell: Haskell allows you to install Haskell signal handler for any signal, which is basically some Haskell code which will get executed when the signal is sent by the operating system. Again, a default handler is installed for the interrupt signal, which basically raises a UserInterrupt exception which then propagates outward as before.

The wrinkle in all this is the following. It turns out that the way Python implements python signal handlers goes like this. The OS provides a low-level primtive that lets you install some C code to run when a signal is delivered (we'll call that an OS signal handler). Unfortunately, what you're allowed to do inside an OS signal handler is extremely limited. So when you ask python to install a python signal handler, python will (in turn) install an OS signal handler, but all that handler does is to set a flag somewhere. Then python constantly polls that flag as it interprets code, and if it's set, Python suspends whatever execution is currently in progress and rushes to call the (python) signal handler code.

(The same is true for Haskell.)

What does this have to do with Hyphen? Well, when you use hyphen to jump into Haskell code, then while that Haskell code is running, Python's OS signal handler is still installed, but no one is checking the flag. So if you (say) ctrl-C your program, the flag will get set, but the program will not stop running because no one will check the flag, and so KeyboardInterrupt will not be raised, so the program will not stop. (Until, that is, the Haskell routine returns and Python finally checks the flag.) This is Not Ideal.

There are two ways around this. One option is to have Haskell periodically check python's 'has a signal been recieved' flags, and (if a signal has been recieved), go service the python signal handler. If the signal handler (in turn) raises an exception (like KeyboardInterrupt), then we propagate that exception to the running Haskell code, which will pass it up the stack, and the exception will eventually escape to the python code that called the Haskell code, which is what we want. This is generally the best option, but it does require some multithreadying overhead on the Haskell side (we have to spawn another Haskell thread just to service the Python signal handlers), and while multithreading in Haskell is cheap compared to python, it's not free. (Another proviso is that Haskell will only check the flag every 50ms or so, which is not nearly as often as Python would check it normally.)

An alternative is the following. While Haskell code is running, we can install Haskell's OS signal handler together with a Haskell signal handler that processes ctrl-C. (We then restore python's OS signal handler at the end of the Haskell code.) Then, if a ctrl-C is received during the executing of some Haskell code, Haskell's flag will be set to say that there's a signal waiting, and (since Haskell is constantly polling that flag while Haskell code is running), Haskell will notice that ctrl-C was pressed, which will invoke the Hasell signal handler, which will raise UserInterrupt, which will propagate up through the Haskell code and finally be translated to KeyboardInterrupt when it crosses over into python. This again will have the effect that our Haskell code is properly interrupted, and we will be polling the flag much more frequently than once every 100ms. The disadvantage is that which signal handlers run depends on whether the signal arrives while Python or Haskell code is running. This may be quite confusing! But if all the signal handlers actually end up doing is raising UserInterrupt or KeyboardInterrupt, it's probably managable.

(A final alternative is to do nothing, and just live with the fact that signals that are recieved while hyphen is executing Haskell code will not be processed by python until the Haskell code returns. This is obviously the lowest-overhead option.)

If you want to have hyphen service python signal handlers while Haskell code is running, do hyphen.hslowlevel.set_signal_mode_python(). If you want to have hyphen replace the OS signal handler while Haskell code is running so that Haskell signal handlers will process any signals received while Haskell code is running, do hyphen.hslowlevel.set_signal_mode_haskell(). If you want to be lazy and not check for signals at all until Haskell returns control to python, do hyphen.hslowlevel.set_signal_mode_lazy(). If you want to know the current state, do hyphen.hslowlevel.get_signal_mode() (which will return 'python', 'haskell' or 'lazy' as appropriate).

About

hyphen - access Haskell modules from Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published