Skip to content

Commit b5f0210

Browse files
authored
Merge pull request #1300 from well-typed/bolt12/827
Updated LowLevel/Functions manual documentation
2 parents 4851c7d + be83ba1 commit b5f0210

File tree

1 file changed

+260
-7
lines changed

1 file changed

+260
-7
lines changed

manual/LowLevel/Functions.md

Lines changed: 260 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,33 @@
22

33
## Introduction
44

5-
TODO
5+
This chapter discusses how `hs-bindgen` generates bindings for C functions.
66

7-
## Function pointers
7+
## Safe vs unsafe foreign imports
88

9-
TODO: introduction to function pointers
9+
When importing a C function, GHC allows us to choose between two calling
10+
conventions: `safe` and `unsafe`. The distinction is important:
1011

11-
1. For every C function, generate an additional binding for the address of that C function.
12+
* **Safe** foreign imports may call back into Haskell code. They are more
13+
expensive.
14+
* **Unsafe** foreign imports may not call back into Haskell. They are faster
15+
because they avoid this overhead, but using `unsafe` incorrectly can lead to
16+
undefined behavior.
17+
18+
To give users control over this choice, `hs-bindgen` generates two separate
19+
modules for function bindings:
20+
21+
* `ModuleName.Safe` - contains all function imports using the `safe` calling
22+
convention
23+
* `ModuleName.Unsafe` - contains all function imports using the `unsafe`
24+
calling convention
25+
26+
Both modules export identical APIs, differing only in their calling convention.
27+
Users can import from whichever module best suits their needs.
28+
29+
## Function addresses
30+
31+
### Function pointers
1232

1333
In theory every C function is a candidate for being passed to other functions as
1434
a function pointer. For example, consider the following two (contrived)
@@ -63,7 +83,7 @@ main = do
6383

6484
[globals]:./Globals.md#Guidelines-for-binding-generation
6585

66-
## Implicit function to pointer conversion
86+
### Implicit function to pointer conversion
6787

6888
In C, functions are not "first-class citizens", but *pointers to functions* can
6989
be passed around freely. Typically, C code is explicit about the fact that it
@@ -193,13 +213,246 @@ apply1_union :: Apply1Union
193213
[creference:fun-decl]: https://en.cppreference.com/w/c/language/function_declaration.html#Explanation
194214
[creference:fun-ptr-conv]: https://en.cppreference.com/w/c/language/conversion.html#Function_to_pointer_conversion
195215
216+
## Conversion between Haskell functions and C functions
217+
218+
Beyond generating type definitions for function pointers and handling implicit
219+
conversions, `hs-bindgen` generates the additional FFI imports needed to
220+
convert between Haskell functions and C function pointers in both directions.
221+
222+
### Auxiliary `_Deref` types
223+
224+
For each typedef function pointer type in the C API, `hs-bindgen` generates
225+
two related types. Given:
226+
227+
```c
228+
typedef void (*ProgressUpdate)(int percentComplete);
229+
```
230+
231+
We generate:
232+
233+
```hs
234+
newtype ProgressUpdate_Deref = ProgressUpdate_Deref
235+
{ un_ProgressUpdate_Deref :: CInt -> IO ()
236+
}
237+
238+
newtype ProgressUpdate = ProgressUpdate
239+
{ un_ProgressUpdate :: FunPtr ProgressUpdate_Deref
240+
}
241+
```
242+
243+
The `_Deref` auxiliary type represents the Haskell function signature, while
244+
the main type wraps the `FunPtr` to that signature. This separation mirrors
245+
how C distinguishes between a function pointer and the function it points to.
246+
247+
### Wrapper and dynamic imports
248+
249+
For function pointer types that are actually used by the C API, `hs-bindgen`
250+
generates both `"wrapper"` and `"dynamic"` foreign import stubs. These provide
251+
bidirectional conversion between Haskell functions and C function pointers:
252+
253+
```hs
254+
-- Create a C-callable function pointer from a Haskell function
255+
foreign import ccall "wrapper" toProgressUpdate_Deref ::
256+
ProgressUpdate_Deref
257+
-> IO (FunPtr ProgressUpdate_Deref)
258+
259+
-- Convert a C function pointer back to a Haskell function
260+
foreign import ccall "dynamic" fromProgressUpdate_Deref ::
261+
FunPtr ProgressUpdate_Deref
262+
-> ProgressUpdate_Deref
263+
```
264+
265+
These stubs are abstracted over two type classes in order to offer a better
266+
API to the end user. The following instances are also generated:
267+
268+
```hs
269+
instance ToFunPtr ProgressUpdate_Deref where
270+
toFunPtr = toProgressUpdate_Deref
271+
272+
instance FromFunPtr ProgressUpdate_Deref where
273+
fromFunPtr = fromProgressUpdate_Deref
274+
```
275+
276+
A function pointer will have a `ToFunPtr` and `FromFunPtr` instance if at
277+
least one of its arguments contains at least one domain specific type. This
278+
check is done recursively so higher order functions will be inspected
279+
correctly.
280+
281+
For the purpose of instance generation, **domain-specific types** are types
282+
defined in the generated bindings for the specific C library being bound,
283+
such as:
284+
- Structs and their fields
285+
- Enums and typedefs
286+
- Function pointer wrapper types
287+
288+
Conversely, **non-domain-specific types** are standard FFI types from GHC's
289+
base libraries, such as `CInt`, `CDouble`, `Ptr a`, `IO ()`, etc.
290+
291+
For example:
292+
- `ProgressUpdate_Deref` with type `CInt -> IO ()` **will** get instances
293+
because `ProgressUpdate_Deref` itself is domain-specific.
294+
- A hypothetical function pointer type `CInt -> IO CInt` **will not** get
295+
instances because both `CInt` and `IO CInt` are non-domain-specific.
296+
297+
This distinction is important to avoid orphan instances and to prevent
298+
generating multiple instances for the same type signature when binding
299+
different C libraries.
300+
301+
#### Wrapping Haskell functions
302+
303+
To pass a Haskell function as a callback to C, use `toFunPtr` or the
304+
`withToFunPtr` bracket combinator:
305+
306+
```hs
307+
import HsBindgen.Runtime.FunPtr (withToFunPtr)
308+
309+
myCallback :: ProgressUpdate_Deref
310+
myCallback = ProgressUpdate_Deref $ \progress ->
311+
putStrLn $ "Progress: " ++ show progress ++ "%"
312+
313+
-- Preferred: automatic cleanup with withToFunPtr
314+
withToFunPtr myCallback $ \funPtr -> do
315+
onProgressChanged (ProgressUpdate funPtr)
316+
317+
-- Or manually manage the function pointer lifetime with bracket
318+
bracket
319+
(toFunPtr myCallback)
320+
(freeHaskellFunPtr . un_ProgressUpdate)
321+
(\funPtr -> onProgressChanged (ProgressUpdate funPtr))
322+
```
323+
324+
#### Unwrapping function pointers
325+
326+
To call a function pointer returned from C, use `fromFunPtr`:
327+
328+
```hs
329+
do
330+
validatorFunPtr <- getValidator
331+
-- validatorFunPtr :: DataValidator
332+
333+
-- Extract the FunPtr and convert to Haskell function
334+
let validator = fromFunPtr (un_DataValidator validatorFunPtr)
335+
result <- un_DataValidator_Deref validator 42
336+
```
337+
338+
### Example: struct with function pointer fields
339+
340+
Function pointers frequently appear as struct fields for registering handlers:
341+
342+
```c
343+
struct MeasurementHandler {
344+
void (*onReceived)(struct Measurement *data);
345+
int (*validate)(struct Measurement *data);
346+
void (*onError)(int errorCode);
347+
};
348+
349+
void registerHandler(struct MeasurementHandler *handler);
350+
```
351+
352+
In Haskell we can make use of the `ToFunPtr` to construct the
353+
`MeasurementHandler` record.
354+
355+
```hs
356+
alloca $ \handlerPtr -> do
357+
onReceivedPtr <- toFunPtr $ OnReceived_Deref $ \dataPtr -> do
358+
measurement <- peek dataPtr
359+
print measurement
360+
361+
validatePtr <- toFunPtr $ Validate_Deref $ \dataPtr -> do
362+
-- validation logic
363+
return 1
364+
365+
onErrorPtr <- toFunPtr $ OnError_Deref $ \errorCode ->
366+
putStrLn $ "Error: " ++ show errorCode
367+
368+
poke handlerPtr $ MeasurementHandler
369+
{ measurementHandler_onReceived = onReceivedPtr
370+
, measurementHandler_validate = validatePtr
371+
, measurementHandler_onError = onErrorPtr
372+
}
373+
374+
registerHandler handlerPtr
375+
```
376+
196377
## Userland CAPI
197378
198-
TODO
379+
GHC's foreign function interface has limitations on which C functions can be
380+
imported directly. For example, GHC cannot import functions that take or return
381+
structs by value. To work around these limitations, `hs-bindgen` generates C
382+
wrapper functions that can be imported by GHC, and then generates Haskell
383+
bindings to these wrappers instead of to the original C functions.
384+
385+
This approach is similar to GHC's `capi` calling convention, which also
386+
generates C wrappers to handle features that the FFI cannot express directly.
387+
However, by generating these wrappers ourselves at the userland level, we can
388+
extend the set of supported function signatures beyond what GHC's `capi`
389+
provides. For instance, we can handle by-value struct arguments and return
390+
values, which `capi` does not support.
391+
392+
The generated wrappers use hash-based names to prevent potential name
393+
collisions. For example, a C function `print_point` might have a wrapper named
394+
`hs_bindgen_test_example_a1b2c3d4e5f6g7h8`:
395+
396+
```c
397+
// Generated C wrapper
398+
void hs_bindgen_test_example_a1b2c3d4e5f6g7h8 ( struct point * arg1 ) {
399+
print_point ( arg1 );
400+
}
401+
```
402+
403+
```hs
404+
-- Generated Haskell import
405+
foreign import ccall "hs_bindgen_test_example_a1b2c3d4e5f6g7h8"
406+
print_point :: Ptr Point → IO ()
407+
```
408+
409+
This hash-based naming ensures that even if multiple functions have similar
410+
names or signatures, their wrappers will have unique names, avoiding linker
411+
errors from symbol collisions.
412+
413+
This userland CAPI approach is used for all function imports in `hs-bindgen`,
414+
not just those with features GHC cannot handle directly. This provides a
415+
uniform interface and makes it straightforward to add support for additional
416+
C features in the future.
199417
200418
### By-value `struct` arguments or return values
201419
202-
TODO
420+
The GHC FFI does not support passing structs by value to or from C functions.
421+
For example, consider:
422+
423+
```c
424+
struct point byval ( struct point p );
425+
```
426+
427+
This function cannot be imported directly. Instead, `hs-bindgen` generates a C
428+
wrapper that accepts and returns structs by pointer, performing the necessary
429+
conversions:
430+
431+
```c
432+
void hs_bindgen_example_9a8b7c6d5e4f3210 ( struct point * arg
433+
, struct point * res ) {
434+
* res = byval (* arg );
435+
}
436+
```
437+
438+
This wrapper is then imported in Haskell:
439+
440+
```hs
441+
foreign import ccall safe "hs_bindgen_example_9a8b7c6d5e4f3210"
442+
byval_wrapper :: Ptr Point Ptr Point IO ()
443+
```
444+
445+
Finally, we generate a Haskell wrapper function that recovers the original
446+
by-value semantics using `with`, `alloca`, and `peek`:
447+
448+
```hs
449+
byval :: Point IO Point
450+
byval p =
451+
with p $ \ arg
452+
alloca $ \ res do
453+
byval_wrapper arg res
454+
peek res
455+
```
203456
204457
### Static inline functions
205458

0 commit comments

Comments
 (0)