|
2 | 2 |
|
3 | 3 | ## Introduction |
4 | 4 |
|
5 | | -TODO |
| 5 | +This chapter discusses how `hs-bindgen` generates bindings for C functions. |
6 | 6 |
|
7 | | -## Function pointers |
| 7 | +## Safe vs unsafe foreign imports |
8 | 8 |
|
9 | | -TODO: introduction to function pointers |
| 9 | +When importing a C function, GHC allows us to choose between two calling |
| 10 | +conventions: `safe` and `unsafe`. The distinction is important: |
10 | 11 |
|
11 | | -1. For every C function, generate an additional binding for the address of that C function. |
| 12 | +* **Safe** foreign imports may call back into Haskell code. They are more |
| 13 | + expensive. |
| 14 | +* **Unsafe** foreign imports may not call back into Haskell. They are faster |
| 15 | + because they avoid this overhead, but using `unsafe` incorrectly can lead to |
| 16 | + undefined behavior. |
| 17 | + |
| 18 | +To give users control over this choice, `hs-bindgen` generates two separate |
| 19 | +modules for function bindings: |
| 20 | + |
| 21 | +* `ModuleName.Safe` - contains all function imports using the `safe` calling |
| 22 | + convention |
| 23 | +* `ModuleName.Unsafe` - contains all function imports using the `unsafe` |
| 24 | + calling convention |
| 25 | + |
| 26 | +Both modules export identical APIs, differing only in their calling convention. |
| 27 | +Users can import from whichever module best suits their needs. |
| 28 | + |
| 29 | +## Function addresses |
| 30 | + |
| 31 | +### Function pointers |
12 | 32 |
|
13 | 33 | In theory every C function is a candidate for being passed to other functions as |
14 | 34 | a function pointer. For example, consider the following two (contrived) |
@@ -63,7 +83,7 @@ main = do |
63 | 83 |
|
64 | 84 | [globals]:./Globals.md#Guidelines-for-binding-generation |
65 | 85 |
|
66 | | -## Implicit function to pointer conversion |
| 86 | +### Implicit function to pointer conversion |
67 | 87 |
|
68 | 88 | In C, functions are not "first-class citizens", but *pointers to functions* can |
69 | 89 | be passed around freely. Typically, C code is explicit about the fact that it |
@@ -193,13 +213,246 @@ apply1_union :: Apply1Union |
193 | 213 | [creference:fun-decl]: https://en.cppreference.com/w/c/language/function_declaration.html#Explanation |
194 | 214 | [creference:fun-ptr-conv]: https://en.cppreference.com/w/c/language/conversion.html#Function_to_pointer_conversion |
195 | 215 |
|
| 216 | +## Conversion between Haskell functions and C functions |
| 217 | +
|
| 218 | +Beyond generating type definitions for function pointers and handling implicit |
| 219 | +conversions, `hs-bindgen` generates the additional FFI imports needed to |
| 220 | +convert between Haskell functions and C function pointers in both directions. |
| 221 | +
|
| 222 | +### Auxiliary `_Deref` types |
| 223 | +
|
| 224 | +For each typedef function pointer type in the C API, `hs-bindgen` generates |
| 225 | +two related types. Given: |
| 226 | +
|
| 227 | +```c |
| 228 | +typedef void (*ProgressUpdate)(int percentComplete); |
| 229 | +``` |
| 230 | +
|
| 231 | +We generate: |
| 232 | +
|
| 233 | +```hs |
| 234 | +newtype ProgressUpdate_Deref = ProgressUpdate_Deref |
| 235 | + { un_ProgressUpdate_Deref :: CInt -> IO () |
| 236 | + } |
| 237 | +
|
| 238 | +newtype ProgressUpdate = ProgressUpdate |
| 239 | + { un_ProgressUpdate :: FunPtr ProgressUpdate_Deref |
| 240 | + } |
| 241 | +``` |
| 242 | +
|
| 243 | +The `_Deref` auxiliary type represents the Haskell function signature, while |
| 244 | +the main type wraps the `FunPtr` to that signature. This separation mirrors |
| 245 | +how C distinguishes between a function pointer and the function it points to. |
| 246 | +
|
| 247 | +### Wrapper and dynamic imports |
| 248 | +
|
| 249 | +For function pointer types that are actually used by the C API, `hs-bindgen` |
| 250 | +generates both `"wrapper"` and `"dynamic"` foreign import stubs. These provide |
| 251 | +bidirectional conversion between Haskell functions and C function pointers: |
| 252 | +
|
| 253 | +```hs |
| 254 | +-- Create a C-callable function pointer from a Haskell function |
| 255 | +foreign import ccall "wrapper" toProgressUpdate_Deref :: |
| 256 | + ProgressUpdate_Deref |
| 257 | + -> IO (FunPtr ProgressUpdate_Deref) |
| 258 | +
|
| 259 | +-- Convert a C function pointer back to a Haskell function |
| 260 | +foreign import ccall "dynamic" fromProgressUpdate_Deref :: |
| 261 | + FunPtr ProgressUpdate_Deref |
| 262 | + -> ProgressUpdate_Deref |
| 263 | +``` |
| 264 | +
|
| 265 | +These stubs are abstracted over two type classes in order to offer a better |
| 266 | +API to the end user. The following instances are also generated: |
| 267 | +
|
| 268 | +```hs |
| 269 | +instance ToFunPtr ProgressUpdate_Deref where |
| 270 | + toFunPtr = toProgressUpdate_Deref |
| 271 | +
|
| 272 | +instance FromFunPtr ProgressUpdate_Deref where |
| 273 | + fromFunPtr = fromProgressUpdate_Deref |
| 274 | +``` |
| 275 | +
|
| 276 | +A function pointer will have a `ToFunPtr` and `FromFunPtr` instance if at |
| 277 | +least one of its arguments contains at least one domain specific type. This |
| 278 | +check is done recursively so higher order functions will be inspected |
| 279 | +correctly. |
| 280 | +
|
| 281 | +For the purpose of instance generation, **domain-specific types** are types |
| 282 | +defined in the generated bindings for the specific C library being bound, |
| 283 | +such as: |
| 284 | +- Structs and their fields |
| 285 | +- Enums and typedefs |
| 286 | +- Function pointer wrapper types |
| 287 | +
|
| 288 | +Conversely, **non-domain-specific types** are standard FFI types from GHC's |
| 289 | +base libraries, such as `CInt`, `CDouble`, `Ptr a`, `IO ()`, etc. |
| 290 | +
|
| 291 | +For example: |
| 292 | +- `ProgressUpdate_Deref` with type `CInt -> IO ()` **will** get instances |
| 293 | + because `ProgressUpdate_Deref` itself is domain-specific. |
| 294 | +- A hypothetical function pointer type `CInt -> IO CInt` **will not** get |
| 295 | + instances because both `CInt` and `IO CInt` are non-domain-specific. |
| 296 | +
|
| 297 | +This distinction is important to avoid orphan instances and to prevent |
| 298 | +generating multiple instances for the same type signature when binding |
| 299 | +different C libraries. |
| 300 | +
|
| 301 | +#### Wrapping Haskell functions |
| 302 | +
|
| 303 | +To pass a Haskell function as a callback to C, use `toFunPtr` or the |
| 304 | +`withToFunPtr` bracket combinator: |
| 305 | +
|
| 306 | +```hs |
| 307 | +import HsBindgen.Runtime.FunPtr (withToFunPtr) |
| 308 | +
|
| 309 | +myCallback :: ProgressUpdate_Deref |
| 310 | +myCallback = ProgressUpdate_Deref $ \progress -> |
| 311 | + putStrLn $ "Progress: " ++ show progress ++ "%" |
| 312 | +
|
| 313 | +-- Preferred: automatic cleanup with withToFunPtr |
| 314 | +withToFunPtr myCallback $ \funPtr -> do |
| 315 | + onProgressChanged (ProgressUpdate funPtr) |
| 316 | +
|
| 317 | +-- Or manually manage the function pointer lifetime with bracket |
| 318 | +bracket |
| 319 | + (toFunPtr myCallback) |
| 320 | + (freeHaskellFunPtr . un_ProgressUpdate) |
| 321 | + (\funPtr -> onProgressChanged (ProgressUpdate funPtr)) |
| 322 | +``` |
| 323 | +
|
| 324 | +#### Unwrapping function pointers |
| 325 | +
|
| 326 | +To call a function pointer returned from C, use `fromFunPtr`: |
| 327 | +
|
| 328 | +```hs |
| 329 | +do |
| 330 | + validatorFunPtr <- getValidator |
| 331 | + -- validatorFunPtr :: DataValidator |
| 332 | +
|
| 333 | + -- Extract the FunPtr and convert to Haskell function |
| 334 | + let validator = fromFunPtr (un_DataValidator validatorFunPtr) |
| 335 | + result <- un_DataValidator_Deref validator 42 |
| 336 | +``` |
| 337 | +
|
| 338 | +### Example: struct with function pointer fields |
| 339 | +
|
| 340 | +Function pointers frequently appear as struct fields for registering handlers: |
| 341 | +
|
| 342 | +```c |
| 343 | +struct MeasurementHandler { |
| 344 | + void (*onReceived)(struct Measurement *data); |
| 345 | + int (*validate)(struct Measurement *data); |
| 346 | + void (*onError)(int errorCode); |
| 347 | +}; |
| 348 | +
|
| 349 | +void registerHandler(struct MeasurementHandler *handler); |
| 350 | +``` |
| 351 | +
|
| 352 | +In Haskell we can make use of the `ToFunPtr` to construct the |
| 353 | +`MeasurementHandler` record. |
| 354 | +
|
| 355 | +```hs |
| 356 | +alloca $ \handlerPtr -> do |
| 357 | + onReceivedPtr <- toFunPtr $ OnReceived_Deref $ \dataPtr -> do |
| 358 | + measurement <- peek dataPtr |
| 359 | + print measurement |
| 360 | +
|
| 361 | + validatePtr <- toFunPtr $ Validate_Deref $ \dataPtr -> do |
| 362 | + -- validation logic |
| 363 | + return 1 |
| 364 | +
|
| 365 | + onErrorPtr <- toFunPtr $ OnError_Deref $ \errorCode -> |
| 366 | + putStrLn $ "Error: " ++ show errorCode |
| 367 | +
|
| 368 | + poke handlerPtr $ MeasurementHandler |
| 369 | + { measurementHandler_onReceived = onReceivedPtr |
| 370 | + , measurementHandler_validate = validatePtr |
| 371 | + , measurementHandler_onError = onErrorPtr |
| 372 | + } |
| 373 | +
|
| 374 | + registerHandler handlerPtr |
| 375 | +``` |
| 376 | +
|
196 | 377 | ## Userland CAPI |
197 | 378 |
|
198 | | -TODO |
| 379 | +GHC's foreign function interface has limitations on which C functions can be |
| 380 | +imported directly. For example, GHC cannot import functions that take or return |
| 381 | +structs by value. To work around these limitations, `hs-bindgen` generates C |
| 382 | +wrapper functions that can be imported by GHC, and then generates Haskell |
| 383 | +bindings to these wrappers instead of to the original C functions. |
| 384 | +
|
| 385 | +This approach is similar to GHC's `capi` calling convention, which also |
| 386 | +generates C wrappers to handle features that the FFI cannot express directly. |
| 387 | +However, by generating these wrappers ourselves at the userland level, we can |
| 388 | +extend the set of supported function signatures beyond what GHC's `capi` |
| 389 | +provides. For instance, we can handle by-value struct arguments and return |
| 390 | +values, which `capi` does not support. |
| 391 | +
|
| 392 | +The generated wrappers use hash-based names to prevent potential name |
| 393 | +collisions. For example, a C function `print_point` might have a wrapper named |
| 394 | +`hs_bindgen_test_example_a1b2c3d4e5f6g7h8`: |
| 395 | +
|
| 396 | +```c |
| 397 | +// Generated C wrapper |
| 398 | +void hs_bindgen_test_example_a1b2c3d4e5f6g7h8 ( struct point * arg1 ) { |
| 399 | + print_point ( arg1 ); |
| 400 | +} |
| 401 | +``` |
| 402 | +
|
| 403 | +```hs |
| 404 | +-- Generated Haskell import |
| 405 | +foreign import ccall "hs_bindgen_test_example_a1b2c3d4e5f6g7h8" |
| 406 | + print_point :: Ptr Point → IO () |
| 407 | +``` |
| 408 | +
|
| 409 | +This hash-based naming ensures that even if multiple functions have similar |
| 410 | +names or signatures, their wrappers will have unique names, avoiding linker |
| 411 | +errors from symbol collisions. |
| 412 | +
|
| 413 | +This userland CAPI approach is used for all function imports in `hs-bindgen`, |
| 414 | +not just those with features GHC cannot handle directly. This provides a |
| 415 | +uniform interface and makes it straightforward to add support for additional |
| 416 | +C features in the future. |
199 | 417 |
|
200 | 418 | ### By-value `struct` arguments or return values |
201 | 419 |
|
202 | | -TODO |
| 420 | +The GHC FFI does not support passing structs by value to or from C functions. |
| 421 | +For example, consider: |
| 422 | +
|
| 423 | +```c |
| 424 | +struct point byval ( struct point p ); |
| 425 | +``` |
| 426 | +
|
| 427 | +This function cannot be imported directly. Instead, `hs-bindgen` generates a C |
| 428 | +wrapper that accepts and returns structs by pointer, performing the necessary |
| 429 | +conversions: |
| 430 | +
|
| 431 | +```c |
| 432 | +void hs_bindgen_example_9a8b7c6d5e4f3210 ( struct point * arg |
| 433 | + , struct point * res ) { |
| 434 | + * res = byval (* arg ); |
| 435 | +} |
| 436 | +``` |
| 437 | +
|
| 438 | +This wrapper is then imported in Haskell: |
| 439 | +
|
| 440 | +```hs |
| 441 | +foreign import ccall safe "hs_bindgen_example_9a8b7c6d5e4f3210" |
| 442 | + byval_wrapper :: Ptr Point → Ptr Point → IO () |
| 443 | +``` |
| 444 | +
|
| 445 | +Finally, we generate a Haskell wrapper function that recovers the original |
| 446 | +by-value semantics using `with`, `alloca`, and `peek`: |
| 447 | +
|
| 448 | +```hs |
| 449 | +byval :: Point → IO Point |
| 450 | +byval p = |
| 451 | + with p $ \ arg → |
| 452 | + alloca $ \ res → do |
| 453 | + byval_wrapper arg res |
| 454 | + peek res |
| 455 | +``` |
203 | 456 |
|
204 | 457 | ### Static inline functions |
205 | 458 |
|
|
0 commit comments