Skip to content

Commit 77bdab6

Browse files
authored
Prepare for 0.6 (#513)
* Update README * Update changelog
1 parent 4a0f504 commit 77bdab6

File tree

3 files changed

+139
-8
lines changed

3 files changed

+139
-8
lines changed

CHANGELOG.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,44 @@
11
# Changelog
22

3+
## [0.6.0] - 2024-04-21
4+
5+
### New! :sparkles:
6+
7+
- Class-based API + concurrent streams + column selections + File reader by @H-Plus-Time in https://github.com/kylebarron/parquet-wasm/pull/407. This added a new `ParquetFile` API for working with files at remote URLs without downloading them first.
8+
- Conditional exports in `package.json`. This should make it easier to use across Node and browser.
9+
- Improved documentation for how to use different entry points.
10+
11+
### Breaking Changes:
12+
13+
- arrow2 and parquet2-based implementation has been removed.
14+
- Layout of files has changed. Your import may need to change.
15+
- Imports are now `parquet-wasm`, `parquet-wasm/esm`, `parquet-wasm/bundler`, and `parquet-wasm/node`.
16+
17+
## What's Changed
18+
19+
- Add conditional exports by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/382
20+
- CI production build size summary by @H-Plus-Time in https://github.com/kylebarron/parquet-wasm/pull/401
21+
- Remove arrow2 implementation by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/446
22+
- feat: add lz4_raw support for `arrow1` by @fspoettel in https://github.com/kylebarron/parquet-wasm/pull/466
23+
- Highlight that esm entry point needs await of default export by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/487
24+
- Fixes for both report builds and PR comment workflow by @H-Plus-Time in https://github.com/kylebarron/parquet-wasm/pull/495
25+
- fix package exports by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/414
26+
- Object store wasm usage by @H-Plus-Time in https://github.com/kylebarron/parquet-wasm/pull/490
27+
- Set Parquet key-value metadata by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/503
28+
- Read parquet with options by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/506
29+
- Documentation updates for 0.6 by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/507
30+
- Avoid bigint for metadata queries by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/508
31+
- Update async API by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/510
32+
- Add test to read empty file by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/512
33+
- bump arrow libraries to version 51 by @jdoig in https://github.com/kylebarron/parquet-wasm/pull/496
34+
35+
## New Contributors
36+
37+
- @fspoettel made their first contribution in https://github.com/kylebarron/parquet-wasm/pull/466
38+
- @jdoig made their first contribution in https://github.com/kylebarron/parquet-wasm/pull/496
39+
40+
**Full Changelog**: https://github.com/kylebarron/parquet-wasm/compare/v0.5.0...v0.6.0
41+
342
## [0.5.0] - 2023-10-21
443

544
## What's Changed

README.md

Lines changed: 95 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,20 +22,107 @@ npm install parquet-wasm
2222

2323
## API
2424

25-
### Choice of bundles
25+
Parquet-wasm has both a synchronous and asynchronous API. The sync API is simpler but requires fetching the entire Parquet buffer in advance, which is often prohibitive.
2626

27-
| Entry point | Description | Documentation |
28-
| ---------------------- | ------------------------------------------------------- | -------------------- |
29-
| `parquet-wasm` | ESM, to be used directly from the Web as an ES Module | [Link][esm-docs] |
30-
| `parquet-wasm/esm` | ESM, to be used directly from the Web as an ES Module | [Link][esm-docs] |
31-
| `parquet-wasm/bundler` | "Bundler" build, to be used in bundlers such as Webpack | [Link][bundler-docs] |
32-
| `parquet-wasm/node` | Node build, to be used with `require` in NodeJS | [Link][node-docs] |
27+
### Sync API
28+
29+
Refer to these functions:
30+
31+
- [`readParquet`](https://kylebarron.dev/parquet-wasm/functions/esm_parquet_wasm.readParquet.html): Read a Parquet file synchronously.
32+
- [`readSchema`](https://kylebarron.dev/parquet-wasm/functions/esm_parquet_wasm.readSchema.html): Read an Arrow schema from a Parquet file synchronously.
33+
- [`writeParquet`](https://kylebarron.dev/parquet-wasm/functions/esm_parquet_wasm.writeParquet.html): Write a Parquet file synchronously.
34+
35+
### Async API
36+
37+
- [`readParquetStream`](https://kylebarron.dev/parquet-wasm/functions/esm_parquet_wasm.readParquetStream.html): Create a [ReadableStream](https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream) that emits Arrow RecordBatches from a Parquet file.
38+
- [`ParquetFile`](https://kylebarron.dev/parquet-wasm/classes/esm_parquet_wasm.ParquetFile.html): A class for reading portions of a remote Parquet file. Use [`fromUrl`](https://kylebarron.dev/parquet-wasm/classes/esm_parquet_wasm.ParquetFile.html#fromUrl) to construct from a remote URL or [`fromFile`](https://kylebarron.dev/parquet-wasm/classes/esm_parquet_wasm.ParquetFile.html#fromFile) to construct from a [`File`](https://developer.mozilla.org/en-US/docs/Web/API/File) handle. Note that when you're done using this class, you'll need to call [`free`](https://kylebarron.dev/parquet-wasm/classes/esm_parquet_wasm.ParquetFile.html#free) to release any memory held by the ParquetFile instance itself.
39+
40+
41+
Both sync and async functions return or accept a [`Table`](https://kylebarron.dev/parquet-wasm/classes/bundler_parquet_wasm.Table.html) class, an Arrow table in WebAssembly memory. Refer to its documentation for moving data into/out of WebAssembly.
42+
43+
## Entry Points
44+
45+
46+
| Entry point | Description | Documentation |
47+
| ------------------------------------------------------------------------- | ------------------------------------------------------- | -------------------- |
48+
| `parquet-wasm`, `parquet-wasm/esm`, or `parquet-wasm/esm/parquet_wasm.js` | ESM, to be used directly from the Web as an ES Module | [Link][esm-docs] |
49+
| `parquet-wasm/bundler` | "Bundler" build, to be used in bundlers such as Webpack | [Link][bundler-docs] |
50+
| `parquet-wasm/node` | Node build, to be used with synchronous `require` in NodeJS | [Link][node-docs] |
3351

3452
[bundler-docs]: https://kylebarron.dev/parquet-wasm/modules/bundler_parquet_wasm.html
3553
[node-docs]: https://kylebarron.dev/parquet-wasm/modules/node_parquet_wasm.html
3654
[esm-docs]: https://kylebarron.dev/parquet-wasm/modules/esm_parquet_wasm.html
3755

38-
**Note that when using the `esm` bundles, the default export must be awaited**. Otherwise, you'll get an error `TypeError: Cannot read properties of undefined`. See [here](https://rustwasm.github.io/docs/wasm-bindgen/examples/without-a-bundler.html) for an example.
56+
### ESM
57+
58+
The `esm` entry point is the primary entry point. It is the default export from `parquet-wasm`, and is also accessible at `parquet-wasm/esm` and `parquet-wasm/esm/parquet_wasm.js` (for symmetric imports [directly from a browser](#using-directly-from-a-browser)).
59+
60+
**Note that when using the `esm` bundles, you must manually initialize the WebAssembly module before using any APIs**. Otherwise, you'll get an error `TypeError: Cannot read properties of undefined`. There are multiple ways to initialize the WebAssembly code:
61+
62+
#### Asynchronous initialization
63+
64+
The primary way to initialize is by awaiting the default export.
65+
66+
```js
67+
import wasmInit, {readParquet} from "parquet-wasm";
68+
69+
await wasmInit();
70+
```
71+
72+
Without any parameter, this will try to fetch a file named `'parquet_wasm_bg.wasm'` at the same location as `parquet-wasm`. (E.g. this snippet `input = new URL('parquet_wasm_bg.wasm', import.meta.url);`).
73+
74+
Note that you can also pass in a custom URL if you want to host the `.wasm` file on your own servers.
75+
76+
```js
77+
import wasmInit, {readParquet} from "parquet-wasm";
78+
79+
// Update this version to match the version you're using.
80+
const wasmUrl = "https://cdn.jsdelivr.net/npm/[email protected]/esm/parquet_wasm_bg.wasm";
81+
await wasmInit(wasmUrl);
82+
```
83+
84+
#### Synchronous initialization
85+
86+
The `initSync` named export allows for
87+
88+
```js
89+
import {initSync, readParquet} from "parquet-wasm";
90+
91+
// The contents of esm/parquet_wasm_bg.wasm in an ArrayBuffer
92+
const wasmBuffer = new ArrayBuffer(...);
93+
94+
// Initialize the Wasm synchronously
95+
initSync(wasmBuffer)
96+
```
97+
98+
Async initialization should be preferred over downloading the Wasm buffer and then initializing it synchronously, as [`WebAssembly.instantiateStreaming`](https://developer.mozilla.org/en-US/docs/WebAssembly/JavaScript_interface/instantiateStreaming_static) is the most efficient way to both download and initialize Wasm code.
99+
100+
### Bundler
101+
102+
The `bundler` entry point doesn't require manual initialization of the WebAssembly blob, but needs setup with whatever bundler you're using. [Refer to the Rust Wasm documentation for more info](https://rustwasm.github.io/docs/wasm-bindgen/reference/deployment.html#bundlers).
103+
104+
### Node
105+
106+
The `node` entry point can be loaded synchronously from Node.
107+
108+
```js
109+
const {readParquet} = require("parquet-wasm");
110+
111+
const wasmTable = readParquet(...);
112+
```
113+
114+
### Using directly from a browser
115+
116+
You can load the `esm/parquet_wasm.js` file directly from a CDN
117+
118+
```js
119+
const parquet = await import(
120+
"https://cdn.jsdelivr.net/npm/[email protected]/esm/parquet_wasm.js"
121+
)
122+
await parquet.default();
123+
124+
const wasmTable = parquet.readParquet(...);
125+
```
39126

40127
### Debug functions
41128

templates/package.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
"webassembly",
2121
"arrow"
2222
],
23+
"$comment": "We export ./esm/parquet_wasm.js so that code can work the same bundled and directly on the frontend",
2324
"exports": {
2425
"./bundler": {
2526
"types": "./bundler/parquet_wasm.d.ts",
@@ -33,6 +34,10 @@
3334
"types": "./node/parquet_wasm.d.ts",
3435
"default": "./node/parquet_wasm.js"
3536
},
37+
"./esm/parquet_wasm.js": {
38+
"types": "./esm/parquet_wasm.d.ts",
39+
"default": "./esm/parquet_wasm.js"
40+
},
3641
".": {
3742
"node": {
3843
"types": "./node/parquet_wasm.d.ts",

0 commit comments

Comments
 (0)