You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+39Lines changed: 39 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,44 @@
1
1
# Changelog
2
2
3
+
## [0.6.0] - 2024-04-21
4
+
5
+
### New! :sparkles:
6
+
7
+
- Class-based API + concurrent streams + column selections + File reader by @H-Plus-Time in https://github.com/kylebarron/parquet-wasm/pull/407. This added a new `ParquetFile` API for working with files at remote URLs without downloading them first.
8
+
- Conditional exports in `package.json`. This should make it easier to use across Node and browser.
9
+
- Improved documentation for how to use different entry points.
10
+
11
+
### Breaking Changes:
12
+
13
+
- arrow2 and parquet2-based implementation has been removed.
14
+
- Layout of files has changed. Your import may need to change.
15
+
- Imports are now `parquet-wasm`, `parquet-wasm/esm`, `parquet-wasm/bundler`, and `parquet-wasm/node`.
16
+
17
+
## What's Changed
18
+
19
+
- Add conditional exports by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/382
20
+
- CI production build size summary by @H-Plus-Time in https://github.com/kylebarron/parquet-wasm/pull/401
21
+
- Remove arrow2 implementation by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/446
22
+
- feat: add lz4_raw support for `arrow1` by @fspoettel in https://github.com/kylebarron/parquet-wasm/pull/466
23
+
- Highlight that esm entry point needs await of default export by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/487
24
+
- Fixes for both report builds and PR comment workflow by @H-Plus-Time in https://github.com/kylebarron/parquet-wasm/pull/495
25
+
- fix package exports by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/414
26
+
- Object store wasm usage by @H-Plus-Time in https://github.com/kylebarron/parquet-wasm/pull/490
27
+
- Set Parquet key-value metadata by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/503
28
+
- Read parquet with options by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/506
29
+
- Documentation updates for 0.6 by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/507
30
+
- Avoid bigint for metadata queries by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/508
31
+
- Update async API by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/510
32
+
- Add test to read empty file by @kylebarron in https://github.com/kylebarron/parquet-wasm/pull/512
33
+
- bump arrow libraries to version 51 by @jdoig in https://github.com/kylebarron/parquet-wasm/pull/496
34
+
35
+
## New Contributors
36
+
37
+
-@fspoettel made their first contribution in https://github.com/kylebarron/parquet-wasm/pull/466
38
+
-@jdoig made their first contribution in https://github.com/kylebarron/parquet-wasm/pull/496
Copy file name to clipboardExpand all lines: README.md
+95-8Lines changed: 95 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,20 +22,107 @@ npm install parquet-wasm
22
22
23
23
## API
24
24
25
-
### Choice of bundles
25
+
Parquet-wasm has both a synchronous and asynchronous API. The sync API is simpler but requires fetching the entire Parquet buffer in advance, which is often prohibitive.
|`parquet-wasm`| ESM, to be used directly from the Web as an ES Module |[Link][esm-docs]|
30
-
|`parquet-wasm/esm`| ESM, to be used directly from the Web as an ES Module |[Link][esm-docs]|
31
-
|`parquet-wasm/bundler`| "Bundler" build, to be used in bundlers such as Webpack |[Link][bundler-docs]|
32
-
|`parquet-wasm/node`| Node build, to be used with `require` in NodeJS |[Link][node-docs]|
27
+
### Sync API
28
+
29
+
Refer to these functions:
30
+
31
+
-[`readParquet`](https://kylebarron.dev/parquet-wasm/functions/esm_parquet_wasm.readParquet.html): Read a Parquet file synchronously.
32
+
-[`readSchema`](https://kylebarron.dev/parquet-wasm/functions/esm_parquet_wasm.readSchema.html): Read an Arrow schema from a Parquet file synchronously.
33
+
-[`writeParquet`](https://kylebarron.dev/parquet-wasm/functions/esm_parquet_wasm.writeParquet.html): Write a Parquet file synchronously.
34
+
35
+
### Async API
36
+
37
+
-[`readParquetStream`](https://kylebarron.dev/parquet-wasm/functions/esm_parquet_wasm.readParquetStream.html): Create a [ReadableStream](https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream) that emits Arrow RecordBatches from a Parquet file.
38
+
-[`ParquetFile`](https://kylebarron.dev/parquet-wasm/classes/esm_parquet_wasm.ParquetFile.html): A class for reading portions of a remote Parquet file. Use [`fromUrl`](https://kylebarron.dev/parquet-wasm/classes/esm_parquet_wasm.ParquetFile.html#fromUrl) to construct from a remote URL or [`fromFile`](https://kylebarron.dev/parquet-wasm/classes/esm_parquet_wasm.ParquetFile.html#fromFile) to construct from a [`File`](https://developer.mozilla.org/en-US/docs/Web/API/File) handle. Note that when you're done using this class, you'll need to call [`free`](https://kylebarron.dev/parquet-wasm/classes/esm_parquet_wasm.ParquetFile.html#free) to release any memory held by the ParquetFile instance itself.
39
+
40
+
41
+
Both sync and async functions return or accept a [`Table`](https://kylebarron.dev/parquet-wasm/classes/bundler_parquet_wasm.Table.html) class, an Arrow table in WebAssembly memory. Refer to its documentation for moving data into/out of WebAssembly.
**Note that when using the `esm` bundles, the default export must be awaited**. Otherwise, you'll get an error `TypeError: Cannot read properties of undefined`. See [here](https://rustwasm.github.io/docs/wasm-bindgen/examples/without-a-bundler.html) for an example.
56
+
### ESM
57
+
58
+
The `esm` entry point is the primary entry point. It is the default export from `parquet-wasm`, and is also accessible at `parquet-wasm/esm` and `parquet-wasm/esm/parquet_wasm.js` (for symmetric imports [directly from a browser](#using-directly-from-a-browser)).
59
+
60
+
**Note that when using the `esm` bundles, you must manually initialize the WebAssembly module before using any APIs**. Otherwise, you'll get an error `TypeError: Cannot read properties of undefined`. There are multiple ways to initialize the WebAssembly code:
61
+
62
+
#### Asynchronous initialization
63
+
64
+
The primary way to initialize is by awaiting the default export.
65
+
66
+
```js
67
+
importwasmInit, {readParquet} from"parquet-wasm";
68
+
69
+
awaitwasmInit();
70
+
```
71
+
72
+
Without any parameter, this will try to fetch a file named `'parquet_wasm_bg.wasm'` at the same location as `parquet-wasm`. (E.g. this snippet `input = new URL('parquet_wasm_bg.wasm', import.meta.url);`).
73
+
74
+
Note that you can also pass in a custom URL if you want to host the `.wasm` file on your own servers.
75
+
76
+
```js
77
+
importwasmInit, {readParquet} from"parquet-wasm";
78
+
79
+
// Update this version to match the version you're using.
// The contents of esm/parquet_wasm_bg.wasm in an ArrayBuffer
92
+
constwasmBuffer=newArrayBuffer(...);
93
+
94
+
// Initialize the Wasm synchronously
95
+
initSync(wasmBuffer)
96
+
```
97
+
98
+
Async initialization should be preferred over downloading the Wasm buffer and then initializing it synchronously, as [`WebAssembly.instantiateStreaming`](https://developer.mozilla.org/en-US/docs/WebAssembly/JavaScript_interface/instantiateStreaming_static) is the most efficient way to both download and initialize Wasm code.
99
+
100
+
### Bundler
101
+
102
+
The `bundler` entry point doesn't require manual initialization of the WebAssembly blob, but needs setup with whatever bundler you're using. [Refer to the Rust Wasm documentation for more info](https://rustwasm.github.io/docs/wasm-bindgen/reference/deployment.html#bundlers).
103
+
104
+
### Node
105
+
106
+
The `node` entry point can be loaded synchronously from Node.
107
+
108
+
```js
109
+
const {readParquet} =require("parquet-wasm");
110
+
111
+
constwasmTable=readParquet(...);
112
+
```
113
+
114
+
### Using directly from a browser
115
+
116
+
You can load the `esm/parquet_wasm.js` file directly from a CDN
0 commit comments