Skip to content

Commit fd45316

Browse files
Update darray.md
1 parent 4887abd commit fd45316

File tree

1 file changed

+159
-0
lines changed

1 file changed

+159
-0
lines changed

docs/src/darray.md

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -211,6 +211,165 @@ across the workers in the Julia cluster in a relatively even distribution;
211211
future operations on a `DArray` may produce a different distribution from the
212212
one chosen by previous calls.
213213

214+
<!-- -->
215+
216+
### Explicit Processor Mapping of DArray Blocks
217+
218+
This feature allows you to control how `DArray` blocks (chunks) are assigned to specific processors or threads within the cluster. Fine-grained control over data locality can be crucial for optimizing the performance of certain distributed algorithms.
219+
220+
You specify the mapping using the optional `assignment` keyword argument in the `DArray` constructor functions (`DArray`, `DVector`, and `DMatrix`) and the `distribute` function.
221+
222+
The `assignment` argument accepts the following values:
223+
224+
* `:arbitrary` (Default):
225+
226+
* If `assignment` is not provided or is set to symbol `:arbitrary`, Dagger's scheduler assigns blocks to processors automatically. This is the default behavior.
227+
* `:blockcyclic`:
228+
229+
* If `assignment` is set to `:blockcyclic`, `DArray` blocks are assigned to processors in a block-cyclic manner. Blocks are distributed cyclically across processors, iterating through the processors in increasing rank along the *last* dimension of the block distribution.
230+
* Any other symbol used for `assignment` results in an error.
231+
* `AbstractArray{<:Int, N}`:
232+
233+
* Provide an N-dimensional array of integer worker IDs. The dimension `N` must match the number of dimensions of the `DArray`.
234+
* Dagger maps blocks to worker IDs in a block-cyclic manner. The block at index `(i, j, ...)` is assigned to the first thread of the processor with ID `assignment[i, j, ...]`. This pattern repeats in a block-cyclic fashion to assign all blocks.
235+
* `AbstractArray{<:Processor, N}`:
236+
237+
* Provide an N-dimensional array of `Processor` objects. The dimension `N` must match the number of dimensions of the `DArray` blocks.
238+
* Blocks are mapped in a block-cyclic manner according to the `Processor` objects in the `assignment` array. The block at index `(i, j, ...)` is assigned to the processor at `assignment[i, j, ...]`. This pattern repeats in a block-cyclic fashion to assign all blocks.
239+
240+
#### Examples and Usage
241+
242+
The `assignment` argument works similarly for `DArray`, `DVector`, and `DMatrix`, as well as the `distribute` function. The key difference lies in the dimensionality of the resulting distributed array:
243+
244+
* `DArray`: For N-dimensional distributed arrays.
245+
246+
* `DVector`: Specifically for 1-dimensional distributed arrays.
247+
248+
* `DMatrix`: Specifically for 2-dimensional distributed arrays.
249+
250+
* `distribute`: General function to distribute arrays.
251+
252+
Here are some examples using a setup with one processor and three worker processors.
253+
254+
First, let's create some sample arrays:
255+
256+
```julia
257+
A = rand(7, 11) # 2D array
258+
v = rand(15) # 1D array
259+
M = rand(5, 5, 5) # 3D array
260+
```
261+
262+
1. **Arbitrary Assignment:**
263+
264+
```julia
265+
Ad = distribute(A, Blocks(2, 2), :arbitrary)
266+
# DMatrix(A, Blocks(2, 2), :arbitrary)
267+
268+
vd = distribute(v, Blocks(3), :arbitrary)
269+
# DVector(v, Blocks(3), :arbitrary)
270+
271+
Md = distribute(M, Blocks(2, 2, 2), :arbitrary)
272+
# DArray(M, Blocks(2,2,2), :arbitrary)
273+
```
274+
275+
This creates distributed arrays with the specified block sizes, and Dagger assigns the blocks to processors arbitrarily. For example, the assignment for `Ad` might look like this:
276+
277+
```julia
278+
4×6 Matrix{Dagger.ThreadProc}:
279+
ThreadProc(4, 1) ThreadProc(3, 1) ThreadProc(3, 1) ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(3, 1)
280+
ThreadProc(3, 1) ThreadProc(4, 1) ThreadProc(3, 1) ThreadProc(4, 1) ThreadProc(2, 1) ThreadProc(2, 1)
281+
ThreadProc(2, 1) ThreadProc(2, 1) ThreadProc(2, 1) ThreadProc(3, 1) ThreadProc(4, 1) ThreadProc(4, 1)
282+
ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(4, 1) ThreadProc(3, 1) ThreadProc(2, 1) ThreadProc(3, 1)
283+
284+
```
285+
286+
2. **Block-Cyclic Assignment:**
287+
288+
```julia
289+
Ad = distribute(A, Blocks(2, 2), :blockcyclic)
290+
# DMatrix(A, Blocks(2, 2), :blockcyclic)
291+
292+
vd = distribute(v, Blocks(3), :blockcyclic)
293+
# DVector(v, Blocks(3), :blockcyclic)
294+
295+
Md = distribute(M, Blocks(2, 2, 2), :blockcyclic)
296+
# DArray(M, Blocks(2,2,2), :blockcyclic)
297+
```
298+
299+
This assigns blocks cyclically along the last dimension across the available processors with increasing rank. For the 2D case (`Ad`), the assignment will look like this:
300+
301+
```julia
302+
4×6 Matrix{Dagger.ThreadProc}:
303+
ThreadProc(1, 1) ThreadProc(2, 1) ThreadProc(3, 1) ThreadProc(4, 1) ThreadProc(1, 1) ThreadProc(2, 1)
304+
ThreadProc(1, 1) ThreadProc(2, 1) ThreadProc(3, 1) ThreadProc(4, 1) ThreadProc(1, 1) ThreadProc(2, 1)
305+
ThreadProc(1, 1) ThreadProc(2, 1) ThreadProc(3, 1) ThreadProc(4, 1) ThreadProc(1, 1) ThreadProc(2, 1)
306+
ThreadProc(1, 1) ThreadProc(2, 1) ThreadProc(3, 1) ThreadProc(4, 1) ThreadProc(1, 1) ThreadProc(2, 1)
307+
308+
```
309+
310+
3. **Block-Cyclic Assignment with Integer Array:**
311+
312+
```julia
313+
assignment_2d = [3 1; 4 2]
314+
Ad = distribute(A, Blocks(2, 2), assignment_2d)
315+
# DMatrix(A, Blocks(2, 2), [3 1; 4 2])
316+
317+
assignment_1d = [2,3,1,4]
318+
vd = distribute(v, Blocks(3), assignment_1d)
319+
# DVector(v, Blocks(3), [2,3,1,4])
320+
321+
assignment_3d = cat([1 2; 3 4], [4 3; 2 1], dims=3)
322+
Md = distribute(M, Blocks(2, 2, 2), assignment_3d)
323+
# DArray(M, Blocks(2, 2, 2), cat([1 2; 3 4], [4 3; 2 1], dims=3))
324+
325+
```
326+
327+
Here, the assignment arrays define how processors are arranged. For example, `assignment_2d` creates a 2x2 processor grid for the 2D array.
328+
329+
The assignment for `Ad` would be:
330+
331+
```julia
332+
4×6 Matrix{Dagger.ThreadProc}:
333+
ThreadProc(3, 1) ThreadProc(1, 1) ThreadProc(3, 1) ThreadProc(1, 1) ThreadProc(3, 1) ThreadProc(1, 1)
334+
ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(2, 1) ThreadProc(4, 1)
335+
ThreadProc(3, 1) ThreadProc(1, 1) ThreadProc(3, 1) ThreadProc(1, 1) ThreadProc(3, 1) ThreadProc(1, 1)
336+
ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(2, 1) ThreadProc(4, 1)
337+
338+
```
339+
340+
4. **Block-Cyclic Assignment with Processor Array:**
341+
342+
```julia
343+
assignment_2d = [Dagger.ThreadProc(3, 1) Dagger.ThreadProc(1, 1);
344+
Dagger.ThreadProc(4, 1) Dagger.ThreadProc(2, 1)]
345+
Ad = distribute(A, Blocks(2, 2), assignment_2d)
346+
# DMatrix(A, Blocks(2, 2), assignment_2d)
347+
348+
assignment_1d = [Dagger.ThreadProc(2,1), Dagger.ThreadProc(3,1), Dagger.ThreadProc(1,1), Dagger.ThreadProc(4,1)]
349+
vd = distribute(v, Blocks(3), assignment_1d)
350+
# DVector(v, Blocks(3), assignment_1d)
351+
352+
assignment_3d = cat([Dagger.ThreadProc(1,1) Dagger.ThreadProc(2,1); Dagger.ThreadProc(3,1) Dagger.ThreadProc(4,1)],
353+
[Dagger.ThreadProc(4,1) Dagger.ThreadProc(3,1); Dagger.ThreadProc(2,1) Dagger.ThreadProc(1,1)], dims=3)
354+
Md = distribute(M, Blocks(2, 2, 2), assignment_3d)
355+
# DArray(M, Blocks(2, 2, 2), assignment_3d)
356+
357+
```
358+
359+
If the assignment is a matrix of `Processor` objects, the blocks are assigned as follows:
360+
For `Ad`:
361+
362+
```julia
363+
4×6 Matrix{Dagger.ThreadProc}:
364+
ThreadProc(3, 1) ThreadProc(1, 1) ThreadProc(3, 1) ThreadProc(1, 1) ThreadProc(3, 1) ThreadProc(1, 1)
365+
ThreadProc(4, 1) ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(2, 1)
366+
ThreadProc(3, 1) ThreadProc(1, 1) ThreadProc(3, 1) ThreadProc(1, 1) ThreadProc(3, 1) ThreadProc(1, 1)
367+
ThreadProc(4, 1) ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(2, 1) ThreadProc(4, 1) ThreadProc(2, 1)
368+
369+
```
370+
371+
<!-- -->
372+
214373
## Broadcasting
215374

216375
As the `DArray` is a subtype of `AbstractArray` and generally satisfies Julia's

0 commit comments

Comments
 (0)