Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,15 +48,15 @@ In the GIFs below, Modin (left) and pandas (right) perform *the same pandas oper
</tbody>
</table>

The charts below show the speedup you get by replacing pandas with Modin based on the examples above. The example notebooks can be found [here](examples/jupyter). To learn more about the speedups you could get with Modin and try out some examples on your own, check out our [10-minute quickstart guide](https://modin.readthedocs.io/en/latest/getting_started/quickstart.html) to try out some examples on your own!
The charts below show the speedup you get by replacing pandas with Modin based on the examples above. The example notebooks can be found [here](examples/jupyter). To learn more about the speedups you could get with Modin and try out some examples on your own, check out our [10-minute quickstart guide](https://modin.readthedocs.io/en/latest/getting_started/quickstart.html).

<img src="https://github.com/modin-project/modin/raw/7c009c747caa90554607e30b9ac2bd1b190b8c7d/docs/img/Modin_Speedup.svg" style="display: block;margin-left: auto;margin-right: auto;" width="100%"></img>

### Installation

#### From PyPI

Modin can be installed with `pip` on Linux, Windows and MacOS:
Modin can be installed with `pip` on Linux, Windows and macOS:

```bash
pip install "modin[all]" # (Recommended) Install Modin with Ray and Dask engines.
Expand Down Expand Up @@ -84,7 +84,7 @@ Modin automatically detects which engine(s) you have installed and uses that for

#### From conda-forge

Installing from [conda forge](https://github.com/conda-forge/modin-feedstock) using `modin-all`
Installing from [conda-forge](https://github.com/conda-forge/modin-feedstock) using `modin-all`
will install Modin and three engines: [Ray](https://github.com/ray-project/ray), [Dask](https://github.com/dask/dask) and
[MPI through unidist](https://github.com/modin-project/unidist).

Expand Down Expand Up @@ -114,7 +114,7 @@ To speed up conda installation we recommend using libmamba solver. To do this in
conda install -n base conda-libmamba-solver
```

and then use it during istallation either like:
and then use it during installation either like:

```bash
conda install -c conda-forge modin-ray --experimental-solver=libmamba
Expand Down Expand Up @@ -161,7 +161,7 @@ _Note: You should not change the engine after your first operation with Modin as

#### Which engine should I use?

On Linux, MacOS, and Windows you can install and use either Ray, Dask or MPI through unidist. There is no knowledge required
On Linux, macOS, and Windows you can install and use either Ray, Dask or MPI through unidist. There is no knowledge required
to use either of these engines as Modin abstracts away all of the complexity, so feel
free to pick either!

Expand Down
2 changes: 1 addition & 1 deletion docs/flow/modin/core/execution/dispatching.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Factories Module Description

Brief description
'''''''''''''''''
Modin has several execution engines and storage formats, combining them together forms certain executions. 
Modin has several execution engines and storage formats; combining them forms certain executions.
Calling any :py:class:`~modin.pandas.dataframe.DataFrame` API function will end up in some execution-specific method. The responsibility of dispatching high-level API calls to
execution-specific function belongs to the :ref:`QueryCompiler <query_compiler_def>`, which is determined at the time of the dataframe's creation by the factory of
the corresponding execution. The mission of this module is to route IO function calls from
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ the :py:class:`~modin.core.execution.ray.implementations.pandas_on_ray.dataframe
generic functionality from the ``GenericRayDataframe`` and the :py:class:`~modin.core.dataframe.pandas.dataframe.dataframe.PandasDataframe`.

..
TODO: insert a link to ``GenericRayDataframe`` once we add an implementatiton of the class
TODO: insert a link to ``GenericRayDataframe`` once we add an implementation of the class

PandasOnRay Dataframe implementation
------------------------------------
Expand Down Expand Up @@ -79,4 +79,4 @@ and a new query compiler with the data read is returned.
When writing data to a CSV file, for example, the :py:class:`~modin.core.execution.ray.implementations.pandas_on_ray.io.PandasOnRayIO` processes
the user query to execute it on Ray workers. Then, the :py:class:`~modin.core.execution.ray.implementations.pandas_on_ray.io.PandasOnRayIO` asks the
:py:class:`~modin.core.execution.ray.implementations.pandas_on_ray.dataframe.PandasOnRayDataframe` to decompose the data into row-wise partitions
that will be written into the file in parallel in Ray workers.
that will be written into the file in parallel in Ray workers.
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ the :py:class:`~modin.core.execution.unidist.implementations.pandas_on_unidist.d
generic functionality from the ``GenericUnidistDataframe`` and the :py:class:`~modin.core.dataframe.pandas.dataframe.dataframe.PandasDataframe`.

..
TODO: insert a link to ``GenericUnidistDataframe`` once we add an implementatiton of the class
TODO: insert a link to ``GenericUnidistDataframe`` once we add an implementation of the class

PandasOnUnidist Dataframe implementation
----------------------------------------
Expand Down Expand Up @@ -80,4 +80,4 @@ and a new query compiler with the data read is returned.
When writing data to a CSV file, for example, the :py:class:`~modin.core.execution.unidist.implementations.pandas_on_unidist.io.PandasOnUnidistIO` processes
the user query to execute it on Unidist workers. Then, the :py:class:`~modin.core.execution.unidist.implementations.pandas_on_unidist.io.PandasOnUnidistIO` asks the
:py:class:`~modin.core.execution.unidist.implementations.pandas_on_unidist.dataframe.PandasOnUnidistDataframe` to decompose the data into row-wise partitions
that will be written into the file in parallel in Unidist workers.
that will be written into the file in parallel in Unidist workers.
14 changes: 7 additions & 7 deletions docs/getting_started/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Installing with pip
Stable version
""""""""""""""

Modin can be installed with ``pip`` on Linux, Windows and MacOS.
Modin can be installed with ``pip`` on Linux, Windows and macOS.
To install the most recent stable release run the following:

.. code-block:: bash
Expand Down Expand Up @@ -96,7 +96,7 @@ Modin can be used with Google Colab_ via the ``pip`` command, by running the fol

!pip install "modin[all]"

Since Colab preloads several of Modin's dependencies by default, we need to restart the Colab environment once Modin is installed by either clicking on the :code:`"RESTART RUNTIME"` button in the installation output or by run the following code:
Since Colab preloads several of Modin's dependencies by default, we need to restart the Colab environment once Modin is installed by either clicking on the :code:`"RESTART RUNTIME"` button in the installation output or by running the following code:

.. code-block:: python

Expand All @@ -120,13 +120,13 @@ it is possible to install modin with chosen engine(s) alongside. Current options
+---------------------------------+---------------------------+-----------------------------+
| **Package name in conda-forge** | **Engine(s)** | **Supported OSs** |
+---------------------------------+---------------------------+-----------------------------+
| modin | Dask_ | Linux, Windows, MacOS |
| modin | Dask_ | Linux, Windows, macOS |
+---------------------------------+---------------------------+-----------------------------+
| modin-dask | Dask | Linux, Windows, MacOS |
| modin-dask | Dask | Linux, Windows, macOS |
+---------------------------------+---------------------------+-----------------------------+
| modin-ray | Ray_ | Linux, Windows |
+---------------------------------+---------------------------+-----------------------------+
| modin-mpi | MPI_ through unidist_ | Linux, Windows, MacOS |
| modin-mpi | MPI_ through unidist_ | Linux, Windows, macOS |
+---------------------------------+---------------------------+-----------------------------+
| modin-all | Dask, Ray, Unidist | Linux |
+---------------------------------+---------------------------+-----------------------------+
Expand Down Expand Up @@ -156,7 +156,7 @@ or explicitly:
Refer to `Installing with conda`_ section of the unidist documentation
for more details on how to install a specific MPI implementation to run on.

``conda`` may be slow installing ``modin-all`` or combitations of execution engines so we currently recommend using libmamba solver for the installation process.
``conda`` may be slow installing ``modin-all`` or combinations of execution engines, so we currently recommend using the libmamba solver for the installation process.
To do this install it in a base environment:

.. code-block:: bash
Expand All @@ -167,7 +167,7 @@ Then it can be used during installation either like

.. code-block:: bash

conda install -c conda-forge modin-ray modin- --experimental-solver=libmamba
conda install -c conda-forge modin-ray modin-dask modin-mpi --experimental-solver=libmamba

or starting from conda 22.11 and libmamba solver 22.12 versions

Expand Down
2 changes: 1 addition & 1 deletion docs/release_notes/release_notes-0.16.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Key Features and Updates
* PERF-#4773: Compute `lengths` and `widths` in `put` method of Dask partition like Ray do (#4780)
* PERF-#4732: Avoid overwriting already-evaluated `PandasOnRayDataframePartition._length_cache` and `PandasOnRayDataframePartition._width_cache` (#4754)
* PERF-#4862: Don't call `compute_sliced_len.remote` when `row_labels/col_labels == slice(None)` (#4863)
* PERF-#4713: Stop overriding the ray MacOS object store size limit (#4792)
* PERF-#4713: Stop overriding the Ray macOS object store size limit (#4792)
* PERF-#4851: Compute `dtypes` for binary operations that can only return bool type and the right operand is not a Modin object (#4852)
* PERF-#4842: `copy` should not trigger any previous computations (#4843)
* PERF-#4849: Compute `dtypes` in `concat` also for ROW_WISE case when possible (#4850)
Expand Down