Skip to content

Better way to read alot of csv files #3

@riedel

Description

@riedel

Currently I am using a bag that reduces to a local data frame. See my SO question/answer https://stackoverflow.com/questions/64512040/how-to-aggregate-large-number-of-small-csv-files-50k-efficiently-code-size/64517641

With a partitioning strategy it should be possible to build a distributed data frame (needed if the data is not that heavily reduced)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions