Skip to content
Change the repository type filter

All

    Repositories list

    • ParSEval

      Public
      Plan-aware Test Database Generation for SQL Equivalence Evaluation
      Python
      0200Updated Nov 23, 2025Nov 23, 2025
    • qParser

      Public
      A simple SQL parser built using Apache Calcite.
      Java
      0000Updated Nov 15, 2025Nov 15, 2025
    • Fastest library to load data from DB to DataFrames in Rust and Python
      Rust
      1982.5k20416Updated Nov 10, 2025Nov 10, 2025
    • CleanAgent

      Public
      This is an experimental demo repository of agent on data cleaning task
      Python
      67031Updated Jun 4, 2025Jun 4, 2025
    • accio

      Public
      Java
      0300Updated Apr 8, 2025Apr 8, 2025
    • lineagex

      Public
      Python
      58121Updated Feb 25, 2025Feb 25, 2025
    • dataprep

      Public
      Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
      Python
      2182.2k14520Updated Jun 27, 2024Jun 27, 2024
    • Python
      33620Updated Apr 25, 2024Apr 25, 2024
    • Apache Wayang(incubating) is the first cross-platform data processing system.
      Java
      107000Updated Mar 13, 2024Mar 13, 2024
    • FeatAug

      Public
      Python
      0100Updated Mar 2, 2024Mar 2, 2024
    • C++
      68000Updated Jun 12, 2023Jun 12, 2023
    • deeperlib

      Public
      Deep Web Crawler for Data Enrichment
      Python
      83001Updated May 22, 2023May 22, 2023
    • Big Data Programming II
      Jupyter Notebook
      1314700Updated Apr 11, 2023Apr 11, 2023
    • A curated list of example code to collect data from Web APIs using DataPrep.Connector.
      Python
      2636105Updated Mar 25, 2023Mar 25, 2023
    • 0000Updated Mar 17, 2023Mar 17, 2023
    • BI benchmark with user generated data and queries
      PLpgSQL
      23000Updated Mar 10, 2023Mar 10, 2023
    • Python
      297231Updated Jan 20, 2023Jan 20, 2023
    • Code repository of paper "Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data"
      Python
      2000Updated Dec 14, 2022Dec 14, 2022
    • Website for DataPrep
      HTML
      0300Updated Nov 3, 2022Nov 3, 2022
    • SQLGen

      Public
      An Automated SQL Query Generation Framework for Scalable Feature Discovery
      Python
      1300Updated Oct 1, 2022Oct 1, 2022
    • Python
      0000Updated May 8, 2022May 8, 2022
    • cmpt354

      Public
      CMPT354: Database System I
      HTML
      12410Updated Apr 11, 2022Apr 11, 2022
    • A place to submit conda recipes before they become fully fledged conda-forge feedstocks
      Python
      5.6k000Updated Jan 11, 2022Jan 11, 2022
    • code of FedRain and Frog for VLDB 2022
      Jupyter Notebook
      1000Updated Jun 15, 2021Jun 15, 2021
    • Jupyter Notebook
      1000Updated Mar 5, 2021Mar 5, 2021
    • BOExplain

      Public
      Explaining Inference Queries with Bayesian Optimization
      Jupyter Notebook
      51000Updated Feb 11, 2021Feb 11, 2021
    • Data repository for dataprep
      0000Updated Nov 28, 2020Nov 28, 2020
    • C
      0000Updated Nov 17, 2020Nov 17, 2020
    • quicksel

      Public
      Mirror from quicksel
      HTML
      0010Updated Oct 30, 2020Oct 30, 2020
    • A list of high quality open datasets for COVID-19 data analysis
      HTML
      336190Updated Oct 29, 2020Oct 29, 2020