Benchmarking synthetic data generation methods.
-
Updated
Jul 2, 2024 - Python
Benchmarking synthetic data generation methods.
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
Portfolio of data science projects showing my skill range and competencies.
Open-source version of the TDspora synthetic data generation algorithm.
Software for evaluating the quality of synthetic data compared with real data.
A terminal spreadsheet multitool for discovering and arranging data
A standard framework for modelling Deep Learning Models for tabular data
Python library for embedding inference of relational tables.
A range of helpful utilities for Python developers including streaming tabular data, date parsing, JSON and YAML handling, dictionary and list utilities
Fast and Accurate ML in 3 Lines of Code
a Stellar Dynamics Toolbox (Not Everybody Must Observe)
Get classification risk scores on tabular tasks using LLMs
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
The modern React DataGrid for building apps — faster
Algorithms for outlier, adversarial and drift detection
End-to-end ML project for tabular data.
Characterization of relational table embeddings (VLDB 2024).
A zero-config, fast and small (~3kB) virtual list (and grid) component for React, Vue, Solid and Svelte.
A comprehensive toolkit and benchmark for tabular data learning, featuring over 20 deep methods, more than 10 classical methods, and 300 diverse tabular datasets.
MSBoost is a gradient boosting algorithm that improves performance by selecting the best model from multiple parallel-trained models for each layer, excelling in small and noisy datasets.
Add a description, image, and links to the tabular-data topic page so that developers can more easily learn about it.
To associate your repository with the tabular-data topic, visit your repo's landing page and select "manage topics."