SynthForge ML Sample Pack
=========================

This pack contains 4 sample ML datasets generated by SynthForge,
each with a ready-to-run Jupyter notebook for exploration and modeling.

Datasets
--------

1. healthcare_readmission/
   Binary classification - predict hospital readmission
   2000 rows, 6 features, stratified train/test split

2. housing_price/
   Regression - predict house prices
   2000 rows, 6 features, feature correlations

3. imperfect_binary/
   Binary classification with data imperfections
   1500 rows, 4 features, missing values + duplicates + outliers

4. correlated_regression/
   Regression with inter-feature correlations
   2000 rows, 5 features, log-normal target distribution

Getting Started
---------------

1. Install dependencies:
   pip install pandas numpy matplotlib seaborn scikit-learn jupyter

2. Open any notebook:
   cd healthcare_readmission/
   jupyter notebook analysis.ipynb

Each dataset directory contains:
  train.csv / test.csv - Training and test data
  train.parquet / test.parquet - Parquet format
  config.json - Generation configuration
  quality_report.json - Data quality analysis
  baseline_evaluation.json - Baseline model results
  analysis.ipynb - Jupyter notebook

Generated by SynthForge