Library And CLI

This repository now exposes a lightweight layer-1 interface in addition to the web app.

Python API

Use the importable API when you want to call the synthesizer from a notebook or another Python project.

import pandas as pd

from privsyn_tabular import synthesize_dataframe

df = pd.read_csv("input.csv")
result = synthesize_dataframe(
    df,
    method="privsyn",
    epsilon=1.0,
    dataset_name="toy",
)

result.synthesized_df.to_csv("synthetic.csv", index=False)

The return value includes:

  • synthesized_df
  • domain_data
  • info_data
  • config

If domain_data and info_data are not provided, the library reuses the same metadata inference helpers as the web workflow.

CLI

You can run the same flow from the shell:

python -m privsyn_tabular synthesize \
  --input sample_data/adult.csv \
  --output temp_synthesis_output/adult_synth.csv \
  --method privsyn \
  --epsilon 1.0 \
  --write-domain-json temp_synthesis_output/domain.json \
  --write-info-json temp_synthesis_output/info.json

After packaging or editable install, the console script is also available as:

privsyn-tabular synthesize --input input.csv --output synthetic.csv

Installation

For local development, the simplest path is:

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt
python3 -m pip install -e .

That keeps the web app, library, and CLI using the same checked-out source tree.

Why This Layer Exists

The library / CLI layer is intentionally separate from the web deployment story:

  • it is the easiest prototype to run on local machines, notebooks, or RC software environments,
  • it keeps the synthesis code usable even when web hosting is not available,
  • it provides the lowest-friction delivery layer for campus or project-team adoption.