Delivery Layers

This project should be discussed as four separate layers rather than one all-or-nothing system.

At A Glance

Layer User experience Core dependencies Permissions / approvals to ask about
1. Python library / CLI Notebook, script, or shell command Python environment and method dependencies Software environment, project storage, Slurm allocation if jobs run on RC compute
2. Internal web interface Small browser workflow for trusted/internal users Lightweight web hosting, background execution, limited-access storage Container or web hosting, service-to-Slurm submission, internal ingress
3. Full managed internal system Reliable shared internal service Login, durable database, durable object storage, job/session persistence SSO/OIDC, database, storage, service identity, audit/retention expectations
4. Broader / external service Service beyond a small internal audience Hardened hosting, stronger security review, support plan Public ingress, security review, operating ownership, support boundary

The meeting with RC should explicitly ask which row they are comfortable supporting, and which of those rows are self-service versus RC-administered.

Layer 1: Python Library / CLI

  • Goal: let researchers call the synthesizer directly from Python or the command line.
  • Surface:
  • importable Python API,
  • simple CLI,
  • local CSV in, synthetic CSV out.
  • Dependencies:
  • only Python environment and method dependencies,
  • no web hosting, auth, or persistent session layer.
  • Questions for RC:
  • can this just run inside a group-managed software environment,
  • can it submit batch jobs on Rivanna if needed,
  • where should example inputs and outputs live.

This is the lowest-friction path for a prototype or for users who already work in notebooks and scripts.

Layer 2: Internal Web Interface

  • Goal: give internal users a small browser-based workflow.
  • Surface:
  • upload,
  • metadata confirmation,
  • job status,
  • artifact download.
  • Dependencies:
  • lightweight web hosting,
  • background execution,
  • limited-access storage.
  • Questions for RC:
  • is there an RC microservice or container platform for this,
  • can that service submit sbatch / squeue / scancel,
  • does RC prefer campus-only access first.

This can still avoid full login and long-term account management if access is restricted to a project team or a campus-only audience.

Layer 3: Full Managed Internal System

  • Goal: make the tool reliable for ongoing internal use.
  • Adds:
  • login,
  • durable data storage,
  • session and job persistence,
  • quota / lifecycle control,
  • clearer ownership and auditability.
  • Questions for RC:
  • what is the right UVA SSO / NetBadge path,
  • is there a standard Postgres or equivalent metadata store,
  • what retention / access controls are expected for uploaded datasets.

This is the first layer that really needs a platform decision about identity, persistent data services, and operations boundaries.

Layer 4: Broader / External Service

  • Goal: expose the system beyond a small internal user group.
  • Adds:
  • stronger security review,
  • access policy decisions,
  • broader hosting / ingress decisions,
  • support and operational expectations.
  • Questions for RC:
  • would RC even want this hosted on RC-managed infrastructure,
  • what review is required before external exposure,
  • who would be the operational owner after the prototype phase.

This layer should be treated as a separate decision from the prototype and from the internal research-service use case.

What To Ask RC

For the meeting with UVA Research Computing, the key question is not simply "can we deploy this?" but:

  1. Which of these four layers can RC support?
  2. Which layers are self-service?
  3. Which layers require RC-administered infrastructure or approvals?
  4. Which layer is realistic for the prototype within proposal timelines?