RC Deployment Handoff Checklist¶

This checklist is intended for handoff to UVA Research Computing staff or collaborators who will help host the PrivSyn research prototype on RC-managed infrastructure.

1. Application Scope¶

The deployed prototype is expected to support:

CSV upload
metadata inference and confirmation
synthetic tabular data generation
job status polling
synthetic CSV download

The system is not intended to be a production public SaaS deployment.

2. Recommended Target Architecture¶

long-running frontend/API service on RC-managed microservice or Kubernetes infrastructure
synthesis execution on Rivanna through Slurm
shared job/output storage visible to both the API service and Rivanna compute jobs

The application should not be left running persistently on a Rivanna login node.

3. What Is Already Prepared In This Repository¶

FastAPI backend with job manager
local executor for workstation testing
Slurm executor for direct Slurm submission
SSH-backed Slurm executor for remote API hosting
Dockerfile for the API + bundled frontend
RC helper scripts under deploy/rc/
Kubernetes starter manifests under deploy/rc/k8s/
NSF deployment summary under docs/nsf-prototype-deployment.md

4. Inputs Needed From RC¶

target namespace/project for RC microservice hosting
approved persistent storage class and PVC sizing
external hostname or ingress route
secret-management mechanism for SSH credentials
preferred image registry and image pull flow
confirmation of how the shared jobs directory should be mounted
any RC-required network policy or egress restrictions

5. Inputs Needed From The Project Team¶

container image to deploy
SSH keypair dedicated to Slurm submission
Rivanna submit host information
Slurm account and partition settings
initial job resource defaults
expected upload size and job duration envelope
retention policy for job inputs/outputs/logs

6. Runtime Configuration To Finalize¶

EXECUTION_MODE=slurm
SLURM_ACCOUNT=dplab
SLURM_PARTITION=standard
SLURM_SSH_TARGET=<submit-host>
SLURM_REMOTE_PROJECT_ROOT=<runnner-checkout>
SLURM_REMOTE_JOBS_ROOT=<shared-jobs-root>
SLURM_REMOTE_RUNNER_COMMAND=<python-runner-command>
JOBS_ROOT=<shared-jobs-root-mounted-in-service>

7. Storage Requirements¶

The final RC deployment should avoid using Rivanna /scratch as the durable system of record for jobs.

The target shared storage should support:

per-job input parquet
metadata JSON
synthetic CSV output
runner stdout/stderr logs
Slurm script persistence

8. Security Requirements¶

SSH private key must be mounted as a secret, not embedded in the image
known_hosts should be pinned
uploaded datasets should remain scoped to the project’s authorized environment
public ingress should be limited to the intended audience if this is still an internal prototype

9. Acceptance Tests Before Launch¶

frontend loads successfully through the RC ingress
upload and metadata inference succeed
/generate submits a Slurm job
/status/{job_id} transitions correctly
output CSV is generated and downloadable
failure path returns failed cleanly with logs available
job files appear in the shared jobs directory

10. Operational Follow-Up¶

define job retention/cleanup policy
define storage quota monitoring
decide whether completed jobs should be archived or pruned automatically
decide whether authentication should be added before broader access

11. Repository Pointers¶

deploy/rc/README.md
deploy/rc/env.rivanna.example
deploy/rc/sync_to_rivanna.sh
deploy/rc/bootstrap_rivanna.sh
deploy/rc/k8s/
docs/nsf-prototype-deployment.md

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search