The CloudsAI Platform

One deployment contract. Every accelerator. Every environment.

Built for heterogeneous AI stacks spanning CUDA, ROCm/HIP, and SYCL/oneAPI, with optimized networking, storage, and runtime paths across public cloud, private, sovereign, and air-gapped environments.

cloudsai-deploy.yaml
# one spec — every environment
kind: AIFactory
spec:
  accelerator: amd-mi300
  runtime: rocm-hip
  environment: air-gapped
  fabric: infiniband-rdma
  storage:
    checkpoint: parallel-fs
    vector: nvme-object
  gateway: agent-secure
  harness: sandboxed
  runtime: policy-enforced
The real problem

The accelerators are the easy part.

Enterprises can buy powerful silicon. What's painful is everything around it — the tooling, deployment patterns, portability, tuning, and operations — which fragments into a bespoke project for every environment, and a developer experience that drifts from team to team.

The platform's job is to make the environment irrelevant to the people building on it.
What the platform is

A deployment contract, not a deployment project.

One spec

A single specification that already encodes the accelerator, fabric, storage, and runtime tunables for each target — promoted unchanged from cloud pilot to sovereign or air-gapped production.

One developer experience

Preconfigured JupyterLab, runtimes, drivers, and kernels — the same CLI, notebooks, and API clients in every environment, with zero code changes between them.

Self-managed, owned

Runs inside the customer's own VPC, datacenter, sovereign region, or air-gapped enclave. You hold the keys, the data, and the control plane.

Heterogeneous accelerators

Optimization paths across the silicon landscape.

First-class support and tuning paths for today's accelerators — and an extensible contract for what comes next.

NVIDIA
H100-class and beyond
CUDA
AMD
Instinct MI200 / 250 / 300
ROCm · HIP
Intel
Gaudi-class infrastructure
SYCL · oneAPI
Qualcomm
Cloud AI 100 · edge
Server · Edge
ARM
ARM-based systems
Portable

Portability across CUDA · ROCm/HIP · SYCL/oneAPI — plus an extensible path to emerging and custom accelerators.

Optimized data & networking paths

The full-stack promise, made specific.

Deploy-anywhere only matters if it performs everywhere. The platform tunes the paths that decide whether a cluster actually delivers — not just the accelerator on paper.

Storage

Data paths

  • Checkpointing pipelines sized for large-model training
  • Object, file, and parallel filesystem choices per workload
  • Vector-serving data paths for retrieval and inference
  • Tiering and large-scale data movement
Networking & fabric

East-west performance

  • RDMA over InfiniBand or Ethernet fabric
  • East-west traffic patterns for distributed training
  • Collective-communication and low-latency design
  • Lossless, congestion-aware fabric tuning
Runtime & topology

Where the gains hide

  • KV-cache, batch sizing, and memory residency
  • MIG / partitioning and NVLink / fabric topology
  • Accelerator-aware workload placement
  • Multi-node cluster bottleneck analysis
Model · harness · runtime

An agent is a model, a harness, and a runtime.

The model reasons. The harness gives it skills, memory, and tools. The runtime governs what it can see, do, and reach. CloudsAI is that secure runtime — and the deploy-anywhere infrastructure beneath it — model-, harness-, and runtime-agnostic, with policy enforced out of the agent's reach, identical in public cloud, sovereign, or air-gapped.

Sandboxed execution

Per-session system and network isolation — agents install packages and run code they can't use to touch the host. Created and torn down on demand.

Out-of-process policy

Deny-by-default across filesystem, network, and process, checked by the runtime before any action — enforced outside the agent's reach, even if it's compromised.

Credential brokering

The agent never sees raw tokens. Scoped, per-agent credentials are injected at the sandbox boundary — so a compromised agent still can't exfiltrate.

Durable state & snapshots

Skills, memory, and sessions persist across rebuilds — snapshot, tear down, restore, and the agent resumes exactly where it left off.

Model & provider routing

Keep sensitive context on local models; route to frontier models only when policy allows. Swap models without touching the harness.

Observability & audit

Full traces of every tool call, handoff, and allow/deny decision — the record regulated and sovereign environments require.

Run popular agent harnesses and secure agent runtimes unmodified, on open models served via NIM, vLLM, or self-hosted — across NVIDIA, AMD, Intel, and beyond.

Developer experience

Identical everywhere it runs.

Preinstalled, tested, and fully supported — so the experience doesn't move when the infrastructure does.

JupyterLab & Python tooling
GPU drivers + CUDA / ROCm
Pre-built notebook kernels
The same CLI everywhere
Identical API clients
Zero code changes between environments

Runs across five deployment models — explore deployment models ↗

Let's talk

See the platform on your stack.

Bring your accelerators, your environment, and your constraints. We'll walk through how the platform deploys, tunes, and operates against them.