Document · v1.0 · Final Confidential
NUS Enterprise Website

A lean stack,
built for editors
and AI agents.

A redesigned NUS Enterprise website — roughly 200–500 pages across a handful of departments. Clean, modern, headless CMS, AI-assisted writing, and semantic search powered by a third-party knowledge base. Off-the-shelf where possible, custom only where the product requires it.

§ 01

Overview

The brief calls for a content-first site that academic staff can edit without technical help, that an AI assistant can write alongside them inside the CMS, and that surfaces the right answer fast — both to humans searching the site and to AI agents fetching content on a user's behalf. The stack below is the smallest set of pieces that delivers all three without locking NUS into anything heavy to maintain.

§ 02

Recommendation & Options Considered

Three candidate stacks were evaluated against the same criteria: editorial UX for non-technical academic staff, AI-assisted writing inside the CMS, semantic search via the third-party knowledge base, content negotiation for AI agents, data residency, total cost of ownership, and time-to-launch.

Criteria Sanity + Cloudflare Headless WordPress Markdown / AI-native
Editorial UX Good Familiar but dated in places Limited Suits Git-comfortable teams
Custom AI in editor Rough Limited by WordPress's editor Strong For repo-based agents & structured content
Approval workflow Plugins Via third-party Git PRs CODEOWNERS & branch protection
Licensing cost Free & open source No hosted CMS fee
Time to launch On plan At risk Tight

2.2 — Headless WordPress

WordPress is the world's most widely used content system — most editors have used it before, and it's free. A large ecosystem of add-ons covers SSO, multilingual, approval workflows, and SEO tooling without custom development. The AI writing experience is the biggest concession: WordPress's editor is harder to customise deeply, so the assistance we can offer is decent but not as polished as Sanity allows. WordPress also needs constant attention — security patches, plugin updates, compatibility checks.

When to pick this: budget is the binding constraint, or NUS IT requires the content system to be hosted entirely on NUS's own infrastructure.

2.3 — Markdown / AI-native

The most experimental option. Content stored as text files in a Git repository, edited through a lightweight web tool. Appealing properties: nothing to pay for, full control over where content lives, and the format is naturally friendly to AI agents. But the editing experience is the weakest of the three for non-technical staff, the approval process essentially becomes a developer workflow, and a lot of editing conveniences (media management, previews, validation) would have to be built from scratch.

Why not now: editing experience and timeline risk outweigh the appeal. Worth revisiting after launch if content strategy shifts toward AI-agent-first distribution — the recommended stack already supports this direction, so the door stays open.
§ 03

Stack at a Glance

LayerChoice
FrontendAstro + Tailwind CSS
CMSSanity (hosted)
AI modelOpenRouter (Multi-Model) or OpenAI
AI SidecarCustom Node / Python service
Knowledge baseThird-party platform
Adapter serviceThin KB wrapper — can be colocated with Sidecar
Search UIAstro / React component → KB search API
AnalyticsReuse existing (confirm with client) or Plausible
Frontend hostingCloudflare Workers
§ 04

Architecture

EDITORS Academic staff Sanity Studio + AI Editor Plugin + Content Releases Headless CMS AI Sidecar + KB Adapter Node / Python Custom build LLM OpenRouter Multi-Model Knowledge Base Third-party platform Vectorisation · Search Astro Build Static + ISR Webhook-triggered CDN Edge — Cloudflare / Vercel · Singapore PoP Static pages · Content negotiation (HTML / Markdown) No origin hit for regular page loads Browsers · HTML AI Agents · Markdown on publish ingest / query

Key data flows

  1. Content delivery Sanity pushes content on publish via webhook, triggering an Astro build or ISR revalidation. Static pages are served from the CDN edge — no origin hit for regular page loads.
  2. AI writing in CMS The Sanity Studio AI editor plugin calls the AI Sidecar, which pulls relevant context from the KB, assembles a prompt, and calls the AI API. Suggestions stream back into the editor panel.
  3. Semantic search The search UI queries the KB search API through the Sidecar/Adapter. No RAG, no generated answers — just well-ranked semantic results. Fast, predictable, no hallucination risk on a front-facing page.
  4. Publish to KB sync On content publish, a webhook fires to the Sidecar/Adapter, which forwards the new content to the KB ingest API for indexing.
  5. AI agent content Crawlers requesting Accept: text/markdown hit the same URLs, served markdown by the Astro middleware or edge function. No origin logic beyond content negotiation.
§ 05

Frontend — Astro

Astro is content-first, outputs minimal JS by default, and partial hydration means interactive components only ship JS where needed.

Styling
Tailwind CSS
Components
shadcn/ui or a lightweight custom set — to be decided once design direction is clearer
Rendering
Static generation for the bulk of the site; ISR for pages that update frequently (news, events)
Hosting
Vercel or Cloudflare Workers — both have Singapore PoPs
Accessibility
WCAG 2.1 AA compliance assumed — needs formal confirmation with client

5.1 Caching

  • Static pages: long Cache-Control headers (e.g., s-maxage=86400, stale-while-revalidate). Content changes trigger revalidation via webhook rather than TTL expiry.
  • Search API responses: short TTL or no-cache. If KB response times are slow, a Redis read-through cache in the Sidecar can cache frequent queries at ~60s TTL.
  • Custom cache logic: any invalidation beyond what the CDN handles natively goes through CDN cache headers. Avoid custom in-app cache logic — it creates consistency issues.

5.2 Deployment

  • Git-based workflow (GitHub/GitLab), PR previews via Vercel/CF branch deployments
  • Sanity publish webhook triggers a production build/revalidation
  • Environment secrets managed via the hosting provider's secrets store — not in committed .env files
  • Staging environment mirrors production, pointed at the KB staging endpoint

5.3 Analytics

If the client has existing analytics (Google Analytics or similar), reuse and confirm the setup during onboarding. If starting fresh, Plausible is the recommended default — lightweight, privacy-friendly, no cookie banner required, which is cleaner for an academic institution.

5.4 Domain migration scope

Migration work needed to preserve search equity: URL mapping, permanent 301 redirects, canonical updates, sitemap submission, and post-launch monitoring in Search Console. We will also keep the destination pages aligned to the intent of the old URLs so external links continue to resolve cleanly.

§ 06

CMS — Sanity

6.1 Why Sanity

The shortlist was Sanity, Contentful, Strapi, Payload, and headless WordPress. Git-based CMSs were ruled out early — routing academic admins through a Git-based approval flow isn't viable.

The decision came down to one thing: Sanity Studio is a React app you fully control. The AI editor plugin can live directly inside the editing experience — no separate tool, no context switching for non-technical editors.

Capability Sanity Contentful Strapi Payload
Custom AI UI in editorLocked UILimitedGood (React)
Non-technical editor UXGoodFunctionalMinimal OOTB
Approval workflowsBuilt-inCustom buildCustom build
Hosted optionYesImmatureSelf-host only
Data residencyUS/EUFull controlFull control
Maintenance burdenLowMediumMed–High

Payload is the strongest open-source alternative if data residency becomes a hard requirement — TypeScript-native, fully hackable admin UI. But the gap in editorial UX and workflow tooling makes Sanity the better starting point. This can be solved by a custom Payload UI, but it requires development.

6.2 Content model

Sanity's schema is defined in code, so the shape of each content type needs to be agreed with faculty stakeholders before CMS configuration begins. Content is mostly text and images — a short workshop with department reps is usually enough. Typical content types:

  • Pages (general, flexible layout)
  • People (faculty bios, staff profiles)
  • News and announcements
  • Events
  • Research outputs and publications
  • Departments / organisational units
  • FAQ pages — question-answer pairs targeting queries AI systems get asked
  • Comparison pages — side-by-side program content

6.3 Approval workflow

Two-stage, per department. Sanity's Content Releases feature handles this natively on the Growth tier and above.

RolePermissions
EditorDraft and edit within own department
Head of DepartmentReview and approve department content
Web AdminFull access across all departments

6.4 SSO

Sanity supports external identity providers on the Team tier and above. Academic users will expect to log in with university credentials — retrofitting SSO after launch is painful. Needs confirmation early. In case there is SSO, every user will have to create a new Sanity account.

6.5 AI editor plugin

A custom Sanity Studio plugin adds a panel alongside the rich text editor. The plugin calls the AI Sidecar — not the LLM directly — so all context assembly and prompt management stays out of the CMS.

  • Writing suggestions based on the KB context
  • Tone and style guidance aligned to faculty brand voice
  • Knowledge base lookups (facts, policies, people profiles)
  • Chat-like interface for free-form writing assistance
  • Alt text suggestions for uploaded images (supports WCAG compliance)
§ 07

AI Layer

7.1 OpenRouter APIs

OpenRouter is a model router that lets us call Claude, GPT, or other models through a single API. For content-generation tasks we default to Claude — it handles long-form structured writing well and follows complex instructions reliably, which matters when enforcing institutional tone or department-specific constraints via the system prompt. If NUS has a preference or existing procurement with OpenAI, the Sidecar can target either provider with minimal reconfiguration.

7.2 AI Sidecar service

A small Node or Python service sitting between the CMS plugin, the OpenRouter API, and the KB. Responsibilities:

  • Context assembly: pulls relevant content from the KB, selects what goes into the context window, and formats the prompt
  • Prompt management: maintains system prompts per department or content type
  • Response streaming: pipes the LLM's response back to the Studio plugin
  • Adapter function: wraps the KB API — auth, rate limiting, normalisation, and optional Redis cache
  • Content management: markdown as a secondary representation. Sidecar writes .md snapshots for easier manipulation with AI agents and future reuse.

The Sidecar is intentionally decoupled. New AI features — automated tagging, summarisation, translation, accessibility checks, multi-step agents — get added here without touching the CMS or frontend. If the client wants more ambitious features later, the Sidecar is the foundation for that.

7.3 Adapter service

Rather than a separate deployment, the KB adapter function can live inside the Sidecar. It exposes a stable internal contract to the rest of the stack and handles auth, rate limiting, retry logic, request/response normalisation, and an optional Redis read-through cache for frequent search queries. If the KB API turns out to be complex, splitting the Adapter into its own service is a clean refactor — the interface doesn't change, just the deployment boundary.

7.4 Semantic search

The search bar on the homepage and site-wide is a semantic search UI backed by the KB's search API. No RAG, no generated summaries, no chatbot — just well-ranked semantic results. Fast and deterministic. The KB team is expected to handle vectorisation and reranking internally.

7.5 Content negotiation (Markdown for AI agents)

End-users will increasingly encounter content through AI agents, not browsers. All pages serve clean markdown when an Accept: text/markdown header is present. Browsers get HTML, AI crawlers get markdown — same URLs, same sitemap, no separate subdomain. Implemented as Astro middleware or an edge function. Markdown output strips nav, chrome, footers, and JS, returning only prose, headings, lists, and tables.

7.6 Structured data and AI agent markup

Schema.org JSON-LD per page type, generated from Sanity fields at build time: EducationalOrganization, Course, Person, Event, ResearchProject. Key pages also include a <meta name="ai-agent-instruction"> tag with a concise machine-readable summary of the page content. Entity naming is consistent across markup and visible text — build-time linting catches mismatches so AI systems don't split the institution into two entities.

7.7 llms.txt and AI agent endpoint

A /llms.txt at site root: plain-text machine-readable summary of the organisation, programs, and site structure. No build step, no CMS dependency. Optionally, the Sidecar exposes a /api/ask endpoint — a POST endpoint accepting a JSON question, returning a structured answer about programs, admission, research, and faculty.

7.8 Answer-first content architecture

CMS content model extended with optional AEO fields: canonical question, short direct answer (1–3 sentences), related concepts. Pages output question as heading, answer as first paragraph — the structure answer engines extract most reliably. FAQ and Comparison page templates follow the same pattern. Program pages get typed metadata fields (eligibility, duration, outcomes, fit boundaries) stored as structured Sanity fields. Build-time validation: AEO pages must have a direct answer under three sentences.

§ 08

Knowledge Base Integration

The KB platform (third-party) owns vectorised content, data access control, role checks, and search retrieval. From this project's perspective, it's a black box with an API — we push content in and query results out.

8.1 Integration points

DirectionTriggerPath
CMS → KBOn content publishSanity webhook → Sidecar/Adapter → KB ingest API
KB → CMS (AI)Editor requests a suggestionSanity plugin → Sidecar → KB search API
KB → Search UIUser submits search querySearch component → Sidecar/Adapter → KB search API

8.2 Sync mechanism

Preferred: webhooks on publish — near real-time indexing, low complexity. Fallback: delta endpoint (fetch everything changed since timestamp X). Last resort: full-export cron job, which means search results can lag content changes by hours.

§ 09

Infrastructure

ServicePlatformNotes
FrontendVercel / Cloudflare WorkersSG edge; branch previews; build webhooks
CMSSanity CloudGrowth tier minimum; Team if SSO required; US/EU data region — confirm with NUS IT
AI Sidecar + AdapterRailway, Render, or AWS LambdaStateless; low sustained traffic
Redis (optional)Upstash or Railway RedisRead-through cache in Sidecar; short TTL
Claude APIAnthropic (external)Confirm data handling agreement
AnalyticsExisting (TBC) or PlausibleConfirm with the client
§ 10

What gets built vs. off-the-shelf

Off-the-shelf

  • Astro, Tailwind, shadcn/ui
  • Sanity Studio + Content Releases workflow
  • Claude API / OpenRouter
  • Knowledge base platform (third-party)
  • Vercel / Cloudflare Workers
  • Plausible (if adopted)

Custom build

  • Sanity Studio AI editor plugin (React)
  • AI Sidecar + Adapter service (Node/Python)
  • Search UI component (Astro/React)
  • CMS → KB publish webhook handler
  • Astro frontend templates & design system
  • Astro content negotiation middleware
  • /api/ask endpoint in Sidecar (optional)
  • AEO build-time validation rules
§ 11

Costs

Monthly recurring SaaS and infrastructure costs at ~100k page views/month. Development and maintenance labour is not included. Sanity figures are based on vendor pricing; WordPress and Markdown figures are ballpark estimates.

Sanity + Cloudflare Headless WordPress Markdown / AI-native
Per month ~$500 ~$500
Per year ~$6,000 ~$6,000

The Sanity high estimate is dominated by the SSO add-on (~$1,399/mo). Without SSO, the high estimate drops to ~$950/mo — still the most capable editorial platform but at a meaningful premium. WordPress and Markdown are significantly cheaper in direct costs, but carry higher maintenance burden and weaker editor tooling that isn't captured in these figures. For a detailed breakdown with adjustable inputs, see the cost calculator.

§ 12

Open Questions

Raised by the technical team to close design gaps before architecture is finalised. Answers to these will materially affect scope, cost, or timeline.

12.1 — Infrastructure & procurement

  • Cloudflare hosting approval. Does NUS IT require a formal security or procurement review before we can host the frontend on Cloudflare? If so, what is the typical lead time? A lengthy review could push the project start by weeks.
  • Data residency. Is Singapore-only data residency a hard legal/technical requirement, or a preference? Several stack components (Sanity Cloud, OpenRouter) default to US/EU regions — a hard SG requirement changes the CMS shortlist.
  • LLM inference provider. Is there a preference or existing procurement relationship with a specific LLM provider (OpenAI, Anthropic, etc.)? Our stack supports OpenRouter (multi-model) or direct OpenAI integration — knowing this upfront avoids rework on the AI Sidecar configuration.

12.2 — CMS licensing & seat count

  • Active editors. Of the anticipated 15–30 editor accounts, how many would be active simultaneously, and how quickly could that number grow? The preferred CMS is per-seat priced; at 30+ seats it becomes a meaningful cost line and may tip toward a self-hosted alternative. The current tier caps at 50 users.
  • Recurring subscription billing. Sanity CMS and several infrastructure services (hosting, AI API usage) are billed as monthly SaaS subscriptions, typically paid by credit card. Is this procurement model acceptable for NUS, or does the institution require annual invoicing / purchase-order-based payment? If credit-card billing is a blocker, we need to evaluate self-hosted or locally procureable alternatives early.

12.3 — Multi-tenancy & future scope

  • Multi-tenant model. When BLOCK71, PIER71, and TIG onboard later, should they share one CMS with separate content spaces, or run fully independent deployments? These are very different architectures, and the decision shapes today's content model and hosting setup even if rollout is future scope.

12.4 — SSO

  • SSO scope. Is SSO needed for CMS editors only, or also for site visitors accessing gated content? Which identity provider does NUS use? Editor-only SSO requires the CMS Team tier and IdP configuration; visitor-facing SSO adds identity federation and session management — very different in cost and timeline.

12.5 — Knowledge base & search

  • Search API under load. Predictable traffic spikes (NOC applications, Open House, SWITCH) could push search volume well above baseline. Does the KB platform have rate limits or SLA commitments for the search API during peak windows? If the API degrades under load, the search experience breaks at the worst possible moment.

12.6 — Salesforce & integrations

  • Salesforce data flow. Is the integration one-way (form submissions push to Salesforce) or bidirectional (Salesforce data surfaces on-site for personalisation, gated content, or event registration)? Bidirectional is significantly more scope and requires NUS IT involvement.
  • Donation / gifting page. Should the existing donation page be fully migrated onto the new platform, or embedded/linked from the existing Sitefinity page? Full migration means touching Salesforce, Sitefinity, and payment flows — a separate project in itself.
0 / 0 answered