Loading...
Loading...
PromptSmith interface preview
CASE STUDY
Full-Stack Context Engineering · Serverless · Multi-Provider AI

PromptSmith
Turn weak prompts into
production-ready context.

A zero-database, serverless-friendly prompt engineering studio that diagnoses raw prompts, builds structured 7-section context scaffolding, and routes optimised queries to Gemini, Groq, OpenRouter, or Together AI — all from shared hosting with no npm, no Composer.

PHP 7.4+ · Vanilla JS Gemini 2.0 / 2.5 Pro Groq · Llama 3.3 70B OpenRouter Free Tier Together AI CSRF Protected API Live Quality Scoring Model Fit Ranker Word-Level Diff View Apache · Shared Hosting
4+
AI providers supported out-of-the-box
1-2x
Usable output iterations in internal testing*
$0/mo
Infra cost — runs on any shared host
0deps
No npm, no Composer, no database
Solo Build · 2025
Every layer — one developer
Design
Architecture
Engineering
Security
01 · Problem Space

Most raw prompts fail before they reach the model.

Developers and non-technical users alike write under-specified prompts — missing context, vague goals, no constraints. The result is unpredictable AI output that requires multiple back-and-forths to refine.

Ambiguity at scale

Words like "good", "modern", and "clean" appear in 70%+ of first-draft prompts, yet give the model zero actionable direction.

Missing context blocks

Audience, output format, constraints, and persona are almost always absent from raw prompts — leaving the model to guess all four.

Model-agnostic output

Flash models need brevity; Pro models need full Markdown structure. A single prompt format cannot serve both well.

Core Design Problem

"How might we intercept a raw prompt before it reaches an AI provider and automatically enrich it with the structured context, persona, and constraints the model needs — without requiring the user to understand prompt engineering?"

PromptSmith answers this with a 4-stage pipeline: diagnose → scaffold → optimise → route.

HMW 01

How might we detect intent and missing context from a prompt automatically, before any LLM call is made?

HMW 02

How might we rewrite prompts differently for Flash vs Pro models, without building separate UIs?

HMW 03

How might we give users real-time feedback on prompt quality without a server round-trip per keystroke?

02 · User Value

Who PromptSmith helps and what improves.

PromptSmith reduces prompt guesswork by turning vague asks into structured, model-ready context before the first API call.

Primary Users
  • AI builders who need consistent output across model families
  • Product designers validating AI UX with repeatable prompts
  • Non-technical operators running workflow prompts without prompt expertise
Jobs To Be Done
  • "I want predictable outputs without becoming a prompt engineer."
  • "I want fewer iteration loops before I get usable output."
  • "I need output structure to match a target format on first pass."
Before / After Proof

Before Prompt

Write a landing page for my app

After PromptSmith

Objective: Create a high-conversion landing page Persona: SaaS founder targeting early adopters Constraints: Keep sections scannable, include trust signals Output: Hero + 3 value blocks + CTA variants

+65%

Structure completeness

5 → 2

Typical iterations to usable output

1st pass

Format alignment in internal testing

*Directional internal testing across 20 prompts (content, product, and engineering tasks).

03 · Solution Overview

A serverless context engineering studio.

PromptSmith acts as an intelligent middleware layer between the user and the AI provider. Every prompt passes through a 4-stage enrichment pipeline before the first API token is ever sent.

Diagnosis Engine

Classifies intent into 8 categories, detects 5 missing context types, flags vague qualifiers, and scores complexity 1–10 — all via regex, no LLM call required.

Context Scaffold Builder

Generates a 7-section Markdown scaffold — Objective, Persona, Domain, Inputs, Constraints, Output Format, Edge Cases — injected between role and task.

Model-Aware Optimiser

Rewrites prompts using 3 distinct strategies: brevity-first bullets for Flash, full Markdown sections for Pro, numbered step-by-step for legacy gemini-pro.

Multi-Provider Router

A single callProvider() dispatcher routes to Gemini, Groq, OpenRouter, or Together AI based on session config — each via curlPost() with unified error normalisation.

Design Philosophy

PromptSmith is designed around a zero-persistence model. API keys live only in PHP $_SESSION — never written to disk or a database. No framework. No npm. No Composer. Deployable in minutes on InfinityFree or any cPanel shared host.

Output Tabs

Response Optimised Context Diff Model Fit History JSON
04 · Core Pipeline

Every prompt passes through 4 stages.

The enrichment pipeline executes entirely server-side before any provider API call. Client-side scoring runs in parallel on every keystroke, debounced at 350ms.

Raw Prompt

User Input

Diagnosis

prompt-analyzer.php

Scaffold

context-builder.php

Optimise

model-optimizer.php

Route

generate.php

Response

JSON payload

1
Prompt Diagnosis Engine

Intent classified into 8 categories (Build, Explain, Fix, Write, Analyse, Convert, List, Test) via verb-keyword regex. Task domain mapped to 9 buckets. Missing context and ambiguity flagged. Complexity scored 1–10.

2
Structured Context Builder

Generates a 7-section Markdown scaffold — Objective, User Persona, Domain Context, Inputs (with [MISSING] annotations), Constraints, Output Format, Edge Cases — injected between role and task.

3
Model-Aware Optimiser

Rewrites the scaffold using model-specific strategies: *-flash → brevity-first bullets; *-pro / 2.x → full Markdown sections; gemini-pro legacy → numbered step-by-step. Other providers receive pass-through.

4
Multi-Provider Router

callProvider() dispatches to Gemini's native API or OpenAI-compatible endpoints (Groq, OpenRouter, Together AI) based on session config. All share a curlPost() utility with 60s timeout and unified error normalisation.

Client-Side Scoring · Live

Runs on every keystroke (debounced 350ms) — zero server calls. computeMetrics() in app.js.

Clarity
0–25 pts
Context
0–25 pts
Structure
0–20 pts
Readiness
0–30 pts
Ambiguity −
up to −20

Readiness Labels

≥ 90 — Production Ready
≥ 70 — Usable
≥ 40 — Needs Refinement
< 40 — Unusable

Model Fit Ranker

Detects prompt characteristics (long, structured, creative, short) and scores all available models 0–100 per characteristic, ranking them in real time.

05 · Architecture

Framework-free. Serverless-ready.

Three layers: a single-page browser shell, Apache PHP with utility modules, and outbound cURL to provider APIs. Nothing else.

File Structure

PromptSmith/
├── .htaccess
# Security + routing
├── public/
│ ├── index.html
# SPA shell
│ ├── styles.css
# UI layer
│ └── app.js
# All client logic
├── api/
│ ├── generate.php
# Main pipeline
│ ├── set-config.php
# Key + model config
│ └── csrf.php
# Token endpoint
└── utils/
# Blocked by .htaccess
├── prompt-analyzer.php
├── context-builder.php
├── model-optimizer.php
└── curl-helper.php
Browser Layer

Single HTML file, CSS, Vanilla JS. Live metrics run client-side. All state changes go via CSRF-protected fetch() calls.

SPA
Apache · PHP 7.4+

Session management, CSRF tokens, pipeline orchestration. Requires curl, json, session extensions only.

Shared Hosting
Provider APIs

Gemini native API + three OpenAI-compatible endpoints. Unified via curlPost() with SSL verification, 60s timeout, normalised errors.

Multi-provider
06 · Providers & Models

Four providers. One interface.

Every provider is accessible via the Settings panel — select, key, test connection. Model selectors are dynamically populated per provider.

Google
Gemini

gemini-2.0-flash
gemini-2.5-pro-preview
gemini-1.5-pro · 1.5-flash
gemini-pro (legacy)

Model-aware optimisation — only provider with custom rewrite strategies
Fast Inference
Groq

llama-3.3-70b-versatile
llama-3.1-8b-instant
mixtral-8x7b-32768
gemma2-9b-it

OpenAI-compatible · api.groq.com/openai/v1
Free Tier
OpenRouter

mistral-7b-instruct:free
llama-3.2-3b:free
gemma-3-1b-it:free
qwen3-8b:free

Free models via :free suffix · openrouter.ai/api/v1
Wide Selection
Together AI

Llama-3.2-11B-Vision-Turbo
Meta-Llama-3.1-8B-Turbo
Mixtral-8x7B-Instruct
gemma-2-9b-it

OpenAI-compatible · api.together.xyz/v1

Generation Parameters (All Providers)

Parameter Gemini OpenAI-compat Notes
temperature 0.7 0.7 Balanced creativity vs determinism
max_tokens 2048 2048 Full response budgets
topP 0.95 not set Gemini native parameter
Safety filters BLOCK_MEDIUM_AND_ABOVE Provider-managed 4 categories on Gemini
API version v1 (gemini-pro) / v1beta (others) Auto-selected by model name
07 · Security Architecture

Secure by design.

API keys are never written to disk, never returned to the client, never logged. Every state-changing request is CSRF-protected with same-origin enforcement.

Threat Mitigations
Threat Mitigation
API key leakage Stored only in $_SESSION — never returned to client or logged
CSRF attacks Tokens on every state-changing request via X-CSRF-Token header
Cross-origin requests requireSameOrigin() checks Origin/Referer against HTTP_HOST
Session hijacking HTTPOnly, SameSite=Strict, strict mode session cookies
Directory traversal Options -Indexes — directory listing disabled globally
Direct util access .htaccess FilesMatch blocks utils/*.php — returns 403
Model injection set-config.php validates model against allowlist per provider
Production Checklist
Not covered in v1.0

Rate limiting (no server-side throttle — add at CDN/server layer), API key rotation (one key per provider per session), and audit logging are not implemented in the current release.

08 · Known Limitations

Honest constraints.

PromptSmith v1.0 is opinionated about simplicity. These trade-offs are deliberate — they keep the stack zero-dependency and shared-host friendly.

Deliberate Trade-offs
Decision Why It Was Chosen Trade-off
Regex diagnosis (not LLM) Fast, free, deterministic Less nuance than semantic parsing
No database Shared-host simplicity No persistent history by default
Multi-provider routing Flexibility and model choice More adapter complexity to maintain
Session-scoped state

All config (API key, provider, model) is stored in PHP $_SESSION only. Closing the browser or session timeout requires re-entry. No localStorage, no cookies beyond session ID.

Approximate token counting

Token estimates use length / 4 — a rough heuristic. Real counts vary significantly by tokeniser, especially for non-English text or dense code. compression_pct is often negative since scaffolding adds content.

Optimiser routing — Gemini only

Model-specific rewriting (optimisePromptByModel()) is implemented for Gemini model classes only. Groq, OpenRouter, and Together AI all receive a pass-through of the base structured prompt without model-specific adaptation.

No persistence or streaming

Optimisation history is in-memory JavaScript only — a page refresh clears it completely. Responses are delivered as a single full JSON payload; there is no streaming output to the frontend. Every generation is stateless — no conversation history.

No rate limiting

No server-side request throttling is implemented. The 60-second cURL timeout is the only throttle. Production deployments should add rate limiting at the server or CDN layer independently.

09 · Outcomes

What PromptSmith delivers.

A production-grade prompt engineering layer that any developer can deploy in minutes on infrastructure they already own.

7
Context sections auto-generated per prompt — Objective, Persona, Domain, Inputs, Constraints, Output, Edge Cases
$0
Infrastructure cost — no database, no framework, no paid hosting required
4
AI providers accessible from one settings panel — no code changes to switch
0ms
Server calls for live quality scoring — all computed client-side, debounced 350ms
Zero-dependency deployment

Runs on any Apache shared host with PHP 7.4+. No npm, no Composer, no CLI tools. Upload and visit — that's the entire deployment story.

Security-first architecture

CSRF tokens, same-origin enforcement, .htaccess hardening, and HTTPOnly/SameSite session cookies — all configured out-of-the-box.

Model-adaptive rewriting

The same raw prompt is transformed differently for Flash (brevity), Pro (full Markdown), and legacy models — without user intervention.

Live quality feedback

Quality score, readiness label, context coverage, token estimate, and 3 improvement hints update on every keystroke — no server round-trips.

10 · What's Next

Roadmap opportunities.

The zero-persistence, framework-free architecture opens clear extension points without breaking the shared-hosting constraint.

Roadmap 01

Extend model-aware optimisation to Groq/OpenRouter/Together AI — each provider's top models have distinct optimal prompt structures.

Roadmap 02

Add optional SQLite persistence for session history — zero server dependencies, but survives page refresh without external infrastructure.

Roadmap 03

Implement streaming responses via Server-Sent Events — the architecture is stateless enough to support SSE without framework changes.

Roadmap 04

Replace length/4 token estimation with provider-specific tokeniser approximations for Llama, Mistral, and Gemini BPE vocabularies.

Roadmap 05

Add server-side rate limiting via PHP session counters — no Redis or database required, compatible with shared hosting constraints.

Roadmap 06

Multi-turn conversation support by maintaining a message history array in session — preserving the stateless-per-request architecture while enabling context threading.

Reflection

PromptSmith demonstrates that meaningful AI tooling doesn't require a cloud-native stack. By treating prompt engineering as a structured pipeline — diagnose, scaffold, optimise, route — and running all of it on zero-cost infrastructure, it makes production-quality context engineering accessible to any developer with a shared hosting account and an API key.

v1.0 · PHP 7.4+ · MIT License · Shared Hosting Compatible