Do you replace our QA team or augment it?

Augment is the most common pattern. We embed senior QA engineers alongside your team, share the same backlog, and ship test code through the same review process. Where we do greenfield work end-to-end (e.g., rebuilding a flaky suite), we hand off with documentation and a 90-day shadowing period.

Playwright or Cypress?

Playwright by default for new engagements. Cypress where the existing org has deep Cypress investment that's worth preserving. Both are capable; Playwright wins on multi-context testing, parallelism, and modern browser support; Cypress wins on developer ergonomics for component testing. We tell you which fits, in writing.

How do you handle test data?

Deterministic by construction. Factories for known shapes, fixed-time clocks, snapshot-based fixtures for stable data, and ephemeral test environments where the test surface warrants it. We engineer out shared-state and time-based flake — not patch around it.

Can you support performance testing under realistic load?

Yes — and we'll tell you when 'realistic load' is being defined optimistically. K6 / Gatling / Locust against staging environments shaped to production scale, sustained-load profiling, capacity planning with explicit budgets enforced in CI, and chaos-engineering for failure modes ahead of peak season.

What about accessibility testing?

Three layers. Automated (axe-core / Lighthouse / Pa11y) in CI catches the easy half. Manual assistive-technology testing (NVDA, VoiceOver, Dragon NaturallySpeaking) on release gates catches the rest. Component-library accessibility audits enforced in the design system. Most engagements achieve WCAG 2.2 AA conformance within 6 months.

What does a QA engagement cost?

Test pyramid audit: 2–3 weeks, $20K–$60K. Test architecture rebuild: 3–6 months, $200K–$700K. Performance engineering program: 3–5 months, $200K–$500K. Accessibility test program build: 4–6 months, $250K–$600K. Embedded QA engineers: $20K–$60K per month per engineer. Brackets published honestly so visitors self-qualify before the first call.

Quality & Security · GUARDIAN + CITADEL

QA & Test Automation.

Senior QA engineering — test automation, performance, accessibility, visual regression — built into the CI you ship through. Engineers who write test code alongside application code, not contractors handed scripts to execute.

Practice: Quality & Security
Department: GUARDIAN + CITADEL

The problem

Most QA suites are 90 minutes of flake the engineering team has learned to ignore.

The shape: a Selenium suite from 2020, a parallel Cypress effort that never finished, an end-to-end test job that fails 30% of the time and gets retried until it passes, performance tests run quarterly and forgotten, accessibility checks mentioned in design reviews and never enforced, and a release process that ships through the QA suite the same way airline passengers ship through TSA — wearily, hoping nothing flags. QA stops being a release gate when the team has trained itself to ignore it.

We rebuild test architecture so that flake rates drop, signal trust returns, and the suite earns its place in the deployment pipeline. Senior QA engineers (G6+) writing test code that lives alongside the application, runs in the same CI, and is reviewed under the same code-review standard. Performance tests with explicit budgets enforced in CI. Accessibility checks at the component, page, and journey layers. Most engagements convert hour-long flaky suites into 10-minute deterministic ones — and restore deploy confidence as a side-effect.

Where it ships

5 use cases, in production.

Specific applications we’ve built and operated. Not speculative — every example below is grounded in a real shipped engagement.

01
8x
deploy frequency post-rebuild
Test architecture rebuild
Replace flaky Selenium / Cypress suites with deterministic Playwright + Vitest matrix. Page-object architecture, deterministic data, and CI parallelism that keeps suite times reasonable.
02
Performance engineering
K6 / Gatling load testing with explicit budget enforcement in CI. Capacity planning, sustained-load profiling, and chaos-engineering for failure modes ahead of peak season.
03
Accessibility test program
axe-core / Pa11y / Lighthouse in CI, manual NVDA / VoiceOver / Dragon validation on release gates, and the WCAG 2.2 AA conformance frame your release process actually enforces.
04
Visual regression
Chromatic / Percy / Playwright snapshots with explicit baseline governance, design-system-aware diffing, and the review process that survives day-to-day UI iteration.
05
Contract testing
Pact / consumer-driven contract testing for service boundaries. Schema evolution managed through CI gates rather than post-deploy panic.

How we engage

4 phases, named in the SOW.

Each phase has a deliverable, an owner, and an acceptance criterion. Not slogans — operating rules.

01
Test pyramid audit
Discovery: current test pyramid, flake rates per layer, suite times, signal trust, deployment-gate effectiveness. Output is honest assessment with prioritized remediation — including 'this suite needs a rebuild, not more contributors'.
02
Deterministic-first engineering
Test data deterministic by construction (factories, snapshots, fixed time). Tests isolated by default; shared-state tests explicit and minimized. Parallelism by design. Most flake comes from non-deterministic data and shared state — we engineer it out.
03
CI integration as gating
Tests run in the same CI that ships application code. Pull-request gates with explicit pass criteria. Flake quarantine with an SLA on root-cause investigation. The suite is a release gate, not a parallel artifact.
04
Operating cadence
Weekly test-health review (flake rate, suite time, coverage). Monthly performance budget review. Quarterly test-pyramid retrospective. Test code reviewed and refactored on the same cadence as application code.

Capabilities

What’s in scope.

Test automation: Playwright, Cypress, Vitest, Jest, Pact, contract tests
Performance engineering: K6, Gatling, Locust, JMeter, sustained-load profiling
Accessibility: axe-core, Pa11y, Lighthouse, NVDA / VoiceOver / Dragon validation
Visual regression: Chromatic, Percy, Playwright snapshots
Mobile testing: Maestro, Detox, XCUITest, Espresso
Test data management: factories, deterministic fixtures, ephemeral environments
CI integration: GitHub Actions, GitLab CI, CircleCI, BuildKite

Stack

Tools we use in production.

Web automation: PlaywrightCypressSelenium (legacy)Puppeteer
Unit & integration: VitestJestJasminePytestJUnitGo test
Performance: K6GatlingLocustJMeterLighthouse CI
Accessibility: axe-corePa11yLighthouseWAVEStark
Mobile: MaestroDetoxXCUITestEspressoAppium

Selected work

Quantified outcomes, not adjectives.

All case studies

01B2B SaaS
8x
deploy frequency
End-to-end test automation rebuild for a Series C platform.
Replaced a 90-minute Selenium suite with a 6-minute Playwright + Vitest matrix. Restored deploy confidence, enabled trunk-based development, and freed the QA team to engineer rather than babysit a flaky CI.
4 months

Security Engineering

QA blends with AppSec on most regulated workloads.

Security testing (SAST / DAST / SCA), penetration testing, and secure SDLC are sister engagements. We co-staff QA and security engagements when the workload's regulatory frame requires it.

Security Engineering

Common questions

Asked before the first call.

01
Do you replace our QA team or augment it?
Augment is the most common pattern. We embed senior QA engineers alongside your team, share the same backlog, and ship test code through the same review process. Where we do greenfield work end-to-end (e.g., rebuilding a flaky suite), we hand off with documentation and a 90-day shadowing period.
02
Playwright or Cypress?
Playwright by default for new engagements. Cypress where the existing org has deep Cypress investment that's worth preserving. Both are capable; Playwright wins on multi-context testing, parallelism, and modern browser support; Cypress wins on developer ergonomics for component testing. We tell you which fits, in writing.
03
How do you handle test data?
Deterministic by construction. Factories for known shapes, fixed-time clocks, snapshot-based fixtures for stable data, and ephemeral test environments where the test surface warrants it. We engineer out shared-state and time-based flake — not patch around it.
04
Can you support performance testing under realistic load?
Yes — and we'll tell you when 'realistic load' is being defined optimistically. K6 / Gatling / Locust against staging environments shaped to production scale, sustained-load profiling, capacity planning with explicit budgets enforced in CI, and chaos-engineering for failure modes ahead of peak season.
05
What about accessibility testing?
Three layers. Automated (axe-core / Lighthouse / Pa11y) in CI catches the easy half. Manual assistive-technology testing (NVDA, VoiceOver, Dragon NaturallySpeaking) on release gates catches the rest. Component-library accessibility audits enforced in the design system. Most engagements achieve WCAG 2.2 AA conformance within 6 months.
06
What does a QA engagement cost?
Test pyramid audit: 2–3 weeks, $20K–$60K. Test architecture rebuild: 3–6 months, $200K–$700K. Performance engineering program: 3–5 months, $200K–$500K. Accessibility test program build: 4–6 months, $250K–$600K. Embedded QA engineers: $20K–$60K per month per engineer. Brackets published honestly so visitors self-qualify before the first call.

Within Quality & Security

Other capabilities in this practice.

Back to Quality & Security

Security Engineering

Talk to us

Bring a qa & test automation problem. We’ll bring a senior engineer.

A senior engineer plus the GUARDIAN + CITADEL department lead joins the first call. No discovery gauntlet, no junior reps.

Book a discovery call Request a proposal

What’s in scope.

Test automation: Playwright, Cypress, Vitest, Jest, Pact, contract tests

Performance engineering: K6, Gatling, Locust, JMeter, sustained-load profiling

Accessibility: axe-core, Pa11y, Lighthouse, NVDA / VoiceOver / Dragon validation

Visual regression: Chromatic, Percy, Playwright snapshots

Mobile testing: Maestro, Detox, XCUITest, Espresso

Test data management: factories, deterministic fixtures, ephemeral environments

CI integration: GitHub Actions, GitLab CI, CircleCI, BuildKite

Most QA suites are 90 minutes of flake the engineering team has learned to ignore.

5 use cases, in production.

Test architecture rebuild

Performance engineering

Accessibility test program

Visual regression

Contract testing

Test pyramid audit

Deterministic-first engineering

CI integration as gating

Operating cadence

What’s in scope.

Tools we use in production.

Quantified outcomes, not adjectives.

End-to-end test automation rebuild for a Series C platform.

QA blends with AppSec on most regulated workloads.

Do you replace our QA team or augment it?

Playwright or Cypress?

How do you handle test data?

Can you support performance testing under realistic load?

What about accessibility testing?

What does a QA engagement cost?

Other capabilities in this practice.

Bring a qa & test automation problem. We’ll bring a senior engineer.

Most QA suites are 90 minutes of flake the engineering team has learned to ignore.

5 use cases, in production.

Test architecture rebuild

Performance engineering

Accessibility test program

Visual regression

Contract testing

Test pyramid audit

Deterministic-first engineering

CI integration as gating

Operating cadence

What’s in scope.

Tools we use in production.

Quantified outcomes, not adjectives.

End-to-end test automation rebuild for a Series C platform.

QA blends with AppSec on most regulated workloads.

Do you replace our QA team or augment it?

Playwright or Cypress?

How do you handle test data?

Can you support performance testing under realistic load?

What about accessibility testing?

What does a QA engagement cost?

Other capabilities in this practice.

Bring a qa & test automation problem. We’ll bring a senior engineer.