It's 4:58 PM on a Friday. Your team merges a critical PR, and the CI pipeline groans to life. Forty-five minutes later, the Playwright E2E suite finally finishes—with three mysterious failures that pass locally. The deployment is blocked, the team is waiting, and you're debugging browser inconsistencies on a GitHub Actions runner you'll never touch.
This chaos has a solution. By combining Playwright with Docker and modern CI parallelization, you can turn that 45-minute ordeal into an 8-minute feedback loop. Let's build that pipeline, step by step.
The Baseline: A CI-Ready Playwright Project
First, ensure your project is configured for headless execution and artifact collection. Your playwright.config.ts should look something like this:
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
testDir: './tests',
fullyParallel: true, // Enable parallelization
retries: process.env.CI ? 2 : 0, // More retries in CI
workers: process.env.CI ? 2 : undefined, // Conservative worker count for now
reporter: [
['html', { outputFolder: 'playwright-report' }],
['list']
],
use: {
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'retain-on-failure',
},
projects: [
{
name: 'chromium',
use: { ...devices['Desktop Chrome'] },
},
],
});
Key settings: fullyParallel: true, CI-specific retries, and HTML reporting. Store sensitive data like auth tokens using your CI platform's secrets, never in the repo.
Improvement: The Docker Container
Environment consistency is the antidote to "but it works on my machine." Playwright provides official Docker images with all browsers pre-installed. Create this Dockerfile in your project root:
# Use the official Playwright image for the specific Node.js version
FROM mcr.microsoft.com/playwright:v1.58.2-noble
# Set working directory
WORKDIR /app
# Copy package files first for better layer caching
COPY package.json package-lock.json ./
# Install dependencies
RUN npm ci
# Copy the rest of the application and tests
COPY . .
# Install Playwright browsers (already in image, but ensure compatibility)
RUN npx playwright install --with-deps chromium
# Command to run tests (overridden in CI)
CMD ["npx", "playwright", "test"]
Build and test it locally:
docker build -t playwright-tests .
docker run --rm playwright-tests
You should see your tests execute in a pristine environment. The official image handles dependencies, browser binaries, and even system libraries. No more npm install failures because of a runner's missing libgtk.
Production-Hardening: Parallelization and Caching
A single CI job running tests sequentially is leaving performance on the table. We'll use two techniques: workers (threads on one machine) and sharding (splitting across multiple jobs).
First, update your config to use more workers in CI. Change the workers line:
workers: process.env.CI ? 4 : undefined,
But for large suites, you need sharding. Here's a GitHub Actions workflow that splits your tests across 3 parallel jobs, caches dependencies, and merges the reports:
name: Playwright Tests
on: [push]
jobs:
playwright-tests:
name: 'Playwright (Shard ${{ matrix.shard }}/${{ matrix.total }})'
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
total: [3]
shard: [1, 2, 3]
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Cache Docker layers
uses: actions/cache@v4
with:
path: /tmp/.buildx-cache
key: playwright-docker-${{ runner.os }}-${{ hashFiles('package-lock.json') }}
restore-keys: |
playwright-docker-${{ runner.os }}-
- name: Build Docker image
run: docker build -t playwright-tests .
- name: Run sharded tests
run: |
docker run --rm \
-v $(pwd)/test-results:/app/test-results \
-e CI=true \
playwright-tests npx playwright test \
--shard=${{ matrix.shard }}/${{ matrix.total }}
- name: Upload test results
uses: actions/upload-artifact@v4
if: always()
with:
name: shard-${{ matrix.shard }}-results
path: |
playwright-report/
test-results/
merge-reports:
name: Merge reports
needs: [playwright-tests]
runs-on: ubuntu-latest
if: always()
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Download all shard results
uses: actions/download-artifact@v4
with:
path: all-results
pattern: shard-*-results
merge-multiple: true
- name: Merge HTML reports
run: npx playwright merge-reports --reporter html ./all-results
- name: Upload merged report
uses: actions/upload-artifact@v4
with:
name: playwright-merged-report
path: playwright-report/
This workflow demonstrates several optimizations. Docker layer caching cuts build times from minutes to seconds on subsequent runs. The fail-fast: false ensures all shards complete even if one fails. The separate merge-reports job consumes the results from each shard and produces a unified HTML report you can browse.
For authentication challenges, bake API tokens or service account keys into the Docker build as build arguments (for CI only) or mount them at runtime from CI secrets. Never commit them.
What to Measure Next
Your pipeline is now stable and fast. Prove it. Track these metrics over the next 10 deployments:
- Total suite duration: From pipeline start to merged report. Aim for a 70% reduction.
- Failure rate: Tests failing in CI but passing locally. Should approach 0%.
- Flakiness score: Percentage of tests that pass on retry without code changes. Target <2%.
- Artifact size: Screenshots, traces, videos. Set alerts if they balloon.
- Cost: CI minutes consumed. Parallelization often reduces total compute time.
When you see a test fail, you'll have a Docker image to replicate it exactly, traces to replay the interaction, and the confidence that it's a real bug—not a CI ghost.