Test Grouping in Playwright: Organizing Your Test Suite

Test Grouping in Playwright: Best Practices for Organizing Your Tests

Playwright has become a favorite for end-to-end (E2E) testing, and for good reason. It's fast, reliable, and works easily across Chromium, Firefox, and WebKit. Plus, its built-in test runner takes the hassle out of writing, executing, and reporting tests—something every QA engineer, SDET, and developer can appreciate.

But here's the thing: your test structure matters just as much as the framework itself. In large-scale projects with hundreds (or even thousands) of test cases, a messy test suite leads to slow debugging, unclear reports, and inefficient CI/CD pipelines. A well-structured one? It makes everything faster, cleaner, and far easier to maintain.

In this blog, we'll break down the right way to structure Playwright tests—from projects and suites to test cases and steps—with real-world examples from Amazon and Gmail. We'll also compare Playwright vs. Selenium vs. Cypress.

So let's jump in!

The Fundamentals of Playwright Test Hierarchy

Playwright's test runner uses a hierarchical model to organize your end-to-end tests. At a top level, tests can be grouped into projects, which in turn contain suites (files or describe blocks). Inside these suites are test cases, and each test can also contain steps for more granular reporting.

Caption 1

1. Projects

A Project in Playwright represents a specific configuration for your tests. For instance, you can have a project for Chromium, one for Firefox, and one for WebKit—each with unique setup options (e.g., viewport sizes or base URLs). This is particularly beneficial when you need cross-browser testing or different environments (such as staging vs. production).

You define these projects in playwright.config.ts or playwright.config.js.

Here's a simple example in TypeScript:

// playwright.config.ts
 
export default {
  testDir: "tests",
 
  projects: [
    {
      name: "chromium",
 
      use: { browserName: "chromium", headless: true },
    },
 
    {
      name: "firefox",
 
      use: { browserName: "firefox", headless: true },
    },
 
    // Add more projects if needed
  ],
};

When you run your tests (e.g., npx playwright test), Playwright automatically creates one Project Suite for each defined project. If you have two projects, your tests run in both contexts, often in parallel. This built-in capability significantly eases cross-browser coverage and speeds up test feedback.

2. Suites (Files and Describe Blocks)

A Suite is a collection of test cases. Playwright defines suites in two primary ways:

Each test file becomes its own suite by default.
You can explicitly create nested suites within a file using test.describe().

When you have multiple projects, each test file (and its internal describe blocks) is repeated per project. This means if you have five test files and three projects, you effectively have 15 suites (five file-based suites per project).

File-based suites generally group tests around a theme—commonly a feature or functionality (e.g., search.spec.ts for a search feature). By splitting features into multiple files, you allow parallel execution (since Playwright can run different files concurrently). Within each file, you can further categorize tests with test.describe(). For example:

import { test, expect } from "@playwright/test";
 
test.describe("Shopping Cart", () => {
  test("Add item to cart", async ({ page }) => {
    // ...
  });
 
  test("Remove item from cart", async ({ page }) => {
    // ...
  });
});

This grouping helps structure your test logic, clarifies reports, and allows you to run or skip entire blocks easily (e.g., npx playwright test -g "Shopping Cart").

3. Test Cases

At the lowest level are test cases themselves, written with test(<name>, <callback>). Each test case focuses on a single scenario—e.g., "Login with valid credentials," "Search product and add to cart," or "Show error for invalid payment details."

test("Should add item to cart", async ({ page }) => {
  await page.goto("[https://example.com](https://example.com/)");
 
  // ...
 
  await expect(page.locator("#cart-count")).toHaveText("1");
});

Within a file or describe block, you can have multiple test cases. By default, tests in the same file run sequentially in a single worker, while tests in different files run in parallel (across multiple workers). This separation promotes more efficient use of CPU resources, especially in continuous integration (CI) environments.

4. Test Steps

Playwright also provides test steps to annotate specific sections within a single test. You can wrap parts of your test in test.step('Description', async () => ), enabling finer reporting granularity in the event of failures or for analyzing step-by-step progress:

test("Full checkout flow", async ({ page }) => {
  await test.step("Login as user", async () => {
    // ...
  });
 
  await test.step("Add items to cart", async () => {
    // ...
  });
 
  await test.step("Proceed to checkout", async () => {
    // ...
  });
});

Though optional, steps can be indispensable for complex tests, because they reveal precisely which segment fails if the test breaks.

How Large-Scale Teams Organize Their Tests: Real-World Playwright Test Grouping

Organizing Playwright tests well is the difference between a manageable test suite and a debugging nightmare. In large applications, tests pile up fast—if they aren't structured properly, maintenance becomes painful, debugging takes longer, and CI/CD pipelines slow down.

To illustrate how test grouping works in real-world scenarios, let's look at two different approaches:

Feature-based test organization (Amazon style)
Workflow-based test organization (Gmail style)

These aren't actual test structures from Amazon or Gmail, but they reflect the best practices teams use when testing complex applications.

Caption 2

Amazon.com (Feature-Based Test Organization)

Amazon's website has many features and user workflows. A sensible way to organize Playwright tests for an application like Amazon is by major feature or user journey. For instance, you might create separate test files (suites) for:

Search Functionality – tests for searching products, applying filters, etc.
Shopping Cart – tests for adding/removing items in the cart, cart view updates.
Checkout Process – tests for the checkout workflow (entering address, choosing shipping, etc.).
Payments – tests for payment methods, handling payment errors, etc.
User Account – tests for login, logout, profile updates, if applicable.
...and so on (each feature or module gets its own suite).

Your project structure could look like this (feature-based grouping):

tests/

├── amazon/

│ ├── search.spec.ts

│ ├── cart.spec.ts

│ ├── checkout.spec.ts

│ └── payments.spec.ts

Each of these files would contain the test cases relevant to that feature. Inside, you might further group tests using test.describe(). For example, in checkout.spec.ts, you could group tests by different scenarios of checkout:

// tests/amazon/checkout.spec.ts
 
import { test, expect } from "@playwright/test";
 
test.describe("Checkout Process", () => {
  test("guest checkout with credit card", async ({ page }) => {
    // ... steps for guest checkout using a credit card ...
  });
 
  test("authenticated checkout with saved address", async ({ page }) => {
    // ... steps for logged-in user checkout ...
  });
 
  test("should show error for invalid payment details", async ({ page }) => {
    // ... steps to trigger a payment error and verify message ...
  });
});

In this snippet, all three test cases belong to the "Checkout Process" suite. We could further nest describes if needed (for example, group all payment-related checks), but it's often sufficient at this level. By structuring Amazon's tests this way, one team can work on the search tests while another works on checkout tests without stepping on each other's toes, and you can run just the amazon/cart.spec.ts tests if a bug was reported in the cart functionality. Also, if using Playwright projects for different browsers, each of these file suites will run under each browser project, ensuring that (for example) the checkout process is verified on Chrome, Firefox, and Safari all in one go.

Gmail.com (Workflow-Based Test Organization)

For a webmail service like Gmail, you would similarly organize tests by feature areas. Key workflows could include:

Login/Authentication – tests for successful login, logout, incorrect password, two-factor auth, etc.
Email Composition – tests for composing an email, adding recipients, sending, drafts autosave.
Attachments – tests for attaching files, attachment size limits, downloading attachments.
Settings – tests for user settings like theme change, filters, forwarding, etc.
Mailbox Operations – (optional) tests for labeling emails, archiving, deleting, searching emails.

Your test files might be structured as:

tests/

├── gmail/

│ ├── login.spec.ts

│ ├── compose.spec.ts

│ ├── attachments.spec.ts

│ └── settings.spec.ts

Inside login.spec.ts, for example, you could have:

// tests/gmail/login.spec.ts
 
import { test, expect } from "@playwright/test";
 
test.describe("Gmail Login", () => {
  test("should log in with valid credentials", async ({ page }) => {
    await page.goto("https://mail.google.com/");
 
    // ... perform login with valid user ...
 
    await expect(page).toHaveURL(/mail\.google\.com\/mail\/u\/0/); // inbox loaded
  });
 
  test("should show an error for wrong password", async ({ page }) => {
    await page.goto("https://mail.google.com/");
 
    // ... attempt login with incorrect password ...
 
    const errorMsg = page.getByText("Wrong password");
 
    await expect(errorMsg).toBeVisible();
  });
 
  test("should prevent login for unregistered email", async ({ page }) => {
    // ... attempt login with an email that doesn't exist ...
    // ... verify appropriate error is shown ...
  });
});

In this setup, all login-related test cases fall under the "Gmail Login" suite. Likewise, compose.spec.ts groups tests under "Compose Email", keeping functionality-specific tests in one place. This feature-based structuring ensures clarity and maintainability.

Large applications often refine this further with subfolders or tags. For example, compose.spec.ts might separate plain text emails from emails with attachments, making tests easy to locate and manage.

Both the Amazon and Gmail examples follow a simple rule: test grouping mirrors product features. This makes it easier for QA engineers and developers to find what they need—whether it's payments, attachments, or login—and enables targeted test execution. A developer working on Gmail's attachments can run just the attachments.spec.ts suite instead of the full test suite, streamlining debugging and validation.

Best Practices for Structuring Playwright Tests for Scalability

When implementing your test organization, consider using the Page Object Model pattern to further improve maintainability. This approach can significantly reduce the effort required to update tests when your UI changes, addressing one of the main challenges faced by test automation engineers. Having said that, there isn't a one-size-fits-all solution, but here are key factors to guide your structuring decisions:

Parallel Execution & Speed
- Playwright runs each test file in parallel. Splitting your suite into logical files (e.g., by feature) boosts concurrency.
- If you keep too many tests in a single file, you limit parallelism. Meanwhile, if you over-split (e.g., creating a new file for every single test), you may get diminishing returns or complexity in management.
- Grouping logically by features or modules often achieves a good balance, letting you run certain features in parallel while also maintaining code clarity.
Maintainability & Scalability
- Over time, your test suite grows. You'll want an easy way to add, remove, or refactor tests without confusion.
- Feature-based organization typically scales well: if the "cart" feature changes, it's clear where updates belong.
- Consistent naming conventions for files (e.g., search.spec.ts, checkout.spec.ts) helps new team members find their way around quickly.
Reusability with Page Object Model (POM)
- POM is a design pattern separating page-level locators and actions into "page object" classes. Each page or component has its own class, so test files remain concise and readable.
- Rather than repeating selectors or page actions in every test, you define them once. This synergy complements good test grouping: your test directories focus on business logic (like "login" or "checkout"), and your page objects (in a /pages/ folder) handle the underlying interactions.
Setup and Teardown
- Playwright allows shared setup (via test.beforeAll()) and teardown for a suite, or per test (test.beforeEach()).
- Consider how to group tests that share a resource-heavy setup, such as logging in or seeding data. You might place them together in one file with a test.beforeAll() hook to reduce repeated overhead. But note, these tests then run sequentially in that file, potentially slowing overall concurrency if the file is large.
Test-type vs. Feature-based
- Some teams separate tests by type: "smoke," "regression," or "accessibility." Others group purely by feature or domain.
- If you do test-type-based grouping, you can still rely on descriptive naming or subfolders (e.g., smoke/checkout.spec.ts, regression/checkout.spec.ts).
- Feature-based grouping often aligns well with how real-world teams and features are organized, but each project can adapt the approach best fitting their development process.
Tagging and Filtering
- Playwright doesn't have a built-in tag system. However, you can rely on naming conventions or use .only(), .skip(), or CLI patterns (npx playwright test -g "Checkout") to filter tests.
- Think about how you'll selectively run certain tests in your CI/CD pipeline. If you rely on dynamic filtering, consistent test naming and file structure becomes even more important.

How a Well-Structured Test Suite Improves Execution and Reporting

1. Parallelization & Test Speed

File-level parallelism: Each test file can run in its own worker. For instance, if you have 10 test files and 2 CPU cores, Playwright can spin up multiple workers, reducing total runtime significantly.
Avoiding unnecessary serialization: If you group many scenarios in one file with test.describe.serial(), you lose parallelism. Use serial execution sparingly, only for tests that truly depend on shared state.

2. Clearer Test Reports & Debugging

Readable suite and test names: By breaking down tests into relevant files (e.g., payments.spec.ts) and giving them descriptive names, the final report clearly indicates which part of the application is failing.
Test steps: Detailed step segmentation appears in HTML and trace reports. So if a test fails at step "Add item to cart," you immediately know where to start debugging.

3. Maintaining CI/CD Efficiency

Separate or combined runs: If you've configured multiple projects for cross-browser testing, you can run them all at once or split them across multiple CI agents for even faster feedback.
Targeted runs: A well-structured suite allows you to run only the tests you need for a particular pipeline event (e.g., running a smoke testing suite on every commit, running full regression nightly). Good grouping ensures your build scripts can easily filter, skip, or batch certain test files as needed.

Comparing Playwright, Selenium, and Cypress for Cross-Browser Testing

Playwright vs. Selenium

Selenium has been around forever—it's practically the OG of browser automation. But it doesn't come with a built-in test runner. To structure your tests properly, you need to pair it with JUnit, TestNG, or NUnit, which handle grouping and parallel execution. While this gives you flexibility, it also means more manual setup just to match what Playwright offers out of the box—automatic cross-browser execution and an integrated test runner.

With Selenium, how you structure your tests depends heavily on the third-party framework you choose. In contrast, Playwright has a clear, built-in hierarchy—projects, suites, test cases, and steps—making it much easier to organize tests efficiently without extra configuration.

Playwright vs. Cypress

Cypress is another solid choice, and it also encourages logical test structuring using spec files, describe() blocks, and test cases. However, there's a catch—Cypress runs in one browser at a time, meaning you can't run tests across Chromium, Firefox, and WebKit simultaneously like you can with Playwright.

While Cypress allows you to parallelize test specs across multiple machines, Playwright takes things further. With Playwright's project configuration, you can define multiple browser contexts and run them concurrently in a single command. This makes Playwright a stronger choice for large-scale, multi-browser test suites where broad coverage and execution speed matter.

At the end of the day, both Selenium and Cypress have their strengths, but if speed, scalability, and built-in structure are what you're after, Playwright makes your life a whole lot easier.

Structure test files by feature (e.g., login.spec.ts, cart.spec.ts).
Use test.describe() to group related tests within a file.
Keep files reasonably sized—too few files limit concurrency, too many become a nightmare to manage.

3. Write Focused Test Cases & Use Steps for Better Debugging

Each test case should focus on one scenario (e.g., "User can log in with valid credentials").
Use steps (test.step()) to highlight key actions—so when a test fails, you know exactly where the issue is.

4. Parallelization & Reporting = Faster, Clearer Feedback

Playwright runs test files in parallel by default—take advantage of that!
Structuring tests well makes reports more readable and debugging easier (meaning less time spent staring at logs).

5. Keep Tests Maintainable (Use the Page Object Model)

Use the Page Object Model to separate UI interactions from test logic.
Reusable components mean less duplication and easier updates when the UI changes.

6. Balance Speed & Complexity Thoughtfully

If tests share state or need to run in a set order, group them carefully (e.g., test.describe.serial()).
Don't overuse serialization—it slows things down unnecessarily.

7. Optimize for CI/CD Pipelines

Structure your tests so you can run subsets easily (e.g., only cart tests when updating that feature).
Parallel test runs across multiple browsers in one command help speed up feedback cycles.

At the end of the day, structured Playwright tests mean fewer issues, faster runs, and a test suite that scales as your product grows.

As your test suite grows, maintaining optimal structure can become time-consuming. Consider using intelligent test prioritization and autonomous testing agents to help manage your test organization more efficiently.

Happy testing!