Operations Playbook

This playbook is for support and operations teams handling ECTS runtime issues.

A. Daily Health Checks

1. Access checks

  1. Open https://ects.keshiholdings.com/login.
  2. Confirm Web Portal tile and Keycloak handoff are reachable.
  3. Log in with a test ops account and verify top modules load.

2. Functional smoke checks

Validate at least one page in each major domain:

  • Cargo: /cargo/unassigned-cargo
  • Journey: /journey/active
  • Routes: /routes/list
  • Inventory: /inventories/inventory-summary
  • Alerts: /alerts/open-alerts
  • Devices: /device/device-list
  • Reports: /reports
  • Settings: /settings/users

3. API checks

Validate representative backend endpoints:

  • /api/session
  • /api/server
  • /api/statistics

B. Incident Patterns and Actions

1. Login works but modules are missing

Likely causes:

  • role claim missing in token (ects_view_*, ects_edit_*)
  • route guard mismatch in frontend SessionManager
  • backend @RolesAllowed mismatch

Actions:

  1. Decode access token claims.
  2. Confirm required route role in frontend route config.
  3. Confirm backend role annotations for failing endpoint.

2. Unauthorized/forbidden API responses

Likely causes:

  • expired or invalid token
  • OIDC issuer/client mismatch
  • auth config drift in backend

Actions:

  1. Validate token exp, issuer, audience, and client mapping.
  2. Verify Keycloak client and realm settings.
  3. Recheck backend OpenID/auth settings.

3. Forms and drafts fail to save

Likely causes:

  • draft endpoint permission issue
  • validation failure on payload
  • frontend/backend payload drift

Actions:

  1. Capture failing request payload from browser network panel.
  2. Replay request against API with same token.
  3. Compare payload contract with backend DTO and validators.

C. Live Validation Notes (March 2, 2026)

During live super-admin capture for documentation, the following routes showed runtime errors:

  • /routes/route-devices -> frontend exception (undefined ... map)
  • /settings/server -> frontend exception (undefined ... speedUnit)

Recommended handling:

  1. Treat both as product defects.
  2. Raise bug tickets with route path, console stack trace, and screenshot artifacts.
  3. Keep these routes out of operator SOPs until fixed.

D. Release Readiness Checklist

  • Auth flow validated against target Keycloak realm
  • Role matrix smoke-tested with representative personas
  • Critical cargo/journey/route/inventory flows verified
  • Migration and DB connectivity confirmed
  • Alerts and websocket updates confirmed
  • Frontend route-level error scan completed

E. Escalation Packet Template

When escalating to engineering, include:

  • exact environment and timestamp
  • user role claims (redacted token)
  • failing route and API endpoint
  • screenshot and console stack trace
  • network request/response payload excerpts
  • expected behavior vs observed behavior