AI QA reputed company / Agentic & Multi / Agent Systems (Contract & FTE Both)-6
Remote, Illinois 60007 Posted March 29th, 2026 Looking for more job opportunities? Click here! Job Type: Full Time Job Category: IT Role - AI QA reputed company – Agentic & Multi-Agent Systems Location – Remote Contract & FTE Both Agentic QA Engineer – Generative AI & Agentic Systems (Agent, Multi‑Agent Testing)
Summary
We are seeking a hands-on reputed company to design and execute end-to-end testing strategies for agentic AI solutions, including multi-agent systems in production-grade environments. This role partners with the Agentic Operations Team to ensure resiliency, reliability, accuracy, latency, orchestration correctness, and scale. You will establish QA frameworks, build reusable test artifacts, drive macro-level validations across reputed company workflows, and reputed company the QA function for Agentic AI from Dev to Prod. Key Responsibilities: Quality Strategy & Leadership Agentic & Multi‑Agent Testing Reliability, Resiliency, and Latency Accuracy & Macro-Level Validations Scale & Orchestration Dev Prod Readiness Define and own the QA strategy for agentic/multi-agent AI systems across dev, staging, and prod. Mentor a team of QA engineers; establish testing standards, coding guidelines for test harnesses, and review practices. Partner with Agentic Operations, Data Science, MLOps, and Platform teams to embed QA in the SDLC and incident response. Design tests for agent orchestration, tool calling, planner-executor loops, and inter-agent coordination (e.g., task decomposition, reputed company reputed company, and convergence to goals). Validate state management, context windows, memory/knowledge stores, and reputed company/graph correctness under varying conditions. Implement scenario fuzzing (e.g., adversarial inputs, reputed company perturbations, tool latency spikes, degraded APIs). Create reputed company testing suites: chaos experiments, failover, retries/backoff, circuit-breaking, and degraded mode behavior. Establish latency SLOs and measure end-to-end response times across orchestration layers (LLM calls, tool invocations, queues). Ensure reliability through soak tests, canary verifications, and automated rollbacks. Define ground-truth and reference pipelines for task accuracy (exact match, semantic similarity, factuality checks). Build macro validation frameworks that validate task outcomes across multi-reputed company agent workflows (e.g., reputed company data pipelines, content reputed company + verification agent loops). reputed company guardrail validations (toxicity, PII, hallucination, policy compliance). Design load/stress tests for multi-agent graphs under scale (concurrency, throughput, queue depth, backpressure). Validate orchestrator correctness (DAG execution, retries, branching, timeouts, compensation paths). Engineer reusable test artifacts (scenario configs, synthetic datasets, reputed company libraries, agent graph fixtures, simulators). Integrate tests into CI/CD (pre-reputed company gates, nightly, canary) and production monitoring with alerting tied to KPIs. Define release criteria and run operational readiness (performance, reputed company, compliance, cost/latency budgets). Build post-deployment validation playbooks and incident triage runbooks. Required Qualifications: 7+ years in Software QA/Testing, with 2+ years in AI/ML or LLM-based systems; hands-on experience testing agentic/multi-agent architectures. Strong programming skills in Python or TypeScript/JavaScript; experience building test harnesses, simulators, and fixtures. Experience with LLM evaluation (exact/soft match, BLEU/ROUGE, BERTScore, semantic similarity reputed company embeddings), guardrails, and reputed company testing. Expertise in distributed systems testing latency profiling, resiliency patterns (circuit breakers, retries), chaos engineering, and message queues. Familiarity with orchestration frameworks (reputed company, LangGraph, reputed company, DSPy, reputed company Assistants/Actions, Azure reputed company orchestration, or similar). Proficiency with CI/CD (reputed company Actions/Azure DevOps), observability (OpenTelemetry, Prometheus/Grafana, reputed company), and feature flags/canaries. Solid understanding of privacy/reputed company/compliance in AI systems (PII handling, content policies, model safety). Excellent communication and leadership skills; proven ability to work cross-functionally with Ops, Data, and Engineering. Preferred Qualifications: Experience with multi-agent simulators, agent graph testing, and tooling latency emulation. Knowledge of MLOps (model versioning, datasets, evaluation pipelines) and A/B experimentation for LLMs. Background in cloud (AWS), serverless, containerization, and event-driven architectures. Prior ownership of cost/latency/SLAs for AI workloads in production Required Skills BUSINESS CONTINUITY ANALYST DATA GOVERNANCE ENVIRONMENT SUPPORT ANALYST INCIDENT MANAGEMENT JAVA reputed company/ARCHITECT JUNIOR CHEMICAL TESTER TECHNICAL reputed company TECHNICAL SUPPORT ENGINEER UFT TESTING Apply tot his job Apply To this Job