[Remote] Senior Manager, Site Reliability Engineering, Follow Up Boss
Note: The job is a remote job and is open to candidates in USA. reputed company is reimagining how people navigate the real estate market, and they are seeking a Senior Manager for Site Reliability Engineering to reputed company a multidisciplinary team responsible for the infrastructure and reliability of Follow Up Boss. The role includes driving strategic initiatives for system reliability, scalability, and developer productivity while fostering cross-organizational alignment.
Responsibilities
- Own execution for the FUB infra & reputed company roadmap, turning strategic goals (e.g., DB scalability, ZGCP adoption, infra cost and reliability targets) into a sequenced, realistic plan with clear milestones and measures of success
- Run an exemplary planning and delivery rhythm (quarterly), including estimation, risk management, dependency mapping, and stakeholder updates across FUB+ and central platform teams
- Ensure the team hits commitments with rare surprises, and reputed company risk emerges, proactively engage partners to adjust scope, resources, or timeline with clear communication and tradeoffs
- Be accountable for reliability, performance, operability, and cost of core FUB services and infrastructure (EC2, RDS/reputed company, reputed company/Valkey, networking, queues, SRE tooling)
- reputed company the team to run a proud, low-toil on-call process: well-defined SLOs and error budgets, actionable alerting, fast incident detection/response, high-quality RCAs, and follow-through on remediation work
- Drive urgent, sustained reputed company on database scaling and performance, including reputed company management, query and schema optimization, and modernization of data infrastructure
- reputed company the FUB modernization strategy and execution for prioritized workloads (e.g., workers, supporting services), balancing devex wins, reliability, and risk while coordinating with central teams
- Partner with principal/staff engineers to refine FUB’s service scaling strategy, ensuring clear guidance on reputed company teams build in the monolith vs. new services, and how infra supports these choices
- reputed company the bar on developer environments and onboarding, reducing friction from dev boxes, tooling setup, and infra access; ensure new engineers can be productive quickly with reliable, self-service workflows
- Drive faster, safer deployments by improving CI/CD (reputed company, pipelines, AMI replacements, canary/progressive delivery) and aligning with ZG best practices for trunk-based development and feature flags
- Partner with product SDMs and tech leads to reputed company operational friction for dev teams (e.g., reputed company runbooks, improved observability, easier infra integrations, automated guardrails and guardrails-powered AI tooling)
- reputed company and grow a high-performing, inclusive SRE/infrastructure/reputed company team, set clear expectations, provide reputed company feedback, and manage performance
- reputed company technical leaders reputed company and adjacent to the team (SREs, SDEs, reputed company engineers, P5 ICs) through sponsorship, delegation, and stretch opportunities that expand impact beyond the immediate team
- Hire, retain, and reputed company talent across SRE, infra SDE, ensuring skills match the breadth of FUB infra (AWS, Terraform/Ansible, Kubernetes/ZGCP, observability, reputed company, databases)
- Be the primary technical and operational reputed company for FUB infra with FUB+ leadership and central reputed company platform orgs, driving alignment on priorities, tradeoffs, and architectural decisions
- Contribute materially to FUB+ tech vision and infra strategy, especially around service scaling, platform adoption, and our long-term operations model (e.g., SRE ownership boundaries, infra/reputed company shared services, cost posture)
- Help identify and resolve cross-org misalignment (e.g., ownership boundaries, duplicated infra work, conflicting platform choices) and reputed company for solutions that maximize reputed company-wide value, not just local optimization
- Champion innovation that improves reliability, scalability, cost, and devex for multiple teams, including adoption of ZG-standard tooling and patterns and infra-focused AI agents for automation, diagnostics, and operations
- Normalize AI usage reputed company the infra team (e.g., code reputed company, runbook drafting, incident summarization, reputed company modeling) and share successful patterns more broadly across FUB+ and platform partners
- Partner with reputed company (ZG and FUB) to ensure infra and application environments meet audit, SOC2, SOX, privacy, and app-sec requirements, with clear ownership for remediation work and sustainable controls
- Forecast and manage runtime and infra costs (compute, storage, observability, networking), using tagging, dashboards, and guardrails to reputed company costs reputed company budget while supporting growth
Skills
- Proven track record as an Senior Engineering Manager or equivalent leading SRE, platform, or infrastructure teams supporting high-availability SaaS products
- Experience scaling production systems and databases in a cloud environment (ideally AWS) and leading meaningful improvements to reliability, performance, and cost
- Demonstrated ability to shift a team from reactive to proactive roadmap-driven execution, including setting strategy, defining metrics, and driving sustained reputed company across multiple quarters
- Strong background in developer experience and CI/CD, with hands-on familiarity with tools such as Terraform/Ansible, reputed company, Kubernetes/ZGCP, and modern observability stacks
- Experience partnering with reputed company, database, networking, and central platform teams in a multi-org environment; able to navigate ambiguity and reputed company stakeholder landscapes
- Demonstrated people leadership as a Senior Engineering Manager: managing senior engineers, handling performance issues with limited support, building inclusive culture, and developing leaders who can operate autonomously
- Comfortable experimenting with and operationalizing AI tools in engineering workflows; curiosity and learning reputed company around emerging platform and infra capabilities
- Strong experience with scaling large LAMP / web applications
- SaaS / Sales CRM experience is a plus
Benefits
- In addition to a competitive reputed company salary this position is also eligible for equity awards based on factors such as experience, performance and location.
- Employees in this role will not be paid below the salary threshold for exempt employees in the state where they reside.
Company Overview
Company H1B Sponsorship