
Headquarters: Washington, DC
URL: http://mybluebird.app
SENIOR INFRASTRUCTURE & SECURITY ENGINEER
DevOps | Site Reliability | Cloud Security
The Opportunity
We process millions of SMS and MMS messages daily across a distributed platform built on Google Cloud — Cloud Run microservices, Pub/Sub event pipelines, Spanner databases, and Memorystore for Redis. Our infrastructure auto-scales aggressively to meet campaign demand, our data pipelines handle real-time delivery tracking at high velocity, and our systems must be fast, secure, and reliable around the clock.
We’re looking for a Senior Infrastructure & Security Engineer to own the reliability, security, and operational maturity of this platform. You’ll be the first dedicated infrastructure hire, working directly with the CTO to shape the technical foundation as we scale. This isn’t a role where you’ll maintain someone else’s runbooks — you’ll define the roadmap, make architectural decisions, and build the systems that keep our platform running and our customers’ data safe.
What You’ll Own
Infrastructure as Code & Cloud Architecture
Own and evolve our Terraform-managed GCP infrastructure spanning a Shared VPC host project and multiple service projects. Design for cost efficiency, resilience, and scalability across Cloud Run, Spanner, Pub/Sub, Cloud Storage, Memorystore for Redis, and Cloud Tasks. You’ll manage environment promotion across dev, staging, and production.
Reliability & Observability
Build comprehensive monitoring, alerting, and incident response capabilities using Cloud Monitoring, Cloud Logging, and Cloud Trace. Establish SLIs and SLOs for critical message delivery paths. Reduce mean time to detection and recovery. Design health checks and auto-healing patterns for Cloud Run services processing millions of daily messages.
Cloud Security
Harden our platform across network, application, and data layers. This includes VPC firewall rules and network policies, IAM role design and service account management, secrets management via Secret Manager, Cloud Armor policies for DDoS and rate limiting, API Gateway security configurations, and dependency scanning. Lead security reviews and own incident response for security events.
CI/CD & Developer Experience
Maintain and improve our GitHub Actions-based deployment pipelines for a TypeScript monorepo deploying to Cloud Run. Ensure the engineering team can ship safely and quickly with automated testing, linting, container builds, and environment-specific deployments. Optimize build times and deployment reliability.
Performance & Auto-Scaling
Tune Cloud Run autoscaling policies including min/max instances and concurrency settings for both public-facing API services and private Pub/Sub processing workers. Optimize Spanner query performance and node allocation. Ensure our distributed rate-limiting infrastructure using Redis handles coordination across horizontally scaling instances with sub-millisecond overhead.
Compliance & Data Protection
Help establish and maintain compliance practices relevant to messaging platforms, including TCPA requirements, carrier-specific policies, data retention and encryption standards, and audit logging. Ensure our platform meets the security and data handling expectations of enterprise customers.
What We’re Looking For
Required
5+ years in infrastructure, DevOps, or SRE roles with increasing scope and ownership
Deep Google Cloud Platform experience, specifically with Cloud Run, VPC networking, IAM, and at least one managed database service
Strong Terraform skills in production — you’ve authored and maintained multi-environment, modular Terraform codebases, not just run applies
Hands-on cloud security experience: network security design (firewall rules, private networking, VPC peering), IAM policy architecture, secrets management, and vulnerability assessment
GitHub Actions proficiency — you’ve built and maintained CI/CD pipelines for containerized applications deploying to cloud infrastructure
Experience operating distributed systems that process high message or event volumes with strict latency and reliability requirements
Strong Linux fundamentals, networking knowledge (DNS, TLS, load balancing), and comfort debugging production issues across the stack
Security-first mindset — you think about attack surfaces, least privilege, encryption in transit and at rest, and incident response as part of every design decision
Comfort with on-call ownership and incident response in a small-team environment
Preferred
Experience with Spanner, Pub/Sub, Memorystore for Redis, Cloud Tasks, or Cloud Armor specifically
Background in messaging or telecom infrastructure — carrier API integrations, throughput management, rate limiting at scale
Experience with TypeScript/Node.js application ecosystems (you don’t need to be a full-stack developer, but understanding the runtime helps)
Monorepo CI/CD experience — managing builds, tests, and deployments across multiple services in a single repository
Familiarity with compliance frameworks relevant to communications platforms (TCPA, SOC 2, carrier security requirements)
Experience as the sole or primary infrastructure engineer at a growing company — you’ve owned it end-to-end
Certifications: Google Cloud Professional Cloud Security Engineer or Professional Cloud Architect (valued but not required)
What Makes This Role Different
Ownership, not maintenance. You’ll be the first dedicated infrastructure hire. You won’t inherit a playbook — you’ll write it. Your decisions will directly shape how the platform evolves.
Real scale, small team. Millions of messages daily, multi-region considerations, carrier-level SLAs — but a team small enough that your work has immediate, visible impact.
Interesting problems. Distributed rate limiting across auto-scaling Cloud Run instances. High-throughput Pub/Sub pipelines with dead letter handling and retry strategies. Sharded counter patterns in Spanner for real-time campaign metrics. These aren’t contrived challenges.
Direct CTO collaboration. You’ll work alongside a technically hands-on CTO who has built this infrastructure and understands the tradeoffs. You’ll have context, support, and the authority to make decisions.
Autonomy over process. We care about outcomes: uptime, security posture, deployment velocity, cost efficiency. How you get there is up to you.
Compensation & Benefits
Competitive base salary commensurate with experience (range available upon request)
Bonus program
Remote-first with flexible working hours
Direct reporting line to the CTO
Our Stack at a Glance
Cloud
Google Cloud Platform
IaC
Terraform
CI/CD
GitHub Actions
Languages
TypeScript running in Node.js runtime
Networking
Shared VPC, Global External HTTPS Load Balancer, API Gateway, Cloud Armor
Monitoring
Cloud Monitoring, Cloud Logging, Cloud Trace
Architecture
Event-driven microservices
Go to posting –> https://weworkremotely.com/remote-jobs/bluebird-technologies-senior-infrastructure-security-engineer
Leave a Reply