Cloud Reliability & Recovery Engineer

Alphasense

Remote - IndiaRemoteCorporate Services, Technology, & Security

About the role

About AlphaSense:

The world’s most sophisticated companies rely on AlphaSense to remove uncertainty from decision-making. With market intelligence and search built on proven AI, AlphaSense delivers insights that matter from content you can trust. Our universe of public and private content includes equity research, company filings, event transcripts, expert calls, news, trade journals, and clients’ own research content.

The acquisition of Tegus by AlphaSense in 2024 advances our shared mission to empower professionals to make smarter decisions through AI-driven market intelligence. Together, AlphaSense and Tegus will accelerate growth, innovation, and content expansion, with complementary product and content capabilities that enable users to unearth even more comprehensive insights from thousands of content sets. Our platform is trusted by over 6,000 enterprise customers, including a majority of the S&P 500. Founded in 2011, AlphaSense is headquartered in New York City with more than 2,000 employees across the globe and offices in the U.S., U.K., Finland, India, Singapore, Canada, and Ireland. Come join us!

Role Overview:

We are seeking an experienced Cloud Engineer to design, implement, and continuously improve our Business Continuity Planning (BCP) and Disaster Recovery (DR) capabilities across AWS cloud environments.

This is a hands-on technical role requiring deep AWS expertise, strong scripting skills, and a passion for building highly available, fault-tolerant, and resilient cloud architecture by leveraging container orchestration with Kubernetes and infrastructure as code using Terraform. Good understanding of CI/CD pipelines to enable rapid, reliable deployments and minimize downtime. Adept at implementing DR strategies including multi-region failover, backup and restore automation, and recovery testing aligned with industry BCP/DR standards. You will collaborate closely with security, infrastructure, and application teams to ensure our systems can withstand and rapidly recover from any disruption.

Reports To: Director of Event Response

Level: Senior Individual Contributor

Key Responsibilities:

Cloud Resilience Architecture

Design and implement multi-region, multi-AZ AWS architectures that meet RTO/RPO targets

Engineer active-active and active-passive failover patterns using Route 53, Global Accelerator, and CloudFront

Build automated DR runbooks and playbooks using AWS Systems Manager Automation and Step Functions

Implement chaos engineering practices using AWS Fault Injection Simulator (FIS) to validate resiliency

Architect cross-region replication strategies for S3, DynamoDB Global Tables, RDS, and Aurora Global

Review containerized workloads using Kubernetes, ensuring resilience through self-healing, auto-scaling, and multi-cluster or multi-region deployments.

Backup & Recovery Engineering

Administer AWS Backup across all services (EC2, EBS, RDS, EFS, FSx, DynamoDB, Aurora) with policy-based automation

Design immutab