Your browser is outdated!

To ensure you have the best experience and security possible, update your browser. Update now


David Sage

David Sage

SRE | Platform Engineering Leader

Prosper (75078) United States (Texas)
Employed Open to opportunities
Experienced and data-driven Site Reliability Engineering leader with a track record of building cross-functional, geodistributed DevOps teams. Skilled at defining and implementing SRE best practices, driving innovation, with a strong focus on uptime and customer experience.

My strong mix of development and operations skills, deep experience with AWS, and understanding of the interplay between software development and operations have enabled me to build and lead high-performing teams that drive business results.
  • Helped lead the SRE team through the path to IPO and oversaw the completion of governance processes relevant to my teams
  • Interviewed M&A targets and helped develop the due dilligence processes for potential aquisitions with a focus on system architecture, system availability, technology stacks / depreciation, and operational readiness
  • Helped define divisional OKRs, and assisted in measuring the completion of R&D division goals on a quarterly basis
  • Developed cross functional SRE mentoring programs, to improve internal candidate pipelines, employee engagement + morale, and reduce attrition
  • Managed M&A integration from a deployment, system development, security and vulnerability management program standpoint
  • Presented to prospects and contributed to RFPs on the subjects of system availability, security defense in depth, vulnerability management, and cloud native technology adoption
  • Oversaw the execution and deliverables from network, system, release, network, and container platform engineering teams
  • Mentored and grew managers, along with individual contributors from a large cross-section of the business
  • Democratized operations and change management. Moved from constant firefighting to standard, repeatable, prompt process, without downtime for all patching, release, and administrative tasks. Developed and published OLAs to reduce friction and measure the success of process-driven Ops
  • Built out a Kubernetes focused team and partnered with development to launch our first microservices in EKS. Worked with leaders in Product and Development to build a comprehensive roadmap to Kubernetes in order to minimize infrastructure spend and release complexity, while maximizing uptime
  • Participated in the company's security and compliance steering committee, focused on providing world class protection for our systems and customer data
  • Developed the Release and System Engineering teams from the ground up and matured Operations and Network Engineering. Built teams responsible for core system operations, automated infrastructure provisioning, application deployment, and incident response
  • Successfully led the migration of Alkami's privately hosted customer and corporate environments to AWS
  • Oversaw the maintenance of remaining corporate hardware, implementation of enterprise vulnerability management programs, and acted as the product owner for my teams
  • Ran the production certification for PCI/SOC 2 Type 2/SOX assessments, and led the technical response for gap item resolution
  • Created repeatable incident response and retrospective processes and automation used for all severity 1 and 2 incidents. Reduced MTTR, customer satisfaction, and data quality through the development of a custom Slack chatbot focused on incident response and client communications.
  • Implemented a robust monitoring program using NewRelic and ElasticSearch. Championed the use of Infrastructure as Code. Transformed the team from point and click system builders to developers who produced highly automated, repeatable, and thoroughly tested infrastructure
  • Modernized the release and deploy process by implementing identical infrastructure and deployment automation in all environments. Built self-service tooling to allow developers to deploy code and infrastructure easily, without direct access to systems. Scaled deployments to thousands of application releases per month
  • Championed best practices with the development and product organizations. Helped lead cross-functional incident reviews to drive action items and build strong data around areas of concern. Worked with product to use this data to drive focused technical debt paydown and rearchitecture, with quantifiable ROI
  • Responsible for the availability, scalability, confugration, deployment, and monitoring of our online banking platform, serving millions of contracted users.
  • Built and enhanced CI, deployment and maintenance workflows in Jenkins and TeamCity
  • Designed and implemented an ELK Stack cluster for centralized logging, capable of handling the 10-25k events per second emitted from Dev, QA, Staging, and Production
  • Architected and wrote PowerShell modules and the fleet-wide deployment process used for all areas of application operations, deployment, and configuration management
  • Wrote custom web and Windows services for Monitoring / Operations and to integrate Jira, New Relic, and DynDNS/CloudFlare Data in our applications and Hipchat/Slack
  • Created infrastructure automation using Bash, PowerShell, Terraform, and Packer. Reduced new environment spin up time from weeks to hours
  • Created automated Testing Tools (Custom HttpClient, NUnit, and Selenium) to ensure quality releases, effective monitoring, and to power rollback decisioning
  • Worked directly with architecture and product engineering to shape and improve microservices strategy, and codify SRE requirements and NFRs
  • Worked with project management and development to complete backend configuration and ETL processes for online banking conversions
  • Created reusable scripting (PowerShell, T-SQL) and tooling (C#) to streamline implementation processes
  • Responsible for in-depth troubleshooting of new features, releases, and participating in code reviews for the application codebase
  • Directly interfaced with customer development teams to integrate on-premise applications with our SaaS solution
  • Configured servers, web sites, applications, and databases for customers as well as internal teams
  • Created the first build server based automated deployments of code and configuration in to customer facing environments
  • Creatied both internal and external technical documentation, and standards for runbook and configuration knowledge base entries
  • AWS
  • Docker / ECS
  • Cloudflare
  • Azure
  • Kubernetes / EKS
  • API Gateway
  • VMWare / HyperV
  • Powershell
  • Windows Server
  • Terraform
  • Bash / Python
  • Active Directory
  • IIS / Tomcat / Apache
  • Packer
  • Linux (CentOS / Amazon Linux)
  • Serverless Framework
  • Nginx / HAProxy
  • TeamCity
  • Jenkins
  • CodeDeploy
  • ArgoCD
  • Chocolatey
  • TFS
  • Spinaker


  • Git
  • C#
  • WPF
  • Java
  • T-SQL
  • WCF
  • Node JS
  • ELK
  • SonarQube
  • Splunk
  • Prometheus
  • NewRelic
  • PagerDuty
  • Grafana
  • SolarWinds

AWS Certified Developer Associate

AWS Certified Solutions Architect Associate