Ongoing AWS Platform Management

Let Steamhaus become your full AWS operations team. Our AWS certified Site Reliability Engineers will become part of your team, without the expense or hassle of building your own in-house operations team.

We deliver this service according to SRE (Site Reliability Engineering) principles.

What’s Steamhaus AWS Management?

Our Site Reliability Engineering (SRE) service combines both 24×7 monitoring and reactive response of your AWS platform, with proactive work.

This ensures that everything’s in place for your infrastructure to be managed as software, making continual improvements to improve reliability and loading time.

Our AWS Certified Site Reliability Engineers will become a complete extension to your in-house team, looking after all aspects of the operations of your AWS account.

we’ll look after everything that underpins your application

We’ll help you architect, maintain, monitor and automate your AWS infrastructure, and become the first point of contact for any issues, with 24×7 monitoring with 15 minute response to emergencies.

Our team understand the particular nuances of the various AWS services and will ensure your account is correctly configured and optimised for speed, security and availability.

Our parternship with AWS

We’re an AWS Advanced Consulting Partner, which demonstrates not only our technical prowess, but also the intimacy of our relationship with AWS.

We’re accredited on AWS’ Well Architected program so we can deliver Well Architected Reviews to AWS customers wanting to find out improvements to be made in their current infrastructure according to AWS five pillars.

Last but by no means least, we have access to internal AWS funding programs that can help or cover the cost of a rebuild or migration of a platform.

Key features of our SRE Service

In no particular order, here’s some of the key features of our SRE Service:

  1. 24×7 monitoring of your platform
  2. 15 minute response to emergency support tickets
  3. Monthly Retained time for proactive work
  4. Named principal engineer who performs all proactive work
  5. Dedicated account manager
  6. access to our team through Slack, support desk and phone
  7. monthly video catchups with your dedicated engineer and account manager
  8. quarterly roadmapping sessions
  9. regular cost-saving reviews on your AWS account
  10. Proactive patching for security vulnerabilities
  11. Introduction of relevant new AWS services
  12. Access to SignalFX giving you unrivalled live in-depth metrics
  13. Access to CloudCheckr which reports on over 400 best practice checks
  14. AWS and DevOps workflow advice


How it’s priced

After a discovery session we’ll understand the complexity of your platform and therefore how much time will be required on a monthly basis to manage it.

We then right-size the number of days a month for your retainer to enable us to do all the proactive work we need to. In addition to this there’s a fixed base price to give you access to 24×7 incident support as well as Helpdesk access for change requests and general advice, along with a shared slack channel to help us work more closely together.

The onboarding process

If we’re doing a new build or migration to AWS, we’ll already know everything we need to about your platform and will be able to commence our SRE service immediately.

Most customers choose to get our full value after a consultancy project like this by taking our SRE service as our engineers will already know your platform and have a close working relationship with your team.

If you’re new to us, we’ll begin by performing a full audit of your platform. This will ensure that everything’s in place for us to be able to support the platform and apply the SLA to it.

Once that’s complete then we can commence the SRE service. During a kick-off meeting prior to the service starting we’ll take detailed notes of your business priorities and what’s critical to you.

We’ll need to set up appropriate monitoring checks to ensure we capture issues before they happen, and also to perform reactive repair if something unexpected happens. We’ll obviously need your help with this as you’ll know your application’s code better than we will.

The key parts of our SRE Service

The underlying goal of the SRE service is to form a partnership with your in-house DevOps/Engineering/Development teams, freeing them from the day-to-day distractions of managing their AWS infrastructure, instead allowing them to focus on adding value to your business.



24x7 incident Response

This gives you round-the-clock access to our SRE engineering team to respond to incidents on your platform (with a 15-minute response to emergency tickets). They will also be there to provide general advice, or make routine changes to your platform.

You’ll also be given access to intelligent tooling such as CloudCheckr (cost optimisation, security and compliance), SignalFX (real-time cloud monitoring and observability), and OpsGenie (modern incident management).


retained time every month for proactive work

Your Principal Engineer will use this time to provide proactive, incremental improvements to your platform in a way that’s structured, intelligent and relevant to your business’ needs.

additional days for larger projects

Because we right-size your retainer for your platform’s normal needs to keep your costs as low as possible, any larger projects are treated separately and once the scope is agreed, they’re treated as a project and managed accordingly.

Monthly reports and account review calls

At the end of every month you’ll receive monthly reports detailing how your retainer time has been used, what we have planned for next month, and any other relevant documents.

They’ll then be discussed during a monthly scheduled video call with your Steamhaus account manager and principal engineer, plus the relevant technical and commercial people from your company. The calls last about 30 minutes and are a great way to make sure that we’re adding as much value as we can to you.