Our Architecture
Our architecture has been designed from the ground up to be scalable, redundant and secure. We selected the AWS cloud as the foundation for our infrastructure to leverage the built in security, high availability and flexibility which has allowed us to iterate quickly and try several permutations before landing on our current setup.
01
Scalable
Because we leverage the power of AWS cloud, we can increase the size or number of our nodes at a moments notice. We are leveraging Infrastructure as Code (IaC) to deploy our infrastructure, so changes can be done rapidly and safely.
02
Redundant
We currently have four relay nodes running in order to maximize network availability and redundancy should one node fail. We also run a secondary producer node in the event the primary fails which can be promoted to leader rapidly via deployment automation.
03
Secure
Three dedicated subnets protected by locked-down security groups which only allow necessary traffic and ports. AWS Secrets manager provide secure key storage and cold keys are removed from running nodes when not required for operation.
Architecture
Six dedicated subnets Across Three Availability Zones
Producer nodes are running in a private subnet with no internet access
Relay nodes are in public subnets with internet access, but protected by a firewall which only allows access to port the relay port. Any management access must be done from the management subnet.
Management subnet contains a bastion host for jump-boxing to relays and producer nodes. The node sandbox is spun up when upgrading to new software versions to create a ‘golden image’ which will then be rolled out to the rest of the network
Tools
We are using several tools to help us achieve automation and keep a smoothly running stake pool
Terraform
Terraform is an ‘Infrastructure as Code’ tool which allows you to provision and manage cloud infrastructure.
Learn more about it at https://www.terraform.io/
Fabric
Fabric is a Python library for executing shell commands remotely over SSH.
We are using it to allow us to centrally develop and manager our config files and management scripts, and securely and reproducible deploy our node infrastructure across multiple servers easily and rapidly. This means tasks like KES key rotation or leader promotion of standby nodes can be managed remotely with a single command.
Learn more about it at http://www.fabfile.org/
Grafana
Grafana is a platform which allows you to query, visualize and alert on metrics from multiple sources and share these via beautiful dashboards.
We are using Grafana to monitor our stake pool infrastructure and provide alerting and monitoring to ensure we can react rapidly to issues, and provide transparency to our delegators. Expect to see Grafana dashboards available on the website soon.
Learn more about it at https://grafana.com/
Community
Keep an eye out on Github where we will be posting our scripts and libraries for the Cardano community!