gRPC over TLS with Traefik v2 on AWS
In recent years, gRPC has gained more and more popularity for microservices due to its high performance. Besides inter-service communication, gRPC is very suitable for data communication between mobile apps and backend servers. With highly efficient transmissions of gRPC, mobile apps can fetch data of desired from backend services more quickly. This post introduces how to enable gRPC over TLS for backend services on AWS.
gRPC over TLS is not as simple as HTTPS on AWS
If you ever set up a HTTPS backend service on AWS, you shall find that it’s pretty simple with AWS load balancers because AWS load balancers inherently support HTTP and SSL termination. Therefore, when you’d like to run a gRPC over TLS backend service on AWS, you probably think that you just need to have an architecture like below.
Unfortunately, none of the AWS CLB, ALB, and NLB has supported gRPC over TLS yet when this post is written (Update: AWS ALB has supported gRPC since October 29, 2020). To overcome this problem, you can let your gRPC server handle SSL termination. However, it’s generally recommended not to offload SSL termination to backend services (see the thread https://security.stackexchange.com/questions/30403/should-ssl-be-terminated-at-a-load-balancer).
gRPC over TLS with Traefik v2
To enable gRPC over TLS and save backend servers from SSL termination, I employ Traefik, one of the most popular cloud native edge routers, to sit between AWS CLB and gRPC backend servers. The system architecture is shown below. Traefik itself can act as the reverse proxy and load balancers for the backend servers and thus AWS CLB is actually not a must here. The main reason of using AWS CLB is avoid exposing any ec2 instances in the public subnets. If you have no concern for placing Traefik in the public subnets, you don’t need AWS CLB to achieve gRPC over TLS.
In this system architecture, AWS CLB must work in TCP mode to prevent the header fields from being modified. To use gRPC in Traefik, you just need to make Traefik work in HTTP mode and use h2c protocol for communication between Traefik and backend services. For SSL termination, you place the certificate and private key for your domain and let Traefik load them for use. Below are the sample configurations for Traefik with gRPC over TLS. Because this post is not a tutorial for Traefik, please refer to the Traefik doc if it’s difficult for you to understand the sample configuration.
Traefik static configuration
Traefik dynamic configuration
Proxy Protocol for Client IPs
The system architect and configuration works for gRPC over TLS. However, the backend servers are unable to get the IP addresses of clients because the client IPs in the request headers would be replaced with those of the CLB. To address this issue, you need to enable proxy protocol in your CLB with the commands as well as in Traefik with the updated EntryPoint configuration below. If both are applied, you shall be able to see the client IP in the request header field x-forwarded-for.
CLB commands for proxy protocol
Traefik static configuration for proxy protocol
Summary
This post introduces a feasible system implementation to achieve gRPC over TLS for backend services on AWS. The main advantage of this architecture is that the backend code can be independent of SSL termination. When any of the AWS load balancers starts to support gRPC, nothing needs to be changed in the backend code. If you’d like to verify your implementation, I recommend BloomRPC to test efficiently. Welcome to leave your comment if any.