The Tech Stack of a One-Man SaaS

Nov 22, 2020@anthonynsimon

Being an engineer at heart, each time I see a company write about their tech stack, I brew a fresh cup of coffee, sit back and enjoy reading the newfound little treat.

There’s just something fascinating about getting to know what’s under the hood of other people’s businesses. It’s like gossip, but about software.

In 2020 I started working on a new project alongside my day job: a web traffic and performance monitoring SaaS.

Panelbear has gone through numerous iterations, and I feel lucky that thousands of websites have already integrated with it, even though it's still in the early stages. That's why, recently I decided to focus on turning this little side project into a full time business.

So I thought it's now my turn to share the tech stack I use to run a one-person SaaS too.

By the way, for the format of this post I got inspired by Jake Lazaroff's article - you should check it out too!

With that said, let's get started.

Table of Contents

Languages

Over the years, I have added many programming languages to my toolbelt. For me, two in particular strike a good balance of productivity and reliability:

  • Python: Most of the backend code is in Python. Which has enabled me to focus on shipping features. Also, I use mypy for optional type hints, which helps keep the codebase manageable.
  • Typescript: As a backend developer, I used to avoid working on the frontend. That is until I discovered Typescript. It makes the whole experience a lot better, and less error prone. I now use it for most of my frontend projects together with React.

Frameworks and libraries

I rely on a vast amount of open-source code. This enables me to focus on building the actual product, and avoid reinventing the wheel.

There's lots of libraries I use, but I'd like to highlight a handful due to their major role in my stack:

  • Django: It's like a superpower for solo developers. The longer you work in the industry, the more you appreciate the conventions it uses. A monolithic framework can get you really, really far. To me, it's about predictable software that's fast in every way that matters. By the way, I talk more about this topic on my other blog post Choose Boring Technology.
  • React: I built the dashboards and some UI components with React. These get bundled with Webpack and embedded into the Django templates using django-react-templatetags.
  • NextJS: I use this React framework for my landing pages, documentation and the blog. It enables me to use the tools I know while reaping the benefits of static site generation. This is great for keeping the site fast, and SEO friendly.
  • Celery: I use it for all background and scheduled tasks. It has quite a learning curve for more advanced use-cases. But it's reliable once you understand how it works and how to troubleshoot it when stuff goes wrong.
  • Bootstrap: I built a custom theme on top of Bootstrap. I know there's newer CSS frameworks. But I already knew how to use this one, which saved me a lot of time, and there's plenty of documentation around it. That's why I picked it.
  • React SWR: I don't do any fancy state management in the frontend. Instead I use plain React hooks plus SWR. Its stale-while-invalidate caching strategy helps keep the dashboards snappy, and supports various goodies out of the box such as polling, revalidation on focus, retries and even basic local mutations.
  • Nivo: My product has a lot of charts. I tried many libraries, but ultimately settled on building custom components on top of Nivo. I really like their interactive docs and wealth of options to customize the charts for my needs.

Databases

In the beginning, I stored all data in a single SQLite database. Doing backups meant making a copy of a single file to an object storage like S3.

Back then, it was more than enough for my needs. But over time, I needed different tools to support my growing infrastructure needs.

Here's the databases I currently use:

  • ClickHouse: It's a high performance columnar database that's great for real time queries. It enables querying and storing large amounts of data on commodity hardware. Some of my customers have millions of page views and I don't have an unlimited budget, so it's been very handy.
  • PostgreSQL: My favorite database. Sane defaults, battle-tested, and well integrated with Django. I use it for application data (not the analytics data). For the analytics data, I wrote a simple interface for querying ClickHouse instead.
  • Redis: I use it for caching, rate-limiting, as a task queue, and as a key/value store with TTL for various features. Rock-solid, and great documentation.

Deployment

I like to treat my infrastructure as cattle instead of pets.

To me, servers and clusters should come and go. If one server gets unhealthy, I should be able to restart it or destroy it without issues.

Also, this process should be automatic and my infrastructure heals itself. That way, I reduce the chances I have to intervene at 2AM when things go wrong.

So I use Infrastructure-as-Code to deploy my stuff. I do not change things by SSH'ing into the servers in production and crossing my fingers.

This also helps me in case of disaster recovery. I run a few commands, and some minutes later my stack has been re-created.

This was also useful when I moved from DigitalOcean to Linode and recently to AWS too.

I describe everything in version controlled code. That way it's easy to keep track of how these systems look like - even years later.

Here's the tools I use:

  • Terraform: I manage most of my cloud infrastructure with Terraform. I declare EKS clusters, S3 buckets, roles, and RDS instances in my Terraform manifests. The state is sync'ed to an encrypted S3 bucket. This avoids getting into trouble in case something happens to my development laptop.
  • Docker: I build everything as Docker images. Even stateful components like Clickhouse or Redis. It makes my stack very portable, as I can run it anywhere I can run containers, which is almost any cloud provider by now.
  • Kubernetes: Yes I use Kubernetes as a one person startup. It allowed me to simplify the operational aspects and re-use my knowledge. I already had several years of production experience with it at my day job. So I wouldn't recommend it if you're getting started. Use a managed platform if you must.
  • GitHub Actions: For this project I chose GitHub Actions instead of CircleCI (which I'd normally use, and is also great!). The reason is simple: I already use GitHub and I don't like many services having access to my repos and secrets.

The great cloud migration

Back when it was a side project, I started in a single $5/mo instance in DigitalOcean. Then I realized that I was reinventing a lot of features which Kubernetes gives me out of the box.

For example:

  • Service discovery
  • Automatic TLS certs renewal.
  • Load balancing.
  • Log rotation and aggregation.
  • Zero downtime rollouts.
  • Autoscaling (my traffic fluctuates a lot).
  • Fault-tolerance (self-healing services).

So I moved to the managed Kubernetes offering in DigitalOcean. It was great, until I hit serious reliability issues, even on larger instances.

In short: The cluster API would often go down and no longer recover. This disrupted a lot of cluster services including the load balancer, which became unresponsive.

Each time this happened, I had to create a new cluster and failover (via changing the DNS records for my service). Not great.

I never had such issues with other managed Kubernetes offerings. I suspect the control plane is being underprovisioned - even with larger nodes.

Unfortunately I was not able to resolve the issue after several weeks. So I decided to move to Linode's managed Kubernetes. From then on I had exactly zero problems during the honeymoon that followed.

However, I migrated cloud providers once again. This time to AWS as I got a lot of credits via YC's Startup School program.

I was also happy to use managed services like RDS to offload operating Postgres. I don't trust myself enough to do a failover at 2AM, so this was a big plus.

You might be wondering: "Wasn't it a lot of work to migrate cloud providers three times during launch?" The short answer is: no.

All my infrastructure was already described via Terraform and Kubernetes manifests. Each migration consisted of:

  1. Schedule a few minutes of downtime to move the database.

  2. Deploy all resources on the new cloud provider.

  3. Update the DNS records.

Luckily, it only caused minimal interruption to my customers. It wasn't zero-downtime, but it was a practical method at the time.

Nowadays, I'd aim for a zero-downtime strategy as I have a lot more users. Likely with some sort of multi-step migration while buffering the incoming telemetry.

Infrastructure

Here's a list of the infrastructure I use to run my SaaS:

  • AWS: Predictable, and lots of managed services. I use it at my full-time job, so I didn't have to spend too much time figuring things out. The main services I use are EKS, ELB, S3, RDS, IAM and private VPCs.
  • Cloudflare: I use it for DDoS protection, DNS, and caching static assets. It currently shaves off 80% of the egress charges from AWS. Not sure which small business could afford serving a lot of content with those egress fees.
  • Let’s Encrypt: It's a free SSL certificate authority. I also use cert-manager in my Kubernetes cluster. It issues and renews certificates based on my ingress rules. Simple and automated.
  • Namecheap: This is my domain name registrar of choice. Allows MFA for login which is an important security feature. Also, unlike other registrars they haven't surprised me with an expensive renewal. I like them.

Monitoring

If there's something that should never go down is the monitoring and alerting system. Otherwise, how would I know if things go wrong if the thing that's supposed to alert me is down too?

In the beginning, I self-hosted a Prometheus and Grafana installation on my cluster.

However, for peace of mind in case my cluster is down, I migrated to a hosted service to instead:

  • New Relic: I use their Prometheus adapter for easy integration with Kubernetes. It automatically forwards all my metrics to their service. I'm talking about things like HTTP request count, response times, queue sizes, and so on.
  • Sentry: Application exception monitoring and aggregation. Notifies me when unhandled errors happen.

Kubernetes components

Here's a list many DevOps tools to automate things in my Kubernetes cluster:

  • ingress-nginx: Ingress controller for Kubernetes. It provisions NGINX services to load balance traffic to my pods. It also manages a Network Load Balancer (NLB) on AWS which controls ingress to the cluster nodes. It handles unhealthy nodes and traffic shaping for me. Rock-solid and has a huge community.
  • cert-manager: It automatically issues and renew TLS certificates. I simply specify an ingress rule in my Kubernetes YAML files and it takes care of the rest.
  • external-dns: It manages the DNS records for the services I run on Kubernetes. I just add an entry to the Ingress manifest and external-dns synchronizes my DNS records in Cloudflare or Route53.
  • flux: A GitOps way to do continuous delivery in Kubernetes. It pulls and deploys new Docker images when it detects a new push to the image registry (in my case ECR).

CLI tools

I use plenty CLI tools for various purposes. But some of my favorite ones are:

  • kubectl: I use it to interact with the Kubernetes cluster. It can tail logs, inspect pods and services, SSH into a running container, and so on.
  • stern: It lets you tail logs in real time across multiple pods or services. Super useful for troubleshooting issues or inspecting production access logs.
  • htop: Interactive system process viewer. Like top, but better.
  • cURL: I use it to make HTTP requests, inspect headers, SSL/TLS certifications and whatnot.
  • hey: A great tool for load testing HTTP endpoints. It outputs a nice summary of the latency percentiles.

Email

  • Fastmail: My business email of choice. Simple and reliable.
  • Postmark: I use it for transactional emails (eg. email verification, weekly reports, login security alerts, password reset). Their email delivery rates are great, and the tooling/mobile app is top-notch.

In the beginning I used Sendgrid for sending transactional emails. Unfortunately, my delivery rates were not good. Many users told me that they were not receiving account verification or password reset emails. This issue persisted for weeks even after contacting support. Not great for my little business, so I switched to Postmark and since then never had an issue.

Development

Here's a list of some other tools that help me run my SaaS:

  • GitHub: Source code hosting and versioning.
  • PyCharm: It's probably the best IDE for Python. I can refactor and navigate the entire codebase with ease. Works well even with large, untyped codebases.
  • VS Code: Great for Typescript/React work. I try to avoid adding too many extensions and keep it as a general purpose code editor, not a full blown IDE.
  • Poetry: Python packaging and dependency management with lock files. Makes the packaging story in Python so much easier and helps in keeping track of changes. I couldn't imagine going back to requirements.txt files after years of issues.
  • Yarn: Fast dependency management for Node.js with local caching.
  • Invoked: I wrap all my codebase tasks in invoked commands. This way, I can run locally the same commands that I would run on CI, in case it's needed.

Other

  • Panelbear: What better tool to measure Panelbear's traffic insights than Panelbear itself. The benefits of dogfooding are real, as I am my own customer and can put myself in my customer's shoes.
  • Cronitor.io: It notifies me via email/whatsapp when a scheduled job doesn't run.
  • Trello: I use it to keep track of issues, requests, new ideas and whatnot.
  • Figma: It replaced Sketch as my go-to tool for making quick mockups, banners, and illustrations for the landing pages.

Nov 22, 2020·@anthonynsimon