CommonLit is a nonprofit organization that provides comprehensive literacy programs used by teachers for students in grades 3-12. This includes a digital library containing thousands of reading lessons and quizzes in both English and Spanish, an accoladed English Language Arts curriculum named CommonLit 360, and an Assessment Series which allows educators to track students’ literacy development and set future progress goals. The programs are also highly accessible, featuring text-to-speech, digital note-taking, and translation, allowing lessons to be tailored based on students’ needs. We sat down with Geoff Harcourt, CommonLit’s CTO, to see what prompted the organization to make the move from Heroku to Porter.
Six years on Heroku
CommonLit had been on Heroku for six years, fully utilizing the Platform-as-a-Service’s (PaaS’s) capabilities. Even before he joined CommonLit, while working as a software development consultant, Geoff consistently recommended Heroku to his clients for its simplicity and convenience; it allowed developers to concentrate on shipping new features with minimal DevOps concerns.
In 2020, for security purposes, CommonLit upgraded to Heroku Private Spaces, an offering that allows users to host applications and databases in a private, isolated network. However, the CommonLit team was ready for a level of flexibility that Heroku’s offerings could no longer account for. The non-profit was also considering moving off of the PaaS to further bolster the security, reliability, and scalability of their infrastructure and to reduce their cloud spend. The final straw that made the migration a priority came in the form of an incident whereby cyberattackers stole Heroku’s Github OAuth tokens for an unknown number of Github accounts.
How Porter stacks up against other PaaS options
The CommonLit team looked at other PaaS options such as Fly and Render; while these platforms certainly provide Heroku-like convenience, the nonprofit’s engineering team was ready for a degree of flexibility and customization beyond what these options could provide. Furthermore, neither of these platforms allows their users to host on their own private and portable cloud, implicating the possibility of vendor lock-in.
“With the other PaaSs, it felt like we wouldn’t actually be migrating off Heroku, essentially trading one set of problems for a different but largely similar set of problems. Porter offered the balance of convenience and flexibility we were looking for.” - Geoff Harcourt, CTO of CommonLit
They also considered managing their own containers, through Elastic Container Service (ECS) or Elastic Kubernetes Service (EKS), but CommonLit had a small engineering team comprised of engineers with limited DevOps experience. They wanted their team to focus on delivering the best product possible, as efficiently as they could, and spending time on DevOps would take up far too much of their bandwidth.
Kubernetes Event-Driven Autoscaling (KEDA)
One of the main concerns the nonprofit had was scalability. Since CommonLit is an academic application used by elementary to high school students and teachers, this results in an operational load that swings heavily. During the week, user traffic starts to pick up when the school day starts on the eastern seaboard; peak traffic occurs from 11 AM EST to 4 PM EST, then slowly slopes downward as children leave school. Peak traffic can reach as high as 50,000 requests per minute. At night, traffic is relatively quiet, often under 1,000 requests per minute, as CommonLit has far fewer users outside of North America.
CommonLit utilized scheduled scaling for a while, aiming to scale in tandem with their user traffic; setting up a scaling schedule works well when there are predictable load changes. However, there are some quirks to the traffic pattern that make need-based autoscaling preferable to a scaling mechanism that’s preprogrammed for the school week: teachers sometimes prep for lessons on Sunday night, resulting in spikes in traffic. Furthermore, on national holidays like Memorial Day, there’s no need to be running the same amount of nodes as on a regular Monday. Also, the summer months (July through August) only see about ten percent of the user traffic the application receives during the school year. Need-based autoscaling fit their use case more so in order to account for these quirks and to match their their highly variable traffic in general.
To allow for need-based autoscaling while on Heroku, they used an add-on called Rails Autoscale, which autoscales based on request queue time rather than Heroku’s default option of autoscaling based on response time, to solve this problem. On Porter, there is an equivalent option to Rails Autoscale called Kubernetes Event-driven Autoscaling (KEDA), which allows CommonLit to autoscale on Kubernetes based on sidekiq queue length, replicating the capability they had with the Heroku add-on. Specifically, KEDA serves as a Kubernetes metrics server that exposes sidekiq queue length data to the Horizontal Pod Autoscaler, driving autoscaling.
CommonLit is a non-profit; cost efficiency is a paramount concern. The organization aims to be as cost-efficient as possible and devote resources to their users’ needs. The real advantage of KEDA over Rails Autoscale is the granularity in scaling that Porter allows through Kubernetes. On Heroku, it is not possible to scale the dynos with much granularity; if one wanted to increase resource utilization for a dyno above the GBs in RAM it allows for, they would have to upgrade to the next dyno type, which could be a difference as large as 11.5 GBs in capacity, resulting in bloated and unnecessary costs that are pushed onto the user. On Porter, CommonLit is able to configure resource utilization for each application down to 1MB RAM and 0.01 vCPU, due to the flexibility and granularity that Kubernetes allows.
Security of user data
Another main point of concern for CommonLit was security; they have access to sensitive student data which must be shielded from the public internet. There is a bevy of compliance frameworks they must abide by, all concerning the privacy of children’s personal information, such as FERPA (Family Educational Rights and Privacy Act), SOPIPA (Student Online Personal Information Protection Act), and COPPA (Children’s Online Privacy Protection Act). Simply put, the organization takes the privacy and safety of their users (young students) very seriously.
On Heroku Private Spaces, they utilized virtual private cloud (VPC) peering, connecting their private dynos to their Postgres database in a VPC. This made it so that their production database was unreachable from the internet, but peering was also limited since they didn’t have direct access to the VPC without peering where their private dynos were.
Geoff finds it easier to manage network security with Porter; they host on their own private cloud, meaning they have full control. On an AWS VPC, Porter just sits as a middleware layer, allowing them to set up security however they like; the Porter platform does not get in the way.
In an AWS VPC, they have flexibility over Identity and Access Management (IAM) roles, security groups, network access control lists (NACLs), peering arrangements, and how they attach transit gateways. Furthermore, they use the Porter add-on Tailscale, a virtual private network (VPN) service that enables encrypted point-to-point connections using the WireGuard protocol, which means there is no direct access to their production database or data stores from the public internet. In the rare case where the engineering team needs to troubleshoot something, they connect via Tailscale, which itself is gated by Indent; as Indent explains, the integration between Tailscale and Indent allows CommonLit to grant users access to specific Tailscale networks, set up closed-by-default rules for their most sensitive nodes so temporary access can be granted as needed, and access temporary secure shell (SSH) connections between devices in their Tailscale network without having to manage SSH keys.
Being able to answer school district boards’ questions regarding cybersecurity in-depth and with confidence is crucial for CommonLit, so the added security from hosting on their own cloud and utilizing these add-ons is essential.
Preview Environments versus Review Apps
Another feature of Porter that CommonLit makes frequent use of is preview environments. These environments run the code in GitHub pull requests in disposable applications, which have unique and shareable URLs, allowing users to easily initiate, test, and merge changes to their codebase.
CommonLit works in small changes to their application, meaning lots of pull requests–up to 120 a week. Every branch’s pull request triggers an automatic opening of a preview environment on Porter (users can configure preview environments to spin up automatically or can manually create them).
This feature is similar to Heroku’s Review Apps, which CommonLit had also utilized frequently; having the exact same capability on Porter was exactly what they were looking for. However, they found preview environments on Porter to be better than Review Apps as they are far more configurable, allowing developers to define environments with complex dependencies.
All of CommonLit’s preview environments are seeded with mock data (with student data only being populated in their production environment for security purposes), but other than that, they are nearly identical to production applications with almost the exact same settings (for example, these applications can’t send genuine emails as they are sent to an email trap). This makes it so that when engineers are performing acceptance testing, they have a high level of confidence that everything occurring will be the same in production. Even for changes that don’t represent actual code changes, a dependency bump for example, using a preview environment to make sure their application started successfully helps them know that the changes are safe.
Convenience, flexibility, and no vendor lock-in
There are a few other features of the Porter platform that Geoff highlighted as stand-out benefits. Job functionality is simple; it's easy to set up and maintain jobs in a cluster. Porter’s command line interface (CLI) was also something CommonLit’s engineering team appreciates as they aren’t forced to go onto the web interface; they can trigger cron tasks or one-off tasks via the Porter CLI, API, or dashboard.
Although CommonLit has no intention to leave Porter, the fact that being able to do so is straightforward and trouble-free is also a big plus; since users host in their own cloud, all that occurs when they eject is that Porter stops managing their cluster, and is no longer responsible for its reliability. Users simply lose access to the layer of abstraction that Porter provides. The underlying Kubernetes cluster stays intact, and users would just have to operate it on their own like any other Kubernetes cluster. There are no concerns regarding vendor lock-in. For CommonLit, being able to tell clients that business continuity would never be an issue is a definite resiliency advantage.
Kubernetes made easy
“I describe Porter as ‘Kubernetes for people who want the developer experience of Heroku’. My team knows a little about K8s but not much, and are very happy with that.” - Geoff Harcourt, CTO of CommonLit
To sum up, CommonLit is able to leverage all of the advantages of Kubernetes without any of the headaches that come with learning best practices to utilize the technology and its ecosystem properly. Despite being hosted on Heroku for over half a decade, they were able to easily acclimate to Porter since Porter's UI is extremely familiar to developers who are used to Heroku, allowing for a seamless transition to AWS. It was a true lift and shift, with minimal overhead. This level of convenience does not impose constraints on CommonLit, as they have full control and flexibility over their infrastructure. In addition, CommonLit was able to improve reliability and their security posture, all in a cost-efficient manner, through Porter.