Design Web app from Scratch Part 1

Han
5 min readMar 27, 2022

--

Start a webapp from scratch might be overwhelming, even with a lot of SaaS, PaaS that helps you deploying these apps, writing the app itself from scratch is still difficult.

In Part 1, I want to emphasize on one single mistake that I see happening in many web applications. That is:

Write your business logic in the Core

Let me explain. Imagine you have a pretty standard design of an web application.

Figure 1. Standard web application architecture.

Note this is not the three-tier design. It would fit into the “Your application” component in a three-tier design of a web app.

Three-Tier architecture. Source: https://www.logianalytics.com/5-benefits-3-tier-architecture/

This architecture design is pretty standard, a presentation layer(GraphQL, Rest, commandline) calls some functions in the core (read or write), and the core, through ReadService or WriteService, fetch data or modify data in storage and return to the presentation layer. There are many variations of this design, but we won’t get into details for now.

Most of the time, when you need to start a web app, it is a much simplified version. For example, you might not need to support commandline, the data requests are very straight forward that you might want to skip writing a service file. Which leads to the design in Figure 2. This is quite common when all you need is a CRUD api to access the database, the read and write are very straight forward and can be generated. Many web frameworks support this kind of style.

Figure 2. No service

For example, RoR(Ruby on Rails) allow direct access to Storage layer(aka ActiveRecords) in Controller. (see their demo code in Figure 3).

Fig 3. RoR demo code: Model access in controller

Or some one-click GraphQL framework like Hasura, PostGraphile.

Don’t put business logic in controller, put them in CORE.

But sooner or later, you will need to extend the simple web app with extra functionality and lazy developers may choose to write business logic directly in controller files and end up with bloated controllers. And it become much harder to add additional component: such as 1) adding background worker to execute Tasks that need to reuse controller code, or 2) add another presentation mechanism (GraphQL, RPC etc).

There is a common misunderstanding of MVC(model view controller) framework, that is, people think Models are the models created in ORM that is 1:1 with database tables. With this kind of thinking in mind, obviously the only place to write business logic is the controller. But Models are not restricted to just DAO(data access objects), in fact, models to quote wiki:

The central component of the pattern. It is the application’s dynamic data structure, independent of the user interface. It directly manages the data, logic and rules of the application.

The controller has a very simple purpose, accept inputs and convert to command for Models and render View. To put into the diagram above, the Models (in MVC framework) is the entire Core component. (I will cover more about how to design core in later posts). If you ever foresee your application grows into medium size and is not just a simple CRUD retriever, you should always enforce this rule to not let controller have direct access to storage layer. Some language linter even provide the ability to restrict imports so developer cannot import user model in userController file. Pylint, eslint to name a few.

Don’t put business logic in database neither.

That is not very obvious to developers, and is very easily missed.

Database like PostgreSQL provide pg_functions, pg_views, pg_triggers that enables developer to write some business logic as SQL. But it is not recommended, for two reasons:

  1. Database cost more. When your server attracts more traffic, it is inevitable to scale your server, either by vertical scaling(increase the powerfulness of existing machine), or horizontal scaling(get more machines). Vertical scaling is almost always more expensive than horizontal scaling. For web servers, horizontal scaling is usually a simple thing to do, add a load balancer at the front and increase the number of machines, many existing cloud services even provide auto scaling for web apps. (Google Cloud, Aws). But it is much much harder to horizontal scale your database. Usually the best option for cloud hosted database is just to put more money and get a more powerful database server. So if you consistently put unnecessary work in database, it will be quite awkward when database run out of power.
  2. SQL is hard to read and manage when there are a lot of it. I have seem people write all the business logic in pg_functions. They are stored in .sql file for easier reading. pg_functions depend on others, and when modifying a function, all pg_functions that depend the said function must be dropped, and recreated after the new version of the function is created. This become a nightmare when you have over like a hundred functions in the system. Not only that, because of the strong dependency, anytime you modify a table (e.g. remove a column), there won’t be any checks to tell you that some function won’t work.

Because of this, for a full function web application, it is recommended to write data retrieving logic and data modification triggers in web server instead of in the database.

For example, we used to have a trigger on every blog creation to add some data to a summary statistics. We wrote that in a pg_trigger and thinking this would guarantee the data is correct, because there is no way a blog is created, but the summary statistic failed to write. We are correct for the data guarantee and it worked for a while, until we developed a bulk import functionality. It took forever to import a couple thousands blogs and we finally figured out it was because the trigger was happening constantly. Eventually we moved the triggering logic to server, and for bulk updates, only write the summary statistic once at the end. It is much more elegant this way, not only it is easier to manage, (reading OO code is indeed easier than look for pg_function/pg_trigger in database), but also we have control of when it should trigger.

Summary

This blog explained a standard architecture design of a general web application and made one point:

Write your business logic in core, not in controller, not in database.

In next series of blogs, I will go into details about each of these components.

Part 2 is here

--

--

Han

Google SWE | Newly Dad | Computational Biology PhD | Home Automation Enthusiast