Web Performance Guidelines
Performance is one of the most important topics in web development, yet it's also one of the most overlooked. The term refers to the speed and efficiency of the application. The topic affects everyone including developers, system administrators, the sales department, the users, and all your servers. Poor performance has a negative impact on your brand, your SEO, and your profits. And yet so few people seem to have a clue about the topic and even fewer seem to care.
Here are a few key measures to improve your site's performance from all perspectives, from frontend, backend, and system level. You can additionally find some organizational measures to try to solve the problem of apathy towards the topic.
Frontend Performance Measures
Design With Performance In Mind
Project Manager: I want this new product with these 50 features and I want tracking in these six tracking suites.
Designer: Here's some designs. I sent you all the images as ultra HD jpegs and I also want to include some new custom fonts.
UX Designer: I want to implement four variants of ten of those 50 features to see which the users like best and I want them all active at at the same time for multivariant testing.
Developer: Well, I made it, but now the page makes 200 requests and takes 15 seconds to load. I'll need to spend the next few weeks trying to trim it down and we have to reconsider some of these features.
Does this situation sound familiar? Performance shouldn't begin at the point when the project reaches the developers' desks. The project conception and the design phase (both graphic design and UX design) should already have performance in mind. It's important for designers and project managers to have some understanding of the capabilities of your systems and the effect that new features might have on performance. And of course, when in doubt, invite a developer to the planning meetings. For those designers reading, there's a good book which is available online for free worth taking a look at called Designing For Performance.
Reduce the Number of Blocking Requests
Reduce the Number of Requests in General
Expanding upon the previous point, we should try to reduce the number of requests in general, especially from multiple domains. Each DNS-lookup takes time, and the slower your connection, the longer it takes. On a typical 3G connection your DNS-lookups will take around 200ms, so you want to try to reduce the number of domains as much as possible and in general reduce the number of requests. This includes not only CSS and JS but also images, fonts, JSONs, and any other resources.
Here are a few ideas for how to reduce the number of requests on your pages:
- Include often-used images as inline SVGs.
- Use sprites. A sprite is a single larger graphic which contains many smaller graphics. The sprite is then cropped and repositioned using CSS to display the desired individual graphic.
- Lazy-loading images. Lazy-loading is a concept in which content which does not need to be on the page immediately is loaded at some point later. A typical usage of this technique is for loading images which appear below the fold (i.e. the bottom of the browser window) only when the user has scrolled close to where the image should appear.
Reduce the Size of Large Resources
Backend Performance Measures
Most companies have a lot of projects and systems where more and more is continually added on top of them and engineers tend to throw more and more technologies at them. Over time all the extra features and technologies start to add overhead and slow things down a lot (not to mention the decrease in development speed as the projects become more complex and the increase in things breaking due to the projects becoming more fragile).
At some point when a project grows we have to stop adding to it, take a step back, and replan the architecture with all of its current features in mind. This applies to the entire stack: frontend, backend, and system level. It can help to try to define a standard technology stack made up of agreed-upon technologies which everyone understands and knows how to support. Coding styleguides and unit tests also help to keep the projects clean, but one of the most important steps is to remember that you as developers have the ability to say "No" (whether to the feature entirely or to the timeline demanded). It's important to ensure that new features are integrated into the architectural plan of the application, and if the project's deadlines don't allow for it then the feature should be rejected. In other words, do it right or don't do it at all.
Cache Dynamic Content
Nearly all external requests (to APIs or external resources) should be cached using something like Memcached or Redis (or whatever else you prefer), and the results from database (even internal databases) queries should typically be cached as well (especially when the query is a slow one). This not only generally increases the speed of your pages but additionally decreases the load on other servers which in turn makes them faster and more stable.
Make Asynchronous Requests to External Resources
Whenever possible you should attempt to make API calls to external resources (especially for non-critical content like widgets) asynchronously. This can be combined with the previous point regarding caching. A common solution is to create an asynchronous caching proxy which only ever returns cached responses when there is something in cache or a 202 HTTP code (i.e. "Accepted for processing") when not. There is then typically a "worker" running in the background which retrieves the actual resource and populates the cache.
Keep Logs Clean
Checking our various logs should be a regular task and warnings and notices should not be ignored even though they don't usually lead to broken pages. First of all, a warning is a sign that something isn't working right, and additionally, when the logs are filled with warnings it makes it much harder to notice the critical errors and makes debugging more difficult (and therefore decreases our internal performance as developers). Plus the actual act of writing to the logs takes time (and disk space).
This doesn't mean we should simply suppress warnings and notices however because, using the example of PHP, the interpreter processes and formats its error messages all the way until completion before it checks whether or not it should actually report the error, so it takes roughly the same amount of time regardless of whether the error is actually displayed or not. Here's a pretty nice explanation with examples in this Stackoverflow post. The proper solution is to quickly identify the issues and to repair them.
Optimize Database Queries
This point goes hand-in-hand with the "simplifying architecture" point from above and is somewhat mitigated by the "cache dynamic content" point, however it's still one of the biggest performance killers on most sites. Most legacy applications have pretty big databases with slow, inefficient queries. To solve this you need to spend some time architecting your databases and designing your queries so that they are fast and efficient. This means trying to clean up the architecture so that joins are clean and logical, use indices always (and feel guilty if you have a situation where you can't), try to reduce the cardinality (i.e. the number of rows the optimizer thinks will be needed) of your queries. This of course applies to all database queries (MongoDB, Elasticsearch, etc.), not just SQL.
Optimize Server Configuration
On the low-level, we can also improve our speed by optimizing our various server/service configurations. One simple example would be the AllowOverride option for Apache; this forces the Apache server, on every request, to scan the file system to check for the presence of a .htaccess file in every directory along the resources' path. And this is an option which is quite commonly used.
There are many good resources out there on this topic, and most software vendors provide some documentation for performance tuning. Here are a few:
- Apache Performance Tuning
- Nginx Performance Tuning
- MySQL Performance Tuning Resources
- Elasticsearch Performance Considerations
HTTP Cache Headers
Every page serving static content (or cacheable dynamic content) should set cache headers to indicate to the users' browsers that they should cache the result for a certain amount of time. Additionally it can be specified in the webserver config to automatically cache certain content types (generally for images, CSS, JS) for long periods of time (typically something like a year for CSS and JS and a month for images).
Organizational Performance Measures
Create a "Performance Culture"
To achieve a truly performant site you need to not only improve the technical aspects of your applications but also integrate "performance" into your organization's culture. This is probably the most difficult point to implement, but is one of the most important, and it goes hand-in-hand with the "Design With Performance In Mind" point from above. Performance and speed can't be considered "features" but rather fundamental parts of the project on all levels.
To achieve this you'll need to define clear performance metrics, foster communication, and work towards a single common goal rather than individual team goals. Having a performant application needs to be a quality which everyone in the organization takes pride in (regardless of their function in the organization) and should be equally important as a selling point as details such as the number of page impressions and conversion rate. Each team should define a "performance budget" to define how big, heavy, and expensive a page is allowed to be (this is a helpful resource: What does my site cost?).
Implement Devops The Right Way
Typically management sees the common Venn diagram showing the overlap between "Development", "Quality Assurance", and "Operations" and in the middle it's labeled "DevOps" and they say "oh, so I should form a devops department and hire a couple new guys". This is not devops; devops is not a person or a role or a tool.
DevOps is a software development method that emphasizes communication, collaboration, integration, and automation. It is meant to acknowledge the interdependencies between development, QA, and operations and to try to break down the barriers between the various departments to encourage open communication and collaboration.
To really achieve this, we all need to expand our knowledge areas a bit. If you're a developer, learn a bit about sysadmin work; if you're an operator, learn some coding. The strict feelings of "that stuff over there isn't my responsibility" and "my team is by far the best team" should be dissolved. "Hand-over" from developers to operations should not exist, but rather it should be "hand-in-hand".
Create a Quality Assurance Team
Most organizations would additionally benefit from a quality assurance department (or team) because it can of course be difficult for teams to keep an oversight over the whole structure and how everything is connected, and it can be tempting to cut corners when deadlines need to be met. The QA team is of course seen as annoying to both developers and project managers since they're always causing delays and finding problems, but in the long run it's much cheaper when problems are discovered before your features make it to production.