ram-sticks

Nginx Reverse Proxy in RAM = Performance +

A Billion Percent Improvement

The speed at which we serve pages is important to all of us, right? If it’s not important then you’re a shit web host! Nginx Reverse Proxy in RAM gives you a great performance boost.

My claim of a billion percent improvement in speed is an exageration, in fact it is outright bullshit. But in my situation I have seen an improvement of around 200% and that makes me happy and probably makes vistors happy too.

My Network

My network is not complicated. I have a local network which is shared throughout my house. We use it for streaming Netflix, Foxtel, Fetch TV, Stan, Youtube and various other streaming services. We have them all.

My servers are all Centos 7. I chose Centos because it’s stable, reliable & well supported and I didn’t want, or need, the bloat which comes with other operating systems.

I have servers stacked-up in one of our the back bedrooms – it’s a mess in there and my wife tells me I should clean it up! These boxes run various software including MySQL, DNS, HTTP / HTTPS, Mail (in & out) as well as various others just for good measure. Some are dedicated to specific roles such as MySQL or DNS while others share their resource load running multiple services. Once machine runs HTTP, SMTP, POP3 & IMAP. Nginx was running on the same machine.

I have been using Nginx as a reverse proxy with Apache for a few months without any issues and love it for it’s simplicity and speed.

Nginx was configured to cache data into a folder (/var/cache/nginx) on a reasonably standard Hard Disk. It has performed well for me and I have no real issues worth discussing here. The machine ran at a cool 45 degrees Celsius (during high load it would reach 55 degrees) with CPU usage hanging around 8% (maximum recorded was about 30%) and about 55% memory usage. I never see swap usage increase above 1% and don’t have any idea which process uses that 1% or why.

All in all the setup is pretty good in my opinion.

My goal was serving content faster

Like you, I wanted web pages appearing in my browser faster & snappier, you know what I mean, click a link and the next page is loaded before your finger lifts of the button.

Hard Disks are amazingly fast these days but when you are serving bucket loads of web pages they will become a bottleneck. Memory is much faster but is often at a premium, due to other processes, so you need lots of it if you want your machine to perform without swapping. If you are sharing disk read / writes with Nginx & Apache along with other processes you quickly realise (<<– correct spelling in Australia) your bottleneck has its own bottleneck.

I wanted to dedicate a server for doing nothing but Nginx caching. Doing this achieves a couple of things, the first being the original machine, previously running Apache, Nginx and Mail, had some load taken off it which in-turn improves Apache performance. As a bonus, on the original box, was less CPU usage & greater free memory.

The initial setting up of a new box was easy and I did not switch traffic port 80 over to the new box until it was ready to go. I installed & configured Nginx in much the same way it had been setup previously but increased workers and other settings because it could now use all the resource it needed on the machine, which is an old crappy laptop I had laying around. Nginx was the sole occupant and didn’t have to share resources with anything else.

I used the same disk cache (/var/cache/nginx) and was very happy with the resulting performance improvement but thought I would take this opportunity to really make it sing. I wondered if I could move the cache from disk into memory because memory is shit-loads than a hard disk, even if you have SSD, which I don’t.

Ramdisk or tmpfs

RAM

Because my background is predominantly Windows my initial thought was that I needed to create a Ramdisk but after doing a bit of research I decided on tmpfs which is available on Linux installations.

With a Ramdisk, once created, the memory used is gone and unusable by other processes. Ramdisks are a fixed size whereas tmpfs grows as needed until the the original set size is reached. What I mean by this is, if I create a tmpfs of 4 gig but the cache only needs 1 gigabyte then only 1 gigabyte is allocated. As Nginx needs more the memory allocation is increased. With a Ramdisk if you create a 4 gig disk, 4 gig of memory is allocated and cannot be used by other processes. Therefore tmpfs made more sense for me.

Create a folder for your tmpfs

The first step is to create a directory where your tmpfs will be mounted.

$ sudo mkdir /media/nginx-cache -p

Passing -p to mkdir tells it to create parent directories if they don’t exist. The folder you create can be anywhere and doesn’t have to be /media/nginx-cache. You could create /nginx-cache if that’s what floats your boat.

Mount your new tmpfs

Now that you have a mounting point you can go ahead and mount it using the following command;

$ sudo mount -t tmpfs -o size=4096M tmpfs /media/nginx-cache

The above command creates a 4 gigabyte tmpfs.

Auto Mounting

Obviously you will need to auto mount your tmpfs so you don’t need to create it after each boot.

Edit your /etc/fstab file and add the following;

none /media/nginx-cache tmpfs nodev,nosuid,noexec,nodiratime,size=4096M 0 0

Reboot your machine and you are almost ready to go. It is hard to see at this point if your tmpfs is doing anything because the amount of free memory wont have changed. Using the ‘htop‘ command you can monitor free memory while you pre-load your new cache. Pre-load? Wtf is pre-load I hear you say. More on that soon.

Modify your Nginx config

In my case I have my caching config in a separate file, ‘/etc/nginx/http.reverseproxy.conf‘, but you might have it in your ‘/etc/nginx.conf‘. Either way, modify whichever file has your caching config and make a small change.

Here is my before and after entries;

Before;

proxy_cache_path /var/cache/nginx/the-cache/ levels=1:2 keys_zone=cached-stuff:256m inactive=600m max_size=2g;

After;

proxy_cache_path /media/nginx-cache/the-cache levels=1:2 keys_zone=cached-stuff:256m inactive=600m max_size=2g;

You might have noticed the size of the cache is smaller than the tmpfs created earlier, 2 gig vs. 4 gig. This is so I can monitor cache usage and increase its size without re-configuring my tmpfs structure. Remember tmpfs auto sizes up-to your preset capacity. Using the ‘nginx -s reload‘ command will tell Nginx to reload the configuration without dropping current connections thereby allowing me to increase my cache size on the fly.

Because this machine is doing nothing but caching data I’m not bothered by how much memory is used. Right now I have another 12 gig in reserve so I can future proof it a little bit.

The downside of caching in memory

Ram is volatile and your cached data will be lost each time you reboot or there is a power failure or some other catastrophe.

I have around a dozen sites here, some are WordPress, others Joomla, while a couple are plain old static html. Some are production, others development sites not available from the outside world. Losing the cache after a reboot isn’t a big issue but I would still like them to load up fast and that’s where pre-loading the cache comes in handy.

Using a small bash script I retrieve each of the websites thereby causing Nginx to re-cache my stuff. Nginx is configured to cache content after being accessed only once.

Here is the script I use, which I run from my laptop via ssh whilst sitting in my home theater watching TV and drinking beer. The example below shows only one site, but the bash script includes all my sites and takes about 10 minutes to pre-load all web sites on the server.

wget \
--recursive \
--page-requisites \
--html-extension \
--convert-links \
--restrict-file-names=windows \
--domains wozsites.com.au \
--no-parent wozsites.com.au
happy-man

Using the above ‘wget‘ command will download most of your site recursively, saving the resulting files into appropriate folders. If your cache is setup to cache on a single access then you’re good to go.

Now clear your browser cache and load up your site. Visit a couple of pages and see if you get the significant improvement I did. Obviously I’m visiting the site from within my own network but even when I visit from the outside I see a massive increase in how quickly pages are loaded up. It’s f&$king awesome and I wish I’d done it from the start.

My only other bottleneck is my internet connection speed but that is something I’ll tackle another day. In the meantime I hope this has been helpful for someone.