Increase SEO results using prerender method

Hello,

More often we hear that website dropped on popular search engines because of slowness. These things happen when we serve uncached content to the Search engine Bots while visiting pages and we get penalized because of that. Recently, I have been testing one amazing service called prerender.io

Prerender is a node server that uses Headless Chrome to render HTML, screenshots, PDFs, and HAR files out of any web page. The Prerender server listens for an http request, takes the URL and loads it in Headless Chrome, waits for the page to finish loading by waiting for the network to be idle, and then returns your content.

alt

Google recommends that you use Prerender.io in their docs (https://developers.google.com/search/docs/guides/dynamic-rendering), so your site is guaranteed to be crawled correctly by Google & other search engines.

It is fast, simple to use, and really straightforward. Vendor came with two solutions about how the application can be used:

1) By using prerender.io servers as middleware.
2) By installing your own server, and using it as a middleware in your's fleet.

Let's start with first. Visit https://prerender.io website and register. Once that is done, you will be logged to their Control panel area.

alt

In https://prerender.io/documentation very detailed documentation pages vendor mention various ways to implement middleware between your application and this service, but since today we will make implementation for Magento 2 hosted on Nginx Web server, we will focus on that part.

https://prerender.io/documentation/install-middleware#nginx
Says that https://gist.github.com/thoop/8165802 having best way, but here is specific Nginx code that I've used to make this service integrated into present (functional) Nginx config file used for Magento 2 application

        set $prerender 0;
        if ($http_user_agent ~* "googlebot|bingbot|yandex|baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp") {
            set $prerender 1;
        }
        if ($args ~ "_escaped_fragment_") {
            set $prerender 1;
        }
        if ($http_user_agent ~ "Prerender") {
            set $prerender 0;
        }
        if ($uri ~* "\.(js|css|xml|less|png|jpg|jpeg|gif|pdf|doc|txt|ico|rss|zip|mp3|rar|exe|wmv|doc|avi|ppt|mpg|mpeg|tif|wav|mov|psd|ai|xls|mp4|m4a|swf|dat|dmg|iso|flv|m4v|torrent|ttf|woff|svg|eot)") {
            set $prerender 0;
        }

        #resolve using Google's DNS server to force DNS resolution and prevent caching of IPs
        resolver 8.8.8.8;

        if ($prerender = 1) {
        return 503;
        }

error_page 500 502 503 504 =200 @prerender;

location @prerender {  
        proxy_set_header X-Prerender-Token <ENTER TOKEN HERE>;
            #setting prerender as a variable forces DNS resolution since nginx caches IPs and doesnt play well with load balancing
            set $prerender "service.prerender.io";
            rewrite .* /$scheme://$host$request_uri? break;
            proxy_pass http://$prerender;
        }

From this point when above Nginx conf is added, you are redirecting traffic for Google and all bots mentioned in the example to service.prerender.io service. You may simulate traffic by executing following command from any popular SSH terminal:

curl -IkL -A "Googlebot" https://magentocommand.ml  

The output would be 200 OK:

> curl -IkL -A "Googlebot" https://magentocommand.ml
HTTP/2 200  
content-type: text/html;charset=UTF-8  
content-length: 178959  
vary: Accept-Encoding  
server: nginx  
date: Sun, 14 Jun 2020 19:38:58 GMT  
vary: Accept-Encoding  
access-control-allow-origin: *  
x-cache: Miss from cloudfront  
via: 1.1 27aa7ec4f54edf4b2fd5fffda84693a0.cloudfront.net (CloudFront)  
x-amz-cf-pop: SOF50-C1  
x-amz-cf-id: fmPB6u2Cqhv06tGTBvgaFDFaW9aJ8eGjGnvlFT2tEGtqgrCHrjdSJw==  

Visit https://prerender.io/ panel in your browser click on the Crawl stats at the top area (https://prerender.io/crawl-stats) and you will see simulated hit:

alt

Within their panel, you may also view how RAW HTML page looks like by clicking on the Cached Pages option at the top menu.

To be honest, if you are having a small website with not many SEF pages, the vendor offers 250 cached pages completely free, then next tier is 20,000 cached pages for $15.00 USD per month.

There are several more small details about the implementation, but I did my best to make this as short as possible. We will jump now to 2nd method hosting your own prerender.io application.

I've installed a very small Cloud Server with German data center Hetzner AG because it is cheap, stable, and close to my location, but ideally, you can install this application on any platform.

1) As mentioned before, we will use Hetzner Cloud as our dedicated server. This guide was written using an instance with the Ubuntu image, but with a few little adjustments, it should also work with any popular Linux operating system.

Regarding the instance type, I suggest starting with something small, get some metrics, and then scale if needed, in the test I've rented CX21 type who comes with 2 VCPU, 4Gb of RAM and 40Gb of NVME disk space. It should be okay to start off with. Finally, configure the security group to allow SSH connections and custom TCP inbounds on port 3000 as that is where your prerender service will work (you can choose your own port here). I usually do the /etc/hosts.allow method allowing only certain IP to gain access. You can go with a popular Firewalls like firewalld or something inbound from Data Center (or Cloud) provider.

2) Setting up the environment:

Connect to the instance and install the needed tools. To get the node, I recommend using nvm and choosing the latest LTS version.

https://github.com/nvm-sh/nvm#git-install

apt-get update && apt-get install git  
cd ~/ && git clone https://github.com/nvm-sh/nvm.git .nvm  
cd ~/.nvm && git checkout v0.35.3  

Now add these lines to your ~/.bashrc, ~/.profile, or ~/.zshrc file to have it automatically sourced upon login: (you may have to add to more than one of the above files)

export NVM_DIR="$HOME/.nvm"  
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"  # This loads nvm
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion"  # This loads nvm bash_completion

Sign off as SSH user, and log in again.

At the moment of writing this article latest Node/NVM was 12.18.0 (LTS: Erbium), so we will install this one:

nvm install 12.18.0  

Output:
alt

After nvm and node, we need to get forever, which will allow us to run the prerender server continuously in the background.

https://github.com/foreversd/forever

npm install forever -g  

Output:
alt

Next, we need to fetch the prerender source. To do this, clone the GitHub repo and run an npm install to get its dependencies.

git clone https://github.com/prerender/prerender.git  

Next, we need to install Google Chrome.

wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb  

Installing packages on Ubuntu requires Sudo privileges. Make sure you are running the following command as a user with Sudo privileges.

Install the Google Chrome .deb package by typing:

apt install ./google-chrome-stable_current_amd64.deb  

Time to add first flag and modify /root/prerender/server.js file with the following code:

var server = prerender({  
  chromeFlags: ['--no-sandbox', '--headless', '--disable-gpu', '--remote-debugging-port=9222', '--hide-scrollbars']
});

The default one defaults to:

var server = prerender({});  

A lot of other useful customizations can be added in this block based on https://github.com/prerender/prerender instructions.

To test if all is fine, run following from /root/prerender/ directory:

node server.js  

Output:
alt

To make it running uninterrupted, we can use forever tool that we have previously installed and run this as follows:

forever start /root/prerender/server.js  

Output:
root@prerender-01:~/prerender# forever start /root/prerender/server.js
warn: --minUptime not set. Defaulting to: 1000ms
warn: --spinSleepTime not set. Your script will exit if it does not stay up for at least 1000ms
info: Forever processing file: /root/prerender/server.js

You may check if the port is listening by executing lsof tool:

lsof -i :3000  

Output:
root@prerender-01:~/prerender# lsof -i :3000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
node 9198 root 18u IPv6 38348 0t0 TCP *:3000 (LISTEN)

Now, that we are having Prerender server listening on port 3000, we can safely create systemd service so it can boot safely again in case of difficulties.

Navigate to /etc/systemd/system folder and create prerender.service file with the following content:

[Unit]
Description=Prerender Service

[Service]
User=root  
WorkingDirectory=/root/prerender  
ExecStart=/root/.nvm/versions/node/v12.18.0/bin/node /root/prerender/server.js

[Install]
WantedBy=multi-user.target  

Then execute:
systemctl enable prerender
systemctl start prerender

Check status now with systemctl status pretender. The output should be like this one:
alt

That's all! Now you can create a DNS zone in your's external entity and start using Prerender service on your Web server by implementing the Middleware method from the start of this article.

To just test out of the box if Prerender service is working you may execute the following:

curl http://localhost:3000/render?url=https://magentocommand.ml  

One of the Alternative to track pages better is to install Nginx on the Prerender server and then using location / simply proxypass traffic to port 3000 while keeping that port enabled only to the localhost. That way, you may filter who is able and who is not to access Prerender service by using ports 80 or even more advanced with 443 (SSL).

The next one would be to configure Prerender on port 80 (why not!) and use it. To do that add one line to the /etc/systemd/system/prerender.service file under [Service] tab:

Environment=NODE_ENV=production PORT=80  

Optionally, you may add some Cache settings and Number of workers. To be honest, I am testing this on 2 CPU Virtual Private server so I had to limit number of workers to single to prevent OOM failures. To make caching even possible you need to install https://github.com/prerender/prerender-memory-cache plugin.

Environment=NODE_ENV=production CACHE_MAXSIZE=100000  
Environment=NODE_ENV=production CACHE_TTL=432000  
Environment=NODE_ENV=production PRERENDER_NUM_WORKERS=1  

The complete file should look like this:

[Unit]
Description=Prerender Service

[Service]
User=root  
WorkingDirectory=/root/prerender  
Environment=NODE_ENV=production PORT=80  
Environment=NODE_ENV=production CACHE_MAXSIZE=100000  
Environment=NODE_ENV=production CACHE_TTL=432000  
Environment=NODE_ENV=production PRERENDER_NUM_WORKERS=1  
ExecStart=/root/.nvm/versions/node/v12.18.0/bin/node /root/prerender/server.js

[Install]
WantedBy=multi-user.target

Extra addon. Generating HTML snapshots is a resource-intensive process, so some sort of caching strategy should be used to improve the performance. Prerender.io comes with several different caching plugins, but we will use the one based on Redis, as it offers scalability and a simple cache expiration mechanism. Redis can be built from source, but the pre-built version is sufficient for the task at hand.

sudo apt-get update  
sudo apt-get upgrade  
sudo apt-get -y install redis-server  

Open /etc/redis/redis.conf file and make configuration is configured properly. For security measures I just uncomment bind line to bind to only 127.0.0.1 IP address or let's say localhost. Rest details can be adjusted per requirements/needs.

Download the Redis cache plugin and install it. Currently, it caches the pages for one day and then expires them. This can be overridden by specifying the env variable process.env.PAGETTL in seconds. To never expire, you should set the PAGETTL variable to 0.

npm install prerender-redis-cache --save  

Change the server.js to use the redis cache plugin:

server.use(require('prerender-redis-cache'));  

Needs to be added here:
alt

Good to add (optional) is access.log feature and keep the access logs - it can be useful for debugging and other maintenance tasks.

https://github.com/unDemian/prerender-access-log

npm install prerender-access-log --save  

Initialize the plugin in the server.js. You also need to configure the access log settings. Here is the finished version of the server.js:

#!/usr/bin/env node
var prerender = require('./lib');

var server = prerender({  
  accessLog: {
        // Check out the file-stream-rotator docs for parameters
        fileStreamRotator: {
            filename: '/var/log/prerender/access-%DATE%.log',
            frequency: 'daily',
            date_format: 'YYYY-MM-DD',
            verbose: false
        },

        // Check out the morgan docs for the available formats
        morgan: {
            format: 'combined'
        }
    },
chromeFlags: ['--no-sandbox', '--headless', '--disable-gpu', '--remote-debugging-port=9222', '--hide-scrollbars']  
});

server.use(prerender.sendPrerenderHeader());  
// server.use(prerender.blockResources());
server.use(prerender.removeScriptTags());  
server.use(prerender.httpHeaders());  
server.use(require('prerender-memory-cache'));  
server.use(require('prerender-redis-cache'));  
server.use(require('prerender-access-log'));

server.start();  

Best way to see if Redis started adding keys is to use redis-cli tool:

root@prerender-01:~/prerender# redis-cli  
127.0.0.1:6379> keys *  
1) "https://magentocommand.ml/catalog/category/view/s/jackets/id/323/?climate=8057&color=7903&material=8008"  
2) "https://magentocommand.ml/"  
3) "https://magentocommand.ml/catalog/category/view/s/bottoms/id/322/?climate=8064&eco_collection=0&erin_recommends=0&material=8003&new=1&pattern=8049&style_bottom=7960"  

Now test page a few times and see how it renders super fast!

I hope that you will find this post useful for your JavaScript SEO usage scenarios. As always, please do not hesitate to send me your comments and questions. I will be happy to help. Good luck!