Skip to content

Lighthouse Scanner: Microservice Development with the Hapi Framework

By Sebastian Günther

Posted in Nodejs, Microservices, Tutorial, Lighthouse-series

Lighthouse is a scanner for improving SEO, performance and security of websites. My service delivers lighthouse scans on demand. The service is provided through a webpage and realized by a microservice. You can use it here: https://lighthouse.admantium.com/.

I started to develop this microservice with the Express framework, a standard choice for Node.js web applications. After some time, adding feature after feature, I found myself thinking "The application logic is hidden between all those expressive logging statements and complex validations". It was hard to have a complete overview with one glance!

Aren’t there other frameworks available? Of course, and I picked Hapi. For these reasons: Its syntax is clear and similar to Express. It has a set of well-integrated modules. And it claims to be very performant and secure because it withstands Walmart’s Black Friday sales. Hapi is enterprise express!

In this article, I will walk through the microservice development and show the relevant Hapi features.

Note: The lighthouse service is discontinued since 2024-05-18.

Switch from Express to Hapi

My initial development went smoothly. Within one day, I implemented the basic functions of starting and executing scans. The microservices is a self-contained unit providing a clear HTTP API: accepting scan request with /scan, communicating job status with /job, and delivering scan results with /report. On the next day, I added detailed validation, error handling and logging. The code base evolved, but mutated to so many logging statements and complex validations that I could not see the main flow of the application.

So it was clear: I need to add specific npm packages that encapsulate loggoing and validation, or I switch to a framework which integrates these essential aspects already. From lists like node frameworks or web api frameworks, I gathered and checked these:

Again, the choice is huge! I narrowed it down by considering my core requirements - validation, error handling, logging, and by reading source code examples. From all examples, I picked Hapi, and was delighted within one day. I had a much cleaner code base, with integrated validation, error handling and logging. Using Hapi feels like writing enterprise Express.

A basic Hapi Server

A basic Hapi server is started with the following code.

const hapi = require('hapi');

async function init() {
  const server = hapi.server({
    port: 8080,
    host: 'localhost',
  });

  server.route({
    method: 'GET',
    path: '/',
    handler: async (request, h) => "Hello World"
  });

  await server.init();
}

init();

If you are familiar with Express, I'm sure you can understand this code perfectly.

Query Parsing

In Hapi, you configure the global query parser inside the server declaration. Then, in the routes, you use request.query to get the queries. Here is an example to return the query object as JSON.

const qs = require('qs');

async function init() {
  const server = hapi.server({
    ...
    query: { parser: (query) => qs.parse(query) }
  });

  server.route({
    method: 'GET',
    path: '/',
    handler: async (request, h) => { request.query };
  })
}

Request Validation

In a microservice, you want to be especially strict about request payloads. Hapi lets you define schema objects. These explain which keys the payload needs to have, and what types or patterns their values needs to satisfy.

Take a look at the input validation for /scan requests. It allows one key, url, which needs to be a string and match the given regex.

const joi = require("@hapi/joi");

schema = {
  scan_req_schema: joi.object({
    url: joi.string().pattern(/http(s?):\/\/[\w.-]+/).required()
  }),
}

Schemas are automatically applied by including the following configuration in the route declaration.

server.route({
  #...
  options: {
    validate: {
      query: schema.scan_req_schema
    },
  },
})

Error Handling

Error Handling is a great example about how Hapi makes basic, meaningful functionality without further configuration.

Hapi make basic assumptions and error catching for you. In its default configuration, it will return a 400 and a JSON object with the error message.

curl localhost:8080/hello

{"statusCode":404,"error":"Not Found","message":"Not Found"}

When the schema validation rules are violated, you get the following error.

"statusCode":400,"error":"Bad Request","message":"Invalid request query input"}

If you want to, you can configure the errors with custom status code and messages. For this, you pass a failAction method, which receives the objects request, h, err. You then define the error message, status code and other attributes with err.output.payload. Here is an example:

server.route({
  method: 'GET',
  path: '/scan',
  options: {
    validate: {
      query: schema.scan_req_schema,
      failAction: async (request, h, err) => {
        err.reformat();
        err.output.payload = {
          statusCode: 420,
          error: 'Bad Request',
          message: 'error, invalid or missing query param `url`',
          query: request.query
        };
        return err;
      }
    }
  }
  [...]

Now, when calling the url with invalid parameters, you receive this custom object. Nice!

curl localhost:8080/scan?ur=http://test

{"statusCode":420,"error":"Bad Request","message":"error, invalid or missing query param `url`","query":{}}

Logging

Text-based logging is enabled by default: Use server.log for generic, and request.log for request specific log infos. Log statements follow the best practices of differentiating log levels. Then, you specify the log message and/or objects that are logged.

I'm using the lightweight and fast Pino JSON logger. It comes as the hapi-pino plugin and is configured as follow:

await server.register({
  plugin: require('hapi-pino'),
  options: {
    prettyPrint: true,
    timestamp: true,
    redact: ['req.headers.authorization']
  }
});

When called during startup, like server.log('info', { msg: 'BOOTING server' }) log messages look like this:

[1588089561261] INFO  (34757 on midi.local):
  tags: [
    "info"
  ]
  data: {
    "msg": "BOOTING server"
  }

When called for requests, like request.log('info', { msg, url, uuid }) it prints also useful information about the request object.

[1588089765043] INFO  (34757 on midi.local):
  tags: [
    "REQUEST /scan"
  ]
  req: {
    "id": "1588089765042:midi.local:34757:k9k3irup:10005",
    "method": "get",
    "url": "http://localhost:8080/scan?url=http://test",
    "headers": {
      "host": "localhost:8080",
      "user-agent": "curl/7.64.1",
      "accept": "*/*"
    },
    "remoteAddress": "127.0.0.1",
    "remotePort": 56808
  }

Complete Example

Let’s put all of the discussed features together in one example.

const hapi = require('@hapi/hapi');
const qs = require('qs');

const { schema } = require('.//images/blog/light/schema');
const { scanner } = require('.//images/blog/light/scanner');

async function init() {
  const server = hapi.server({
    port: 5135,
    host: 'localhost',
    query: { parser: (query) => qs.parse(query) }
  });

  server.route({
    method: 'GET',
    path: '/scan',
    options: {
      validate: {
        query: schema.scan_req_schema
      },
      response: { schema: schema.scan_res_schema }
    },
    handler: async (request, h) => {
      const { url } = request.query;
      const { msg, err, uuid } = await scanner.run(url, request);

      if (err) {
        request.log('error', { msg, url });
        return h.response({ msg }).header('Retry-After', '30s').code(429);
      }

      request.log('info', { msg, url, uuid });
      return h.response({ msg, uuid }).code(202);
    }
  });

  await server.register({
    plugin: require('hapi-pino'),
    options: {
      prettyPrint: true,
      timestamp: true,
      redact: ['req.headers.authorization']
    }
  });

  await server.start();
  server.log('info', { msg: 'BOOTING server' });
}

init();

Conclusion

Hapi is enterprise express. It offers sensitive defaults for error handling, validation and logging. Application code is compact and very readable. The carefully curated core modules and plugins enhance this very robust framework. When you would use plain express in your next project, consider to use Hapi instead. You will be delighted.