Performance

This is the documentation for 4.3.0 version, which is not the latest version. Consider upgrading to 4.4.0.

Performance (i.e. speed of text generation) has historically been an issue for many NLG systems.

  • Grammar based NLG systems make heavy manipulations on trees, which might affect performance.

  • Template based NLG systems, like RosaeNLG, spend some time going back-and-forth to test synonyms and empty texts etc. This (heavily) affects performance compared to classic (non-NLG) templating systems which can be very fast.

RosaeNLG can be pretty fast though. Its speed depends a lot of the complexity of the texts, how intricated they are and the quantity of synonyms; but a rule of thumb is that generating half a page of text should take 10 to 50 ms. Short but complex texts (a few sentences) should take 2 to 5 ms.

Use Pug cache

Pug’s template are compiled into plain JavaScript functions. This process takes a while. But you can use the same compiled template multiple times with different datasets.

Pug has a built in cache mechanism: you just have to activate it.

When using render and renderFile, always put cache: true.

Also, when the templates are large (but not necessarily complex), compilation can take a while. Be sure that you compile only once.

Put your main loop outside of pug

Let’s imagine you have to generate textual descriptions of products. Each product has its data and you have to loop on them to generate your texts, calling the same template again and again.

You can have your main loop directly in your the pug template. Still, looping outside of the template allows an easy reset of RosaeNLG and Pug, which is much better for performance.

Don’t put your main loop inside of pug.

Use choosebest moderately

It is ok to use [choosebest] but it should be used moderately:

  • in terms of scope: only use it on texts that are difficult to optimize without using it

  • keep among parameter reasonably low; most of the time it is senseless to generate the same text hundreds of times just to find the best one

Use node.js clusters

You can easily parallelize text generation on a single server using node cluster features. Compile the template once using compileFileClient, launch workers, and ask each worker to run the compiled template on different data. A sample POC project is provided in the rosaenlg-parallel-poc module.

Benchmark

Node.js Docker version

2020-11-06:

  • 1 AWS EC2 server t2.medium (2 vCPU, 4 GB), no API Gateway

  • Docker node.js version

  • 1 user rendering 1 complex template (bx)

  • results:

    • no errors at 200 requests per second, with mean latency 18 ms

    • no errors up to 300 requests per second, mean latency grows to 445 ms

    • 5% errors when 400 requests per second, mean latency of 2 s

2020-19-08:

  • 1 AWS EC2 server t2.medium (2 vCPU, 4 GB), no API Gateway

  • Docker node.js version

  • 1 user rendering 2 different templates

  • result: no errors up to 600 requests per second

Java Docker version

2020-11-07:

  • 1 AWS EC2 server t2.medium (2 vCPU, 4 GB), no API Gateway

  • Docker Java version, Xmx3072m

  • 1 user rendering 1 complex template (bx)

  • results:

    • no errors at 10 requests per second, mean latency is low at 69 ms

    • no errors up to 80 requests per second, mean latency grows to 14 s

    • 33% errors at 100 requests per second, high mean latency at 21 s