RosaeNLG node.js Server

Server for RosaeNLG.

Install & Run the server

Using npm and node.js

mkdir my-rosaenlg-server
cd my-rosaenlg-server
npm init --yes
npm install rosaenlg-node-server
npx rosaenlg-server

Using Docker

Not saving templates:

docker run -p 5000:5000 -i rosaenlg/server:latest

Saving templates on the disk (here on an AWS EC2 server): you have to define the ROSAENLG_HOMEDIR variable and map volumes:

mkdir templates
docker run -p 80:5000 --env "ROSAENLG_HOMEDIR=/templates" -v /home/ec2-user/templates:/templates -i rosaenlg/server:latest

You can also use S3, see below.

Plain exe file

You can easily generate a plain .exe file for the server using Vercel pkg.

The .exe file is not shipped. To build it:

  • install Vercel pkg: npm install -g pkg

  • in the sources: go to packages/rosaenlg-node-server, and run pkg package.json --targets node12-win-x64 (if you are on Windows)

  • run rosaenlg-node-server.exe

Configuration

Port

Set the port using environment variable ROSAENLG_PORT. Default is 5000.

On Mac, 5000 port is often already used. You should either disable Airplay, that uses 5000, or change the port for RosaeNLG.

Persistence

By default, templates are not saved permanently: the server is empty when it starts, you can upload templates and render them, but they are lost when the server is shut down. To save templates, you must provide a path to the disk or S3 credentials. Templates will be saved when uploaded (as json files), and reloaded when the server restarts.

when using direct rendering, which is rendering a template with both the template and the data in the same request, the template is cached but will never be stored permanently. This means that you can just use the server without configuring any persistence, and just using direct rendering.

Configuration:

  • Persistence on disk: Set the environment variable ROSAENLG_HOMEDIR.

  • Persistence on S3:

    • set the 3 environment variables AWS_S3_BUCKET, AWS_S3_ACCESS_KEY_ID and AWS_S3_SECRET_ACCESS_KEY

    • optionally use AWS_S3_ENDPOINT to indicate an S3-compatible object storage service (for instance https://mys3service.com)

You can either save on disk or on S3, but not both.

You can also push new templates directly on the disk (using CI or whatever) and ask the server to reload them (see reload path), without having to restart the server. Follow these 2 constraints:

  1. filename must be $user#$templateId.json: you can use DEFAULT_USER (see User identification)

  2. a user field in the JSON (with the same value) must identify which user the template belongs to (for instance DEFAULT_USER)

Security

The server can be started without any security. This is relevant in a microservice architecture, when the server is not publicly visible, or for testing purposes. This is the default mode (or put environment variable JWT_USE to false).

For other scenarios JWT is available.

Run the server with the following environment variables set:

When calling the API you first have to request a JWT token, and then put it in the header when calling the API: Authorization: Bearer shhhhhhmysecret.

health (/health) and swagger (/api-docs) routes are not protected even when activating JWT.

Testing was done using auth0, adaptations for other contexts can be required.

Share templates

Shared templates are templates that can be rendered by anyone.

To activate the feature, you must set ROSAENLG_SHARED_USER: all the templates from this user can be rendered by anyone, but only their owner can get their content, update or delete it.

Additionally, when using the disk (with ROSAENLG_HOMEDIR set), you can set ROSAENLG_SHARED_DIR: this folder will contain the shared templates. Tey can be rendered by anyone, but no user can list them or modify them using the API (even the ROSAENLG_SHARED_USER). This is used when creating Docker images containing shared templates.

Log using CloudWatch

You can put logs in CloudWatch (optional). This requires a bunch of environment variables:

  • AWS_CW_LOG_GROUP_NAME: the log group name; must be created before

  • AWS_CW_LOG_STREAM_NAME: the log stream name; must also exist

  • AWS_CW_ACCESS_KEY_ID: access key ID

  • AWS_CW_SECRET_ACCESS_KEY: secret access key

  • AWS_CW_REGION: region

I had trouble creating the proper IAM policy, and finally used this:
{
   "Version":"2012-10-17",
   "Statement":[
      {
      "Action": [
        "logs:CreateLogStream",
        "logs:DescribeLogStreams",
        "logs:PutLogEvents",
        "logs:GetLogEvents"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:logs:YOUR_REGION:YOUR_AWS_ID:log-group:YOUR_LOG_GROUP:*"
      }
   ]
}

In a cluster

When using the server in a cluster, you have the following issue: the templates are loaded in a specific instance (the one which received the create template request), but not on the other ones. Thus the other nodes must be able to load the template from the storage when they need it.

Recommanded configuration is:

  • use S3, not the disk persistence

  • ROSAENLG_LAZY_STARTUP: usually put true (it defaults to false) so that the templates are not loaded when the server starts; they will get loaded once the servers needs them

  • ROSAENLG_FORGET_TEMPLATES: put true (it defaults to false) so that a server can forget the templates after a while (they will just be reloaded if they are necessary again)

An alternative is to use no persistence backend, and just allow direct render requests.

Documentation, swagger, OpenAPI

Static version is here.

When running the server, the documentation is directly available: http://localhost:5000/api-docs

User identification

Each user has his own separate space: user2 cannot see nor use user1 templates, etc.

  • When using JWT, the user is uniquely identified using sub property in the token.

  • When not using JWT:

    • You put a user ID in a header; indicate the header name using ROSAENLG_USER_ID_HEADER env variable.

    • If you do not identify users (which is a valid choice), user will default to DEFAULT_USER.

The name of the user cannot contain # char.

Output data, and not only text

The main feature is to output text in the renderedText field. Sometimes, data is computed in the templates (in JavaScript files), and you might wish to output this data as well.

  • in your template, use the outputData variable: - outputData.obj = {aaa: 'bbb'};

  • in the API answer, read the outputData field, which will here contain {"obj":{"aaa":"bbb"}

State management

The API is stateless. It do not keep the result of a previous call. When developing for instance a chatbot, you need to keep the state of the conversation somewhere outside the API.

Packaging the templates

RosaeNLG templates are typically developed on a node.js environment, as RosaeNLG is primarly a JavaScript library. Once the templates are developed, you can package them in a JSON package (instead of having multiple .pug files, which is not practical), deploy them on RosaeNLG Java Server and render texts.

To package the templates, use the RosaeNLG Packager.

Use the API - Exemple using cURL

Register a template

curl -X PUT \
  http://localhost:5000/templates \
  -H 'Accept: */*' \
  -H 'Accept-Encoding: gzip, deflate' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -d '{
  "templateId": "chanson",
  "src": {
    "entryTemplate": "chanson.pug",
    "compileInfo": {
      "activate": false,
      "compileDebug": false,
      "language": "fr_FR"
    },
    "templates": {
      "chanson.pug": "p\n  | il #[+verb(getAnonMS(), {verb: '\''chanter'\'', tense:'\''FUTUR'\''} )]\n  | \"#{chanson.nom}\"\n  | de #{chanson.auteur}\n"
    }
  }
}
'

You should get:

{
  "templateId":"chanson",
  "templateSha1":...,
  "ms":...}

Use the result templateSha1 to customize the following request and render the template with some input data:

curl -L -X POST \
  http://localhost:5000/templates/chanson/PUT_TEMPLATESHA1_HERE/render \
  -H 'Accept: */*' \
  -H 'Accept-Encoding: gzip, deflate' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -d '{
  "language": "fr_FR",
  "chanson": {
    "auteur": "Édith Piaf",
    "nom": "Non, je ne regrette rien"
  }
}'

You should get:

{
  "templateId":"chanson",
  "renderedText":"<p>Il chantera \"Non, je ne regrette rien\" d'Édith Piaf</p>",
  "renderOptions":{
    "language":"fr_FR"
  },
  "ms": ...
}
You can also ask curl to follow redirects, as, if the SHA1 is wrong, the server will automatically answer a 308 redirect. Use curl -L -X POST …​.

Misc

Do not use the Pug cache parameter, as:

  • anyway the render function of Pug is not used, so it is useless

  • the server already caches the compiled functions

Versions

rosaenlg-node-server version corresponding RosaeNLG version

ALWAYS THE SAME

ALWAYS THE SAME

1.5.0

1.5.0

1.4.1

1.4.1