RosaeNLG node.js Server
Server for RosaeNLG.
Install & Run the server
Using npm and node.js
mkdir my-rosaenlg-server
cd my-rosaenlg-server
npm init --yes
npm install rosaenlg-node-server
npx rosaenlg-server
Using Docker
Not saving templates:
docker run -p 5000:5000 -i rosaenlg/server:latest
Saving templates on the disk (here on an AWS EC2 server): you have to define the ROSAENLG_HOMEDIR
variable and map volumes:
mkdir templates
docker run -p 80:5000 --env "ROSAENLG_HOMEDIR=/templates" -v /home/ec2-user/templates:/templates -i rosaenlg/server:latest
You can also use S3, see below.
Plain exe file
You can easily generate a plain .exe file for the server using Vercel pkg.
The .exe file is not shipped. To build it:
-
install Vercel pkg:
npm install -g pkg
-
in the sources: go to
packages/rosaenlg-node-server
, and runpkg package.json --targets node12-win-x64
(if you are on Windows) -
run
rosaenlg-node-server.exe
Configuration
Persistence
By default, templates are not saved permanently: the server is empty when it starts, you can upload templates and render them, but they are lost when the server is shut down. To save templates, you must provide a path to the disk or S3 credentials. Templates will be saved when uploaded (as json files), and reloaded when the server restarts.
when using direct rendering, which is rendering a template with both the template and the data in the same request, the template is cached but will never be stored permanently. This means that you can just use the server without configuring any persistence, and just using direct rendering. |
Configuration:
-
Persistence on disk: Set the environment variable
ROSAENLG_HOMEDIR
. -
Persistence on S3:
-
set the 3 environment variables
AWS_S3_BUCKET
,AWS_S3_ACCESS_KEY_ID
andAWS_S3_SECRET_ACCESS_KEY
-
optionally use
AWS_S3_ENDPOINT
to indicate an S3-compatible object storage service (for instancehttps://mys3service.com
)
-
You can either save on disk or on S3, but not both. |
You can also push new templates directly on the disk (using CI or whatever) and ask the server to reload them (see reload
path), without having to restart the server. Follow these 2 constraints:
-
filename must be
$user#$templateId.json
: you can useDEFAULT_USER
(see User identification) -
a
user
field in the JSON (with the same value) must identify which user the template belongs to (for instanceDEFAULT_USER
)
Security
The server can be started without any security. This is relevant in a microservice architecture, when the server is not publicly visible, or for testing purposes. This is the default mode (or put environment variable JWT_USE
to false).
For other scenarios JWT is available.
Run the server with the following environment variables set:
-
JWT_USE
to true -
JWT_JWKS_URI
points to JWKS URI, for instancehttps://xxxxx.auth0.com/.well-known/jwks.json
when using auth0 -
JWT_AUDIENCE
: audience -
JWT_ISSUER
: issuer, for instancehttps://xxxxxx.eu.auth0.com/
when auth0
When calling the API you first have to request a JWT token, and then put it in the header when calling the API: Authorization: Bearer shhhhhhmysecret
.
health (/health ) and swagger (/api-docs ) routes are not protected even when activating JWT.
|
Testing was done using auth0, adaptations for other contexts can be required.
Share templates
Shared templates are templates that can be rendered by anyone.
To activate the feature, you must set ROSAENLG_SHARED_USER
: all the templates from this user can be rendered by anyone, but only their owner can get their content, update or delete it.
Additionally, when using the disk (with ROSAENLG_HOMEDIR
set), you can set ROSAENLG_SHARED_DIR
: this folder will contain the shared templates. Tey can be rendered by anyone, but no user can list them or modify them using the API (even the ROSAENLG_SHARED_USER
). This is used when creating Docker images containing shared templates.
Log using CloudWatch
You can put logs in CloudWatch (optional). This requires a bunch of environment variables:
-
AWS_CW_LOG_GROUP_NAME
: the log group name; must be created before -
AWS_CW_LOG_STREAM_NAME
: the log stream name; must also exist -
AWS_CW_ACCESS_KEY_ID
: access key ID -
AWS_CW_SECRET_ACCESS_KEY
: secret access key -
AWS_CW_REGION
: region
I had trouble creating the proper IAM policy, and finally used this: |
{
"Version":"2012-10-17",
"Statement":[
{
"Action": [
"logs:CreateLogStream",
"logs:DescribeLogStreams",
"logs:PutLogEvents",
"logs:GetLogEvents"
],
"Effect": "Allow",
"Resource": "arn:aws:logs:YOUR_REGION:YOUR_AWS_ID:log-group:YOUR_LOG_GROUP:*"
}
]
}
In a cluster
When using the server in a cluster, you have the following issue: the templates are loaded in a specific instance (the one which received the create template request), but not on the other ones. Thus the other nodes must be able to load the template from the storage when they need it.
Recommanded configuration is:
-
use S3, not the disk persistence
-
ROSAENLG_LAZY_STARTUP
: usually puttrue
(it defaults tofalse
) so that the templates are not loaded when the server starts; they will get loaded once the servers needs them -
ROSAENLG_FORGET_TEMPLATES
: puttrue
(it defaults tofalse
) so that a server can forget the templates after a while (they will just be reloaded if they are necessary again)
An alternative is to use no persistence backend, and just allow direct render
requests.
Documentation, swagger, OpenAPI
Static version is here.
When running the server, the documentation is directly available: http://localhost:5000/api-docs
User identification
Each user has his own separate space: user2
cannot see nor use user1
templates, etc.
-
When using JWT, the user is uniquely identified using
sub
property in the token. -
When not using JWT:
-
You put a user ID in a header; indicate the header name using
ROSAENLG_USER_ID_HEADER
env variable. -
If you do not identify users (which is a valid choice), user will default to
DEFAULT_USER
.
-
The name of the user cannot contain #
char.
Output data, and not only text
The main feature is to output text in the renderedText
field.
Sometimes, data is computed in the templates (in JavaScript files), and you might wish to output this data as well.
-
in your template, use the
outputData
variable:- outputData.obj = {aaa: 'bbb'};
-
in the API answer, read the
outputData
field, which will here contain{"obj":{"aaa":"bbb"}
State management
The API is stateless. It do not keep the result of a previous call. When developing for instance a chatbot, you need to keep the state of the conversation somewhere outside the API.
Packaging the templates
RosaeNLG templates are typically developed on a node.js environment, as RosaeNLG is primarly a JavaScript library. Once the templates are developed, you can package them in a JSON package (instead of having multiple .pug
files, which is not practical), deploy them on RosaeNLG Java Server and render texts.
To package the templates, use the RosaeNLG Packager.
Use the API - Exemple using cURL
Register a template
curl -X PUT \
http://localhost:5000/templates \
-H 'Accept: */*' \
-H 'Accept-Encoding: gzip, deflate' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-d '{
"templateId": "chanson",
"entryTemplate": "chanson.pug",
"compileInfo": {
"activate": false,
"compileDebug": false,
"language": "fr_FR"
},
"templates": {
"chanson.pug": "p\n | il #[+verb(getAnonMS(), {verb: '\''chanter'\'', tense:'\''FUTUR'\''} )]\n | \"#{chanson.nom}\"\n | de #{chanson.auteur}\n"
}
}
'
You should get:
{
"templateId":"chanson",
"templateSha1":...,
"ms":...}
Render the template with some input data:
curl -X POST \
http://localhost:5000/templates/chanson/1bfdbcd203ec8e6f889b068fbb2d7d298b1db903/render \
-H 'Accept: */*' \
-H 'Accept-Encoding: gzip, deflate' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-d '{
"language": "fr_FR",
"chanson": {
"auteur": "Édith Piaf",
"nom": "Non, je ne regrette rien"
}
}'
You should get:
{
"templateId":"chanson",
"renderedText":"<p>Il chantera \"Non, je ne regrette rien\" d'Édith Piaf</p>",
"renderOptions":{
"language":"fr_FR"
},
"ms": ...
}