Synonyms
Use synonyms to vary your output is a good practice. For some background about the approaches read Ehud Reiter’s article about synonyms in NLG.
You can output simple synonyms (words etc.) with syn and complex ones (which use mixins or other synonyms etc.) using synz > syn structure.
The algorithm that chooses the synonym to output works like that:
- 
It is random based (nothing fancy but efficient), or sequence based (one after the other), depending on the mode. 
- 
It eliminates empty alternatives. 
- 
You can ask the algorithm to globally choose the best alternative. 
You should not use your own random numbers in your mixins, because it will break RosaeNLG’s ability to predict the next outputs. More about RosaeNLG and random numbers.
Basic synonyms using syn
The syn mixin is perfect for very basic synonyms.
Arguments can be single words, multiple words or anything, but not mixins. Please note that the argument is not an array: the mixin takes a variable number of arguments.
With this mixin the choice is always random. Use synz > syn structure if you want more options like sequential output.
The syn_fct function
The syn_fct is not a mixin but a standard JavaScript function. Its argument is an array.
It is useful when you want random arguments in some other mixins. Remind, do not use you own random function.
Will randomly output the apple or an apple:
Complex synonyms with synz > syn structure
First example
When each synonymic alternative is complex text using mixins, syn doesn’t fit. You have to use the synz > syn structure.
You can put sentences or words or whatever you want in each syn.
Note on empty alternatives
RosaeNLG will always try to find a non-empty alternative in a synz > syn structure. When it triggers an empty one, it will go back and try to find a new one.
Choose randomly but try not to repeat
Pure random mode is default but has a drawback: the same alternative can trigger again, leading to non harmonious text.
To avoid that use mode:'once'. It will trigger each alternative randomly, but will try not to repeat
the same alternative. When all alternatives have been triggered, it will reset.
| In general you should favor onceinstead of defaultrandom. | 
Force a specific synonym to trigger
To force a specific synonym to trigger, use synz {force:3} (to trigger the 3rd one):
| This is useful while developping. | 
| if the forced alternative is empty, it will not trigger it (and will trigger a non empty one). | 
Weights of each alternative
If you want to favor an alternative more than the other, you can put a higher weight on the one you prefer:
The the one I prefer option will be triggered much more often (probability is 3/5).
weight must be a strictly positive integer.
| It is generally a bad practice to use weightandmode: 'once'in the same structure: once an alternative has been triggered, it will be avoided whatever its weight. | 
Choose each synonym alternative one after the other
Sometimes random is not the right way. You might prefer to trigger the first alternative, then the second one, etc. Put the mode parameter to sequence to do that.
When called 5 times, will output: first second third fourth first
weight parameter is meaningless in sequence mode.
Global synonym mode
Possible values for more are:
- 
random(default)
- 
sequence
- 
once
By default, the synonyms are choosen randomly (random), and you can locally change this behavior using sequence or once mode. But you can change the behavior globally using defaultSynoMode.
When you have changed defaultSynoMode, you can still change the default behavior locally using another mode.
| using onceasdefaultSynoModeand settingsequencelocally is a popular setting. | 
Choosing the best alternative globally with choosebest
Introduction and first example
| The standard synonym algorithm is and should be good enough for most usages. When there are non elegant repetitions in the generated texts, the first reflex should be to do local fixes with using {mode:'sequence'}. | 
choosebest works the following:
- 
it generates dozens of texts on a section, whatever its size or what is contains 
- 
it chooses the textual alternative that contains the least close repetitions 
For instance, if stone gem and jewel are synonyms, ranking from best to worst: stone gem jewel / stone gem stone / stone stone gem / stone stone stone.
Let’s take a first example:
eachz i in [1,2,3] with {separator: ' '}
  synz
    syn
      | stone
    syn
      | jewel
    syn
      | gem
If you run that, you will get randomly gem jewel jewel or stone gem stone etc. - sometimes gem jevel stone if you are lucky.
Let’s use choosebest:
It will generate a 100 times the same text and take the best alternative. Unless you are very unlucky, you are sure to get gem jevel stone (still in a random order).
Usage
You can put choosebest anywhere to optimize synonyms in a section of text but you should use it at a paragraph level.
| choosebesthas a heavy impact on performance as the texts are generated multiple times. Use it cautiously only when required. | 
| you cannot imbricate choosebeststructures. But in a same template you can use multiplechoosebeststructures one after the other, for instance on each paragraph. | 
Advanced options
How it works
The scoring algorithm works like this:
- 
single words are extracted thanks to a tokenizer wink-tokenizer, and lowercased
- 
stopwords are removed (you can customize the list of stopwords) 
- 
when the same word appears multiples times, it raises the score depending on the distance of the two occurrences (if the occurrences are closes it raises the score a lot). 
Max attempt
To indicate the maximum attempts to find the best alternative:
- 
amonglocal parameter:choosebest {among:20}
- 
defaultAmongglobal parameter:rosaenlgPug.render(myTemplate, { language: 'en_US', defaultAmong:10 })
- 
default is 5 
Stop words customization
You can customize locally the list of stop words with:
- 
stop_words_addstring[]: list of stopwords to add to the standard stopwords list (NB:stop_words_addwill be automatically lowercased)
- 
stop_words_removestring[]: list of stopwords to remove to the standard stopwords list
- 
stop_words_overridestring[]: replaces the standard stopword list (which is per language)
will output newStopWord newStopWord AAA newStopWord BBB.
choosebest param
  synz
    syn
      | thus thus thus AAA BBB
    syn
      | AAA AAA
will output AAA AAA, because thus is not considered as a stop word no more.
| The standard list of stop words per language is here. | 
Force identical elements
Sometimes you want to say that 2 or more words should be considered as identical in terms of synonyms even if they are not. Often for plurals: diamonds diamond, as there is no integrated lemmatizer, or for similar words like phone cellphone smartphone.
Use identicals string[][] with list of words that should be considered as beeing identical:
will output diamonds and pearl systematically.
Generate all alternatives
You may want to generate exhaustively all texts to insure that they are ok.
| Even on a modest project, the combination of all possible texts can be huge. | 
A general advice would be:
- 
When the output must be completely predictable and exhaustively tested (it can be the case for financial or medical applications), you may just consider avoid using synonyms, or use them less. 
- 
Use regression testing and test some of the outputs: - 
some with a fixed random seed ( forceRandomSeed): must be an exact match
- 
some random, but checking for invariants in the text (typically numbers, or facts) 
 
- 
If you really want to generate all texts, you can write a loop using forceRandomSeed:
- 
put forceRandomSeedto 1, then 2, then 3 etc.
- 
changing random seed will not guarantee that a different text is generated each time: keep the generated texts as keys in a Map to see if a text is new or not 
- 
eventually stop looping when you don’t discover enough new texts