Skip to main content
Version: 2026.0

text

Methods

analyze

analyze(pAnalyzeRequest): Map<any, any>

Performs the analysis for a given analyze request, containing the text and the specified pattern.

The return value is a number; its value, however does not provide an indication as to the quality of the search result.

By specifying different patterns, you can compare the results of the analyses.

Parameters

pAnalyzeRequest

Request object for the text analysis, containing the required text, patterns and settings for the analysis. The request can be created via text.createAnalyzeTextRequest().

Returns

Map<any, any>

The result of the analysis as a "score" of the patterns, sorted by score descending.

Throws

May throw an exception.


Example

var data= "My first car was a Vauxhall Corsa 1.4. The next one was a red Corsa.";



var patternArr = new Object();
var ret = "";
patternArr["plain"] = "Corsa";
patternArr["after"] = "Corsa*";
patternArr["all"] = "*Corsa*";



var analysisRequest = text.createAnalyzeTextRequest();
analysisRequest.setText(data);
analysisRequest.setPatterns(patternArr);
analysisRequest.setUseStopWords(false);
analysisRequest.setAnalyzer(text.ANALYZER_TYPE_TEXT);
analysisRequest.useDefaultOperatorOR();



question.showMessage(text.analyze(analysisRequest));

analyzeText

analyzeText(pText, pPatterns, pUseStopWords): Map<any, any>

Analyzes a text using the specified pattern.

The return value is a number; its value, however does not provide an indication as to the quality of the search result.

By specifying different patterns, you can compare the results of the analyses. The function uses the ANALYZER_TYPE_TEXT_CLASSIC and the default operator 'AND' for the given patterns. The call is equivalent to text.analyze(createAnalyzeTextRequest().setText(pText).setPatterns(pPatterns).setUseStopWords(pUseStopWords) .setAnalyzer(text.ANALYZER_TYPE_TEXT_CLASSIC).useDefaultOperatorAND())

Parameters

pText
string | number | boolean

The text to be analyzed.

pPatterns
any

The search pattern (as a map with keys and values).

pUseStopWords
boolean

If 'true', stop words will be recognized. These are search terms in Lucene that will not be taken into account for the analysis.

Returns

Map<any, any>

The result of the analysis as a "score" of the patterns, sorted by score descending.

Throws

May throw an exception.


Example

var data= "My first car was a Vauxhall Corsa 1.4. The next one was a red Corsa.";



var patternArr = new Object();
var ret = "";
patternArr["plain"] = "Corsa";
patternArr["after"] = "Corsa*";
patternArr["all"] = "*Corsa*";
var useStopWords = false;



question.showMessage(text.analyzeText(data, patternArr, useStopWords));

any2text

any2text(pContent, pDataType): string

Extracts the text parts from an given content, using the Tika parser. For further details, please refer to the documentation of Apache Tika

Parameters

pContent
string | number | boolean

any Content

pDataType
number

The expected type of the data. See util.DATA_*.

Returns

string

The text part in the Content code; or 'null' if no content was passed.

Throws

May throw an exception.


Example

var any = "<html><body><h1>Heading</h1>" + "

Paragraph

</body></html>";
var text = text.any2text(any);
question.showMessage(text);

createAnalyzeTextRequest

createAnalyzeTextRequest(): AnalyzeRequest

Creates a new empty request for the text analysis. It ist required to provide valid values for the text and patterns. All other settings are optional. If not otherwise specified the request searches for stop words and uses the text.ANALYZER_TYPE_TEXT and the default operator AND.

Returns

a new empty request for the text analysis.


Example

var data= "My first car was a Vauxhall Corsa 1.4. The next one was a red Corsa.";



// Create the patterns as a map with keys and values
var patternArr = new Object();
patternArr["plain"] = "Corsa";
patternArr["after"] = "Corsa*";
patternArr["all"] = "*Corsa*";



// Create a new empty request object. var analysisRequest = text.createAnalyzeTextRequest();
analysisRequest.setText(data); // Set the text will be analyzed.
analysisRequest.setPatterns(patternArr); // Set the search pattern
analysisRequest.setUseStopWords(true); // Specify if stop words should be filtered.
analysisRequest.setAnalyzer(text.ANALYZER_TYPE_TEXT); // Specify the explizit type of analyzer.
analysisRequest.useDefaultOperatorOR(); // Specify the default operator used for the patterns.



var res = text.analyze(analysisRequest); // Perform the analysis.

decodeFirst

decodeFirst(pEncoded): string

Retrieves the first value from a multi-string (return value from a table, return value from a list).

The result is written to a variable.

Parameters

pEncoded
string | number | boolean

Multi-string (e.g. "; One; Two;") where the value is to be retrieved.

Returns

string

The decoded element. If this element is empty or 'null', an empty string will be returned.

Throws

May throw an exception.


Example

var strg = "; One; Two; Three;";
var ret = text.decodeFirst(strg);
question.showMessage(ret);


**//Example 2**
var ret = text.decodeFirst(vars.get("$comp.someListComponent"));
question.showMessage(ret);

decodeMS

decodeMS(pEncoded): string[]

Retrieves all values from a multi-string and returns them as an array.

Parameters

pEncoded
string | number | boolean

Multi-string (e.g. "; One; Two;") where the value is to be retrieved.

Returns

string[]

All elements of the multi-string, as an array (e.g. ["One", "Two"]).

Throws

May throw an exception.


Example

var ret = text.decodeMS(vars.get("$comp.someListComponent"));
question.showMessage(ret[0]);

encodeMS

encodeMS(pDecoded): string

Creates a multi-string from the array you passed.

Parameters

pDecoded
string[]

The array to be converted (e.g. ["One", "Two"]).

Returns

string

The multi-string (e.g. "; One; Two;"). The result can be processed, for example in list components.

Throws

May throw an exception.


Example

var multistrg = text.encodeMS(new Array("One", "Two"));

formatDouble

formatDouble(pDouble, pPattern, pUseClientLocale?): string

Formats a decimal number using the specified formatting pattern.

Parameters

pDouble
string | number | boolean

The "double" value to be converted, present as a string.

pPattern
string | number | boolean

The formatting pattern to be used for formatting the number. See also "Number Formatting Patterns" in the JDito-JavaScript manual.

pUseClientLocale?
boolean

If 'true', the system gets the client's locale (language version) so that the number formatting complies with the language conventions used at the client (e.g. dot or comma used as a thousands separator).

Returns

string

The formatted "double" value.

Throws

May throw an exception.


Example

var formatted = text.formatDouble("10000.30", "#,##0.00", true);
question.showMessage(formatted);
// Here, '0' is a placeholder for a mandatory digit, '#' for an optional digit.
// For the German locale, "formatted" will be "10.000,30"

formatLong

formatLong(pLong, pPattern): string

Formats an integer using the specified formatting pattern.

Parameters

pLong
string | number | boolean

The "long" value to be converted, currently existing as a string.

pPattern
string | number | boolean

The formatting pattern to be used for formatting the number. See also "Number Formatting Patterns" in the JDito-JavaScript manual.

Returns

string

The formatted "long" value.

Throws

May throw an exception.


Example

var formatted = text.formatLong("1000030", "#,##0");
question.showMessage(formatted);
// Here, '0' is a placeholder for a mandatory digit, '#' for an optional digit.
// For the German locale, "formatted" will be "10.000.030"

hash

hash(pValue, pAlgorithm): string

Calculates the hash value of the string you passed.

Parameters

pValue
string | number | boolean

The value for which you want to calculate the hash value.

pAlgorithm
string | number | boolean

The hash algorithm to be used for calculating the hash value. Valid input values are MD5, SHA1, SHA256, JAVA. They can be added using text.HASH_*.

Returns

string

The calculated hash value.

Throws

May throw an exception.


Example

question.showMessage(text.hash("Quak", text.HASH_SHA256));

html2text

html2text(pHTML): string

Extracts the text parts from an HTML string.

Parameters

pHTML
string | number | boolean

The HTML text, formatted as HTML3.

Returns

string

The text part in the HTML code; or 'null' if no HTML code was passed.

Throws

May throw an exception.


Example

var html = "<html><body><h1>Heading</h1>" + "

Paragraph

</body></html>";
var text = text.html2text(html);
question.showMessage(text);

html2textTika

html2textTika(pHTML): string

Extracts the text parts from an HTML string, using the Tika HTML parser

Parameters

pHTML
string | number | boolean

The HTML text, formatted as HTML3.

Returns

string

The text part in the HTML code; or 'null' if no HTML code was passed.

Throws

May throw an exception.


Example

var html = "<html><body><h1>Heading</h1>" + "

Paragraph

</body></html>";
var text = text.html2textTika(html);
question.showMessage(text);

If the String is no valid HTML document, then most likely the extraction will fail. So the use of the html and body tag is mandatory.

parseCSV

parseCSV(pInput, pLineDelim, pFieldDelim, pFieldLimit): string[][]

Reads CSV contents and converts it to a JavaScript array.

Parameters

pInput
string | number | boolean

The contents of the CSV file.

pLineDelim
string | number | boolean

The line separator, e.g. "CRLF".

pFieldDelim
string | number | boolean

The field separator, e.g. ";"

pFieldLimit
string | number | boolean

The field delimiter.

Returns

string[][]

A two-dimensional JavaScript array.

Throws

May throw an exception.


Example

var data = swing.doClientIntermediate(swing.CLIENTCMD_GETDATA, new Array("C:/test/import.csv", util.DATA_TEXT));



var tab = text.parseCSV( data.replace(/(^\s+)|(\s+$)/g,""), '\r\n', ';', "" );
question.showMessage(tab[0][0] + " // " + tab [1][0]);
question.showMessage(tab[0][1] + " // " + tab [1][1]);

parseDocument

parseDocument(pBase64): string

Retrieves the text from a Base64 document (PFD, DOC, etc.).

Parameters

pBase64
string | number | boolean

The document as a Base64-encoded string.

Returns

string

The text contained in the document.

Throws

May throw an exception.


Example

var file = question.askQuestion("Which file?", question.QUESTION_FILECHOOSER, null);
var data = swing.doClientIntermediate(swing.CLIENTCMD_GETDATA, [file, util.DATA_BINARY, null]);



var parsed = text.parseDocument(data);
var re = /Corsa/g;
var arr;
var result = [];



while ((arr = re.exec(parsed)) !== null)
{
result.push(arr.index);
}
question.showMessage(result);

parseDouble

parseDouble(pFormatted, pPattern, pUseClientLocale?): number

Retrieves a formatted "double" value.

Parameters

pFormatted
string | number | boolean

The formatted "double" value to be converted, present as a string.

pPattern
string | number | boolean

The formatting pattern to be used for formatting the number. See also "Number Formatting Patterns" in the JDito-JavaScript manual.

pUseClientLocale?
boolean

If 'true', the system gets the client's locale (language version) so that the number formatting complies with the language conventions used at the client (e.g. dot or comma used as a thousands separator).

Returns

number

The number as a "double" value.

Throws

May throw an exception.


Example

var backToRoots = text.parseDouble("10.000,30", "#,##0.00", true);
question.showMessage(backToRoots);
// For the German locale, "formatted" will be "10000.3"

parseLong

parseLong(pFormatted, pPattern): number

Retrieves a formatted "long" value.

Parameters

pFormatted
string | number | boolean

The formatted "long" value to be converted, present as a string.

pPattern
string | number | boolean

The formatting pattern to be used for formatting the number. See also "Number Formatting Patterns" in the JDito-JavaScript manual.

Returns

number

The number as a "long" value.

Throws

May throw an exception.


Example

var backToRoots = text.parseLong("10.000,00", "#,##0.00");
question.showMessage(backToRoots);
/// For the German locale, "formatted" will be "10,000"

parseText

parseText(pText, pEncoding): string

Reads the document you passed (e.g. HTML) and returns its text contents.

Parameters

pText
string | number | boolean

Data from which to retrieve the text contents.

pEncoding
string | number | boolean

Encoding of the text contents; if you specify 'null', the system uses the default server encoding.

Returns

string

The text contained in the document.

Throws

May throw an exception.


Example

var content = text.parseText(vars.getString("$comp.html"), "UTF-8");

replaceAll

replaceAll(pText, pReplacements): string

Replaces certain text strings in a text by others. This method is faster, by the factor of 100, than the default JavaScript .replace method.

Parameters

pText
string | number | boolean

The text where certain text strings are to be replaced.

pReplacements
any

The string replacing the searched one (map).

Returns

string

The text with the replaced strings.

Throws

May throw an exception.


Example

var data = "Raider is a chocolate bar, " + "Raider is sold in packs of two.";
var replacements = new Array();
replacements["Raider"] = "Twix";
var newText = text.replaceAll(data, replacements);
question.showMessage(newText);

rtf2text

rtf2text(pRTF): string

Extracts the text from an RTF file.

Parameters

pRTF
string | number | boolean

the RTF text

Returns

string

The plain text. Null, if null or a text with length 0 was given.

Throws

AditoException


rtf2textTika

rtf2textTika(pRTF): string

Extracts the text parts from an RTF file, using the tika RTF parser

Parameters

pRTF
string | number | boolean

The RTF file.

Returns

string

The text, or 'null' if the RTF does not contain any text.

Throws

May throw an exception.


Example

var content = text.rtf2textTika("{\\rtf1 Hello! \\line {\\i This} is \\b{\\i a piece of \\i0 formatted \\b0text}. \\par \\i0 The \\b0ende. }");
logging.show(content);

sanitize

sanitize(pOuterHtml, pWhiteList): string

Sanitizes the given string with a default safelist containing the following tags which are suitable for inline HTML text styling: "b", "u", "i", "sub", "sup", "s", "strong", "em", "ins", "q".

Parameters

pOuterHtml
string | number | boolean

The unsanitized HTML string

pWhiteList
string[]

A list of additional HTML tags to exclude from sanitization (e.g. "div", "p", "span"). Can also be null or an empty array.

Returns

string

A sanitized version of the HTML string


sanitizeRichText

sanitizeRichText(pOuterHtml, pWhiteList): string

Sanitizes the given string with a default safelist containing tags that are suitable for inline HTML text styling and an additional extensive list of allowed tags and attributes, suitable for richt HTML text. Allowed tags: "address", "article", "aside", "footer", "header", "h1", "h2", "h3", "h4", "h5", "h6", "hgroup", "main", "nav", "section", "blockquote", "dd", "div", "dl", "dt", "figcaption", "figure", "hr", "li", "main", "ol", "p", "pre", "ul", "a", "abbr", "b", "bdi", "bdo", "br", "cite", "code", "data", "dfn", "em", "i", "kbd", "mark", "q", "rb", "rp", "rt", "rtc", "ruby", "s", "samp", "small", "span", "strong", "sub", "sup", "time", "u", "var", "wbr", "caption", "col", "colgroup", "table", "tbody", "td", "tfoot", "th", "thead", "tr", "iframe", "img", "label", "input", "b", "u", "i", "sub", "sup", "s", "strong", "em", "ins", "q" Allowed attributes: ":all", "class", "id", "style", "name", "type", "disabled", "checked", "data-oembed-url", "td", "rowspan", "colspan", "a", "href", "target", "rel", "type", "data", "download", "iframe", "src", "frameborder", "width", "height", "allow", "allowfullscreen","img", "src", "alt", "title"

Parameters

pOuterHtml
string | number | boolean

The unsanitized HTML string

pWhiteList
string[]

A list of additional HTML tags to exclude from sanitization (e.g. "div", "p", "span"). Can also be null or an empty array.

Returns

string

A sanitized version of the HTML string


split

split(pText, pRegex): string[]

Splits the string at the specified position. This method works much faster than the JavaScript .split method!

Parameters

pText
string | number | boolean

The text to be split.

pRegex
string | number | boolean

A regular expression that specifies where to split the string.

Returns

string[]

A string [] that contains the split elements.

Throws

May throw an exception.


Example

// Splitting a string
var data = "This is a long sentence! For this reason, it must be split.";
var regex = "!";
var data = text.split(data, regex);

strToDouble

strToDouble(pString): number

Converts a string to a decimal number.

Parameters

pString
string | number | boolean

The value to be converted, present as a string.

Returns

number

The "double" value converted from the string you passed.

Throws

May throw an exception.


Example

var res = text.strToDouble("25.8");

strToLong

strToLong(pString): number

Converts a string to an integer.

Parameters

pString
string | number | boolean

The value to be converted, present as a string.

Returns

number

The "long" value converted from the string you passed.

Throws

May throw an exception.


Example

var res = text.strToLong("313373");

text2html

text2html(pPlainText, pConvertLinks): string

converts simple plain text to a html-text by replacing control characters, reserved characters, etc.

Parameters

pPlainText
string | number | boolean

the plaintext that shall be converted to a simple html-text

pConvertLinks
boolean

defines if links should be automatically converted to a html-anchor (true) or not (false)

Returns

string

plain text converted as html. html is not wrapped into any -tag or

-tag

Example

var text = "characters like these: < or > or newlines \n will be escaped";
var html = text.text2html(text, false);
logging.log(html);

toCSV

toCSV(pElements, pLineDelim, pFieldDelim, pFieldLimit): string

Creates a CSV entry from a JavaScript array.

Parameters

pElements
string[][]

The JavaScript array from which to create the CSV entry.

pLineDelim
string | number | boolean

The line separator.

pFieldDelim
string | number | boolean

The field separator.

pFieldLimit
string | number | boolean

The field delimiter.

Returns

string

The array as a CSV-formatted string.

Throws

May throw an exception.

Properties

Example

import("lib_table4report");
import("lib_document");



// Example: Output data from a read-in CSV file to a new CSV file
var data = swing.doClientIntermediate(swing.CLIENTCMD_GETDATA, new Array("C:/temp/import.csv", util.DATA_TEXT));
var tab = text.parseCSV( data.replace(/(^\s+)|(\s+$)/g,""), '\r\n', ';', "" );



var table = [];
table += text.toCSV(tab, "\r\n", ";", '"');



var fname = question.askQuestion("Please select the desired file", question.QUESTION_FILECHOOSER, "");



if(fname != null && fname != "")
{
fname = fname + ".csv";
if ( ! FileIOwithError(swing.CLIENTCMD_STOREDATA, [fname, table, util.DATA_TEXT, false]) )
FileIOwithError(swing.CLIENTCMD_OPENFILE, new Array(fname));
}

ANALYZER_TYPE_PLAIN

string

Simple analyser which leaves the original text untouched as much as possible. The search is case-insensitive and special characters are converted to their ASCII counterparts. All occurring punctuation and special characters are preserved and taken into account in the search.


ANALYZER_TYPE_SIMPLE

string

Analyzer, which dissects the text according to Lucene standards. The analyzer does not take into account English and German language features during normalisation. The search is case-insensitive and special characters are converted to their ASCII counterparts. All occurring punctuation and special characters, with a few exceptions, are ignored and cannot be searched.


ANALYZER_TYPE_TEXT

string

Standard analyzer which decomposes the text according to the current Lucene standard. The analyzer performs a simple normalisation for English and German text elements. The search is case-insensitive and special characters are converted to their ASCII counterparts. All occurring punctuation and special characters, with a few exceptions, are ignored and cannot be searched.


ANALYZER_TYPE_TEXT_CLASSIC

string

Classic-Analyzer, which decomposes the text according to the old 4.0 Lucene standard. The analyzer performs a simple normalisation for English and German text elements. The search is case-insensitive and special characters are converted to their ASCII counterparts. All occurring punctuation and special characters, with a few exceptions, are ignored and cannot be searched.


HASH_JAVA

string

HashCode: JAVA


HASH_MD5

string

HashCode: MD5


HASH_SHA1

string

HashCode: SHA1


HASH_SHA256

string

HashCode: SHA256