Authoring Memory Sentence Segmentation
Important:
The sentence segmentation used by Congree in the Authoring Memory is rule-based, i.e. on the basis of rules, Congree determines where a sentence ends and where a new one begins. The sentence segmentation has a major impact on the identification of previously saved similar sentences for the text entered in the editor.
Each stored sentence rule consists of three parts:
- The first part of the rule (e.g. [!]) indicates which separator the rule covers.
- The second part of the rule determines whether the rule defines the end of a sentence (+) or not (-).
- The third part of the rule represents the core of the rule. The part [!^_] of the rule [!]+[!^_] stands for "An exclamation mark followed by a space", which is interpreted as the end of the sentence.
Note:
The following table contains the most important sentence rules used in Congree by default:
Sentence rule | Meaning |
---|---|
[!]+[!^_] | An exclamation mark followed by a space is interpreted as the end of the sentence. |
[.] - [^_^n.] | A period followed by a space and a lowercase letter is not interpreted as the end of the sentence. |
[.]+[.^_] | A period followed by a space is interpreted as the end of the sentence. |
[.] - [.^_^a] | A period followed by a space and a lowercase letter is not interpreted as the end of the sentence. |
[.] - [^_^n.] | A space followed by a one-digit number and a period is not interpreted as the end of the sentence. |
[?]+[?^_] | A question mark followed by a space is interpreted as the end of the sentence. |
[?] - [?^_^a] | A question mark followed by a space and a lowercase letter is not interpreted as the end of the sentence. |
[n]+[.\n] | A period followed by a backslash and the letter n is interpreted as the end of the sentence. |
[n]+[!\n] | An exclamation mark followed by a backslash and the letter n is interpreted as the end of the sentence. |
[n]+[?\n] | An exclamation mark followed by a backslash and the letter n is interpreted as the end of the sentence. |
[t]+[.\t] | A period followed by a backslash and the letter t is interpreted as the end of the sentence. |
[t]+[!\t] | A question mark followed by a backslash and the letter t is interpreted as the end of the sentence. |
[t]+[?\t] | A question mark followed by a backslash and the letter t is interpreted as the end of the sentence. |