Health Monitoring - Concept and Components
Third party tools can access metrics about the health state of Congree and its server components. You can download, analyze and visualize the data in our own environment, matching your needs.
Install Congree Telemetry Collector
To access the health data, install the Congree Telemetry Collector module with at least one exporter.
Currently, there is one exporter available, the Prometheus Exporter:
During the installation process, add this module to the server configuration. Proceed as described in Configuring the Congree Telemetry Collector.
Metrics provided by Congree
Congree allows to collect availability metrics and measurement metrics.
Metric structure
All metrics provided by Congree services have the following structure:
congree_coreserver_uptime_seconds{instance="439e796d-9a7b-4ec6-9909-a3a4ffcaec02",job="CongreeCoreServer"}
Each metric contains the value “instance”. It identifies a unique service instance and renews after each restart. Within the “instance”, the metric has a tag “job”. The tag “job” identifies a specific service.
List of Congree services and corresponding “job” tag.
Services marked with do not support health monitoring (yet).
Congree service | “job” tag |
|---|---|
Core server | CongreeCoreServer |
Data Storage wrapper | CongreeDataStorageServer |
Identity server | |
WEB API | |
Linguistic server | CongreeLinguisticServer |
Linguistic Agent service | Congree.LinguisticAgent |
UMMT server | CongreeUmmtServer |
Linguistic Compiler service | Congree.Compilationservice |
Authoring Memory server | CongreeAuthoringMemory |
Term Web back-end | CongreeTermWebConnector |
Quickterm back-end | CongreeQuickTermConnector |
Content Analysis | |
AI Correction service | |
TermSync service | Congree.TermSync |
Linguistic Reporting service | LinguisticReportingService |
Some metrics can have additional tags to distinguish a certain service. See here as example the linguistic connection pool:
congree_linguisticserver_connectionpool_request_count{culture="de-de",instance="640bb02b-4c31-451b-82b2-ab8138cc768b",job="CongreeLinguisticServer",server_name="Default"}
The tag “culture” defines the linguistic culture. The tag “server_name” defines the linguistic server name. Mentioning the linguistic server name is only necessary if multiple linguistic servers are installed.
Availability metrics
In addition, each web service implements the endpoint https://<host>/LinguisticServer/status. This endpoint returns the status of a certain webservice in JSON format.
Example response:
{
"version": "7.0.23332.02.20250521.Dev",
"upSince": "2025-05-23T12:57:57.1551312Z",
"uptime": "00:22:21.8325997"
}List of web services that provide the status endpoint:
Web service | Status is available |
|---|---|
Core server | |
Data Storage wrapper | |
Identity server | |
WEB API | |
Linguistic server | |
UMMT server | |
Authoring Memory server | |
Term Web back end | |
Quickterm back end | |
Content Analysis | |
AI Correction service |
Uptime metric
Each service measures its uptime and provides a metric with a name in the following form:
<service name>_uptime_seconds.
This metric shows the time the service was running since its start.
http_check metric
The http_check metric is part of the health_check of the OpenTelemetry Collector components that check the health or availability of an HTTP endpoint.
Purpose: Monitors the health of an HTTP endpoint by sending requests and recording metrics like latency, status, and availability.
Typical Use Case: Monitoring microservices, APIs, or web endpoints to ensure they are responsive and returning expected HTTP status codes.
Documentation: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/httpcheckreceiver/README.md
For Congree, the status endpoint has this structure: https://<host>/<web_service>/status.
This metric works only for services that provide a status (see table above).
| Description | Example |
|---|---|---|
| Time taken for the HTTP request |
|
| HTTP status code of the last request |
|
http_url is an address of the HTTP endpoint.
This receiver makes a request to the specified endpoint using the configured method. This scraper generates a metric with a label for each HTTP response status class with a value of 1 if the status code matches the class. For example, the following metrics will be generated:
if the endpoint returned a
200:
httpcheck_status{http_method="GET",http_status_class="1xx",http_status_code="200",http_url="..."} 0
httpcheck_status{http_method="GET",http_status_class="2xx",http_status_code="200",http_url="..."} 1
httpcheck_status{http_method="GET",http_status_class="3xx",http_status_code="200",http_url="..."} 0
httpcheck_status{http_method="GET",http_status_class="4xx",http_status_code="200",http_url="..."} 0
httpcheck_status{http_method="GET",http_status_class="5xx",http_status_code="200",http_url="..."} 0if the endpoint returned a
404:
httpcheck_status{http_method="GET",http_status_class="1xx",http_status_code="200",http_url="..."} 0
httpcheck_status{http_method="GET",http_status_class="2xx",http_status_code="200",http_url="..."} 0
httpcheck_status{http_method="GET",http_status_class="3xx",http_status_code="200",http_url="..."} 0
httpcheck_status{http_method="GET",http_status_class="4xx",http_status_code="404",http_url="..."} 1
httpcheck_status{http_method="GET",http_status_class="5xx",http_status_code="200",http_url="..."} 0Health checks
Congree provides a health check status of components by means of health check metrics. These metrics are based on the health check probe that is implemented with Microsoft Health Check libraries: https://learn.microsoft.com/en-us/aspnet/core/host-and-deploy/health-checks.
There are three health statuses: Healthy, degraded, unhealthy. The health status is obtained from the metric <server name>_healthcheck_status.
Currently, the health status is only available for the linguistic server.
Status name | Corresponding metric value | Description |
|---|---|---|
Unhealthy | 0 | Indicates that the health check determined that the component was unhealthy, or an unhandled exception was thrown while executing the health check. |
Degraded | 0.5 | Indicates that the health check determined that the component was in a degraded state. |
Healthy | 1 | Indicates that the health check determined that the component was healthy. |
Example of health check metrics on the linguistic server:
congree_linguisticserver_healthcheck_status{culture="de-de",instance="640bb02b-4c31-451b-82b2-ab8138cc768b",job="CongreeLinguisticServer",name="Linguistic de-de",server_name="Default"} 1
congree_linguisticserver_healthcheck_status{culture="en-us",instance="640bb02b-4c31-451b-82b2-ab8138cc768b",job="CongreeLinguisticServer",name="Linguistic en-us",server_name="Default"} 0In addition, all of the services that implement the status endpoint, implement the health endpoint like https://<host>/LinguisticServer/health. Example response:
{
"entries": {
"Linguistic de-de": {
"data": {
"culture": "de-de",
"server_name": "Default"
},
"description": null,
"duration": "00:00:00.0000059",
"exception": null,
"status": "Healthy",
"tags": []
},
"Linguistic en-us": {
"data": {
"culture": "en-us",
"server_name": "Default"
},
"description": "Failed to load projects: An error occurred on Congree Linguistic Engine (Englisch) at localhost: Linguistic Engine is not available. Error: Connection refused by server.",
"duration": "00:00:00.0000057",
"exception": null,
"status": "Unhealthy",
"tags": []
}
},
"status": "Unhealthy",
"totalDuration": "00:00:00.0012756"
}Measurements
Currently Linguistic server only collects request related data. Linguistic server offers the following metrics:
Name | Unit | Description |
|---|---|---|
| job (count) | Total number of attempts to retrieve jobs / counter |
| ms | Time taken to process jobs in milliseconds / histogram |
| ms | Total time in milliseconds spent in job processing |
| count | Count of measurements of |
| job (count) | Total number of jobs received |
| job (count) | Total number of jobs processed |
| job (count) | Total number of jobs that failed |
| request (count) | Number of active requests to the Linguistic Engine in the moment |
| s | Request time to the Linguistic Engine in seconds. It is a histogram. |
| s | Total time in seconds spent in requests to the Linguistic Engine |
| count | Count of measurements of |
| request (count) | Total number of requests to the Linguistic Engine |
| request (count) | Total number of successful requests to the Linguistic Engine |
| request (count) | Total number of failed requests to the Linguistic Engine |