Monitoring and Healthcheck
EJBCA contains a health check service that can be used for health monitoring remotely. This should be considered essential if running EJBCA in a clustered installation, as it can be used to determine whether a node is able to remain in the cluster or needs to be taken out.
The purpose of an Healthcheck is to notify if something is not as expected in order to allow a monitoring system to send alarms and cluster nodes to be taken offline.
An example, and a source of a common misunderstanding, is that putting a CA offline (CA Service State) is expected and will not result in Healthcheck warnings. The motivation is that if you have de-activated a CA, this has been done deliberately, i.e. everything is as it should be, and the Healthcheck should not warn. This is in contrast to when you have the CA activated but the Crypto Token goes offline, then the CA is expected to be online, but it cannot be because the crypto token is offline and therefore Healtcheck should warn.
Note that a configuration as in the following example will thus not result in Healthcheck warnings.
Servlet URL
The servlet is located in the URL: http://localhost:8080/ejbca/publicweb/healthcheck/ejbcahealth
Note that the client (e.g. the load balancer) is responsible for closing the connection to the application server. Failure to do so may result in denial of service, preventing other clients from connecting to EJBCA.
Configuration
Which CAs that are checked by the health check service can be configured in the Admin Web on the CA Activation page as well as in the Edit CA page.
Common Criteria Compliance
To be fully Common Criteria compliant, a different key for signature tests than certificate signing should be used in the CA's HSM token configuration (the "testKey" alias should point to a key with no other uses).
The behavior of the servlet can be modified by configuring the below values in conf/ejbca.properties.
General Configuration
The following configuration parameters may be set to configure authorization and what the service checks:
Key |
Default |
Description |
healthcheck.amountfreemem |
1 |
The amount of memory that must be free on the server, in megabytes. |
healthcheck.dbquery |
select 1 |
Parameter indicating the string that should be used to do a minimal check that the database is working. |
healthcheck.authorizedips |
127.0.0.1 |
Specifies which remote IPs that may call this healthcheck servlet. Multiple IPs may be separated by a semicolon. |
healthcheck.catokensigntest |
false |
Set to true to perform a test signature on each CA token during the check. Otherwise just checks that the token status is active. |
healthcheck.publisherconnections |
false |
Set to true to perform a health test on all active publisher connections. |
Maintenance File Properties
Key |
Default |
Description |
healthcheck.maintenancefile |
|
Location of file containing information about maintenance. |
healthcheck.maintenancepropertyname |
DOWN_FOR_MAINTENANCE |
The key of the property value in the maintenance, should be in the following format: DOWN_FOR_MAINTENANCE=true. |
Servlet Configuration
The following parameters configure what message or HTTP error code the health service returns.
Key |
Default |
Description |
healthcheck.okmessage |
ALLOK |
Text string used to say that everything is ok with this node. Any properties defined properties value can be used here by inserting it in as a property, e.g: |
healthcheck.sendservererror |
true |
Set to true of the HTTP error code 500 should be sent in case of error. |
healthcheck.customerrormessage |
null |
Allows for a custom error message to be configured. |
Error Messages
If an error is detected one or several of the following error messages is reported. All errors will be sent with a response code of 500
Error |
Description |
MEM: Error Virtual Memory is about to run out, currently free memory : number |
The JVM is about to run out of memory |
DB: Error creating connection to database |
JDBC Connection to the database failed, this might occur if DB crashes or network is down. |
CA: Error CA Token is disconnected: CAName |
This is a sign of hardware problems with one or several of the hard ca tokens in the node. |
MAINT: DOWN_FOR_MAINTENANCE |
This is reported when the healthcheck.maintenancefile is used and the node is set to be offline. |
Error when testing the connection with publisher: PublisherName |
This is reported when a test connection to one of the publishers failed. |
Could not perform a test signature on the audit log. |
Reported when the audit log failed to sign (if database protection is enabled) |