You can configure Acrolinx Analytics to write reporting data to a database of your choice. Acrolinx Analytics comes installed with a default Apache Derby database for testing the reporting functionality.
For an up-to-date list of supported database formats, visit the Standard Stack Compatibility Guide.
The reporting database can take up a lot of disk space. You can set up a purge task to prevent the database from getting too large.
The purge task deletes all records associated with old client sessions, checks, or Content Group states.
The purge task deletes records from the following tables:
DocumentMetaInfoEjb FlagEjb MatchEjb RepositoryStateEjb ReuseFlagEjb RuleFlagEjb ScorecardAccessEjb SentenceTooLongFlagEjb SeoFlagEjb SuggestionEjb TermFieldValueEjb TermFlagEjb TermHarvestingFlagEjb TypeStatisticsEjb VoiceCharacteristicEjb
The purge task doesn't change any other tables.
By default the purge task runs every night at 02:00am UTC. You can also configure other intervals.
To configure the purge task, follow these steps:
-
Open your overlay of the core server properties file.
You find the overlay for the core server properties file in the following location:
<config directory>/server/bin/coreserver.properties
-
Add the following properties:
Property
Descriptions
reportDbPurge.use=true
Turn on database purging.
reportDbPurge.purgeBeforeThatManyMonths=<MONTHS>
Specify the minimum age in months that makes a record purgeable.
(A "month" simplifies to 31 days where each day has 24 hours.)
The default value is 36 (=3 years).
For example, to remove records that are older than 6 months, enter the following value:
reportDbPurge.purgeBeforeThatManyMonths=6
reportDbPurge.cron.schedule=<CRON_EXPRESSION>
Specify a Quartz cron expression that defines when the purge task runs.
By default this is every night at 2am.
For example, to schedule the purge task every Saturday, Sunday and Monday at 01:15am, write:
reportDbPurge.cron.schedule=0 15 1 ? * SAT-MON
reportDbPurge.cron.timeZone=<TIME_ZONE>
Specify the time zone the cron expression from
thereportDbPurge.cron.schedule -
setting relates to.(The supported time zone IDs may depend on the platform JDK.)
The default time zone is UTC.
For example, to schedule the purge task in Icelandic time, enter the following value:
reportDbPurge.cron.timeZone=Atlantic/Reykjavik
reportDbPurge.batchSize=<BATCH_SIZE>
The number of primary records to be deleted in one transaction.
A primary record is one selected by its age: a check, a client session, or a Content Group (repository) state record. It may have any number of related records that are deleted in the same transaction.
The default value is 50.
Decrease the value only if you’re facing memory issues during a purge task.
Increase the value to speed up the purge task. (Not recommended when you check large documents that receive many highlights!)
For example, to process 1000 primary records and all their related records in one transaction set:
reportDbPurge.batchSize=1000
-
Save your changes and restart the Acrolinx Platform.