[en] You can configure Acrolinx Analytics to write reporting data to a database of your choice. Acrolinx Analytics comes installed with a default Apache Derby database for testing the reporting functionality.
[en] For an up-to-date list of supported database formats, visit the Standard Stack Compatibility Guide.
[en] The reporting database can take up a lot of disk space. You can set up a purge task to prevent the database from getting too large.
[en] The purge task deletes all records associated with old client sessions, checks, or Content Group states.
[en] The purge task deletes records from the following tables:
DocumentMetaInfoEjb FlagEjb MatchEjb RepositoryStateEjb ReuseFlagEjb RuleFlagEjb ScorecardAccessEjb SentenceTooLongFlagEjb SeoFlagEjb SuggestionEjb TermFieldValueEjb TermFlagEjb TermHarvestingFlagEjb TypeStatisticsEjb VoiceCharacteristicEjb
[en] The purge task doesn't change any other tables.
[en] By default the purge task runs every night at 02:00am UTC. You can also configure other intervals.
[en] To configure the purge task, follow these steps:
-
[en] Open your overlay of the core server properties file.
[en] You find the overlay for the core server properties file in the following location:
[en]
<config directory>/server/bin/coreserver.properties
-
[en] Add the following properties:
[en] Property
[en] Descriptions
[en]
reportDbPurge.use=true
[en] Turn on database purging.
[en]
reportDbPurge.purgeBeforeThatManyMonths=<MONTHS>
[en] Specify the minimum age in months that makes a record purgeable.
[en] (A "month" simplifies to 31 days where each day has 24 hours.)
[en] The default value is 36 (=3 years).
[en] For example, to remove records that are older than 6 months, enter the following value:
[en]
reportDbPurge.purgeBeforeThatManyMonths=6
[en]
reportDbPurge.cron.schedule=<CRON_EXPRESSION>
[en] Specify a Quartz cron expression that defines when the purge task runs.
[en] By default this is every night at 2am.
[en] For example, to schedule the purge task every Saturday, Sunday and Monday at 01:15am, write:
[en]
reportDbPurge.cron.schedule=0 15 1 ? * SAT-MON
[en]
reportDbPurge.cron.timeZone=<TIME_ZONE>
[en] Specify the time zone the cron expression from
thereportDbPurge.cron.schedule -
setting relates to.[en] (The supported time zone IDs may depend on the platform JDK.)
[en] The default time zone is UTC.
[en] For example, to schedule the purge task in Icelandic time, enter the following value:
[en]
reportDbPurge.cron.timeZone=Atlantic/Reykjavik
[en]
reportDbPurge.batchSize=<BATCH_SIZE>
[en] The number of primary records to be deleted in one transaction.
[en] A primary record is one selected by its age: a check, a client session, or a Content Group (repository) state record. It may have any number of related records that are deleted in the same transaction.
[en] The default value is 50.
[en] Decrease the value only if you’re facing memory issues during a purge task.
[en] Increase the value to speed up the purge task. (Not recommended when you check large documents that receive many highlights!)
[en] For example, to process 1000 primary records and all their related records in one transaction set:
[en]
reportDbPurge.batchSize=1000
-
[en] Save your changes and restart the Acrolinx Platform.