If you harvest terms during a check, Acrolinx creates a term harvesting file in the output directory. You can then use the Acrolinx Term Aggregator to process these term harvesting files into a single XML file. This aggregation is helpful because Acrolinx automatically filters out duplicate terms or variants that you would otherwise have to validate manually. After you convert the XML file into CSV format and import it into Acrolinx, you can validate the new terms in the user interface of the Terminology Manager.
To add harvested terms to your terminology database, follow these steps:
- Run one or more checks so that Acrolinx can generate term harvesting files with term candidates.
Aggregate the term candidates from the term harvesting files into a single XML file.
Open the Acrolinx Term Aggregator.
To open the Term Aggregator, enter the following address into your web browser:
For example: http://acrolinx-server:8031/termaggregator/
Your administrator must enable the Term Aggregator in the core server properties .
Select the options for the term aggregation.
You can set the following options:
Option Use to Language Select the input language. The Term Aggregator only displays languages with term harvesting files in the output directory. Files Display the list of term harvesting files that are currently stored in the output directory. Maximum number of terms Restrict the number of terms that the Term Aggregator adds to the output XML file. Sort order Select how the entries in the output XML file are sorted. Frequency threshold Set the minimum number of occurrences required for a term to be included in the output XML file. Number of example sentences per term Set the number of example sentences per term.
Acrolinx aggregates all term harvesting files from the term harvesting directory into one XML file. You then download and store that XML file on your computer.
To aggregate the files and move all term harvesting files from the input directory to a subdirectory called aggregated_<TIMESTAMP> , click Aggregate and Clean up instead.
To move only the term harvesting files from the input directory to a subdirectory called aggregated_<TIMESTAMP> , click Clean up instead.
- Open the Acrolinx Term Aggregator.
- Open the aggregated XML file in Microsoft Excel.
Save the XML file as "Unicode Text (*.txt)".
You can ignore the warning that the new file may contain features that are not compatible with unicode text. Just save the file as intended.
Import the file into the Terminology Manager.
To import the CSV file, follow the standard procedure for importing terminology . In the Format-specific Options , select the Delimiter <tab> and the Encoding UTF-16LE .
- Validate the terms in the Terminology Manager.