Let's imagine that you want Acrolinx to recognize and process specific custom elements or attributes in your XML files. Where do you start? Let’s look at a sample document and create a Content Profile for it:
<!DOCTYPE HTML PUBLIC "//acrolinx example-v1" "_"> <document> <h1>Content Profiles Available From Acrolinx Server 5.3 Help Extend How Acrolinx Reads Your Text</h1> <paragraph> Content Profiles are a cool new feature in Acrolinx Server 5.3. In this tutorial, we'll learn how to configure a Content Profile. <comment>Note: Previously, you used CSDs to configure server-side segmentation.</comment> </paragraph> </document>
Note
All the changes that you make in your Content Profile apply immediately.
To name your content profile, open the General tab.
A good name includes the type of file and how you want to use the Content Profile. This one's for XML, so you can add that to its name. Or maybe you want to be more specific, for example, "XML - Tutorial."
Add a description that reminds you and others what the Content Profile is used for.
Use the file type and the Public ID to identify the content you want Acrolinx to read with this Content Profile.
To tell Acrolinx when to use this Content Profile, do the following:
-
Go to the Criteria tab.
-
Select the Type. For example, "XML."
-
Enter the Public ID. It's the most reliable way to identify or match XML content. In this example, it's
//acrolinx example-v1
.
To test which Content Profile Acrolinx uses, follow these steps:
-
Run a check on your XML file.
-
Open the Scorecard and expand the Administrative Information section.
You should see Content Profile: XML - Tutorial.
Note
If Acrolinx uses a different Content Profile, check the criteria of that Content Profile to troubleshoot.
Define what you want Acrolinx to read in a check. For example, you might want to exclude elements or specify how text gets broken up.
To exclude an element, do the following:
-
Open the Extraction tab.
-
Select exclude in Filter Mode.
-
Enter the name of the tag in Element Name. For example,
comment
. -
Press Enter to add it.
-
Run a check to test the Content Profile.
If you want to exclude comments, for example, do the following:
-
Make sure that you have an issue in the comment section of your file.
-
Run a check, and the issue shouldn't appear anymore.
-
Next, let’s look at the title. Acrolinx recognizes sentences by punctuation, but titles are a special case since they don’t need punctuation. We can tell Acrolinx this using the Break Level.
To adjust the break level, do the following:
-
Select include in Filter Mode.
-
Set the Break Level as sentence.
-
Enter your Element Name, which is
h1
. -
Press Enter to add it to your list.
-
Test the Content Profile:
-
Run a check.
You'll notice that the example has a very long title in its
h1
element.You should get a Sentence too long issue.
-
Before we move on...
Since we're setting up a new XML Content Profile, we recommend that you select Remove Extra Whitespace. This helps Acrolinx to ignore whitespace that your XML editor might add.
Click Advanced and select Remove Extra Whitespace.
Much of our guidance is context dependent. For example, you want your title guidelines like "Capitalization" to only ever apply to titles.
The the following are the 4 most important contexts,:
-
LIST
-
PARAGRAPH
-
TABLE
-
TITLE
When you create a new Content Profile, you'll see your contexts but you'll need to map them.
To configure your contexts, do the following:
-
Open the Context tab.
-
Select a context fro the list. For example, TITLE.
-
Click + to add a new MAPPING.
-
Enter
h1
.This tells Acrolinx that the content in "h1" is a title. If you have additional title contexts such as h2, h3, and h4, simply add additional mappings.
-
Test the Content Profile:
-
Run a check.
The Sentence too long issue should now be a Title too long issue for your title.
-
Congratulations! You've now successfully configured a Content Profile.