The names of the elements you want to extract aren't always immediately obvious. We've included the most common scenarios that you might come across.
You'll define how you want Acrolinx to extract elements under Extraction in your Word Content Profile.
-
-
Title
-
Bold, Italics, and Underlined
-
Text Color
-
Text Highlighting
-
Superscript Subscript
-
You can extract titles and content with a specific style. These include headings, typographical emphasis, text color, colored highlighting, superscript, and custom styles.
Title
To exclude a title, follow these steps:
-
Enter
Title
in Element Name. -
Select empty in Filter Mode.
To extract a specific title style, use the element name attribute.
To exclude "heading 1," follow these steps:
-
Enter
* name=”heading 1”
in Element Name. -
Select empty in Filter Mode .
Bold, Italics, and Underlining
To exclude bold, italics, and underlining, you can use b
, i
, and u
.
Follow these steps:
-
Enter
b
,i
, oru
. in Element Name. -
Select empty in Filter Mode.
Text Color
You can pick any RGB value for extraction.
To exclude text color, follow these steps:
-
Enter
color val=RGB
in Element Name. Red, for example, would be:color val=FF0000
-
Select empty in Filter Mode.
Text Highlighting
You can extract the following colors for text highlighting: black, blue, cyan, darkBlue, darkCyan, darkGray, darkGreen, darkMagenta, darkRed, darkYellow, green, lightGray, magenta, red, and white.
To exclude text highlighting, follow these steps:
-
Enter
highlight val=
followed by the respective color in Element Name. Yellow, for example, would be:highlight val=yellow
-
Select empty in Filter Mode.
Superscript and Subscript
To exclude superscript and subscript, follow these steps:
-
Enter
vertAlign val=superscript
orvertAlign val=subscript
in Element Name. -
Select empty in Filter Mode.
Warning
Font extraction only works if Acrolinx can find the needed information in the content. Word doesn't provide any information for the default font via the content. That means Acrolinx can't extract information for your set default font. Microsoft sets Calibri as your default font.
To extract specific fonts, use the element name rFonts
and the attribute for the particular font group:
-
ascii
(ASCII Font) -
cs
(Complex Script Font) -
hAnsi
(High ANSI Font) -
eastAsia
(East Asian Font)
To extract one font, specify every font group.
To exclude content with the font "Courier New," for example, follow these steps:
-
Enter the following element names and attributes in Element Name:
rFonts ascii=”Courier New”
rFonts cs=”Courier New”
rFonts hAnsi=”Courier New”
rFonts eastAsia=”Courier New”
-
Select empty in Filter Mode .
You can refer to comments by several element names, these include:
-
comments
-
comment
-
noProof
(all content that Word won't check for issues)
If you set your extraction to Starting Element: exclude, Acrolinx won't check your comments by default. You need to include them specifically.
To include comments, follow these steps:
-
Enter
comment
in Element Name. -
Select include in Filter Mode.
There isn’t a perfect way to extract a table of contents. But you do have a couple of options.
You can use noProof
, but this will affect all noProof content — not just your table of contents.
You can extract the table of contents based on its style, for example. The most common styles for a table of contents include "toc 1" and "toc 2."
To exclude a table of contents using its style attribute, follow these steps:
-
Enter
* name=”toc 1”
in Element Name. -
Select empty in Filter Mode.
If you set your extraction to Starting Element: exclude, Acrolinx won't check your headers and footers by default. You need to include them specifically.
These are the element names you need:
-
Headers:
hdr
-
Footers:
ftr
To include headers, follow these steps:
-
Enter
hdr
in Element Name. -
Select include in Filter Mode.
To include footers, follow these steps:
-
Enter
ftr
in Element Name. -
Select include in Filter Mode.