MS Windows drivers nowadys typically don´t use any printer-internal device fonts anymore. They download the TrueType fonts which are used in the document using 4 different methods to the printer:
- In very rare cases the driver uses the printer internal fonts. In that case the document is searchable and very efficient, meaning the data stream is very compact and small. Converted to PDF, the document is searchable. Data streams from non-Windows systems like from SAP via DeviceType or from Linux systems use this method.
- The document is printed completely in graphical mode. In that case ELP has no chance to find any information in the data stream. All available information is the meta data provided by the spooler like the username, the document name or the printer queue name.
- The text is printed using bitmapped softfonts. In this case ELP usually also can't find any text as the characters are printed in the same way like in (4) but the character itself is printed as graphics in that font. Windows drivers use this method often for fixed pitch TrueType fonts like Courier or Letter Gothic. ELP can´t parse any text encoded in this way.
- In most cases documents are based on proportional fonts like Arial and Windows drivers download each used character once as a so-called scalable soft font. These texts can be searched for or can be triggered on.
ELP can analyze the data stream like explained in the rules generation part of the manual and find with 3 principle different methods the searched texts.
It is pretty easy to find out how your driver handles text printing/encoding.
- Open the rule assistant and enable archive printing and store the outgoing data stream in PDF format.
- Print one or more jobs.
- Open the Archive Tab of the W-ELP Control Center select the Archive, search for data streams and open the PDF file into your viewer.
Try to mark some text parts. If this is not possible, then the text is printed in graphic mode, like explained above in 2. and 3.
If you can mark the text, copy it to the clipboard and past it into your text editor or to notepad. If the text is now readable then you use one of the very rare drivers printing in mode 1, above. Usually you might get something like that:
|You printed:||Hello World|
|You see in the PDF document:||Hello World|
|But the copied text in your text system shows something like that:||!"##$ %$&#â€™
On a closer look we can see that the H is coded as an !, the e with an " and the # is the l. In other words: Windows usually prints the first used character as a blank (ASCII 32) the second as ASCII 33 the ! etc. Not very user friendly for archiving systems.
Reprint your document and copy the text "Hello World" from the generated PDF into a text editor. If your result is "Hello World" you will have fully searchable PDF files.
The printed characters must be encoded in an ISO symbol set, so when printing through Windows drivers ELP needs to have the ConvertSoftFont2ISO turned on.
The syntax of the key is: RemoveDownloadFont=<font name>|<Font-ID>;<italic>;<stroke>;<selection sequence>
To get the correct syntax for your printer, we recommend to print a PCL font list. Usually on that list you see for each device internal font the selection sequence, which you will need:
|Font name||This is the exact name the driver uses in the data stream. Very often the name includes some additional attributes. Examples for a German driver:
Arial, Arial Fett, Arial Kursiv, Arial Fett Kurs, Times New Roman, Free Sans Bold, Free Sans Obliq etc.
The full name can easily be evaluated. Make an MS Word document with a small text, mark minimum one character in normal, bold, italic and bold italic and print it to file. Open that print file for example in Notepad++ and search for the font name. The maximum length is 16 characters *)
|Font-ID||An alternative selection of the font is the PCL Font ID, which can be provided in the optional PCL section of the TrueType font header. *)|
|Italic||Font attribute, normally 0 for upright and 1 for italic|
|Stroke||Font attribute, normally 0 for normal and 3 for bold|
|Selection sequence||The font selection sequence from the PCL font list without the symbol set sequence. The leading Escape can be encoded as \x1B
See example below:
*) To find out the real font name / number:
- Insert the key log_mode=101 in the rule global.
- Print your stream
- Open the file: c:\ProgramData\Welp\debug\<PRINTERQUEUENAME>\log_file_<DATESERIAL>.txt in your ASCII Editor. We recommend to use Notepad++
- Search for this text: <fontlist>
- The font name is listed in the first row after name="xxx". If the PCL Font-ID is provided in the font, then look for the <typeface> key.
- The stroke and italic parameters are also listed as tag: <style posture="xxxx" appearance-width="xxxx"
; to make sure the font can be translated to device symbol set.
RemoveDownloadFont=Arial Fett Kurs;1;3;\x1B(s1p1s3b16602T
; The Arial samples