MS Windows drivers nowadys typically don´t use any printer-internal device fonts anymore. They download the TrueType fonts which are used in the document using 4 different methods to the printer:

  1. In very rare cases the driver uses the printer internal fonts. In that case the document is searchable and very efficient, meaning the data stream is very compact and small. Converted to PDF, the document is searchable. Data streams from non-Windows systems like from SAP via DeviceType or from Linux systems use this method.
  2. The document is printed completely in graphical mode. In that case ELP has no chance to find any information in the data stream. All available information is the meta data provided by the spooler like the username, the document name or the printer queue name.
  3. The text is printed using bitmapped soft fonts. In this case ELP usually also can't find any text as the characters are printed in the same way like in (4) but the character itself is printed as graphics in that font. Windows drivers use this method often for fixed pitch TrueType fonts like Courier or Letter Gothic. ELP can´t parse any text encoded in this way.
  4. In most cases documents are based on proportional fonts like Arial and Windows drivers download each used character once as a so-called scalable soft font. These texts can be searched for or can be triggered on.

    ELP can analyze the data stream like explained in the rules generation part of the manual and find with 3 principle different methods the searched texts.

It is pretty easy to find out how your driver handles text printing/encoding.

Try to mark some text parts. If this is not possible, then the text is printed in graphic mode, like explained above in 2. and 3.

If you can mark the text, copy it to the clipboard and past it into your text editor or to notepad. If the text is now readable then you use one of the very rare drivers printing in mode 1, above. Usually you might get something like that:

You printed: Hello World
You see in the PDF document: Hello World
But the copied text in your text system shows something like that: !"##$ %$&#’

On a closer look we can see that the H is coded as an !, the e with an " and the # is the l. In other words: Windows usually prints the first used character as a blank (ASCII 32) the second as ASCII 33 the ! etc. Not very user friendly for archiving systems.



ConvertSoftFont2ISO
 

If this key (available in the MISCELLANEOUS keys section) is activated in the rule Global or any other start-up rules, ELP converts all TrueType font based texts, which are encoded like explained in (4) above from the strange Windows symbol set coding into ISO encoding.

Reprint your document and copy the text "Hello World" from the generated PDF into a text editor. If your result is "Hello World" you will have fully searchable PDF files.
 
[GLOBAL]
ConvertSoftFont2ISO=ON
; Minimum commands for generation of searchable data streams coming form MS Windows drivers.

 

This function converts all characters above 127 into their Unicode notation. Unfortunately the W-ELP PDF converter may have some problems with this way of printing Unicode. So if you can ensure, that every character printed remains below the Unicode character 256 you may add the key ConvertSoftFont2ISO_No_Unicode=ON. This will force the data stream converter to use the windows single byte data stream, based upon the windows symbol set.



RemoveDownloadFont

Deletes the TrueType based soft font from the data stream and replaces it with existing printer internal fonts. This function reduces the data stream size for small documents drastically,

The printed characters must be encoded in an ISO symbol set, so when printing through Windows drivers ELP needs to have the ConvertSoftFont2ISO turned on.

The syntax of the key is: RemoveDownloadFont=<font name>|<Font-ID>;<italic>;<stroke>;<selection sequence>

To get the correct syntax for your printer, we recommend to print a PCL font list. Usually on that list you see for each device internal font the selection sequence, which you will need:

Font name This is the exact name the driver uses in the data stream. Very often the name includes some additional attributes. Examples for a German driver:
Arial, Arial Fett, Arial Kursiv, Arial Fett Kurs, Times New Roman,  Free Sans Bold, Free Sans Obliq etc.
The full name can easily be evaluated. Make an MS Word document with a small text, mark minimum one character in normal, bold, italic and bold italic and print it to file. Open that print file for example in Notepad++ and search for the font name. The maximum length is 16 characters  *)
Font-ID An alternative selection of the font is the PCL Font ID, which can be provided in the optional PCL section of the TrueType font header. *)
Italic Font attribute, normally 0 for upright and 1 for italic
Stroke Font attribute, normally 0 for normal and 3 for bold
Selection sequence The font selection sequence from the PCL font list without the symbol set sequence. The leading Escape can be encoded as \x1B 
See example below:
 
*) To find out the real font name / number:
  1. Insert the key log_mode=101 in the rule global.
  2. Print your stream
  3. Open the file: c:\ProgramData\Welp\debug\<PRINTERQUEUENAME>\log_file_<DATESERIAL>.txt in your ASCII Editor. We recommend to use Notepad++
  4. Search for this text: <fontlist>
  5. The font name is listed in the first row after name="xxx". If the PCL Font-ID is provided in the font, then look for the <typeface> key.
  6. The stroke and italic parameters are also listed as tag: <style posture="xxxx" appearance-width="xxxx"

[GLOBAL]

; to make sure the font can be translated to device symbol set.

ConvertSoftFont2ISO=ON

; The Arial samples

RemoveDownloadFont=Arial;0;0;\x1B(s1p0s0b16602T
RemoveDownloadFont=Arial Fett;0;3;\x1B(s1p0s3b16602T
RemoveDownloadFont=Arial Kursiv;1;0;\x1B(s1p1s0b16602T
RemoveDownloadFont=Arial Fett Kurs;1;3;\x1B(s1p1s3b16602T