Chapter 10 Install and System Administration for FortiOS 5.0 : Using the CLI : Tips : Language support and regular expressions
  
Language support and regular expressions
Characters such as ñ, é, symbols, and ideographs are sometimes acceptable input. Support varies by the nature of the item being configured. CLI commands, objects, field names, and options must use their exact ASCII characters, but some items with arbitrary names or values may be input using your language of choice. To use other languages in those cases, you must use the correct encoding.
Input is stored using Unicode UTF-8 encoding but is not normalized from other encodings into UTF-8 before it is stored. If your input method encodes some characters differently than in UTF-8, your configured items may not display or operate as expected.
Regular expressions are especially impacted. Matching uses the UTF‑8 character values. If you enter a regular expression using another encoding, or if an HTTP client sends a request in an encoding other than UTF‑8, matches may not be what you expect.
For example, with Shift-JIS, backslashes ( \ ) could be inadvertently interpreted as the symbol for the Japanese yen ( ¥ ) and vice versa. A regular expression intended to match HTTP requests containing money values with a yen symbol therefore may not work it if the symbol is entered using the wrong encoding.
For best results, you should:
use UTF-8 encoding, or
use only the characters whose numerically encoded values are the same in UTF‑8, such as the US-ASCII characters that are also encoded using the same values in ISO 8859-1, Windows code page 1252, Shift-JIS and other encodings, or
for regular expressions that must match HTTP requests, use the same encoding as your HTTP clients.
 
HTTP clients may send requests in encodings other than UTF-8. Encodings usually vary by the client’s operating system or input language. If you cannot predict the client’s encoding, you may only be able to match any parts of the request that are in English, because regardless of the encoding, the values for English characters tend to be encoded identically. For example, English words may be legible regardless of interpreting a web page as either ISO 8859-1 or as GB2312, whereas simplified Chinese characters might only be legible if the page is interpreted as GB2312.
If you configure your FortiGate unit using other encodings, you may need to switch language settings on your management computer, including for your web browser or Telnet/SSH client. For instructions on how to configure your management computer’s operating system language, locale, or input method, see its documentation.
If you choose to configure parts of the FortiGate unit using non-ASCII characters, verify that all systems interacting with the FortiGate unit also support the same encodings. You should also use the same encoding throughout the configuration if possible in order to avoid needing to switch the language settings of the web-based manager and your web browser or Telnet/SSH client while you work.
Similarly to input, your web browser or CLI client should usually interpret display output as encoded using UTF-8. If it does not, your configured items may not display correctly in the web-based manager or CLI. Exceptions include items such as regular expressions that you may have configured using other encodings in order to match the encoding of HTTP requests that the FortiGate unit receives.
To enter non-ASCII characters in the CLI Console widget
1. On your management computer, start your web browser and go to the URL for the FortiGate unit’s web-based manager.
2. Configure your web browser to interpret the page as UTF-8 encoded.
3. Log in to the FortiGate unit.
4. Go to System > Dashboard > Status.
5. In title bar of the CLI Console widget, click Edit (the pencil icon).
6. Enable Use external command input box.
7. Select OK.
8. The Command field appears below the usual input and display area of the CLI Console widget.
9. In Command, type a command.
Figure 226: Entering encoded characters (CLI Console widget)
10. Press Enter.
In the display area, the CLI Console widget displays your previous command interpreted into its character code equivalent, such as:
edit \743\601\613\743\601\652
and the command’s output.
To enter non-ASCII characters in a Telnet/SSH client
1. On your management computer, start your Telnet or SSH client.
2. Configure your Telnet or SSH client to send and receive characters using UTF-8 encoding.
Support for sending and receiving international characters varies by each Telnet/SSH client. Consult the documentation for your Telnet/SSH client.
3. Log in to the FortiGate unit.
4. At the command prompt, type your command and press Enter.
Figure 227: Entering encoded characters (PuTTY)
You may need to surround words that use encoded characters with single quotes ( ' ).
Depending on your Telnet/SSH client’s support for your language’s input methods and for sending international characters, you may need to interpret them into character codes before pressing Enter.
For example, you might need to enter:
edit '\743\601\613\743\601\652'
5. The CLI displays your previous command and its output.
See Also
Help
Shortcuts and key commands
Command abbreviation
Environment variables
Special characters
Using grep to filter get and show command output
Screen paging
Baud rate
Using Perl regular expressions
Connecting to the CLI
Command syntax
Sub-commands
Tips