(maxima.info)Characters


Next: String Processing Prev: Input and Output Up: stringproc-pkg
Enter node , (file) or (file)node

85.3 Characters
===============

Characters are strings of length 1.

 -- Function: adjust_external_format ()

     Prints information about the current external format of the Lisp
     reader and in case the external format encoding differs from the
     encoding of the application which runs Maxima
     'adjust_external_format' tries to adjust the encoding or prints
     some help or instruction.  'adjust_external_format' returns 'true'
     when the external format has been changed and 'false' otherwise.

     Functions like Note: cint, Note: unicode, Note:
     octets_to_string and Note: string_to_octets need UTF-8 as the
     external format of the Lisp reader to work properly over the full
     range of Unicode characters.

     Examples (Maxima on Windows, March 2016): Using
     'adjust_external_format' when the default external format is not
     equal to the encoding provided by the application.

     1.  Command line Maxima

     In case a terminal session is preferred it is recommended to use
     Maxima compiled with SBCL. Here Unicode support is provided by
     default and calls to 'adjust_external_format' are unnecessary.

     If Maxima is compiled with CLISP or GCL it is recommended to change
     the terminal encoding from CP850 to CP1252.
     'adjust_external_format' prints some help.

     CCL reads UTF-8 while the terminal input is CP850 by default.
     CP1252 is not supported by CCL. 'adjust_external_format' prints
     instructions for changing the terminal encoding and external format
     both to iso-8859-1.

     2.  wxMaxima

     In wxMaxima SBCL reads CP1252 by default but the input from the
     application is UTF-8 encoded.  Adjustment is needed.

     Calling 'adjust_external_format' and restarting Maxima permanently
     changes the default external format to UTF-8.

          (%i1)adjust_external_format();
          The line
          (setf sb-impl::*default-external-format* :utf-8)
          has been appended to the init file
          C:/Users/Username/.sbclrc
          Please restart Maxima to set the external format to UTF-8.
          (%i1) false

     Restarting Maxima.

          (%i1) adjust_external_format();
          The external format is currently UTF-8
          and has not been changed.
          (%i1) false

 -- Function: alphacharp (<char>)

     Returns 'true' if <char> is an alphabetic character.

     To identify a non-US-ASCII character as an alphabetic character the
     underlying Lisp must provide full Unicode support.  E.g.  a German
     umlaut is detected as an alphabetic character with SBCL in
     GNU/Linux but not with GCL. (In Windows Maxima, when compiled with
     SBCL, must be set to UTF-8.  See Note: adjust_external_format for
     more.)

     Example: Examination of non-US-ASCII characters.

     The underlying Lisp (SBCL, GNU/Linux) is able to convert the typed
     character into a Lisp character and to examine.

          (%i1) alphacharp("u"");
          (%o1)                          true

     In GCL this is not possible.  An error break occurs.

          (%i1) alphacharp("u");
          (%o1)                          true
          (%i2) alphacharp("u"");

          package stringproc: u" cannot be converted into a Lisp character.
           -- an error.

 -- Function: alphanumericp (<char>)

     Returns 'true' if <char> is an alphabetic character or a digit
     (only corresponding US-ASCII characters are regarded as digits).

     Note: See remarks on Note: alphacharp.

 -- Function: ascii (<int>)

     Returns the US-ASCII character corresponding to the integer <int>
     which has to be less than '128'.

     See Note: unicode for converting code points larger than '127'.

     Examples:

          (%i1) for n from 0 thru 127 do (
                  ch: ascii(n),
                  if alphacharp(ch) then sprint(ch),
                  if n = 96 then newline() )$
          A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
          a b c d e f g h i j k l m n o p q r s t u v w x y z

 -- Function: cequal (<char_1>, <char_2>)

     Returns 'true' if <char_1> and <char_2> are the same character.

 -- Function: cequalignore (<char_1>, <char_2>)

     Like 'cequal' but ignores case which is only possible for
     non-US-ASCII characters when the underlying Lisp is able to
     recognize a character as an alphabetic character.  See remarks on
     Note: alphacharp.

 -- Function: cgreaterp (<char_1>, <char_2>)

     Returns 'true' if the code point of <char_1> is greater than the
     code point of <char_2>.

 -- Function: cgreaterpignore (<char_1>, <char_2>)

     Like 'cgreaterp' but ignores case which is only possible for
     non-US-ASCII characters when the underlying Lisp is able to
     recognize a character as an alphabetic character.  See remarks on
     Note: alphacharp.

 -- Function: charp (<obj>)

     Returns 'true' if <obj> is a Maxima-character.  See introduction
     for example.

 -- Function: cint (<char>)

     Returns the Unicode code point of <char> which must be a Maxima
     character, i.e.  a string of length '1'.

     Examples: The hexadecimal code point of some characters (Maxima
     with SBCL on GNU/Linux).

          (%i1) obase: 16.$
          (%i2) map(cint, ["$","#","Euro"]);
          (%o2)                           [24, 0A3, 20AC]

     Warning: It is not possible to enter characters corresponding to
     code points larger than 16 bit in wxMaxima with SBCL on Windows
     when the external format has not been set to UTF-8.  See Note:
     adjust_external_format.

     CMUCL doesn't process these characters as one character.  'cint'
     then returns 'false'.  Converting a character to a code point via
     UTF-8-octets may serve as a workaround:

     'utf8_to_unicode(string_to_octets(character));'

     See Note: utf8_to_unicode, Note: string_to_octets.

 -- Function: clessp (<char_1>, <char_2>)

     Returns 'true' if the code point of <char_1> is less than the code
     point of <char_2>.

 -- Function: clesspignore (<char_1>, <char_2>)

     Like 'clessp' but ignores case which is only possible for
     non-US-ASCII characters when the underlying Lisp is able to
     recognize a character as an alphabetic character.  See remarks on
     Note: alphacharp.

 -- Function: constituent (<char>)

     Returns 'true' if <char> is a graphic character but not a space
     character.  A graphic character is a character one can see, plus
     the space character.  ('constituent' is defined by Paul Graham.
     See Paul Graham, ANSI Common Lisp, 1996, page 67.)

          (%i1) for n from 0 thru 255 do (
          tmp: ascii(n), if constituent(tmp) then sprint(tmp) )$
          ! " #  %  ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B
          C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
          d e f g h i j k l m n o p q r s t u v w x y z { | } ~

 -- Function: digitcharp (<char>)

     Returns 'true' if <char> is a digit where only the corresponding
     US-ASCII-character is regarded as a digit.

 -- Function: lowercasep (<char>)

     Returns 'true' if <char> is a lowercase character.

     Note: See remarks on Note: alphacharp.

 -- Variable: newline

     The newline character (ASCII-character 10).

 -- Variable: space

     The space character.

 -- Variable: tab

     The tab character.

 -- Function: unicode (<arg>)

     Returns the character defined by <arg> which might be a Unicode
     code point or a name string if the underlying Lisp provides full
     Unicode support.

     Example: Characters defined by hexadecimal code points (Maxima with
     SBCL on GNU/Linux).

          (%i1) ibase: 16.$
          (%i2) map(unicode, [24, 0A3, 20AC]);
          (%o2)                            [$, #, Euro]

     Warning: In wxMaxima with SBCL on Windows it is not possible to
     convert code points larger than 16 bit to characters when the
     external format has not been set to UTF-8.  See Note:
     adjust_external_format for more information.

     CMUCL doesn't process code points larger than 16 bit.  In these
     cases 'unicode' returns 'false'.  Converting a code point to a
     character via UTF-8 octets may serve as a workaround:

     'octets_to_string(unicode_to_utf8(code_point));'

     See Note: octets_to_string, Note: unicode_to_utf8.

     In case the underlying Lisp provides full Unicode support the
     character might be specified by its name.  The following is
     possible in ECL, CLISP and SBCL, where in SBCL on Windows the
     external format has to be set to UTF-8.  'unicode(name)' is
     supported by CMUCL too but again limited to 16 bit characters.

     The string argument to 'unicode' is basically the same string
     returned by 'printf' using the "~@c" specifier.  But as shown below
     the prefix "#\" must be omitted.  Underlines might be replaced by
     spaces and uppercase letters by lowercase ones.

     Example (continued): Characters defined by names (Maxima with SBCL
     on GNU/Linux).

          (%i3) printf(false, "~@c", unicode(0DF));
          (%o3)                    #\LATIN_SMALL_LETTER_SHARP_S
          (%i4) unicode("LATIN_SMALL_LETTER_SHARP_S");
          (%o4)                                  ss
          (%i5) unicode("Latin small letter sharp s");
          (%o5)                                  ss

 -- Function: unicode_to_utf8 (<code_point>)

     Returns a list containing the UTF-8 code corresponding to the
     Unicode <code_point>.

     Examples: Converting Unicode code points to UTF-8 and vice versa.

          (%i1) ibase: obase: 16.$
          (%i2) map(cint, ["$","#","Euro"]);
          (%o2)                           [24, 0A3, 20AC]
          (%i3) map(unicode_to_utf8, %);
          (%o3)                 [[24], [0C2, 0A3], [0E2, 82, 0AC]]
          (%i4) map(utf8_to_unicode, %);
          (%o4)                           [24, 0A3, 20AC]

 -- Function: uppercasep (<char>)

     Returns 'true' if <char> is an uppercase character.

     Note: See remarks on Note: alphacharp.

 -- Variable: us_ascii_only

     This option variable affects Maxima when the character encoding
     provided by the application which runs Maxima is UTF-8 but the
     external format of the Lisp reader is not equal to UTF-8.

     On GNU/Linux this is true when Maxima is built with GCL and on
     Windows in wxMaxima with GCL- and SBCL-builds.  With SBCL it is
     recommended to change the external format to UTF-8.  Setting
     'us_ascii_only' is unnecessary then.  See Note:
     adjust_external_format for details.

     'us_ascii_only' is 'false' by default.  Maxima itself then (i.e.
     in the above described situation) parses the UTF-8 encoding.

     When 'us_ascii_only' is set to 'true' it is assumed that all
     strings used as arguments to string processing functions do not
     contain Non-US-ASCII characters.  Given that promise, Maxima avoids
     parsing UTF-8 and strings can be processed more efficiently.

 -- Function: utf8_to_unicode (<list>)

     Returns a Unicode code point corresponding to the <list> which must
     contain the UTF-8 encoding of a single character.

     Examples: See Note: unicode_to_utf8.


automatically generated by info2www version 1.2.2.9