(maxima.info)Characters
85.3 Characters
===============
Characters are strings of length 1.
-- Function: adjust_external_format ()
Prints information about the current external format of the Lisp
reader and in case the external format encoding differs from the
encoding of the application which runs Maxima
'adjust_external_format' tries to adjust the encoding or prints
some help or instruction. 'adjust_external_format' returns 'true'
when the external format has been changed and 'false' otherwise.
Functions like Note: cint, Note: unicode, Note:
octets_to_string and Note: string_to_octets need UTF-8 as the
external format of the Lisp reader to work properly over the full
range of Unicode characters.
Examples (Maxima on Windows, March 2016): Using
'adjust_external_format' when the default external format is not
equal to the encoding provided by the application.
1. Command line Maxima
In case a terminal session is preferred it is recommended to use
Maxima compiled with SBCL. Here Unicode support is provided by
default and calls to 'adjust_external_format' are unnecessary.
If Maxima is compiled with CLISP or GCL it is recommended to change
the terminal encoding from CP850 to CP1252.
'adjust_external_format' prints some help.
CCL reads UTF-8 while the terminal input is CP850 by default.
CP1252 is not supported by CCL. 'adjust_external_format' prints
instructions for changing the terminal encoding and external format
both to iso-8859-1.
2. wxMaxima
In wxMaxima SBCL reads CP1252 by default but the input from the
application is UTF-8 encoded. Adjustment is needed.
Calling 'adjust_external_format' and restarting Maxima permanently
changes the default external format to UTF-8.
(%i1)adjust_external_format();
The line
(setf sb-impl::*default-external-format* :utf-8)
has been appended to the init file
C:/Users/Username/.sbclrc
Please restart Maxima to set the external format to UTF-8.
(%i1) false
Restarting Maxima.
(%i1) adjust_external_format();
The external format is currently UTF-8
and has not been changed.
(%i1) false
-- Function: alphacharp (<char>)
Returns 'true' if <char> is an alphabetic character.
To identify a non-US-ASCII character as an alphabetic character the
underlying Lisp must provide full Unicode support. E.g. a German
umlaut is detected as an alphabetic character with SBCL in
GNU/Linux but not with GCL. (In Windows Maxima, when compiled with
SBCL, must be set to UTF-8. See Note: adjust_external_format for
more.)
Example: Examination of non-US-ASCII characters.
The underlying Lisp (SBCL, GNU/Linux) is able to convert the typed
character into a Lisp character and to examine.
(%i1) alphacharp("u"");
(%o1) true
In GCL this is not possible. An error break occurs.
(%i1) alphacharp("u");
(%o1) true
(%i2) alphacharp("u"");
package stringproc: u" cannot be converted into a Lisp character.
-- an error.
-- Function: alphanumericp (<char>)
Returns 'true' if <char> is an alphabetic character or a digit
(only corresponding US-ASCII characters are regarded as digits).
Note: See remarks on Note: alphacharp.
-- Function: ascii (<int>)
Returns the US-ASCII character corresponding to the integer <int>
which has to be less than '128'.
See Note: unicode for converting code points larger than '127'.
Examples:
(%i1) for n from 0 thru 127 do (
ch: ascii(n),
if alphacharp(ch) then sprint(ch),
if n = 96 then newline() )$
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
-- Function: cequal (<char_1>, <char_2>)
Returns 'true' if <char_1> and <char_2> are the same character.
-- Function: cequalignore (<char_1>, <char_2>)
Like 'cequal' but ignores case which is only possible for
non-US-ASCII characters when the underlying Lisp is able to
recognize a character as an alphabetic character. See remarks on
Note: alphacharp.
-- Function: cgreaterp (<char_1>, <char_2>)
Returns 'true' if the code point of <char_1> is greater than the
code point of <char_2>.
-- Function: cgreaterpignore (<char_1>, <char_2>)
Like 'cgreaterp' but ignores case which is only possible for
non-US-ASCII characters when the underlying Lisp is able to
recognize a character as an alphabetic character. See remarks on
Note: alphacharp.
-- Function: charp (<obj>)
Returns 'true' if <obj> is a Maxima-character. See introduction
for example.
-- Function: cint (<char>)
Returns the Unicode code point of <char> which must be a Maxima
character, i.e. a string of length '1'.
Examples: The hexadecimal code point of some characters (Maxima
with SBCL on GNU/Linux).
(%i1) obase: 16.$
(%i2) map(cint, ["$","#","Euro"]);
(%o2) [24, 0A3, 20AC]
Warning: It is not possible to enter characters corresponding to
code points larger than 16 bit in wxMaxima with SBCL on Windows
when the external format has not been set to UTF-8. See Note:
adjust_external_format.
CMUCL doesn't process these characters as one character. 'cint'
then returns 'false'. Converting a character to a code point via
UTF-8-octets may serve as a workaround:
'utf8_to_unicode(string_to_octets(character));'
See Note: utf8_to_unicode, Note: string_to_octets.
-- Function: clessp (<char_1>, <char_2>)
Returns 'true' if the code point of <char_1> is less than the code
point of <char_2>.
-- Function: clesspignore (<char_1>, <char_2>)
Like 'clessp' but ignores case which is only possible for
non-US-ASCII characters when the underlying Lisp is able to
recognize a character as an alphabetic character. See remarks on
Note: alphacharp.
-- Function: constituent (<char>)
Returns 'true' if <char> is a graphic character but not a space
character. A graphic character is a character one can see, plus
the space character. ('constituent' is defined by Paul Graham.
See Paul Graham, ANSI Common Lisp, 1996, page 67.)
(%i1) for n from 0 thru 255 do (
tmp: ascii(n), if constituent(tmp) then sprint(tmp) )$
! " # % ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B
C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
d e f g h i j k l m n o p q r s t u v w x y z { | } ~
-- Function: digitcharp (<char>)
Returns 'true' if <char> is a digit where only the corresponding
US-ASCII-character is regarded as a digit.
-- Function: lowercasep (<char>)
Returns 'true' if <char> is a lowercase character.
Note: See remarks on Note: alphacharp.
-- Variable: newline
The newline character (ASCII-character 10).
-- Variable: space
The space character.
-- Variable: tab
The tab character.
-- Function: unicode (<arg>)
Returns the character defined by <arg> which might be a Unicode
code point or a name string if the underlying Lisp provides full
Unicode support.
Example: Characters defined by hexadecimal code points (Maxima with
SBCL on GNU/Linux).
(%i1) ibase: 16.$
(%i2) map(unicode, [24, 0A3, 20AC]);
(%o2) [$, #, Euro]
Warning: In wxMaxima with SBCL on Windows it is not possible to
convert code points larger than 16 bit to characters when the
external format has not been set to UTF-8. See Note:
adjust_external_format for more information.
CMUCL doesn't process code points larger than 16 bit. In these
cases 'unicode' returns 'false'. Converting a code point to a
character via UTF-8 octets may serve as a workaround:
'octets_to_string(unicode_to_utf8(code_point));'
See Note: octets_to_string, Note: unicode_to_utf8.
In case the underlying Lisp provides full Unicode support the
character might be specified by its name. The following is
possible in ECL, CLISP and SBCL, where in SBCL on Windows the
external format has to be set to UTF-8. 'unicode(name)' is
supported by CMUCL too but again limited to 16 bit characters.
The string argument to 'unicode' is basically the same string
returned by 'printf' using the "~@c" specifier. But as shown below
the prefix "#\" must be omitted. Underlines might be replaced by
spaces and uppercase letters by lowercase ones.
Example (continued): Characters defined by names (Maxima with SBCL
on GNU/Linux).
(%i3) printf(false, "~@c", unicode(0DF));
(%o3) #\LATIN_SMALL_LETTER_SHARP_S
(%i4) unicode("LATIN_SMALL_LETTER_SHARP_S");
(%o4) ss
(%i5) unicode("Latin small letter sharp s");
(%o5) ss
-- Function: unicode_to_utf8 (<code_point>)
Returns a list containing the UTF-8 code corresponding to the
Unicode <code_point>.
Examples: Converting Unicode code points to UTF-8 and vice versa.
(%i1) ibase: obase: 16.$
(%i2) map(cint, ["$","#","Euro"]);
(%o2) [24, 0A3, 20AC]
(%i3) map(unicode_to_utf8, %);
(%o3) [[24], [0C2, 0A3], [0E2, 82, 0AC]]
(%i4) map(utf8_to_unicode, %);
(%o4) [24, 0A3, 20AC]
-- Function: uppercasep (<char>)
Returns 'true' if <char> is an uppercase character.
Note: See remarks on Note: alphacharp.
-- Variable: us_ascii_only
This option variable affects Maxima when the character encoding
provided by the application which runs Maxima is UTF-8 but the
external format of the Lisp reader is not equal to UTF-8.
On GNU/Linux this is true when Maxima is built with GCL and on
Windows in wxMaxima with GCL- and SBCL-builds. With SBCL it is
recommended to change the external format to UTF-8. Setting
'us_ascii_only' is unnecessary then. See Note:
adjust_external_format for details.
'us_ascii_only' is 'false' by default. Maxima itself then (i.e.
in the above described situation) parses the UTF-8 encoding.
When 'us_ascii_only' is set to 'true' it is assumed that all
strings used as arguments to string processing functions do not
contain Non-US-ASCII characters. Given that promise, Maxima avoids
parsing UTF-8 and strings can be processed more efficiently.
-- Function: utf8_to_unicode (<list>)
Returns a Unicode code point corresponding to the <list> which must
contain the UTF-8 encoding of a single character.
Examples: See Note: unicode_to_utf8.
automatically generated by info2www version 1.2.2.9