[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
73.1 Introduction to string processing | ||
73.2 Functions and Variables for input and output | ||
73.3 Functions and Variables for characters | ||
73.4 Functions and Variables for strings |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
stringproc.lisp
enlarges Maximas capabilities of working with strings
and adds some useful functions for file in/output.
For questions and bugs please mail to volkervannek at gmail dot com .
In Maxima a string is easily constructed by typing "text".
stringp
(%i1) m: "text"; (%o1) text (%i2) stringp(m); (%o2) true
Characters are represented as strings of length 1.
These are not Lisp characters.
Tests can be done with charp
lcharp
and conversion from Lisp to Maxima characters with cunlisp
(%i1) c: "e"; (%o1) e (%i2) [charp(c),lcharp(c)]; (%o2) [true, false] (%i3) supcase(c); (%o3) E (%i4) charp(%); (%o4) true
All functions in stringproc.lisp
that return characters, return
Maxima characters. Due to the fact, that the introduced characters are strings
of length 1, you can use a lot of string functions also for characters.
As seen, supcase
It is important to know, that the first character in a Maxima string is at
position 1. This is designed due to the fact that the first element in a
Maxima list is at position 1 too. See definitions of charat
charlist
In applications string functions are often used when working with files.
You will find some useful stream and print functions in stringproc.lisp
.
The following example shows some of the here introduced functions at work.
Example:
openw
formatted writing to this file. See printf
(%i1) s: openw("E:/file.txt"); (%o1) #<output stream E:/file.txt> (%i2) for n:0 thru 10 do printf( s, "~d ", fib(n) ); (%o2) done (%i3) printf( s, "~%~d ~f ~a ~a ~f ~e ~a~%", 42,1.234,sqrt(2),%pi,1.0e-2,1.0e-2,1.0b-2 ); (%o3) false (%i4) close(s); (%o4) true
After closing the stream you can open it again, this time with input direction.
readline
package now offers a lot of functions for manipulating strings. Tokenizing can
be done by split
tokens
.
(%i5) s: openr("E:/file.txt"); (%o5) #<input stream E:/file.txt> (%i6) readline(s); (%o6) 0 1 1 2 3 5 8 13 21 34 55 (%i7) line: readline(s); (%o7) 42 1.234 sqrt(2) %pi 0.01 1.0E-2 1.0b-2 (%i8) list: tokens(line); (%o8) [42, 1.234, sqrt(2), %pi, 0.01, 1.0E-2, 1.0b-2] (%i9) map( parse_string, list ); (%o9) [42, 1.234, sqrt(2), %pi, 0.01, 0.01, 1.0b-2] (%i10) float(%); (%o10) [42.0, 1.234, 1.414213562373095, 3.141592653589793, 0.01, 0.01, 0.01] (%i11) readline(s); (%o11) false (%i12) close(s)$
readline
returns false
when the end of file occurs.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Example:
(%i1) s: openw("E:/file.txt"); (%o1) #<output stream E:/file.txt> (%i2) control: "~2tAn atom: ~20t~a~%~2tand a list: ~20t~{~r ~}~%~2t\ and an integer: ~20t~d~%"$ (%i3) printf( s,control, 'true,[1,2,3],42 )$ (%o3) false (%i4) close(s); (%o4) true (%i5) s: openr("E:/file.txt"); (%o5) #<input stream E:/file.txt> (%i6) while stringp( tmp:readline(s) ) do print(tmp)$ An atom: true and a list: one two three and an integer: 42 (%i7) close(s)$
Closes stream and returns true
if stream had been open.
Returns the number of elements in stream where stream has to be a stream from or to a file.
Returns the current position in stream, if pos is not used.
If pos is used, fposition
sets the position in stream.
stream has to be a stream from or to a file and
pos has to be a positive number where the first element in stream
is in position 1.
Writes a new line (to stream), if the position is not at the beginning of
a line. See also newline
.
Gibt Buchstaben, aktuell in dem geöffneten Datenstrom stream
enthalten sind, in einer Zeichenkette zurück. Die zurück gegebenen
Buchstaben werden dabei aus dem Datenstrom entfernt. stream muss durch
make_string_output_stream
erzeugt worden sein.
Beispiel: Siehe make_string_output_stream .
Gibt einen Datenstrom zurück, der Teile der Zeichenkette string und ein Dateiende enthält. Ohne optionale Argumente enthält der Strom die gesamte Zeichenkette und ist vor dem ersten Buchstaben positioniert. Mit den optionalen Argumenten start und end lässt sich der Abschnitt der Zeichenkette festlegen, den der Datenstrom enthält. Der erste Buchstabe befindet sich dabei an der Position 1.
(%i1) istream : make_string_input_stream("text", 1, 4); (%o1) #<string-input stream from "text"> (%i2) (while (c : readchar(istream)) # false do sprint(c), newline())$ t e x (%i3) close(istream)$
Gibt einen Datenstrom zurück, der Buchstaben aufnehmen kann. Die aktuell im Strom enthaltenden Buchstaben können mit get_output_stream_string entnommen werden.
(%i1) ostream : make_string_output_stream(); (%o1) #<string-output stream 09622ea0> (%i2) printf(ostream, "foo")$ (%i3) printf(ostream, "bar")$ (%i4) string : get_output_stream_string(ostream); (%o4) foobar (%i5) printf(ostream, "baz")$ (%i6) string : get_output_stream_string(ostream); (%o6) baz (%i7) close(ostream)$
Writes a new line (to stream). See sprint
newline()
. Note that there are some cases, where newline()
does
not work as expected.
Returns an output stream to file.
If an existing file is opened, opena
appends elements at the end of file.
Returns an input stream to file. If file does not exist, it will be created.
Returns an output stream to file.
If file does not exist, it will be created.
If an existing file is opened, openw
destructively modifies file.
Erzeugt eine formatierte Ausgabe. Der Zielparameter dest gibt an, wo die
Ausgabe erfolgen soll. Möglich sind hier ein Ausgabestrom oder die globalen
Variablen true
und false
. true
bewirkt eine Ausgabe im Terminal.
Der Rückgabewert von printf
ist in diesem Fall false
.
false
als Zielparameter bewirkt die Ausgabe im Rückgabewert.
Die Buchstaben des Kontrollparameters string werden der Reihe nach ausgegeben, wobei jedoch eine Tilde eine Direktive einleitet. Die Direktiven verwenden dann im Allgemeinen die nachstehenden Parameter expr_1, …, expr_n, um die Ausgabe zu erzeugen. Der Buchstabe nach der Tilde gibt dabei an, welche Art der Formatierung gewünscht ist.
printf
stellt die Common Lisp Funktion format
in Maxima zur Verfügung.
Das folgende Beispiel zeigt die grundsätzliche Beziehung zwischen diesen
beiden Funktionen.
(%i1) printf(true, "R~dD~d~%", 2, 2); R2D2 (%o1) false (%i2) :lisp (format t "R~dD~d~%" 2 2) R2D2 NIL
Die folgende Beschreibung und die Beispiele beschränken sich auf eine grobe
Skizze der Verwendungsmöglichkeiten von printf
.
Die Lisp Funktion format
ist in vielen Referenzbüchern ausführlich
beschrieben. Eine hilfreiche Quelle ist z.B. das frei verfügbare Online-Manual
"Common Lisp the Language" von Guy L. Steele. Siehe dort das Kapitel 22.3.3.
~% new line ~& fresh line ~t tab ~$ monetary ~d decimal integer ~b binary integer ~o octal integer ~x hexadecimal integer ~br base-b integer ~r spell an integer ~p plural ~f floating point ~e scientific notation ~g ~f or ~e, depending upon magnitude ~h bigfloat ~a uses Maxima function string ~s like ~a, but output enclosed in "double quotes" ~~ ~ ~< justification, ~> terminates ~( case conversion, ~) terminates ~[ selection, ~] terminates ~{ iteration, ~} terminates
Die Direktive ~h für Gleitkommazahlen mit beliebiger Genauigkeit entspricht nicht dem Lisp-Standard und wird daher unten näher beschrieben.
Die Direktive ~* wird nicht unterstützt.
Ist dest ein Datenstrom oder true
, gibt printf
false
zurück. Andernfalls ist der Rückgabewert eine Zeichenkette.
(%i1) printf( false, "~a ~a ~4f ~a ~@r", "String",sym,bound,sqrt(12),144), bound = 1.234; (%o1) String sym 1.23 2*sqrt(3) CXLIV (%i2) printf( false,"~{~a ~}",["one",2,"THREE"] ); (%o2) one 2 THREE (%i3) printf( true,"~{~{~9,1f ~}~%~}",mat ), mat = args(matrix([1.1,2,3.33],[4,5,6],[7,8.88,9]))$ 1.1 2.0 3.3 4.0 5.0 6.0 7.0 8.9 9.0 (%i4) control: "~:(~r~) bird~p ~[is~;are~] singing."$ (%i5) printf( false, control, n,n, if n = 1 then 1 else 2 ), n = 2; (%o5) Two birds are singing.
Die Direktive ~h wurde für Gleitkommazahlen mit beliebiger Genauigkeit eingeführt.
~w,d,e,x,o,p@H w : width d : decimal digits behind floating point e : minimal exponent digits x : preferred exponent o : overflow character p : padding character @ : display sign for positive numbers
(%i1) fpprec : 1000$ (%i2) printf(true, "|~h|~%", 2.b0^-64)$ |0.0000000000000000000542101086242752217003726400434970855712890625| (%i3) fpprec : 26$ (%i4) printf(true, "|~h|~%", sqrt(2))$ |1.4142135623730950488016887| (%i5) fpprec : 24$ (%i6) printf(true, "|~h|~%", sqrt(2))$ |1.41421356237309504880169| (%i7) printf(true, "|~28h|~%", sqrt(2))$ | 1.41421356237309504880169| (%i8) printf(true, "|~28,,,,,'*h|~%", sqrt(2))$ |***1.41421356237309504880169| (%i9) printf(true, "|~,18h|~%", sqrt(2))$ |1.414213562373095049| (%i10) printf(true, "|~,,,-3h|~%", sqrt(2))$ |1414.21356237309504880169b-3| (%i11) printf(true, "|~,,2,-3h|~%", sqrt(2))$ |1414.21356237309504880169b-03| (%i12) printf(true, "|~20h|~%", sqrt(2))$ |1.41421356237309504880169| (%i13) printf(true, "|~20,,,,'+h|~%", sqrt(2))$ |++++++++++++++++++++|
Entfernt und gibt den ersten Buchstaben in stream zurück.
Falls das Ende des Streams erreicht sein sollte, gibt readchar
false
zurück.
Beispiel: Siehe make_string_input_stream.
Returns a string containing the characters from the current position in
stream up to the end of the line or false
if the end of the file
is encountered.
Evaluates and displays its arguments one after the other `on a line' starting
at the leftmost position. The numbers are printed with the '-' right next to
the number, and it disregards line length. newline()
, which will be
autoloaded from stringproc.lisp
might be useful, if you whish to place
intermediate line breaking.
Examples:
(%i1) for n:0 thru 19 do sprint( fib(n) )$ 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 (%i2) for n:0 thru 22 do ( sprint(fib(n)), if mod(n,10)=9 then newline() )$ 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Returns true
if char is an alphabetic character.
Returns true
if char is an alphabetic character or a digit.
Returns the character corresponding to the ASCII number int. ( -1 < int < 256 )
Examples:
(%i1) for n from 0 thru 255 do ( tmp: ascii(n), if alphacharp(tmp) then sprint(tmp), if n=96 then newline() )$ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z
Returns true
if char_1 and char_2 are the same.
Like cequal
Returns true
if the ASCII number of char_1 is greater than the
number of char_2.
Like cgreaterp
Returns true
if obj is a Maxima character.
See introduction for example.
Returns the ASCII number of char.
Returns true
if the ASCII number of char_1 is less than the number
of char_2.
Like clessp
Returns true
if char is a graphic character and not the space
character. A graphic character is a character one can see, plus the space
character. (constituent
is defined by Paul Graham, ANSI Common Lisp,
1996, page 67.)
Example:
(%i1) for n from 0 thru 255 do ( tmp: ascii(n), if constituent(tmp) then sprint(tmp) )$ ! " # % ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~
Converts a Lisp character into a Maxima character. (You won't need it.)
Returns true
if char is a digit.
Returns true
if obj is a Lisp character.
(You won't need it.)
Returns true
if char is a lowercase character.
The newline character.
The space character.
The tab character.
Returns true
if char is an uppercase character.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Gibt einen String zurück, der die Base64-Darstellung von string zeigt.
Sind in string Umlaute oder Eszett enthalten, ist das Ergebnis von der verwendeten Plattform abhängig. Es wird aber durch eine Anwendung von base64_decode in jedem Fall wieder in den ursprünglichen String zurück verwandelt.
Beispiel:
(%i1) base64 : base64("foo bar baz"); (%o1) Zm9vIGJhciBiYXo= (%i2) string : base64_decode(base64); (%o2) foo bar baz (%i3) base64_decode(base64("äöü")); (%o3) äöü
Dekodiert den Base64-kodierten String base64-string wieder zurück in den ursprünglichen String.
Beispiel: Siehe base64.
Gibt den n-ten Buchstaben in string zurück. Den ersten Buchstaben in string erhält man mit n = 1.
Beispiel:
(%i1) charat("Lisp", 1); (%o1) L
Gibt eine Liste mit allen Buchstaben in string zurück.
Beispiel:
(%i1) charlist("Lisp"); (%o1) [L, i, s, p] (%i2) %[1]; (%o2) L
Parse the string str as a Maxima expression and evaluate it. The string
str may or may not have a terminator (dollar sign $
or semicolon
;
). Only the first expression is parsed and evaluated, if there is more
than one.
Complain if str is not a string.
See also parse_string
.
Examples:
(%i1) eval_string ("foo: 42; bar: foo^2 + baz"); (%o1) 42 (%i2) eval_string ("(foo: 42, bar: foo^2 + baz)"); (%o2) baz + 1764
Gibt die md5-Prüfsumme von string als String zurück. Um den Rückgabewert in eine natürliche Zahl zu parsen, setzen Sie bitte die Eingabebasis auf 16 und fügen eine Null vor den String.
Beispiel:
(%i1) string : md5sum("foo bar baz"); (%o1) ab07acbb1e496801937adfa772424bf7 (%i2) ibase : obase : 16.$ (%i3) integer : parse_string(sconcat(0, string)); (%o3) 0ab07acbb1e496801937adfa772424bf7
Parse the string str as a Maxima expression (do not evaluate it). The
string str may or may not have a terminator (dollar sign $
or
semicolon ;
). Only the first expression is parsed, if there is more
than one.
Complain if str is not a string.
See also eval_string
.
Examples:
(%i1) parse_string ("foo: 42; bar: foo^2 + baz"); (%o1) foo : 42 (%i2) parse_string ("(foo: 42, bar: foo^2 + baz)"); 2 (%o2) (foo : 42, bar : foo + baz)
Returns a copy of string as a new string.
Like supcase
,
Returns true
if string_1 and string_2 are the same length
and contain the same characters.
Like sequal
sexplode
is an alias for function charlist
.
simplode
takes a list of expressions and concatenates them into a string.
If no delimiter delim is specified, simplode
uses no delimiter.
delim can be any string.
Examples:
(%i1) simplode(["xx[",3,"]:",expand((x+y)^3)]); (%o1) xx[3]:y^3+3*x*y^2+3*x^2*y+x^3 (%i2) simplode( sexplode("stars")," * " ); (%o2) s * t * a * r * s (%i3) simplode( ["One","more","coffee."]," " ); (%o3) One more coffee.
Returns a string that is a concatenation of substring (string,
1, pos - 1)
, the string seq and substring (string,
pos)
. Note that the first character in string is in position 1.
Examples:
(%i1) s: "A submarine."$ (%i2) concat( substring(s,1,3),"yellow ",substring(s,3) ); (%o2) A yellow submarine. (%i3) sinsert("hollow ",s,3); (%o3) A hollow submarine.
Returns string except that each character from position start to end is inverted. If end is not given, all characters from start to the end of string are replaced.
Examples:
(%i1) sinvertcase("sInvertCase"); (%o1) SiNVERTcASE
Returns the number of characters in string.
Returns a new string with a number of num characters char.
Example:
(%i1) smake(3,"w"); (%o1) www
Returns the position of the first character of string_1 at which
string_1 and string_2 differ or false
. Default test function
for matching is sequal
.
sequalignore
Example:
(%i1) smismatch("seven","seventh"); (%o1) 6
Returns the list of all tokens in string.
Each token is an unparsed string.
split
uses delim as delimiter.
If delim is not given, the space character is the default delimiter.
multiple is a boolean variable with true
by default.
Multiple delimiters are read as one.
This is useful if tabs are saved as multiple space characters.
If multiple is set to false
, each delimiter is noted.
Examples:
(%i1) split("1.2 2.3 3.4 4.5"); (%o1) [1.2, 2.3, 3.4, 4.5] (%i2) split("first;;third;fourth",";",false); (%o2) [first, , third, fourth]
Returns the position of the first character in string which matches
char. The first character in string is in position 1.
For matching characters ignoring case see ssearch
.
Returns a string like string but without all substrings matching
seq. Default test function for matching is sequal
.
sremove
should ignore case while searching for seq, use
sequalignore
Note that the first character in string is in position 1.
Examples:
(%i1) sremove("n't","I don't like coffee."); (%o1) I do like coffee. (%i2) sremove ("DO ",%,'sequalignore); (%o2) I like coffee.
Like sremove
except that only the first substring that matches seq
is removed.
Returns a string with all the characters of string in reverse order.
Returns the position of the first substring of string that matches the
string seq. Default test function for matching is sequal
.
ssearch
should ignore case, use sequalignore
start and end to limit searching. Note that the first character in
string is in position 1.
Example:
(%i1) ssearch("~s","~{~S ~}~%",'sequalignore); (%o1) 4
Returns a string that contains all characters from string in an order such
there are no two successive characters c and d such that
test (c, d)
is false
and test (d,
c)
is true
. Default test function for sorting is
clessp
.
clessp
,
clesspignore
,
cgreaterp
,
cgreaterpignore
,
cequal
,
cequalignore
Example:
(%i1) ssort("I don't like Mondays."); (%o1) '.IMaddeiklnnoosty (%i2) ssort("I don't like Mondays.",'cgreaterpignore); (%o2) ytsoonnMlkIiedda.'
Returns a string like string except that all substrings matching old
are replaced by new. old and new need not to be of the same
length. Default test function for matching is sequal
.
ssubst
should ignore case while searching for old, use
sequalignore
Note that the first character in string is in position 1.
Examples:
(%i1) ssubst("like","hate","I hate Thai food. I hate green tea."); (%o1) I like Thai food. I like green tea. (%i2) ssubst("Indian","thai",%,'sequalignore,8,12); (%o2) I like Indian food. I like green tea.
Like subst
is replaced.
Returns a string like string, but with all characters that appear in seq removed from both ends.
Examples:
(%i1) "/* comment */"$ (%i2) strim(" /*",%); (%o2) comment (%i3) slength(%); (%o3) 7
Like strim
Like strim
Returns true
if obj is a string.
See introduction for example.
Returns the substring of string beginning at position start and ending at position end. The character at position end is not included. If end is not given, the substring contains the rest of the string. Note that the first character in string is in position 1.
Examples:
(%i1) substring("substring",4); (%o1) string (%i2) substring(%,4,6); (%o2) in
Returns string except that lowercase characters from position start to end are replaced by the corresponding uppercase ones. If end is not given, all lowercase characters from start to the end of string are replaced.
Example:
(%i1) supcase("english",1,2); (%o1) English
Returns a list of tokens, which have been extracted from the argument
string. The tokens are substrings, whose characters satisfy a certain
test function. If the argument test is not given, the test
constituent
is used as the default test. The set of test functions is
{
constituent
,
alphacharp
,
digitcharp
,
lowercasep
,
uppercasep
,
charp
,
alphanumericp
(The Lisp-version of tokens
is written by Paul Graham.
ANSI Common Lisp, 1996, page 67.)
Examples:
(%i1) tokens("24 October 2005");
(%o1) [24, October, 2005]
(%i2) tokens("05-10-24",'digitcharp);
(%o2) [05, 10, 24]
(%i3) map(parse_string,%);
(%o3) [5, 10, 24]
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Robert Dodier on Oktober, 11 2013 using texi2html 1.76.