13 Text Manipulation

The basic BETA environment defines a Text pattern for manipulating texts. Text constants have been used a lot in the previous examples. Here we explore more on the many facilities of the text concept. Constant texts can be assigned to text variables and texts can be added:

(# t: @text;(* declare t as a static ref. to a text object *)
   r: ^text;(* declare t as a dynamic ref. to a text object *)
   i: @integer;
do 'foo' -> t;       (* assign a constant to t = 'foo'   *)
   ' ' -> t.append;  (* append  one blank to t = 'foo '  *)
   ' ' -> t.prepend; (* prepend one blank to t = ' foo ' *)
   t.length -> i;    (* assign the length of t to i (5)  *)
   (2,4) -> t.sub -> r[]; (* get substring 'foo' from t *)
#)

Users do not have to bother about extending the text when adding or manipulating. The length of the text object will automatically be adjusted. Many functions on texts uses a current position in the text (t.pos). For example:

(# t: @text;
do 'foo'->t; (* sets pos to t.length *)
   'bar'->t.puttext; (* adds 'bar' after current pos: t='foobar'*)
   1->t.pos;
   'bar'->t.puttext; (* t = 'barbar' *)
#)

Texts sub-strings can be fetched and assigned to another text object reference, and texts can be inserted at a specified position:

(# t: @text;   (* declare t as a static ref.  to a text object *)
   r: ^text;   (* declare t as a dynamic ref. to a text object *)
do ' foo ' -> t;          (* assign a constant to t = ' foo ' *)
   (2,4) -> t.sub -> r[]; (* get substring 'foo' from t *)
   ('bar',5) -> t.insert; (* insert substring 'bar' in t = ' foobar ' *)
#)

Texts can be compared using the equal function.

(# t: @text;
   b: @boolean;
do ...
   'foo' -> t.equal -> b;    (* case sensitive comparison *)
   'foo' -> t.equalNCS -> b; (* not case sensitive comparison *)
#)

The following example program is an extended version of the character counting programs constructed before. The program can count either characters or lines in the input file. In addition to text comparison, the program uses two new features.

getline: reads from input, i.e. what the user types. Waits until the user has typed a newline

ascii.newline: ascii is an object defined in betaenv containing attributes for manipulating and comparing ASCII characters. newline is a generic definition of the newline character. ascii also contains conversion functions, e.g. toLower, definition of white space, e.g. isWhiteSpace, etc.

Program 16: FileCount.bet

ORIGIN '~beta/basiclib/file'
---- program: descriptor ----

(# (* ----------------------------------------------
    *   count.bet: Simple file handling program
    *              -Counting lines/characters-
    * ----------------------------------------------*)
   
   inFile: @file;
   Ch: @char;
   nc: @integer;
   answer: ^text;
   lines, chars: @Boolean;
do
   (if NoofArguments 
    // 2 then 
       2->Arguments->inFile.name ;
       inFile.openRead;  (* OPENING *)
       'Count what in \''->Puttext;   inFile.name->PutText;
       '\' (lines/chars)? '->PutText;
       GetLine->answer[]; (* read from keybord ­ what the user types *)
       (if true
        //('lines'->answer.equal) then true->lines;
        //('chars'->answer.equal) then true->chars;
        else 
           'Unknown input'->PutLine;
           Stop; (* end execution *)
       if);
       Loop: 
         (if inFile.eos//false then
             inFile.Get->Ch; 
             (if true
              //lines then (if Ch//ascii.newline then nc + 1->nc if);
              //chars then nc + 1->nc;
             if);
             restart Loop
         if);
       NewLine;
       nc->PutInt; 
       (if true
        //lines then ' lines '->PutText;
        //chars then ' characters '->PutText;
       if);
       'in file \''->Puttext;
       inFile.name->PutText;
       '\'\n\n'->PutText;
       inFile.close;
    else 
       'Missing Argument'->putline;
   if)
#)

The output running FileCount on itself is:

nil% FileCount FileCount.bet
Count what in 'FileCount.bet' (lines/chars)? lines

46 lines in file 'FileCount.bet'

nil% FileCount FileCount.bet
Count what in 'FileCount.bet' (lines/chars)? chars 

1238 characters in file 'FileCount.bet'

Finally, the table below lists some of the useful attributes of texts:

t.length Returns number of characters of text
t.pos Returns current position
t.empty -> b Returns True if t is empty
t.clear -> b Sets the length to zero
c -> t.put Appends the character c to t
t.get -> c Returns the character at current position, and increments position by 1
t.peek -> c Returns the character at current position, without updating the position
r[] -> t.puttext Adds r to t starting at current position
r[] -> t.prepend Prepends the text r to t
r[] -> t.append Appends the text r to t
i -> t.putint Inserts the integer i to t starting at current position
t.getint -> i Reads the next integer from t starting at current position
t.getAtom -> r[] Reads characters until next white-space and returns the text
t.getLine -> r[] Reads characters from t until next newline and returns that text.
i -> t.inxget -> c Returns the character at position i
(c,i) -> t.inxput Replaces the character at position i
t.copy -> r[] Returns a copy of t
r[]->(t.copy).append->s[] Returns s[] where s = t cat r [3]
r[]->(t.copy).prepend->s[] Returns s[] where s = r cat t
t.scanAtom(# do ... #) Scans from current position until next white-space and call INNER for each char
t.scanAll(# do ... #) Scans all the elements in t and calls INNER for each char
(i,j) -> t.sub -> r[] Returns the text from position i to position j from t
(i,j) -> t.delete Deletes characters in the range i:j
r[] -> t.less Tests whether r is less than t. Lexicographic ordering is used
r[]->t.greater Tests whether r is greater than t. Lexicographic ordering is used
t.makeLC Converts all characters to lower case
t.makeUC Converts all characters to upper case
c -> findAll(# do ... #) Calls INNER for each occurrence of c in t
t.EOSerror Called when reading past length of the text

Please see the basic libraries manual [MIA 90-8] for more details about the text concept.


[3] Actually this is an example of how to combine patterns that exits references. Append is called on the reference returned by copy. This facility is called computed remote


Libraries Tutorial
© 1994-2004 Mjølner Informatics
[Modified: Thursday January 16th 2003 at 10:23]