I need to write a simple scanner for ASCII files. The scanner should recognize (find) three kinds of tokens in the input file: strings of spaces, single new lines, and words (a word is defined as any sequence of characters not including spaces or new lines).
Everything in the input file can be broken down into sequences/tokens of one of these three types.
I'm need to write a driver (main() function) that repeatedly calls a function called getToken(). getToken reads the next token in the input file, and returns to main() the token type, and the actual string (lexeme) of the token.
For each call to getToken() (for each token returned), main outputs the token type (a unique integer is OK to represent the type), and the string (except for new lines – this causes problems). main also outputs the position in the input file for each token: the line number and the char position on the line of the beginning of the token
The logic overwhelms me and I can't think it through. Help!
2006-10-03
10:38:04
·
1 answers
·
asked by
Anonymous
in
Computers & Internet
➔ Programming & Design
You will need to use a concept called look ahead, because for space strings or words you will not know where the end of the token is until you read the first character past its last character. So this look ahead character must be saved and used as the first input character the next time that getToken() is called.
Use: ifstreamName.unget();
Output looks like this:
tokenType=3 str=Hello on line #1 at charpos=1
2006-10-03
10:39:41 ·
update #1