Fill This Form To Receive Instant Help
Homework answers / question archive / LAB 9 - Parser COMP-232 Programming Languages Read everything before doing anything
LAB 9 - Parser
COMP-232 Programming Languages
Read everything before doing anything. In particular, read the submission checklist before you start editing any of the provided code.
Consider the following grammar for a simple programming language:
<program> ::= <statement> <EOF> | <statement> <program>
<statement> ::= <assignStmt> | <repeatStmt> | <printStmt>
<assignStmt> ::= <id> = <expr> ;
<repeatStmt> ::= repeat ( <expr> ) <statement>
<printStmt> ::= print ( <expr> ) ;
<expr> ::= <term> | <term> <addop> <expr>
<term> ::= <factor> | <factor> <multop> <term>
<factor> ::= <id> | <number> | <addop> <factor> | ( <expr> )
<number> ::= <int> | <float>
<int> ::= <digit> | <int> <digit>
<float> ::= <digit>. | <digit> <float> | <float> <digit>
<id> ::= <letter> | <id> <letter> | <id> <digit>
<addop> ::= + | -
<multop> ::= * | / | %
<digit> ::= 0-9
<letter> ::= a-z | A-Z | _ | $
Refer input.txt in the lab download at /home/LAB9 on comp232.com for a sample of statements from the grammar above.
Next, look through scan.h; the scanner implemented in scan.h and scan.c is complete, but you will need to use the tokens it returns. Read the TOKEN struct and its referenced TOKEN_VALUE union and TOKEN_TYPE enum; make sure you understand which TOKEN_TYPEs will populate which TOKEN_VALUE members; if you're uncertain don't hesitate to ask!
Note the addition of an ungottenToken to the scanner, and corresponding ungetToken function. These serve the same purpose in parsing that ungetc served in lexing: when a token is lexed by the scanner and it is determined to not be part of the production being performed it should be "put back" with ungetToken, so it can be used as part of the next production.
For instance, consider the production:
<expr> ::= <term> | <term> <addop> <expr>
When an expression is parsed, it starts with a term. That term might be followed by an addition-tier operation and then a (smaller) expression, but it might not be! The way to find out is to first parse the leading term, and then check: is the next token is an addition operation? If it is, then the second option in the production above should be carried out, but if it is not an addition operation then the expression being produced is comprised entirely of the leading term, and the token read after that term is not part of the expression. In this case, ungetToken should be called on the extra token (which is likely closing parenthesis, a multiplication operation, or a semicolon).
Once you are sure you understand how the statements in input.txt fit into the grammar above, how the TOKEN struct is set up, and what purpose ungetToken serves, you're ready to move on.
Before starting task 1, note that there are multiple enums with similarly-named elements. Be careful not to confuse TOKEN_TYPEs and NODE_TYPEs. You will be making nodes in an abstract syntax tree and assigning them types from the TOKEN_TYPE enum which can lead to issues. For example, the ASSIGNMENT_TOKEN and the ASSIGN_STMT_NODE elements are very different. Assigning a node which should be an ASSIGN_STMT_NODE the similarly named ASSIGNMENT_TOKEN type will lead to errors.
Next, open parse.h. This header file contains definitions for a NUMBER struct and a NODE struct.
The NUMBER struct is to store numerical values, which in the grammar above are either ints or floats (stored in long and double form respectively in the NUMBER struct). The NODE struct defines an individual node in an abstract syntax tree. The goal of parsing is to use the token's output by the scanner to build an abstract syntax tree which adheres to the grammar.
You will be implementing a recursive descent parser. The syntax tree set up by this particular recursive descent parser will be a binary tree; each node will have at most 2 children (named leftChild and rightChild). For example, if a node is of type ASSIGN_STMT_NODE, its leftChild will be the IDENT_NODE representing the variable being assigned value and its rightChild will be the EXPR_NODE representing the expression whose value is to be stored in the variable.
parse.h also includes these function declarations:
NODE *program();
NODE *statement();
NODE *assignStmt(TOKEN **currToken);
NODE *repeatStmt(TOKEN **currToken);
NODE *printStmt(TOKEN **currToken);
NODE *expr(TOKEN **currToken);
NODE *term(TOKEN **currToken);
NODE *factor(TOKEN **currToken);
NODE *ident(TOKEN **currToken);
NODE *number(TOKEN **currToken);
The names of these functions coincide with the productions (above the token level) in the grammar. This is not a coincidence. Each function above will perform the corresponding production. Your first task is to complete the definitions of these functions in parse.c. The program and statement functions have already been completed as examples.
Before you start, note the functions freeToken and getNextToken. As you'd expect freeToken frees the space allocated by the input token. getNextToken takes as an argument the previous token (which is freed) and then returns the next token. Every token needs to be freed at some point; if a token is ever passed into a getNextToken call, that will free it, so freeToken does not need to be called on it again.
Similarly, if a token is passed into one of the production functions above, that production will "consume" (i.e. parse and free) the token. If, however, a token is never passed into a getNextToken call or a production function, then it should be freed manually with freeToken. This will happen with trailing tokens, such as semicolons, right parenthesis, etc.
It is often necessary to use the leading token to decide which productions to call. See the completed statement function; if the first token is an IDENT_TOKEN then the statement being produced must be an assignment statement, as this is the only type of statement which starts with an IDENT_TOKEN. Similarly, if the first token is a REPEAT_TOKEN then the statement must be a repeat statement, and so on.
There are tokens which do not need to be stored in the syntax tree, but they are still necessary for valid parsing. For instance, the ASSIGNMENT_TOKEN, which is the second token in an assignment statement, is not stored in the syntax tree.
Your parser must still get the token and ensure that it is an assignment token, and throw an error (see the error function in parse.c) if it is not the correct type of token. This will also often be the case for parenthesis and semicolons. To be clear, these tokens are not redundant or useless; their purpose is syntactic, and without them parsing could not be done without ambiguity, but once they've been used for parsing their purpose has been fulfilled and their usefulness exhausted.
Productions (other than program and statement, which serve as "entry points") should have the first token in the production passed in as an argument, so they should not have to call getNextToken to get the first token in the production. Similarly, when a production is complete, the last token that it used should be freed unless it is passed into another production, as discussed above. In other words, if the token gotten from the last call to getNextToken in a given production is not passed into another production, it should be freed using freeToken.
You may use the gdb debugger to check your parse.c before continuing. However, you may find it easier to instead complete the next task (which prints the parse tree) and then debug the two together.
Keep in mind that we're building a recursive setup here, so these production functions will be calling each other constantly, but none of the functions should be very complex for this reason. Consider parsing an expression; the expression starts with a term, for sure, so our first step is to parse a term. How do we parse a term? Who cares, we have another function for that, so we can just call that other function to parse the leading term (and trust our past/future selves to fill out that function correctly). If you're creating a node, and you find yourself accessing at the data contained in its children, you're overcomplicating things.
Check out print.h and print.c. These declare and define functions to recursively print the parse tree constructed in the previous task.
They should be somewhat self-explanatory, the structure is nearly identical to that of parsing.
The main difference is the extra indent argument. It exists solely for aesthetic purposes; it determines how far the printed node will be indented. To adhere to the sample output below, sub-productions should be indented 1 further than their parent productions, so you'll often find yourself passing indent+1 into your print functions.
Like the previous task, printProgram and printStatement have been completed for you. You should follow the example in printStatement to decide what to print. Every node other than the program node should print start and end messages like the statement node does.
The contents of input.txt are:
firstVar = 100; print(firstVar); second2var = 0.15; print(second2var); repeat (2)
firstVar = firstVar + 1; print(+firstVar); repeat(10) print(-101.725);
When the parse_print_tree executable is run with the above input.txt as an argument, it should produce the result below. You may find it useful to go through the input above, statement by statement, and compare to the output below before starting.
Done parsing...
START STATEMENT
| START ASSIGN STATEMENT
| | START IDENT
| | | firstVar | END IDENT
100 END ASSIGN STATEMENT END STATEMENT
START STATEMENT
| START PRINT STATEMENT
| | START EXPRESSION
| | | START TERM
| | | | START FACTOR
| | | | | START IDENT
| | | | | | firstVar
| | | | | END IDENT
| | | | END FACTOR
| | | END TERM
| | END EXPRESSION
| END PRINT STATEMENT
END STATEMENT
START STATEMENT
| START ASSIGN STATEMENT
| | START IDENT
| | | second2var
| | END IDENT
| | START EXPRESSION
| | | START TERM
| | | | START FACTOR
| | | | | START NUMBER
| | | | | | FLOAT : 0.150000
| | | | | END NUMBER
| | | | END FACTOR
| | | END TERM
| | END EXPRESSION
| END ASSIGN STATEMENT
END STATEMENT
START STATEMENT
| START PRINT STATEMENT
| | START EXPRESSION
| | | START TERM
| | | | START FACTOR
| | | | | START IDENT
| | | | | | second2var
| | | | | END IDENT
| | | | END FACTOR
| | | END TERM
| | END EXPRESSION
| END PRINT STATEMENT END STATEMENT
START STATEMENT
START REPEAT STATEMENT
2
| START STATEMENT
| | | START ASSIGN STATEMENT
| | | | START IDENT
| | | | | firstVar
| | | | END IDENT
| | | | START EXPRESSION
| | | | | START TERM
| | | | | | START FACTOR
| | | | | | | START IDENT
| | | | | | | | firstVar
| | | | | | | END IDENT
| | | | | | END FACTOR
| | | | | END TERM
| | | | <ADDOP +>
| | | | | START EXPRESSION
| | | | | | START TERM
| | | | | | | START FACTOR
| | | | | | | | START NUMBER
| | | | | | | | | INT : 1
| | | | | | | | END NUMBER
| | | | | | | END FACTOR
| | | | | | END TERM
| | | | | END EXPRESSION
| | | | END EXPRESSION
| | | END ASSIGN STATEMENT
| | END STATEMENT
| END REPEAT STATEMENT
END STATEMENT
START STATEMENT
| START PRINT STATEMENT
| | START EXPRESSION
| | | START TERM
| | | | START FACTOR
| | | | | <ADDOP +>
| | | | | START FACTOR
| | | | | | START IDENT
| | | | | | | firstVar
| | | | | | END IDENT
| | | | | END FACTOR
| | | | END FACTOR
| | | END TERM
| | END EXPRESSION
| END PRINT STATEMENT END STATEMENT
START STATEMENT
START REPEAT STATEMENT
10
| START STATEMENT
| | | START PRINT STATEMENT
| | | | START EXPRESSION
| | | | | START TERM
| | | | | | START FACTOR
| | | | | | | <ADDOP ->
| | | | | | | START FACTOR
| | | | | | | | START NUMBER
| | | | | | | | | FLOAT : 101.725000
| | | | | | | | END NUMBER
| | | | | | | END FACTOR
| | | | | | END FACTOR
| | | | | END TERM
| | | | END EXPRESSION
| | | END PRINT STATEMENT
| | END STATEMENT
| END REPEAT STATEMENT
END STATEMENT
Process finished with exit code 0
The vertical bars marking indentation levels are handled by the printfIndent function defined at the bottom of print.c. If you feel like going down a rabbit hole completely unrelated to the lab, explore how the printfIndented function works to use varargs in a format string like printf does.
Much like the previous task, we are building a recursive set of functions, and that should keep the individual functions somewhat simple; if you find yourself accessing a node's childrens' data in a print function, you should probably be calling one of the other print functions instead.
When you've defined every function in print.c, edit the run configurations for the parse_print_tree executable to use a path from within the cmake-build-debug folder to input.txt as the first argument, and then run it!
You should fully debug tasks 1 and 2 before moving on to task 3.
Finally, we can evaluate the completed parse tree to perform the actions specified in input.txt.
Open eval.h and eval.c. In eval.h, see the following function declarations:
void evalProgram(NODE *node); void evalStatement(NODE *node); void evalAssignStmt(NODE *node); void evalRepeatStmt(NODE *node); void evalPrintStmt(NODE *node); NUMBER evalExpr(NODE *node);
NUMBER evalTerm(NODE *node);
NUMBER evalFactor(NODE *node);
NUMBER evalNumber(NODE *node);
NUMBER evalId(NODE *node);
NUMBER evalOperation(NUMBER operand1, NUMBER operand2, char op);
These functions will coincide with those implemented in parse.c and print.c; they will traverse the parse tree recursively. This time, the goal is neither to build the tree nor to print it, but to perform the actions specified by the parsed statements. That is, in this task we are actually executing the commands in input.txt. When this task is complete we will have a fully functional interpreter! Just don't try to give it any tokens with more than 128 characters ?.
The intended functionality of assignment, print, and repeat statements is likely intuitive, but we will define them anyway:
The remaining functions are for the evaluation of expressions and their contained terms, factors, numbers, and variables. Pay attention to the types specified for the functions; evalExpr is of type NUMBER because the result of an evaluated expression is a NUMBER. evalAssignStmt, on the other hand, is of type void because the function of an assignment statement is to do something (specifically, assign the value of an expression to a variable) and not to return something.
Note that evalId is for accessing the value of a variable when an ident node is the child of a factor. evalId should not be used when an ident node is the child of an assignment node, as in this case the goal is to assign a value to the variable, not find a previously assigned value. If evalId is called and there is not variable with the specified identifier, return the NAN_NUMBER. NAN_NUMBER is defined at the top of eval.c, and is an instance of the NUMBER struct of type FLOAT_TYPE with value NAN (short for "not a number").
There is also a set of functions which doesn't have an equivalent in the previous tasks: evalOperation and its referenced helper functions. This function takes in two NUMBERs and a character + - * / % representing and operation, and simply calls the corresponding evaluation function evalAdd, evalSub, evalMult, evalDiv, or evalMod. You will need to fill out these evaluation functions.
If both of an operation's inputs are numbers of type INT_TYPE, then the result should be an INT_TYPE. If, on the other hand, either of the operands is FLOAT_TYPE then the output should be a FLOAT_TYPE (as is the convention in most languages). When the modulus % operation is used with floats, it should return the result of the fmod function from math.h (check out the manual entry). When division is done with integers, it should perform floor division.
Once again: recursive setup --> somewhat simple functions. If you're looking at childrens' data instead of just passing the children into a different function, you're working too hard.
When you're ready to run, edit the configurations for the parse_eval executable to include a path to input.txt as an argument, and run!
With the provided contents of input.txt, the output should be:
Done parsing...
INT : 100
FLOAT : 0.150000
INT : 102
FLOAT : -101.725000
FLOAT : -101.725000
FLOAT : -101.725000
FLOAT : -101.725000
FLOAT : -101.725000
FLOAT : -101.725000
FLOAT : -101.725000
FLOAT : -101.725000
FLOAT : -101.725000
FLOAT : -101.725000
Process finished with exit code 0
In your submission:
(Leak_PossiblyLost is ok).
Disclaimer on "matching the sample runs": I change these labs every semester; if I've made an obvious mistake like, say, changing an identifier or a numeric value in the input without updating the sample run, then please inform me and don't worry about matching my errors. I believe I've found them all, but I always believe that and it's never true.
firstvar = 1; secondvar = 2; repeat (10) thirdvar = 2 * (firstvar % secondvar) / (firstvar + 2);; repeat (firstvar + 2 * secondvar) repeat (thirdvar) print -firstvar;;;
Done parsing...
=> START program
=> START statement
=> START assignment
=> START identifier
<id> firstvar
=> END identifier
=> START expression
=> START term
=> START factor
=> START number
<number> 1.000000
=> END number
=> END factor
=> END term
=> END expression
=> END assignment
=> END statement => START program
=> START statement
=> START assignment
=> START identifier
<id> secondvar
=> END identifier
=> START expression
=> START term
=> START factor
=> START number
<number> 2.000000
=> END number
=> END factor
=> END term
=> END expression
=> END assignment
=> END statement => START program
=> START statement
=> START repeat
=> START expression
=> START term
=> START factor
=> START number <number> 10.000000
=> END number
=> END factor
=> END term
=> END expression
=> START statement
=> START assignment
=> START identifier
<id> thirdvar
=> END identifier
=> START expression
=> START term
=> START factor
=> START number
<number> 2.000000
=> END number
=> END factor
<times>
=> START expression
=> START term
=> START factor
=> START expression
=> START term
=> START factor
=> START identifier
<id> firstvar
=> END identifier
=> END factor
<modulus>
=> START expression
=> START term
=> START factor
=> START identifier
<id> secondvar
=> END identifier
=> END factor
=> END term
=> END expression
=> END term
=> END expression
=> END factor
<divide>
=> START expression
=> START term
=> START factor
=> START expression
=> START term
=> START factor
=> START identifier
<id> firstvar
=> END identifier
=> END factor
=> END term
<plus>
=> START expression
=> START term
=> START factor
=> START number
<number> 2.000000
=> END number
=> END factor
=> END term
=> END expression
=> END expression
=> END factor
=> END term
=> END expression
=> END term
=> END expression
=> END term
=> END expression
=> END assignment
=> END statement
=> END repeat
=> END statement => START program
=> START statement
=> START repeat
=> START expression
=> START term
=> START factor
=> START identifier
<id> firstvar
=> END identifier
=> END factor
=> END term
<plus>
=> START expression => START term
=> START factor
=> START number
<number> 2.000000
=> END number
=> END factor
<times>
=> START expression
=> START term
=> START factor
=> START identifier
<id> secondvar
=> END identifier
=> END factor
=> END term
=> END expression
=> END term
=> END expression
=> END expression
=> START statement
=> START repeat
=> START expression
=> START term
=> START factor
=> START identifier
<id> thirdvar
=> END identifier
=> END factor
=> END term
=> END expression
=> START statement
=> START print
=> START expression
=> START term
=> START factor
<minus>
=> START factor
=> START identifier
<id> firstvar
=> END identifier
=> END factor
=> END factor
=> END term
=> END expression => END print
=> END statement
=> END repeat
=> END statement
=> END repeat
=> END statement
=> START program
=> END program
=> END program
=> END program
=> END program
=> END program
The following files can be downloaded at comp232.com in the /home/LAB9 directory.
#ifndef __EVAL_H
#define __EVAL_H
#include "parser.h"
void printProgram(NODE *node); void printStatement(NODE *node); void printAssignStmt(NODE *node); void printRepeatStmt(NODE *node); void printPrintStmt(NODE *node); void printExpr(NODE *node); void printTerm(NODE *node); void printFactor(NODE *node); void printId(NODE *node);
void printNumber(NODE *node);
#ifndef __PARSER_H
#define __PARSER_H
#include <stdio.h>
#include <stdlib.h>
#include "scanner.h"
typedef enum
{
IDENTIFIER_NODE = 1,
NUMBER_NODE,
EXPR_NODE,
TERM_NODE,
ASSIGN_STMT_NODE,
REPEAT_STMT_NODE,
PRINT_STMT_NODE,
FACTOR_NODE,
PROGRAM_NODE,
STATEMENT_NODE
} NODE_TYPE;
typedef struct node
{
NODE_TYPE type;
union
{
char identifier[128]; double number; char op;
//char addOp;
//char multOp;
} data; struct node *leftNode; struct node *rightNode;
} NODE;
NODE *program();
NODE *statement();
NODE *assignStmt(TOKEN **currToken);
NODE *repeatStmt(TOKEN **currToken);
NODE *printStmt(TOKEN **currToken);
NODE *expr(TOKEN **currToken);
NODE *term(TOKEN **currToken);
NODE *factor(TOKEN **currToken);
NODE *id(TOKEN **currToken);
NODE *number(TOKEN **currToken);
TOKEN *getNextToken(TOKEN **currToken);
void error(char *errorMsg);
#endif
#endif
#ifndef __SCANNER_H
#define __SCANNER_H
#include <stdio.h>
#include <stdlib.h> #include <string.h>
#include <stdbool.h>
typedef enum
{
INVALID_TOKEN = 0,
NUMBER_TOKEN,
IDENT_TOKEN,
ASSIGNMENT_TOKEN,
SEMICOLON_TOKEN,
LPAREN_TOKEN,
RPAREN_TOKEN,
PLUS_TOKEN,
MINUS_TOKEN,
MULT_TOKEN,
DIV_TOKEN,
MOD_TOKEN,
REPEAT_TOKEN,
PRINT_TOKEN,
END_OF_INPUT_TOKEN
} TOKEN_TYPE;
typedef struct token
{
TOKEN_TYPE type; char *strVal;
} TOKEN;
TOKEN *scannerAdHoc();
void ungetToken(TOKEN **);
void freeToken(TOKEN **);
#define BUF_SIZE 128
#define MAX_LINE_LENGTH 256
#endif
#include "eval.h"
void printProgram(NODE *node)
{
printf("=> START program\n"); if (node->leftNode != NULL) printStatement(node->leftNode); if (node->rightNode != NULL) printProgram(node->rightNode);
printf("=> END program\n");
}
void printStatement(NODE *node)
{
// TODO: implement
}
void printAssignStmt(NODE *node)
{
// TODO: implement
}
void printRepeatStmt(NODE *node)
{
// TODO: implement
}
void printPrintStmt(NODE *node)
{
// TODO: implement
}
void printExpr(NODE *node)
{
// TODO: implement
}
void printTerm(NODE *node)
{
// TODO: implement
}
void printFactor(NODE *node)
{
// TODO: implement
}
void printId(NODE *node)
{
// TODO: implement
}
void printNumber(NODE *node)
{
// TODO: implement
}
#include "eval.h"
int main(void)
{
NODE *fullProgram = program(); printf("Done parsing...\n");
printProgram(fullProgram);
exit(0);
}
#include "parser.h"
TOKEN *getNextToken(TOKEN **token)
{
freeToken(token);
return scannerAdHoc();
}
NODE *program()
{
NODE *node = malloc(sizeof(NODE)); node->type = PROGRAM_NODE; node->leftNode = statement();
if (node->leftNode != NULL) node->rightNode = program(); return node;
}
NODE *statement()
{
// TODO: implement
}
NODE *assignStmt(TOKEN **currToken)
{
// TODO: implement
}
NODE *repeatStmt(TOKEN **currToken)
{
// TODO: implement
}
NODE *printStmt(TOKEN **currToken)
{
// TODO: implement
}
NODE *expr(TOKEN **currToken)
{
// TODO: implement
}
NODE *term(TOKEN **currToken)
{
// TODO: implement
}
NODE *factor(TOKEN **currToken)
{
// TODO: implement
}
NODE *id(TOKEN **currToken)
{
// TODO: implement
}
NODE *number(TOKEN **currToken)
{
// TODO: implement
}
void error(char *errorMessage)
{
printf("PARSING ERROR: %s\nExiting...\n", errorMessage); exit(1);
}
#include "scanner.h"
TOKEN *ungottenToken = NULL; // used by parser
//
// return token to the input, so it can be analyzed again
//
void ungetToken(TOKEN **token)
{
ungottenToken = *token;
*token = NULL;
}
//
// clean up the token structure
//
void freeToken(TOKEN **token)
{
if (token == NULL)
return;
if (*token == NULL) return;
if ((*token)->strVal != NULL)
free((*token)->strVal);
free(*token);
*token = NULL;
}
//
// check if a collected sequence of characters is a keyword
//
bool isKeyword(TOKEN *token, char *str)
{
if (strcmp(str, "repeat") == 0) token->type = REPEAT_TOKEN; else if (strcmp(str, "print") == 0)
token->type = PRINT_TOKEN;
else
return false;
return true;
}
//
// scan input for a token using the ad hoc method
//
TOKEN *scannerAdHoc()
{
// reuse any token returned to the input if (ungottenToken != NULL)
{
TOKEN *tok = ungottenToken;
ungottenToken = NULL;
return tok;
}
// temporary buffer for collecting characters char tempString[BUF_SIZE];
// acquire space for the new token
TOKEN *token = (TOKEN *) malloc(sizeof(TOKEN));
token->type = INVALID_TOKEN; // assume that the token is invalid to start with token->strVal = NULL;
// analyze the input character by character figuring out what kind of token they constitute char c; int i;
while ((token->type == INVALID_TOKEN) && ((c = getchar()) != EOF))
{
switch (c)
{
case ' ': // skip whitespace case '\t': case '\n': break;
case 'a'...'z': // c is a letter
i = 0; do
{
tempString[i++] = c; c = getchar();
} while (c >= 'a' && c <= 'z'); tempString[i] = '\0';
ungetc(c, stdin); // the last character read is not a letter, so return it to the input stream if (!isKeyword(token, tempString))
{
token->type = IDENT_TOKEN;
token->strVal = (char *) malloc(strlen(tempString) + 1); strcpy(token->strVal, tempString);
}
return token;
case '0'...'9': // if c is a number i = 0; do
{
tempString[i++] = c; c = getchar();
} while (c >= '0' && c <= '9'); tempString[i] = '\0';
ungetc(c, stdin);
token->type = NUMBER_TOKEN;
// this scanner saves each token as a string and // assumes that the parser will interpret it as needed token->strVal = (char *) malloc(strlen(tempString) + 1); strcpy(token->strVal, tempString);
return token; default:
// save the single character so the parser may distinguish operators if needed token->strVal = malloc(sizeof(char) + 1);
token->strVal[0] = c; token->strVal[1] = '\0';
switch (c)
{ case '=': token->type = ASSIGNMENT_TOKEN;
break; case ';':
token->type = SEMICOLON_TOKEN;
break; case '(':
token->type = LPAREN_TOKEN;
break; case ')':
token->type = RPAREN_TOKEN;
break; case '+':
token->type = PLUS_TOKEN;
break; case '-':
token->type = MINUS_TOKEN;
break; case '*':
token->type = MULT_TOKEN;
break; case '/':
token->type = DIV_TOKEN;
break; case '%':
token->type = MOD_TOKEN;
break;
}
return token;
}
}
if (c == EOF)
{
free(token);
return NULL;
}
return token;
}