Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / The topic of this task is he implementation of a (simplified) Earley parser

The topic of this task is he implementation of a (simplified) Earley parser

Computer Science

The topic of this task is he implementation of a (simplified) Earley parser. The Earley parser is a chart parser which differs from the left-corner parser primarily in the predict function. The complete functions are the same. 

You can recycle the sample code for the exercise “Statistischer Parser” on the webpage of this course. Your parser doesn’t have to compute the best parse tree. The parse probability is sufficient. 

The predict function of the Earley parser is called whenever a dotted rule A → α · Bβ is entered into the chart whose dot has not yet reached the end of the rule. The predict function gets the non-terminal B and the end-position pos of the span as arguments, and looks up all grammar rules with B on the left-hand side of the rule. Each rule is entered into the chart with dot position 0 and start/end position pos (i.e. the span of the rule is still empty). 

The parser stores dotted rules as a tuple tup=(lhs, rhs, dotpos, startpos, endpos) into chart via the add function. The chart is a list of dictionaries which is used as follows: 

self.vitprob[endpos][tup] = prob  Meaning of the variables: 

lhs = left-hand side of the rule rhs = list of elements on the right-hand side of the rule dotpos = position of the dot on the right-hand side of the rule

startpos = start position of the rule’s span (index of the first covered word) endpos = end position of the rule’s span (index of the last covered word + 1) prob = probability of the dotted rule 

Please write a Python class Parser with the following methods:

 

Task 1) The constructor method __init__ receives 2 filenames, grammarfile and lexfile, as arguments and calls read_grammar and read_lexicon to read the 2 files.  

Task 2) The read_grammar method receives a grammr filename as argument, reads the rules, and stores them in such a way in the data structure self.ruleprobs that it is easy to look up all grammar rules A → α and their probabilities for a given non-terminal A. The symbol on the left-hand side of the first grammar rule is stored in self.start_symbol. 

Each line of the grammar file contains (i) a probability, (ii) the left-hand side of a grammar rule and (iii) the symbols on the right-hand side of the rule. 

1.0 S NP VP

0.5 VP VP PP

0.5 VP v NP

0.4 NP d N1

...

Task 3) The read_lexicon method reads the lexicon file and stores the rules A → w in the data structure self.lexprobs. The data is stored in such a way that it is easy to retrieve the parts of speech A and the correponding probabilities for a given word w. 

Each line of the lexicon file contains a probability, a part of speech, and a word: 

0.1 DT the 0.002 N man

...

Task 4) The scan method receives 2 arguments: a word and its position. It looks up each part of speech A of the word in lexprobs and adds the dotted rule A → w · and its probability to the chart. 

Task 5) The predict method receives 2 arguments: a non-terminal A and a position pos. It looks up all rules with left-hand side A and adds them to the chart as described above. 

Task 6) The complete method receives 2 arguments: a dotted-rule tuple and its pro- bability. It performs the complete operation. 

Task 7) The add method receives 2 arguments: a dotted-rule tuple and its probability. It adds the dotted rule to the chart if either the rule is new, or the previously entered rule has a lower probability. Then it calls the complete method if the dot has reached the end of the rule, and the predict method, otherwise. 

Task 8) Finally, write a method parse which gets a sentence (a list of words) as argu- ment. It calls the method predict with the symbol self.start_symbol and start position 0 as arguments. Then it calls the method scan with each word of the sequence and its position as arguments. Finally the parse method checks whether a complete analysis of the sentence has been found and outputs its probability.  

Please follow exactly the above instructions and check whether the probabilities are computed correctly. 

Option 1

Low Cost Option
Download this past answer in few clicks

26.99 USD

PURCHASE SOLUTION

Already member?


Option 2

Custom new solution created by our subject matter experts

GET A QUOTE

Related Questions