FIRST and FOLLOW sets

FIRST(X) for all grammar symbols X

Apply following rules:

  1. If X is terminal, FIRST(X) = {X}.
  2. If X → ε is a production, then add ε to FIRST(X).
  3. If X is a non-terminal, and X → Y1 Y2 … Yk is a production, and ε is in all of FIRST(Y1), …, FIRST(Yk), then add ε to FIRST(X).
  4. If X is a non-terminal, and X → Y1 Y2 … Yk is a production, then add a to FIRST(X) if for some i, a is in FIRST(Yi), and ε is in all of FIRST(Y1), …, FIRST(Yi-1).

Applying rules 1 and 2 is obvious. Applying rules 3 and 4 for FIRST(Y1 Y2 … Yk) can be done as follows:

Add all the non-ε symbols of FIRST(Y1) to FIRST(Y1 Y2 … Yk). If ε ∈ FIRST(Y1), add all the non-ε symbols of FIRST(Y2). If ε ∈ FIRST(Y1) and ε ∈ FIRST(Y2), add all the non-ε symbols of FIRST(Y3), and so on. Finally, add ε to FIRST(Y1 Y2 … Yk) if ε ∈ FIRST(Yi), for all 1 ≤ i ≤ k.

Example:

Consider the following grammar.

E → E + T | T
T → T * F | F
F → (E) | id

Grammar after removing left recursion:

E → TX
X → +TX | ε
T → FY
Y → *FY | ε
F → (E) | id

For the above grammar, following the above rules, the FIRST sets could be computed as follows:

FIRST(E) = FIRST(T) = FIRST(F) = {(, id}

FIRST(X) = {+, ε}

FIRST(Y) = {*, ε}

FOLLOW(A) for all non-terminals A

Apply the following rules:

  1. If $ is the input end-marker, and S is the start symbol, $ ∈ FOLLOW(S).
  2. If there is a production, A → αBβ, then (FIRST(β) – ε) ⊆ FOLLOW(B).
  3. If there is a production, A → αB, or a production A → αBβ, where  ε ∈ FIRST(β), then FOLLOW(A) ⊆ FOLLOW(B).

Note that unlike the computation of FIRST sets for non-terminals, where the focus is on what a non-terminal generates, the computation of FOLLOW sets depends upon where the non-terminal appears on the RHS of a production.

Example:

For the above grammar, the FOLLOW sets can be computed by applying the above rules as follows.

FOLLOW(E) = {$, )}

FOLLOW(E) ⊆ FOLLOW(X) [in other words, FOLLOW(X) contains FOLLOW(E)]

Since there is no other rule applicable to FOLLOW(X),

FOLLOW(X) = {$, )}

FOLLOW(T) ⊆ FOLLOW(Y)      …. (1)

(FIRST(X) – ε) ⊆ FOLLOW(T) i.e., {+} ⊆ FOLLOW(T)      …. (2)

Also, since ε ∈ FIRST(X), FOLLOW(E) ⊆ FOLLOW(T)

i.e., {$, )} ⊆ FOLLOW(T)     …. (3)

Putting (2) and (3) together, we get:

FOLLOW(T) = {$, ), +}

Since, there is no other rule applying to FOLLOW(Y), from (1), we get:

FOLLOW(Y) = {$, ), +}

Since ε ∈ FIRST(Y), FOLLOW(T) ⊆ FOLLOW(F) and FOLLOW(Y) ⊆ FOLLOW(F). Also, (FIRST(Y) – ε) ⊆ FOLLOW(F). Putting all these together:

FOLLOW(F) = FOLLOW(T) ∪ FOLLOW(Y) ∪ (FIRST(Y) – ε) = {$, ), +, *}

This entry was posted in Compiler, Parsing. Bookmark the permalink.

Comments are closed.