Study notes: Phases of Compiler for GATE CS Examination

By BYJU'S Exam Prep

Updated on: September 25th, 2023


The compiler is nothing but, merely program that translates a program written in one language (the source language) to the same program in another language (the target language).

Download Formula Notes PDF for Computer Networks



Want to Ace in GATE 2023? BYJU’S Exam Prep offers a comprehensive study plan to aid your GATE exam preparation journey.

We have launched a GATE Starter Series on our App platform FREE for all users. Check the links below:

GATE Starter Series Link – Civil Engineering Link
GATE Starter Series Link – Mechanical Engineering Link
GATE Starter Series Link – Electronics Link
GATE Starter Series Link – Electrical Link
GATE Starter Series Link – Computer Science Link


  • Translates from one representation of the program to another
  • Typically from high-level source code to low-level machine code or object code
  • Source code is normally optimized for human readability
    • Expressive: matches our notion of languages (and application?!)
    • Redundant to help avoid programming errors
  • Machine code is optimized for hardware
    • Redundancy is reduced
    • Information about the intent is lost


A compiler is often characterized by three languages: the language (S), the target language (T), and therefore the implementation language (I). The three languages S, I, and T are often quite different. Such a compiler is called a cross-compiler


Compilers are of two types -: native and cross.

Native compilers are written within the same language because of the target language. For example, SMM maybe a compiler for the language S that’s during a language that runs on machine M and generates output code that runs on machine M.

Cross compilers are written in several languages because of the target language. For example, SNM maybe a compiler for the language S that’s during a language that runs on machine N and generates output code that runs on machine M. 

Major Parts of a Compiler: There are two major parts of a compiler: Analysis and Synthesis.

  • Analysis phase: An intermediate representation is created from the given source program. A lexical analyzer, syntax analyzer and semantic analyzer are part of the analysis phase.
  • Synthesis phase: An equivalent target program is created from the intermediate representation. Intermediate code generator, code generator and code optimizer are part of the synthesis phase.

Phases of a Compiler

Lexical analyzer reads the source program character by character and returns the tokens of the source program, Thus updating information about identifiers into the symbol table simultaneously.


Each phase does the following.

  • Transforms the source program from one representation into another representation.
  • They communicate with error handlers.
  • They communicate with the symbol table.

Lexical Analyzer:

It reads the program character by character and returns the tokens of the program

  • A token describes a pattern of characters having the same meaning in the source program. (Such as identifiers, operators, keywords, numbers, delimiters and so on)
  • Puts information about identifiers into the symbol table.
  • Regular expressions are used to describe tokens (lexical constructs).
  • A (Deterministic) Finite State Automaton can be used in the implementation of a lexical analyzer.

Syntax analyzer:

  • A syntax analyzer creates the syntactic structure (generally a parse tree) of the given program.
  • A syntax analyzer is also called a parser.
  • A parse tree describes a syntactic structure.


  • In a parse tree, all terminals are at leaves.
  • All inner nodes are non-terminals in context-free grammar.
  • The syntax of a language is specified by context-free grammar (CFG).
  • The rules in a CFG are mostly recursive.
  • A syntax analyzer checks whether a given program satisfies the rules implied by a CFG or not.
  • If it satisfies, the syntax analyzer creates a parse tree for the given program.
    • Ex: We use BNF (Backus Normal Form) to specify a CFG assignment -> identifier := expression expression -> identifier expression -> number expression -> expression + expression

Semantic analyzer:

  • A semantic analyzer checks the source program for semantic errors and collects the type information for the code generation.
  • Type-checking is an important part of the semantic analyzer.
  • Normally semantic information cannot be represented by a context-free language used in syntax analyzers.
  • Context-free grammars used in the syntax analysis are integrated with attributes (semantic rules)
    • The result is a syntax-directed translation,
    • Attribute grammars
  • Ex: newval := oldval + 12
  • The type of the identifier newval must match with the type of the expression (oldval+12)
  • The following can be done during semantic analysis:
    • Check semantics
    • Error reporting
    • Disambiguate overloaded operators
    • Type coercion
    • Static checking
    • Type checking
    • Control flow checking
    • Uniqueness checking
    • Name checks

Intermediate code generation:

  • A compiler may produce an explicit intermediate code representing the source program.
  • These intermediate codes are generally machine codes (architecture-independent). But the level of intermediate codes is close to the level of machine codes.
  • Example: newval := oldval * fact + 1 id1 := id2 * id3 + 1 MULT id2,id3,temp1 Intermediates Codes (Quadruples) ADD temp1,#1,temp2 MOV temp2,,id1

Code optimizer:

  • The code optimizer optimizes the code produced by the intermediate code generator in the terms of time and space.
  • Ex: MULT id2, id3, temp1 ADD temp1, #1, id1
  • No strong counterpart with English, but is similar to editing/précis writing
  • Automatically modify programs so that they run faster
  • Use fewer resources (memory, registers, space, fewer fetches etc.)
  • Some common optimizations:
    • Common sub-expression elimination
    • Copy propagation
    • Dead code elimination
    • Code motion
    • Strength reduction
    • Constant folding

Code generator

  • Produces the target language in a specific architecture.
  • The target program is normally is a relocatable object file containing the machine codes.
  • Ex: (assume that we have an architecture with instructions whose at least one of its operands is a machine register) MOVEid2,R1 MULT id3,R1 ADD #1,R1 MOVER1,id1

Usually a two-step process

  • Generate intermediate code from the semantic representation of the program
  • Generate machine code from the intermediate code

The advantage is that each phase is simple

  1. Requires design of the intermediate language
  2. Most compilers perform translation between successive intermediate representations
  3. Intermediate languages are generally ordered in decreasing level of abstraction from highest (source) to lowest (machine)
  4. However, typically the one after the intermediate code generation is the most important

So this is about introduction which will help you to get the basics of the subject.

You can also follow the detailed champion study plan for GATE CS 2021 from the given link below:

Detailed GATE CSE 2021 Champion Study Plan

Candidates can also practice 110+ Mock tests for exams like GATE, ISRO, NIELiT with BYJU’S Exam Prep Test Series check the  link given below:

Click Here to Avail GATE CSE Test Series! (100+ Mock Tests)

Get unlimited access to 21+ structured Live Courses all 112+ mock tests with Online Classroom Program for GATE CS, ISRO BARC & PSU Exams:

Click here to avail Online Classroom Program for Computer Science Engineering



Our Apps Playstore
SSC and Bank
Other Exams
GradeStack Learning Pvt. Ltd.Windsor IT Park, Tower - A, 2nd Floor, Sector 125, Noida, Uttar Pradesh 201303
Home Practice Test Series Premium