Introduction to SAX parser in Java with example

Introduction to SAX parser in Java with example

XML parsers reads an XML document and uses DOM or SAX based APIs to provide programmatic access to its content and structure. DOM is an in-memory tree representation of the structure of an XML document and SAX is a standard for event-based XML parsing. XML parsers can also be used to modify or create a new XML document. This post is an Introduction to SAX parser in Java with example program.

SAX Parser

SAX (Simple API for XML) is a Java API for sequential reading of XML files. SAX can only read XML documents. The javax.xml.parsers.SAXParser provides methods to parse XML document using event handlers. This class implements XMLReader interface and provides overloaded version of parse() methods to read XML document from File, InputStream, SAX InputSource and String URI.

SAX is a push API i.e. you register listeners in the form of Handlers with call-back methods and the SAX parser sends(push) notifications about the XML document being processed an element, an attribute, at a time in sequential order starting at the top of the document, and ending with the closing of the ROOT element.

Below is the architecture of a typical SAX parsing application.

 

SAX Parser in Java

When to Use SAX

  • You want to process the XML document in a sequential manner from top to bottom.
  • SAX requires much less memory than DOM because SAX does not create an in-memory tree of the XML data, as a DOM does. So if you want to process a very large XML document whose DOM tree would consume too much memory, you can shoose SAX over DOM.
  • If the XML document is not deeply nested.
  • SAX is fast and efficient and it is useful for state-independent filtering. SAX parser calls a method whenever an element tag is encountered and calls a different method when text or character is found.

The disadvantage of SAX is that it does not provide random access to an XML document.

Let us see an example program using SAX parser in Java.

employees.xml

This the xml file that we are going to read using SAX parser. The xml file contains a list of employees and each Employee has id attribute and fields age, firstname, lastname and role. We will parse this XML and create a list of Employee object.

Here is the Employee class representing Employee element from XML.

The next step is to create our own handler class to parse the XML document. DefaultHandler class provides default implementation of ContentHandler interface, so we can extend this class to create our own handler.

SAXParserHandler.java

When the SAXParser starts parsing the document, whenever a start element is found, the startElement() method will be called. We are overriding this method to set boolean variables that will be used to identify the element in the characters() method. We are also using this method to create new Employee object every time Employee start element is found. Note how id attribute is read and set in the Employee Object id field.

characters() method is called whenever character data is found by SAXParser inside an element. Based on the boolean field set in the startElement() method, the value will be set to correct field in the Employee object.

The endElement() will be called whenever an end tag is found by the SAX parser. In this method we add Employee object to the list whenever the parser found Employee as end element tag.

Below is the Java program that uses SAXParserHandler to parse the XML to list of Employee objects.

Below is the output of above program.

SAXParserFactory provides factory methods to get the SAXParser instance. We are passing File object and SAXParserHandler instance as parameters to the SAXParser’s parse() method.

The SAX parser can generate three kinds of errors:

  • a fatal error
  • an error
  • a warning

When a fatal error occurs, the parser will not continue.

When an error or warning occurs, the default error handler will not generate exceptions and no messages are displayed. We can define our own error handler by implementing the standard ErrorHandler interface and overriding the methods warning(), error() and fatalError().

Though SAX is an older API, an understanding of it will helpful in exploring other efficient APIs like StAX (Streaming API for XML).

Hope you find this post on Introduction to SAX parser in Java useful. If you have any comments, post it in the comments section.

Additional Resources:

Lesson: Simple API for XML

Java and XML – Tutorial

The following two tabs change content below.
Working as a Java developer since 2010. Passionate about programming in Java. I am a part time blogger.
3 comments

Add Comment

Required fields are marked *. Your email address will not be published.