An Introduction to DOM parser in Java

Introduction to DOM parser in Java

DOM (Document Object Model) defines a standard for accessing and manipulating documents. DOM builds an in-memory tree representation of the XML document, where each node contains one of the components from an XML structure. The two most common types of nodes in XML document are element nodes and text nodes. Java DOM parser API allows us to create nodes, remove nodes, change their contents, and traverse the node hierarchy. This post gives an Introduction to DOM parser in Java.

Points to note about DOM

  • DOM was designed to be language-neutral.
  • DOM does not take advantage of Java’s object-oriented features.
  • DOM provides a lot of flexibility for handling fully-fledged documents and complex applications. If your programs handle simple data structures, then you can use JDOM or dom4j.
  • Since DOM creates in-memory tree, for processing very large XML documents, you should choose SAX or StAX beacuse DOM would consume too much memory.

DOM Parser API

The javax.xml.parsers.DocumentBuilder class defines API to obtain DOM Document instances from an XML document. Using this class, an application program can obtain a Document from XML.

An instance of this class can be obtained from the DocumentBuilderFactory.newDocumentBuilder() method. Once an instance of this class is obtained, XML can be parsed from a variety of input sources. These input sources are InputStreams, Files, URLs, and SAX InputSources.

Below is the architecture of a typical DOM parser application.

dom parser in java


DOM interfaces

The DOM defines several Java interfaces. Here are the most common interfaces used in a program

  • Node – Primary datatype for the entire Document Object Model.
  • Element – The Element interface represents an element in an HTML or XML document. Elements may have attributes associated with them. Most of your code will deal with this interface only.
  • Attr  – Represents an attribute of an element object.
  • Text – The actual textual content of an Element or Attr.
  • Document – Represents the entire HTML or XML document. A Document object is the root of the DOM tree.

Parse XML document using DOM parser in Java

Here is the input xml file we will be parsing using DOM parser.

Below is the output when you run the above program.

Create XML document using DOM parser in Java

Here is the input xml file we will be creating using DOM parser.

Below is the output printed to the console after running the above program.

and below is the screenshot of the file created in D drive.

dom parser output

Modify XML document using DOM parser in Java

We will use the below XML file to modify using DOM parser.

Below is the output of running the above program.

We have learnt how to parse(read), create and modify XML documents using DOM parser. I hope this post at least serves as an Introduction to DOM parser in Java.

If you have any comments, post it in the comments section.

The following two tabs change content below.
Working as a Java developer since 2010. Passionate about programming in Java. I am a part time blogger.

Add Comment

Required fields are marked *. Your email address will not be published.