My Technical Notes & Others...: Parser

Friday, 3 August 2012

Parse & Create JSON Documents in Java

The Jackson JSON library provides means to process and create JSON files à la DOM and StAX. It is also possible to process plain old java objects (POJO's). The code samples described here are available at GitHub.

DOM

The following code loads a JSON into a tree structure, then extracts contents by retrieving elements manually:

  String json = "{\"Id\":123456," +
    "\"Title\":\"My book title\"," +
    "\"References\":[\"Reference A\",\"Reference B\"]}";

  System.out.println("Source: " + json);

  // Parsing JSON into tree structure
  ObjectMapper om = new ObjectMapper();
  JsonNode retr = om.readTree(json);

  // Retrieving items from the structure
  JsonNode id_node = retr.get("Id");
  System.out.println("Id" + id_node.asInt());

  JsonNode id_title = retr.get("Title");
  System.out.println("Title " + id_title.asText());

  JsonNode id_refs = retr.get("References");
  System.out.print("References ");

  // Retrieving sub-elements
  Iterator<JsonNode> it = id_refs.elements();

  while ( it.hasNext() ) {
    System.out.print(" " + it.next().asText());
  }

  System.out.println(" ");

The above produces the following:

Source: {"Id":123456,"Title":"My book title","References":["Reference A","Reference B"]}
Id 123456
Title My book title
References Reference A Reference B

StAX

The code below performs a round trip, that is, the manual creation of a JSON and manual parsing with tokens:

JsonFactory jf = new JsonFactory();

// Creating in memory representation
ByteArrayOutputStream baos = new ByteArrayOutputStream();
JsonGenerator jg = jf.createJsonGenerator(
  baos, JsonEncoding.UTF8);

jg.writeStartObject();

jg.writeNumberField("Id", 123456);
jg.writeStringField("Title", "My book title");
jg.writeFieldName("References");

jg.writeStartArray();
jg.writeString("Reference A");
jg.writeString("Reference B");
jg.writeEndArray();

jg.writeEndObject();
jg.close();

// Printing JSON
String result = baos.toString("UTF8");
System.out.println(result);

// Parsing JSON
JsonParser jp = jf.createJsonParser(result);

while (jp.nextToken() != JsonToken.END_OBJECT) {
  String token = jp.getCurrentName();
  if ( "Id".equals(token) || "Title".equals(token) ) {
    System.out.print(token + " ");
    jp.nextToken();
    System.out.println(jp.getText());
  } else if ( "References".equals(token) ) {
    System.out.print(token + " ");
    jp.nextToken(); // JsonToken.START_ARRAY
    while (jp.nextToken() != JsonToken.END_ARRAY) {
      System.out.print(jp.getText() + " ");
    }
    System.out.println("");
  }
}

jp.close();

The produced output is:

{"Id":123456,"Title":"My book title","References":["Reference A","Reference B"]}
Id 123456
Title My book title
References Reference A Reference B

POJO

This example uses a plain old Java object (POJO):

public class PojoItem {

  private int id = 123456;
  private String title;
  private List<String> references
    = new ArrayList<String>() {{
        add("Reference A"); add("Reference B"); }
      };

  public int getId() { 
    return id;
  }

  public void setId(int id) {
    this.id = id;
  }

  public String getTitle() {
    return title;
  }

  public void setTitle(String title) {
    this.title = title;
  }

  public List<String> getReferences() {
    return references;
  }

  public void setReferences(List<String> references) {
    this.references = references;
  }

}

The following transform a POJO into an JSON, and then back into a POJO instance:

PojoItem pi = new PojoItem();
pi.setId(123466);
pi.setTitle("My Title");
pi.setReferences(new ArrayList<String>() { {
  add("Reference A"); add("Reference B");
} });

// Creating and printing JSON
ObjectMapper om = new ObjectMapper();
String result = om.writeValueAsString(pi);
System.out.println(result);

// Parsing JSON
PojoItem user = om.readValue(result, PojoItem.class);

System.out.println("id " + user.getId());
System.out.println("title " + user.getTitle());
System.out.print("references ");
for (String s : user.getReferences()) {
  System.out.print(s + " ");
}

System.out.println(" ");

The above code creates a JSON from a plain old Java object. The output is:

{"id":123466,"title":"My Title","references":["Reference A","Reference B"]}
id 123466
title My Title
references Reference A Reference B

Pojo to XML • Pojo to JSON • JAXB to XML • JAXB to JSON • JAXB Crash Course

Parse & Create XML Documents in Java

There are 3 main types of XML parsers available out there: DOM, SAX and StAX. StAX is an improvement on SAX and much easier to use. Therefore, we are not going to cover it here. DOM and StAX offer enough functionalities to work with XML documents.

DOM is a parser building a complete tree of nodes in-memory. This can be an issue when parsing large documents. However, it is the only (and easiest) mean to manipulate documents via CRUD (Create, Read, Update, Delete) operations.

StAX is a pull kind of parser. It parses documents step by step and lets the user pull node elements one by one. It is much more efficient regarding memory consumption, but it cannot be used for CRUD operations.

We will use a maven code sample available here. In the resource directory, there is a rates.xml example file.

DOM

DocumentBuilderFactory factory =
    DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream IS = DOM.class.getResourceAsStream("/rates.xml");
Document doc = builder.parse(IS);

// Retrieving cube XML nodes
NodeList list = doc.getElementsByTagName("Cube");

for (int i = 0; i < list.getLength(); i++) {

  Element element = (Element) list.item(i);

  // Retrieving attributes
  NamedNodeMap attr = element.getAttributes();

  for (int j=0;j<attr.getLength();j++) {
    System.out.print(attr.item(j).getTextContent() + " ");
  }

  System.out.println("");

}

The above code loads the rates.xml file, extracts Cube nodes, and prints their attributes. The output is:

2012-08-02
USD 1.2346
JPY 96.64 
BGN 1.9558 
CZK 25.260
...

StAX

XMLInputFactory inputFactory = XMLInputFactory.newInstance();
InputStream IS = StAX.class.getResourceAsStream("/rates.xml");
XMLEventReader eventReader
  = inputFactory.createXMLEventReader(IS);

// Pulling XML elements
while (eventReader.hasNext()) {

  XMLEvent event = eventReader.nextEvent();

  if (event.isStartElement()) {
    StartElement se = event.asStartElement();

    // Filtering on Cube elements
    if (se.getName().getLocalPart().equals("Cube")) {

      Iterator it = se.getAttributes();
      while (it.hasNext()) {
        Attribute a = (Attribute) it.next();
        System.out.print(a.getValue() + " ");
      }

      event = eventReader.nextEvent();
      System.out.println("");
      continue;

    }

  }

}

The above code pulls XML elements one by one, filters for Cube ones, and prints corresponding attributes. The output is:

2012-08-02
1.2346 USD
96.64 JPY
1.9558 BGN
25.260 CZK
...

CRUD and Print

For creation:

//We need a Document
DocumentBuilderFactory dbfac
    = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.newDocument();

Element root = doc.createElement("MyXML");
doc.appendChild(root);

Element sub = doc.createElement("MyNode");
sub.setAttribute("MyAttribute", "33");
root.appendChild(sub);

Text text = doc.createTextNode("Some text for my node");
sub.appendChild(text);

Element sub2 = doc.createElement("MyNode2");
sub2.setAttribute("MyAttribute", "45");
root.appendChild(sub2);

Element subnode = doc.createElement("MySubNode");
sub2.appendChild(subnode);

printXML(doc);

The above creates a document with a root node, then adds subnodes, and a subsubnode to the subnode. One also sets some attribute value.

For printing:

TransformerFactory transfac
    = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();
trans.setOutputProperty(OutputKeys.INDENT, "yes");

StringWriter sw = new StringWriter();
StreamResult sr = new StreamResult(sw);
DOMSource source = new DOMSource(doc);

trans.transform(source, sr);

String result = sw.toString();
System.out.println(result);

The ident line adds a newline after each node. The output is:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<MyXML>
<MyNode MyAttribute="33">Some text for my node</MyNode>
<MyNode2 MyAttribute="45">
<MySubNode/>
</MyNode2>
</MyXML>

Pojo to XML • Pojo to JSON • JAXB to XML • JAXB to JSON • JAXB Crash Course

Pages

Friday, 3 August 2012

Parse & Create JSON Documents in Java

DOM

StAX

POJO

Parse & Create XML Documents in Java

DOM

StAX

CRUD and Print