How to parse XML using NodeJS, ExpressJS and xml2js

How to parse XML using NodeJS

XML or eXtensible Markup Language is widely used to store or interchange data on internet. XML is derived from SGML and is human and machine readable. XML document can have custom defined tags. In a previous article we discussed how to generate xml using php , this article explore how to parse XML using NodeJS.

parse xml using nodejs

Suppose you are working on an online book store application, and this application shares all available books information. You can generate a XML document containing information such as book name, category, price and author etc. This information can be parsed and displayed to users by other web sites.

In this tutorial we are going to:

1. Create a NodeJS, ExpressJS application

2. Parse a book information XML document using NodeJS

3. Display books information to user

Create a NodeJS, Express application

Open command prompt and type following command to generate a NodeJS, Express application.

express --view=pug nodejs-parse-xml

This command will generate an express application named nodejs-parse-xml. Type command cd nodejs-parse-xml. Type npm install to install the dependencies and modules. You can see the directory structure. pug is used as template engine.

parse xml using nodejs

Books information XML document

The XML file below is used to store information about the books. Create a directory xmlfiles in public  directory. Create an XML file booksxml.xml with following information.

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>

<book>
  <title lang="en">The C++ Programming Language</title>
  <author>Bjarne Stroustrup</author>
  <year>2003</year>
  <price>32.77</price>
  <category>Programming</category>
</book>

<book>
  <title lang="en">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
  <category>Children</category>
</book>

<book>
  <title lang="en">Ivor Horton's Beginning Java</title>
  <author>Ivor Horton</author>
  <year>2011</year>
  <price>37</price>
  <category>Programming</category>
</book>

<book>
  <title lang="en">The Pragmatic Programmer</title>
  <author>Andrew Hunt</author>
  <year>2005</year>
  <price>30.00</price>
  <category>Programming</category>
</book>

<book>
  <title lang="en">XQuery Kick Start</title>
  <author>James McGovern</author>
  <year>2003</year>
  <price>49.99</price>
  <category>Web</category>
</book>

<book>
  <title lang="en">Learning XML</title>
  <author>Erik T. Ray</author>
  <year>2003</year>
  <price>39.95</price>
  <category>Web</category>
</book>

</bookstore>

In the document bookstore is the root element. This element contains book elements. Each book elements has information about a book like name, author, price, category and year.

parse xml using xml

Parse XML using NodeJS

In order to parse XML using NodeJS, You need to install a Node package called xml2js, using NPM or Node package manager. On command prompt type

npm install xml2js

Open index.js file in routes folder and add the code below in the file.

var express      = require('express');

var router       = express.Router();

var fs           = require('fs');

var xml2js       = require('xml2js');

var parser       = new xml2js.Parser();

/* GET home page. */

router.get('/', function(req, res, next) {
    var xmlfile = __dirname + "/../public/xmlfiles/booksxml.xml";

    fs.readFile(xmlfile, "utf-8", function (error, text) {

        if (error) {

            throw error;

        }else {

            parser.parseString(text, function (err, result) {

                var books = result['bookstore']['book'];

                res.render('index', { books:  books });

            });

        }

   });
});

module.exports = router;

As you can see we have included fs and xml2js module . Next a parser is created to parse XML document. In ‘/’ route first define  path of the file to be parsed.

    var xmlfile = __dirname + "/../public/xmlfiles/booksxml.xml";
fs.readfile method takes three parameters. First path to file to read, second the encoding type and last a callback function. Callback function returns contents on file in text variable.

Callback function parses XML from  text variable using parser ‘s parseString method. books variable contains all the books  information in bookstore root element.

parser.parseString(text, function (err, result) {

                var books = result['bookstore']['book'];

                res.render('index', { books:  books });

            });
index view template is rendered and books variable is passed to view template.

Display XML information to users

Open style.css file inside stylesheets folder present in public folder. Add following code in style.css for table, tr, th, and td.

table, td, th {

  border: 1px solid #ddd;

  text-align: left;

 }
 
 table {

  border-collapse: collapse;

  width: 100%;

 }
  
th, td {

  padding: 15px;

  }

Open index.pug in views folder and add following code to display information from XML file.

extends layout

block content
  h1= Parse XML using NodeJS
  p Books Information
  table

      tr
        th Title

        th Category

        th Author

        th Year

        th Price

        th Laguage

      each book, index in books

       tr

        td #{book['title'][0]['_']}

        td #{book['category'][0]}

        td #{book['author'][0]}

        td #{book['year'][0]}

        td #{book['price'][0]}

        td #{book['title'][0]['$']['lang']}

In the view file a table is created with headers Title, Category, Author, Year, Price and Language. Next we loop through a books information, displaying category, author  and price. Title element that contains an attribute for language and value of book title are parsed differently.

<title lang="en">The C++ Programming Language</title>

To display title “_” is used to display column value.

#{book['title'][0]['_']}

To display attribute lang. $ is used with lang parameter to displays value of language attribute.

#{book['title'][0]['$']['lang']}

You can see the parsed XML result in the image below.

parse xml using nodejs

Summary

In this tutorial you learned how to parse XML using NodeJS and display this information to user. You can download the source code for this tutorial.

Please leave you feedback and comments. Follow us on twitter or subscribe  to our newsletter to stay informed about upcoming tutorials and articles.

 

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

  • Pingback: Generate XML with NodeJS and MySQL using XML builder()

  • Pingback: How to generate pdf using nodejs, express and mysql with pdfkit()

  • Robertson

    This is a very helpful tutorial; thanks for putting it together. I’m pretty new to Node.js, Express and xml2js but I was able to get your example app up and running with no problems. It’s only when I try to adapt it to consuming xml from the OCLC Classify api that I’m running into trouble. Do you have any advice on how to adapt your code to consume the xml below? Specifically I’m interested in the ‘sfa’ attribute of the ‘mostPopular’ field.

    185775347

    Dibdin, Michael

    thold desc
    0679442723

    • Hi Robertson,
      Thanks for your comment. We would definitely help. Will update you regarding this code to be parsed using xml2js later today.

      Regards,

    • Hi Robertson,

      You have to update your routes/index.js and views/index.pug code.

      Routes/index.js
      ==========================

      var xmlfile = __dirname + “/../public/xmlfiles/oclc.xml”;
      fs.readFile(xmlfile, “utf-8”, function (error, text) {
      if (error) {
      throw error;
      }else {
      parser.parseString(text, function (err, result) {
      var books = result[‘classify’][‘recommendations’];
      //var sfa = books[0][‘ddc’][0][‘mostpopular’][0][‘$’][‘sfa’];
      //var nsfa = books[0][‘ddc’][0][‘mostpopular’][0][‘$’][‘nsfa’];
      //var holdings = books[0][‘ddc’][0][‘mostpopular’][0][‘$’][‘holdings’];
      console.log(“SfA: “+sfa);
      console.log(“NSFA: “+nsfa);
      console.log(“Holdings: “+holdings);
      res.render(‘index’, { title: ‘Parse XML using NodeJS’, books: books });
      });
      }
      });

      =============
      1. Path to file is changed to oclc.xml (Your provided XML copied to this file)
      2. Used Root element as result[‘classify’][‘recommendations’]
      3. Further you can see the individual attributes for ddc/mostpopular tag.

      ========== Views/index.pug========
      Now open index.pug file, update code like this.

      table
      tr
      th SFA
      th NSFA
      th Holdings
      each book, index in books
      tr
      td #{book[‘ddc’][0][‘mostpopular’][0][‘$’][‘sfa’]}
      td #{book[‘ddc’][0][‘mostpopular’][0][‘$’][‘nsfa’]}
      td #{book[‘ddc’][0][‘mostpopular’][0][‘$’][‘holdings’]}

      This is self explanatory. we are looping through books array that was assigned from route.
      ==============================
      You can see the screenshot below, desired output is shown. You can modify code according to your requirement.
      https://uploads.disquscdn.com/images/9899922e90125eef30bc7e0e6e57d240ebdbf5f9084499f956e1fbf9d9cf2804.png

      Hope this helps. If you still face any issue, don’t hesitate to contact.

      Regards,

      • Robertson

        Wow thanks, that’s clears up some questions I had. Unfortunately, I’m still getting a very similar error as before. That is:


        events.js:183
        throw er; // Unhandled 'error' event
        ^

        TypeError: Cannot read property '0' of undefined
        at [...]/nodejs-parse-xml/routes/index.js:20:52

        Pointing to line 20–>

        var sfa = books[0]['ddc'][0]['mostpopular'][0]['$']['sfa'];
        </code

        For some reason xml2js isn't passing or registering the values form the xml file.

        • Hi, Thanks for your comment.
          Can you please paste your full code with XML and View file.

          Regards,

          • Robertson

            Absolutely. Check it out…

            index.js
            var express = require('express');

            var router = express.Router();

            var fs = require('fs');

            var xml2js = require('xml2js');

            var parser = new xml2js.Parser();

            /* GET home page. */
            router.get('/', function(req, res, next) {
            var xmlfile = __dirname + "/../public/xmlfiles/oclc.xml";
            fs.readFile(xmlfile, "utf-8", function(error, text) {
            if (error) {
            throw error;
            } else {
            parser.parseString(text, function(err, result) {
            var books = result['classify']['recommendations'];
            var sfa = books[0]['ddc'][0]['mostpopular'][0]['$']['sfa'];
            var nsfa = books[0]['ddc'][0]['mostpopular'][0]['$']['nsfa'];
            var holdings = books[0]['ddc'][0]['mostpopular'][0]['$']['holdings'];
            console.log("SfA: " + sfa);
            console.log("NSFA: " + nsfa);
            console.log("Holdings: " + holdings);
            res.render('index', {
            title: 'Parse XML using NodeJS',
            books: books
            });
            });
            }
            });
            });
            module.exports = router;

            oclc.xml


            185775347

            Dibdin, Michael

            thold desc
            0679442723

            index.pug
            extends layout

            block content
            h1= title
            p Welcome to #{title}
            table

            tr
            th SFA
            th NSFA
            th Holdings
            each book, index in books
            tr
            td #{book['ddc'][0]['mostpopular'][0]['$']['sfa']}
            td #{book['ddc'][0]['mostpopular'][0]['$']['nsfa']}
            td #{book['ddc'][0]['mostpopular'][0]['$']['holdings']}

          • Thanks for posting the code. Let us have a look and will be back.

            Regards

          • Hi,

            Executed your code. It is working perfectly fine. XML file is parsed correctly. Please check if all dependencies are installed.

            See attached screen shot.
            If you still have issues, don’t hesitate to contact.

            Regards,

            https://uploads.disquscdn.com/images/d9941ebc15f1896be7b40cbd804a050dc00ccce412856577db83aac898b0788f.png