Embracing Open Government with Node.js and Microsoft SQL Server

On May 9, 2013, President Obama signed an executive order that made open and machine-readable data the new default for government information. Making information about government operations more readily available and useful is also core to the promise of a more efficient and transparent government.

Since then, a lot of agency data has been made available on http://www.data.gov. Unfortunately, very little of it is well-suited for web developers who want to immediately start building apps. File formats vary considerably with much of the data being published in difficult to process formats of HTML, CSV, PDF, and Microsoft Excel. Only a very, very small percentage is exposed through a dynamic, queryable standards-based web-services interface.

Recently my company, Fig Leaf Software was tasked by one of our government agency clients with publishing a searchable dataset, currently stored in Microsoft SQL Server, as a series of REST-based web services. The dataset is fairly large and diverse, consisting of 50+ tables. While the customer preferred an Adobe ColdFusion-based solution, we were given the opportunity to select the best technology to deliver the most scalable and cost-efficient solution possible. We also wanted to engineer a solution that could be easily rolled out to other government agencies who were facing the same challenges. By the look of what’s going on at www.data.gov, there’s a whole lot of them.

And for creating RESTful web-services APIs, that technology is Node.JS.

Node.js is a JavaScript runtime built on Chrome’s V8 JavaScript engine. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient. Unlike Java, .NET, or PHP solutions, database requests in Node do not delay thread execution and is therefore much more efficient at handling a high concurrency of requests. The Express framework for Node makes it relatively easy to implement RESTful APIs.

Of course, the real problem at hand was how to expose large, diverse SQL server datasets that would not require tons of hours spent on hand-coding and documenting table and view-specific APIs?

Designing the Request Pattern

We decided to implement an API where each managed table or view could be accessed through the following URL pattern:

/api/entityname?filter=[filters]&sort=[sorters]&start=n&limit=n

  • entityname is the name of the table, view, or stored procedure to access in the database.
  • filter is an array of objects that have the following properties:
    • property : The field to query
    • value: The comparison value
    • operator: The comparison operator
  • sort is an array of objects that have the following properties:
    • property: The field to sort by
    • direction: the sort direction (either ASC or DESC)
  • start is the starting row to return. Defaults to 0.
  • limit is the total number of rows to return (limited to <= 500). Defaults to 25.

Only entityname and sort are required.

The following example url illustrates using multiple filters and sorters to retrieve the first 100 employee records whose last names start with the letter B, placed in sort order by state and city.

/api/employees?filter=[{“property”: “lastname”,”value”:”B”,”operator”:”ge”},{“property”: “lastname”, “value”:”C”, “operator”:”lt”]&sort=[{“property”:”state_code”,”direction” : “asc”},{“property”:”city”,”direction” : “asc”}]&start=0&limit=100

This pattern is similar to the one used by the Sencha Ext JS Javascript framework’s REST Proxy.

The pattern also had to support receiving an authorization code that would be transmitted as part of the HTTP header so that requests could be backtracked to a specific developer and logging could be used to set throttle limits on the number of requests.

Designing the Response Pattern

Output from each of our REST services is a JSON as illustrated by the following snippet:

{
  data: [],
  success: true,
  total: n
}
  • data is an array of javascript objects that represent matching rows from the query result
  • success is a boolean value that indicates whether the query succeeded or not.
  • total is an integer that represents a count of total # of available records that match the filter condition

Defining an Application Config File

The Node.js app has to connect to a SQL database and mail server, both of which require credentials that we wanted to keep separate from our application source code. In addition, we also needed to indicate to our app the specific tables and views that would be queryable, since not every datatable or view in our database was going to have a public API.

To support these requirements, we used the Node.JS config module to read application configuration properties from the static JSON file illustrated below.

{
  "clientName" : "My .gov client",

 "database" : {
    "server": "127.0.0.1",
    "database": "sampledatabase",
    "user": "sdrucker",
    "password": "mypassword",
    "port": 1433,
    "options": {
        "instanceName": "SQLEXPRESS"
    }
 },

  "smtp" : {
    "port": 465,
    "host": "email-smtp.us-east-1.amazonaws.com",
    "auth" : {
      "user" : "[insert credentials here]",
      "pass" : "[insert credentials here]"
    },
    "secure": true,
    "maxConnections": 5,
    "maxMessages": 10,
    "rateLimit": 5
  },

  "entities" : {
    "campgrounds": {
      "type": "view",
      "obj": "campgrounds",
      "label": "Campgrounds",
      "description": "Information about campgrounds"
    },
    "visitor_center": {
      "type": "table",
      "obj": "visitor_center",
      "label": "Visitor Centers",
      "description": "All visitor centers"
    }
  }
}

Note: Storing unencrypted credentials in a text file is definitely not a best practice. I’ll discuss deployment/clustering/security issues with Node.js in my next blog post.

Implementing the Request Pattern in a Node.js/Express.js Route

Express.js routers make it easy to handle dynamic routes and parse arguments from the URL. Our main web service route performs the following tasks:

  • Verifies that the requested table/view is in a list of supported entities.
  • Parses the table/view name from the URL
  • Verifies that a valid authorization code was transmitted over the http header
  • Log the request to a database table
  • Validate and parse the start, limit, filter, and sort url parameters

var express = require('express');
var router = express.Router();
var config = require('config');
var db = require('./../database');

isValidEntity = function (entityName) {
    var entities = config.get('entities');
    if (entities[entityName]) {
        return true;
    } else {
        return false;
    }
}

//
// set tablename parameter and check authorization
//
router.param('entityName', function (req, res, next, id) {
    var isValid = true;

    if (!isValidEntity(req.params.entityName)) {
        isValid = false;
        var err = new Error('Not Found - Invalid Entity Name');
        res.status(404);
        res.render('error', {
            message: 'You did not transmit a valid entity name',
            error: '404-Not Found'
        });
    }

    // check authorization code
    var authorization = req.headers.authorization;
    if (!authorization) {
        isValid = false;
        var err = new Error('Invalid Authorization Key');
        res.status(401);
        res.render('error', {
            message: 'You did not transmit a valid authorization key',
            error: '401-Unauthorized'
        });
    }

    if (isValid) {
        // lookup authorization id
        db.query('api_users', {userid: authorization}, 'userid', 0, 1, function (err, results) {
            if (results.data.length == 1) {
                db.executeStoredProcedure(
                    'api_log_insert',
                    {
                        "userid" : authorization,
                        "request" : decodeURI(req.originalUrl),
                        "ipaddress" : req._remoteAddress
                    } ,
                    function(err,data) {
                        // reserved for future use
                    }
                );
                next();
            } else {
                res.render('error', {
                    message: 'The authorization key that you transmitted is invalid',
                    error: '401-Unauthorized'
                });
            }
        })
    }

});

//
// Parse/validate URL parameters, build the dynamic SQL, and transmit results
// in Javascript Object Notation (JSON) format
//
router.get('/:entityName', function (req, res, next) {

    var entityName = req.params.entityName;
    var params = req.query;
    var startrow, maxrows, orderby;
    var entityDetails = config.get('entities')[entityName];

    // set defaults for pagination
    if (!params.start) {
        startrow = 0;
    } else {
        startrow = parseInt(params.start);
        delete params.start;
    }
    if (!params.limit) {
        maxrows = 20;
    } else {
        maxrows = parseInt(params.limit);
        delete params.limit;
    }

    // set max requested data to 500 rows
    if (maxrows &amp;gt; 500) {
        maxrows = 500;
    }

    if (params.orderby) {
        orderby = params.orderby;
        delete params.orderby;
    }

    if (entityDetails.type == 'table' || entityDetails.type == 'view') {

        if (params.filter) {

            var filter = JSON.parse(params.filter);
            if (params.sort) {
                var sorters = JSON.parse(params.sort);
            } else {
                var sorters = null;
            }

            db.advancedQuery(entityName, filter, sorters, startrow, maxrows, function (err, data) {
                res.json(data);
            });

        } else { // no filter, use "simple" mode which accepts colName=value params for searching

            if (!orderby) {
                res.status(500);
                res.render('error', {
                    message: 'You did not transmit a orderby property.',
                    error: '500-"Simple Mode" requires you to specify an &amp;lt;b&amp;gt;orderby&amp;lt;/b&amp;gt; property'
                });
            } else {
                db.query(entityName, params, orderby, startrow, maxrows, function (err, data) {
                    res.json(data);
                })
            }

        }
    } else {
        // future support for stored procedures
    }

});

module.exports = router;

Creating an Abstraction Layer for Querying Microsoft SQL Server

The next step in building a one-size-fits-all dynamic API was to create a database abstraction class that would simplify using the Node.JS mssql package. This class performs the following tasks:

  • Connects to the SQL Server database using the credentials stored in the .config file
  • Handle stored procedure calls, used in the auto-documentation routine
  • Dynamically generates a SQL prepared statement based on the view/table name, filters, sort, start, and limit input parameters
  • Dynamically generates a second query to get a count of all available records, given the filter conditions specified
  • Executes a multiple SQL Server prepared statements as a single db transaction

 

Abstracting out data requests into a separate class file also provides us with the important benefit of being able to hot-swap database connectivity if our clients need us to support other DBMS like ORACLE or MySQL.

var sql = require('mssql');
var config = require('config');
var dbConfig = config.get('database');

// create connection pool
var dbConn = new sql.Connection(dbConfig);
dbConn.connect(function (err) {
    if (err) {
        console.log('Error connecting to db', err);
    }
});

/**
 * Execute a prepared statement
 * @param ps - the prepared statement
 * @param sqlCmd - the sql command
 * @param values - object - bound values
 * @param callback - callback function
 */

executePreparedStatement = function (ps, sqlCmd, values, callback) {

    ps.prepare(sqlCmd, function (err) {
        if (err) {
            console.log('prepare stmt error', err);
        } else {
            ps.execute(values, function (err, data) {
                if (err) {
                    console.log('sql execution error', err);
                } else {
                    if (callback) {
                        callback(err, data);
                    }
                    ps.unprepare(function (err) {
                        if (err) {
                            console.log('ps unprepare error', err);
                        }
                    });
                }
            });
        }
    });

}

/**
 * Execute a stored procedure
 *
 * @param procedureName
 * @param values - array of name/value pairs to send as arguments to the stored proc
 * @param callback - callback function to receive results from run
 *
*/

executeStoredProcedure = function (procedureName, values, callback) {

    var request = new sql.Request(dbConn);
    for (var colName in values) {
        var value = values[colName];
        switch (typeof value) {
            case 'string':
                request.input(colName, sql.VarChar(), value);
                break;
            case 'number' :
                request.input(colName, sql.Int, value);
                break;
        }
    }
    request.execute(procedureName, function (err, recordsets, returnValue) {
        callback(err, recordsets, returnValue);
    });

}

/**
 * Returns javascript object containing paginated results of query, total # of matching records
 *
 * @param tablename
 * @param values
 * @param orderby
 * @param startrow
 * @param maxrows
 * @param callback
 */

query = function (tablename, values, orderby, startrow, maxrows, callback) {

    var ps = new sql.PreparedStatement(dbConn);
    var filterSql = "";
    var tableName = tablename.split(' ')[0];

    // paginate data
    var sqlCmd = "".concat(
        "select * from ",
        "( select ROW_NUMBER() OVER ( ORDER BY " + orderby.split(' ')[0] + ") as RowNum, *",
        " from " + tableName,
        " where 0=0 "
    );

    for (var colName in values) {
        var value = values[colName];
        if (parseInt(value) == value) {
            value = parseInt(value);
        }
        switch (typeof value) {
            case 'string':
                ps.input(colName, sql.VarChar());
                values[colName] += '%';
                filterSql += ' and ' + colName + ' LIKE @' + colName;
                break;
            case 'number' :
                ps.input(colName, sql.Int);
                filterSql += ' and ' + colName + ' = @' + colName;
                break;
        }
    }
    sqlCmd += filterSql + " ) AS RowConstrainedResult "
    sqlCmd += " WHERE RowNum &gt;= @startrow and RowNum &lt; @endrow";
    sqlCmd += " ORDER by RowNum; ";

    sqlCmd+= ''.concat(
        "select count(*) as thecount ",
        "from " + tableName + " ",
        "where 0=0 " + filterSql
    );

    ps.input('startrow', sql.Int);
    ps.input('endrow', sql.Int);
    values.startrow = startrow + 1;
    values.endrow = values.startrow + maxrows;

    ps.multiple = true; // enable multiple recordsets to be returned

    executePreparedStatement(ps, sqlCmd, values, function(err,results) {

        callback(err, {
            data : results[0],
            success: true,
            total: results[1][0].thecount
        })
    });

}

/**
 * Create a dynamic query against a table or view
 * @param tablename
 * @param filter - array of objects. Each object has the keys property/value/operator
 * @param orderby
 * @param startrow
 * @param maxrows
 * @param callback
 */

advancedQuery = function (tablename, filters, sorters, startrow, maxrows, callback) {

    var ps = new sql.PreparedStatement(dbConn);
    var thisFilter = {};
    var values = {}, colName = '';
    var op = "";
    var isNumber = false;
    var filterSql = "";
    var tableName = tablename.split(' ')[0]; // no sql injection

    // paginate data
    var sqlCmd = "select * from ";
    sqlCmd += "( select ROW_NUMBER() OVER ( ORDER BY " ;

    // handle sorters array
    // sorters are encoded as an array of javascript objects
    // [{property: 'lastname', direction: 'ASC'}, {property: 'firstname', direction: 'ASC'}]

    for (var i=0; i&lt;sorters.length; i++) { if (i &gt; 0) {
            sqlCmd += ", ";
        }
        sqlCmd += sorters[i].property.split(' ')[0] + " ";
        if (!sorters[i].direction) {
            sqlCmd += " ASC ";
        } else {
            sorters[i].direction = sorters[i].direction.toUpperCase();
            if (sorters[i].direction != 'ASC' &amp;&amp; sorters[i].direction != 'DESC') {
                sqlCmd += " ASC "
            } else {
                sqlCmd += " " + sorters[i].direction;
            }
        }
    }

    sqlCmd+= ") as RowNum, *";
    sqlCmd += " from " + tableName + " where 0=0 ";

    // append additional WHERE conditions
    for (var i = 0; i &lt; filters.length; i++) {
        thisFilter = filters[i];
        isNumber = (thisFilter.value.indexOf(' ') == -1 &amp;&amp; !isNaN(parseInt(thisFilter.value)));
        colName = thisFilter.property;
        switch (thisFilter.operator) {
            case 'lt' :
                op = ' &lt; '; break; case 'gt' : op = ' &gt; ';
                break;
            case 'eq' :
                op = ' = ';
                break;
            case 'le' :
                op = ' &lt;= '; break; case 'ge' : op = ' &gt;= ';
                break;
            default :
                if (isNumber) {
                    op = ' = ';
                } else {
                    op = ' like ';
                    thisFilter.value += '%'
                }
                break;
        }
        filterSql += ' and ' + colName + op + '@' + colName;
        if (isNumber) {
            ps.input(colName, sql.Int);
            values[thisFilter.property] = parseInt(thisFilter.value);
        } else {
            ps.input(colName, sql.VarChar());
            values[thisFilter.property] = thisFilter.value;
        }
    }

    sqlCmd += filterSql + " ) AS RowConstrainedResult "
    sqlCmd += " WHERE RowNum &gt;= @startrow and RowNum &lt; @endrow";
    sqlCmd += " ORDER by RowNum; ";

    // handle pagination
    ps.input('startrow', sql.Int);
    ps.input('endrow', sql.Int);
    values.startrow = startrow + 1;
    values.endrow = values.startrow + maxrows;

    // get count of all available records
    sqlCmd+= ''.concat(
        "select count(*) as thecount ",
        "from " + tableName + " ",
        "where 0=0 " + filterSql
    );

    // execute the prepared statement

    ps.multiple = true; // enable multiple recordsets to be returned

    executePreparedStatement(ps, sqlCmd, values, function(err,results) {

        callback(err, {
            data : results[0],
            success: true,
            total: results[1][0].thecount
        })

    });

}

module.exports = {
    advancedQuery: advancedQuery,
    query: query,
    executeStoredProcedure: executeStoredProcedure
};

<h3>Automatically Generating Developer Documentation</h3> Perhaps the most important feature of any API is the documentation that accompanies it. Without adequate docs, developers will struggle with using your system and generate help-desk requests that lead to added support expense. Using Microsoft SQL Server’s system stored procedures we were able to automatically generate developer docs for each entity that is available through our REST API. Our /docs route uses SQL Server’s sp_columns procedure to retrieve information about the columns that are available in the exposed table/view. It also makes a sample call to our API, sending both blocks of information to a Jade template for output:


var express = require('express');
var router = express.Router();
var config = require('config');
var db = require('./../database');

isValidEntity = function (entityName) {
    var entities = config.get('entities');
    if (entities[entityName]) {
        return true;
    } else {
        return false;
    }
}

// set tablename parameter and check authorization

router.param('entityName', function (req, res, next, id) {

    var isValid = true;

    if (!isValidEntity(req.params.entityName)) {
        isValid = false;
        var err = new Error('Not Found - Invalid Entity Name');
        res.status(404);
        res.render('error', {
            message: 'You did not transmit a valid entity name',
            error: '404-Not Found'
        });
    } else {
        next();
    }

});

//
// url: /docs/entityname
//
router.get('/:entityName', function (req, res, next) {

    var entityName = req.params.entityName;
    var entities = config.get('entities');

    if (entities[entityName].type == 'table' || entities[entityName].type=='view') {

        db.executeStoredProcedure('sp_columns', {table_name: entityName}, function (err, data) {

            var sampleCall = "/api/" + entityName + "?orderby=" + data[0][0].COLUMN_NAME + '&amp;start=0&amp;limit=1';

            // run a sample SELECT query, retrieving a single row
            db.query(entityName, {}, data[0][0].COLUMN_NAME, 0, 1, function (err, sampleData) {

                // send the query results and metadata to a Jade template for output.

                res.render('entitydocs', {
                    title: config.get('clientName') + ' API: ' + entityName,
                    sampleCall: sampleCall,
                    columns: data[0],
                    sampleOutput: JSON.stringify(sampleData, null, 4)
                });

            });

        });
    } else {
        // todo: expose stored procedures
    }
});

module.exports = router;

The Jade template gives developers a sample API call and displays sample results from running a query:

Jade API Output

Automatically Generating Unit Tests with POSTMAN

Postman is a free Google Chrome application that enables you to easily test your APIs. Postman saves its URL collection data into an unencrypted .JSON file. This means that you can use Node.js to not only dynamically create your RESTful APIs in minutes, but also use it to automatically generate unit tests and additional developer documentation!
Use POSTMAN to test your APIs

Let Fig Leaf Software Help you Quickly and Painlessly Fulfill the Promise of Open Government

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s