<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"><channel><atom:link rel="hub" href="http://tumblr.superfeedr.com/" xmlns:atom="http://www.w3.org/2005/Atom"/><description>A blog about software projects and technologies that I’m involved with or interest me.</description><title>sean mcdaniel</title><generator>Tumblr (3.0; @seanmcdaniel)</generator><link>http://seanmcdaniel.tumblr.com/</link><item><title>Drying Out Your Crack</title><description>&lt;h2&gt;A Little Powder Goes a Long Way&lt;/h2&gt;

&lt;p&gt;Look, I spend a lot of time commuting a couple of times a week into downtown DC via the &amp;#8216;fine&amp;#8217; public transportation system here.  There is no doubt that during these hot summer months, while standing &amp;#8216;nuts to butt&amp;#8217; with my fellow commuters in a metro car packed like sardines, sweating buckets.  It&amp;#8217;s an uncomfortable feeling to put it mildly.  Which brings me to my point, and that is-sometimes code imitates life.  To be honest, I spend the majority of my time during my commute reviewing code, which is oftentimes my own.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://sentuhanbayu.biz/product_images/s/bb_powder__13639.jpg" alt=""/&gt;&lt;/p&gt;

&lt;h2&gt;On to the Real Show&lt;/h2&gt;

&lt;p&gt;I know, I know, lame blog bait.  Ok, fair enough.  But I promise the title really makes sense within the context of what I want to talk about.  So I&amp;#8217;m on the train today reviewing some code I&amp;#8217;ve been working on here and there over the last week.  It&amp;#8217;s a Ruby gem that provides an abstraction to a REST API provided by my current client as part of the data.gov initiative.  This particular API uses an XMsmeLs, also know as XML, protocol for interacting with its resources.  Needing a way to parse and deal with XML, I took a look around at the available gems in the wild.  Nokogiri looked awesome, but it was more that what I needed.  I ended up using the &lt;a href="https://github.com/jnunemaker/crack" target="_blank"&gt;Crack&lt;/a&gt; gem which, from what I gathered, had a lot of fans in the Ruby community.&lt;/p&gt;

&lt;h2&gt;Crack&lt;/h2&gt;

&lt;p&gt;Crack is a XML and JSON parsing gem.  Actually, it is a simple XML to Hash marshaling framework that is shamelessly a copy and paste job from source found in both Merb and Rails.  The gist is feed Crack, an XML document, and it provides you with a Hash representation of that document.  This is great, because a Hash can easily and quite naturally represent hierarchical data.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;xml = Crack::XML.parse("&amp;lt;foo&amp;gt;&amp;lt;bar&amp;gt;&amp;lt;baz&amp;gt;contents&amp;lt;/baz&amp;gt;&amp;lt;/bar&amp;gt;&amp;lt;/foo&amp;gt;")
 =&amp;gt; {"foo"=&amp;gt;{"bar"=&amp;gt;{"baz"=&amp;gt;"contents"}}}

puts xml['foo']['bar']['baz']
 =&amp;gt; "contents" 
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Awesome, this is a straight forward and an easy way to deal with (dreadful) XML.  So I moved forward with it and wrote a lot of code using this nested Hash key style even though it always made me feel a little &amp;#8216;dirty&amp;#8217;.  So this morning at 6:30am while on the train with 300 of my closest friends I revisited this code and had the same &amp;#8216;dirty&amp;#8217; feeling.  So I decided to do something about it.&lt;/p&gt;

&lt;p&gt;My first inclination was that I needed a nice way to navigate the document using attribute-like &amp;#8216;.&amp;#8217; notation.  Attribute access just feels better since there is less syntactic noise and it is just as expressive as doing Hash navigation. So I knew what I wanted about half way through my 1&amp;#160;1/2 hour-long commute.  Unfortunately, the data service was pretty spotty because the train snakes its way underground into the nation&amp;#8217;s capital.  Luckily, as the train entered into Metro Center Station,  I got a signal and hit the Googs with a search for &amp;#8220;recursive openstruct&amp;#8221;.  Bam! Once again Google provided me with exactly what I was looking for to &amp;#8220;dry&amp;#8221; out this Crack.&lt;/p&gt;

&lt;h2&gt;&amp;#8220;Drying Out Your Crack&amp;#8221; with Recursive Open Struct&lt;/h2&gt;

&lt;p&gt;Ruby&amp;#8217;s stdlib OpenStruct class gives a nice abstraction (syntactic sweetener) to Hashes.  It&amp;#8217;s weakness is that it only supports a single level of abstraction.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;better = OpenStruct.new({"foo"=&amp;gt;{"bar"=&amp;gt;{"baz"=&amp;gt;"contents"}}})
 =&amp;gt; #&amp;lt;OpenStruct foo={"bar"=&amp;gt;{"baz"=&amp;gt;"contents"}}&amp;gt; 

puts better.foo
 =&amp;gt; {"bar"=&amp;gt;{"baz"=&amp;gt;"contents"}} 
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Better, but foo is a Hash, and therefore I obviously can&amp;#8217;t navigate to &lt;code&gt;better.foo.bar&lt;/code&gt;, which is the goal.  Enter the &lt;a href="https://github.com/aetherknight/recursive-open-struct" target="_blank"&gt;recursive open struct&lt;/a&gt; gem/class, which does exactly what the name suggests.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;mo_better = RecursiveOpenStruct.new({"foo"=&amp;gt;{"bar"=&amp;gt;{"baz"=&amp;gt;"contents"}}})
 =&amp;gt; #&amp;lt;RecursiveOpenStruct foo={"bar"=&amp;gt;{"baz"=&amp;gt;"contents"}}&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;looks similar but thanks to Ruby&amp;#8217;s ability to dynamically define methods, this now works&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;puts mo_better.foo.bar.baz
 =&amp;gt; "contents"
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now that is &amp;#8220;Mo Better&amp;#8221;.  Mission accomplished.&lt;/p&gt;

&lt;h2&gt;Summary&lt;/h2&gt;

&lt;p&gt;I wouldn&amp;#8217;t use Crack if the XML documents that I was dealing were large, and required XPath to effectively navigate.  Nokogiri would be a much more appropriate tool to use.  But for my particular case, the documents that I had to deal with made Crack a nice fit.  I&amp;#8217;m not sure if there is a rule about Hash depth, but to me Hash navigation higher than two starts to make the code smell.  RecursiveOpenStruct gave me the syntactic style I was looking for and dried out my Crack quite well.  Now I need to find some talcum power to dry out my other problem.&lt;/p&gt;</description><link>http://seanmcdaniel.tumblr.com/post/7418616282</link><guid>http://seanmcdaniel.tumblr.com/post/7418616282</guid><pubDate>Sat, 09 Jul 2011 10:41:57 -0400</pubDate><category>xml</category><category>rest</category><category>ruby</category><category>crack</category></item><item><title>NodeJS - From "Yea Right" to "Hell Ya"</title><description>&lt;p&gt;&lt;img src="http://news.cnet.com/i/tim/2011/04/19/node.js-logo_270x71.png" alt=" "/&gt;&lt;/p&gt;
&lt;h2&gt;Yea Right&lt;/h2&gt;
&lt;p&gt;If someone had told me 2 years ago that I would be implementing server-side logic in Javascript, I would have laughed him out of the room.  It was about a year ago I came across an article on a framework called, node.JS.  At the time, the idea of building networked servers/services in Javascript seemed very foreign to me, but I remember the sense of excitement I felt about the thought of it.  When I tried to explain what node was to my colleagues and it’s potential to solve some of our specific problems, they gave me a dumbass look followed by a resonating, ‘Yea right’.&lt;/p&gt;
&lt;p&gt;&lt;img src="http://farm6.static.flickr.com/5221/5647923966_0cf0428c6b.jpg" alt=" "/&gt;&lt;/p&gt;
&lt;p&gt;What follows is a very high level view of a few of the pieces in the node puzzle, but this should be enough to get you inspired to craft some server-side Javascript.&lt;/p&gt;
&lt;h2&gt;Non-blocking I/O&lt;/h2&gt;
&lt;p&gt;Node is a set of asynchronous libraries that are built on top of Google’s high performance &lt;a target="_blank" href="http://code.google.com/p/v8/"&gt;V8&lt;/a&gt; Javascript engine. The asynchronous nature comes from the fact that it takes a non-blocking approach to all I/O.  In traditional runtime environments, threads are used as a means to support concurrency and scale.  In a threaded model I/O is a wait operation.  This simply means that operations required to access an external resource, a database request for instance, will result in the thread waiting until the I/O completes.  Oftentimes, from my experience at least, the majority of the time spent processing client requests is spent waiting on I/O.&lt;/p&gt;
&lt;p&gt;Node works quite differently using a single-threaded/event loop design.  “In node everything runs in parallel except your code”. Yep, that’s right.  The best way to explain this is to use an analogy.  Let’s say your code is the manager for a courier service and node is the manager’s fleet of couriers.  When the day starts, the manager assigns jobs to the couriers and the couriers go to work delivering packages throughout the city.  Once all the work has been assigned, the manager decides it’s time for a little nap.  After a short snooze, the manager is awakened by a knock at the door. His couriers have just begun to return after completing their tasks and are lining up at the door.  The manager lets each courier in one at a time and gathers the relevant paperwork associated with the job.  As each courier leaves the manager’s office, he closes the door and the manager tries to get back to his nap.  Unfortunately for the manager, as long as there are couriers lined up outside of his door, the knocking will continue.  This flow continues throughout the day until there is no more work left and all the couriers have reported back with the necessary paperwork from their jobs.&lt;/p&gt;
&lt;p&gt;While this is a contrived example, it should give you a general sense of how things work in node. Your code issues async function calls to the node API, and node calls back (knocks), as results become available.  There are a couple of interesting things to note about the node runtime environment.  First, when the application code is running, it is the only thing that is running in the event loop,(it’s single threaded remember), and will run until there is no more logic to execute. When the event loop starts, it checks for the presence of async call results and calls back to the application code if one is available.  One interesting behavior to be aware of is that the order in which the callbacks are received by the application code may or may not be the order in which the async operation completed.  This detail will/should not have any observable side effects as your application code ‘should’ run blazingly fast. (Stay away from CPU intensive operations in your script).&lt;/p&gt;
&lt;p&gt;&lt;a target="_blank" href="http://roflscale.com/"&gt;Intermission break =&amp;gt; roflscale&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;CommonJS &amp;amp; Node Package Manager (NPM)&lt;/h2&gt;
&lt;p&gt;The node environment includes a module system base on &lt;a target="_blank" href="http://www.commonjs.org/"&gt;CommonJS&lt;/a&gt;.  The CommonJS initiative was started in order to address many of the needs that would be required to use Javascript as a language outside of the browser.  Platform like things such as a module system, file system access, binary I/O, process management, sockets, etc. The important thing for this context is the module system that allows us to require other modules via the require() function defined in node’s global namespace.  It will become more apparent in the code examples later.&lt;/p&gt;
&lt;p&gt;&lt;a target="_blank" href="http://npmjs.org/"&gt;NPM&lt;/a&gt; is to node what RubyGems is to Ruby or Maven is to Java (yuck btw).  You can use it to publish your node modules for others to consume but you will primarily use it to fetch 3rd party packages for use in your scripts.&lt;/p&gt;
&lt;h2&gt;Conventions &amp;amp; Patterns&lt;/h2&gt;
&lt;p&gt;The programming model can be summed up as ‘callback driven’. This shouldn’t be a big surprise, it’s just Javascript.  A large majority of your code will exist in callback functions.&lt;/p&gt;
&lt;h3&gt;Comparison between sync and async programming paradigms.&lt;/h3&gt;
&lt;h4&gt;Sync&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;function do(something) {
  if (!something) throw new Error('nothing to do!');
  return 'done';  
}

try {   
  var result = do(something);
  console.log(result);
} catch err {
  throw err
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Async&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;function do(something, callback) {
  if (!something) callback(new Error('nothing to do!'));
  return callback(null, result);  
}

do('work', function(err, result) {
  if (err) throw err;
  console.log(result);
});
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Patterns&lt;/h3&gt;
&lt;h4&gt;Callback parameter&lt;/h4&gt;
&lt;p&gt;The convention in node is to pass the callback function as the last parameter to an async function.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;function doAsync(something, callback) {
  callback(null, result);
}

doAsync(work, function(err, result) {
  if (err) throw err;
  console.log(result);
});
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Callback result&lt;/h4&gt;
&lt;p&gt;The convention in node is to pass the result to the callback as the 2nd argument.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;function doAsync(something, callback) {
  ...
  callback(null, result);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Error Propagation aka Errorback&lt;/h4&gt;
&lt;p&gt;The convention in node is to pass the callback function an error as the first argument or null if an error wasn’t encountered.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;function doAsnyc(something, callback) {
  if (!something) callback(new Error('nothing to do!'));
  ...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Codes using node - taken from the node docs&lt;/h2&gt;
&lt;h4&gt;Filesystem&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;var fs = require('fs');

fs.readFile('/etc/passwd', function (err, data) {
  if (err) throw err;
  console.log(data);
});
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Http&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;var http = require('http');

http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n');
}).listen(1337, "127.0.0.1");

console.log('Server running at &lt;a target="_blank" href="http://127.0.0.1:1337/'"&gt;&lt;a href="http://127.0.0.1:1337/'" target="_blank"&gt;http://127.0.0.1:1337/'&lt;/a&gt;&lt;/a&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Sweet Spot&lt;/h2&gt;
&lt;p&gt;The ecosystem around node is pretty incredible and there are an amazing number of 3rd party modules in the NPM repo. It seems that just about everything under the moon is being developed or ported for the node platform.  I cut my teeth writing a &lt;a target="_blank" href="https://github.com/steelThread/connect-jsonp"&gt;JSONP&lt;/a&gt; middleware module for the &lt;a target="_blank" href="http://senchalabs.github.com/connect/"&gt;connect&lt;/a&gt; framework which is to node what &lt;a target="_blank" href="http://rack.rubyforge.org/"&gt;rack&lt;/a&gt; is to Ruby.  Connect is the foundation for the popular &lt;a target="_blank" href="http://expressjs.com/"&gt;Express&lt;/a&gt; web application framework.  While Express is pretty solid, I’m not convinced that I would write web applications on top of node.  I don’t see the advantage over mature web frameworks.&lt;/p&gt;
&lt;p&gt;Another area where I wouldn’t use node is for CPU intensive logic.  This actually goes against what node is trying to solve, efficiency through offloading latency intensive operations outside of the process / event loop.  If each iteration of the event loop takes a considerable amount of time performing application logic, the benefit of non-blocking I/O gets negated.&lt;/p&gt;
&lt;p&gt;I have found node to be very well suited for I/O intensive batch processing.  In particular, I’m using node as a lightweight, scalable and high performant means for shelling out to CPU hungry Unix command line tools using node to manage intensive I/O.  I built &lt;a target="_blank" href="https://github.com/steelThread/mimeograph"&gt;Mimeograph&lt;/a&gt; as a simple OCR/text extraction batching solution to orchestrate tools such as ghostscript, image magick and tesseract.  Mimeograph uses a simple utility called &lt;a target="_blank" href="https://github.com/steelThread/redisfs"&gt;RedisFS&lt;/a&gt;, to move files to be processed in and out of &lt;a target="_blank" href="http://redis.io/"&gt;Redis&lt;/a&gt; and to and from a Unix filesystem.  Mimeograph uses &lt;a target="_blank" href="https://github.com/technoweenie/coffee-resque"&gt;coffee-resque&lt;/a&gt; as the batching framework,  a node port of the popular &lt;a target="_blank" href="https://github.com/defunkt/resque"&gt;Resque&lt;/a&gt; framework.  With the architecture, I can massively scale out mimeograph processes across virtual machines to support a very impressive amount of OCR/text extraction processing required by the business to support upcoming legal proceedings as well as process multiple years worth of indexing backlog.  I’ll post a focused discussion of Mimeograph in the near future and how to best utilize coffee-resque to distribute CPU intensive parallel problems.&lt;/p&gt;
&lt;p&gt;I recently watched a vid in which Isaac Z. Schlueter of NPM fame in which he explains why node is well suited for &lt;a target="_blank" href="http://video.nextconf.eu/video/1914374/nodejs-digs-dirt-about"&gt;Data-Intensive Real-Time (DIRT)&lt;/a&gt; applications.  Pretty good stuff, take a look.&lt;/p&gt;
&lt;h2&gt;Hell Ya&lt;/h2&gt;
&lt;p&gt;Javascript is definitely not just for the browser anymore, thanks to node.  I’ve been working with it for the last year and have had great success with it in production.  Mimeograph impressed my client so much that we are targeting more opportunities within the organization where it is a good fit.  Things have changed quite a bit over the last year, the dumbass looks are no longer, and the ‘yeah right’ responses are now a resonating ‘hell ya!’.&lt;/p&gt;
&lt;p&gt;Happy Noding.&lt;/p&gt;
&lt;p&gt;&lt;a target="_blank" href="http://www.youtube.com/watch?v=jo_B4LTHi3I"&gt;Introduction to node.js&lt;/a&gt;&lt;/p&gt;</description><link>http://seanmcdaniel.tumblr.com/post/6360548055</link><guid>http://seanmcdaniel.tumblr.com/post/6360548055</guid><pubDate>Mon, 13 Jun 2011 22:27:00 -0400</pubDate><category>coffeescript</category><category>javascript</category><category>nodejs</category><category>mimeograph</category><category>tesseract</category></item><item><title>Shiny New Toys</title><description>&lt;p&gt;Holy shit, I have a blog (again)!  I previously maintained a blog ~ 3 years ago that focused on &lt;a href="http://www.sencha.com/products/extjs/" target="_blank"&gt;ExtJS&lt;/a&gt; and integration solutions for various Java based MVC frameworks.  Since that material completely is wack and out of date I simply flushed it from the tubes.  Out with the old, in with the new.&lt;/p&gt;

&lt;p&gt;In any case for those of you who don&amp;#8217;t know me my name is Sean McDaniel.  I&amp;#8217;m a freelance developer working out of the greater Washington DC area.  I like to consider myself a generalist in my field and have had the privilege of working with a lot of interesting technologies, clients and problem domains.  I’ve spent a lot of my free time over the last 6 months building tools using a slew of relatively new technologies.  Implementation wise these tools/utils are based on &lt;a href="http://nodejs.org/" target="_blank"&gt;node.js&lt;/a&gt; and other frameworks surrounding &lt;a href="http://nodejs.org/" target="_blank"&gt;node.js&lt;/a&gt;.  In addition I&amp;#8217;ve done quite a bit with the &lt;a href="http://redis.io" target="_blank"&gt;Redis&lt;/a&gt; key/value store.&lt;/p&gt;

&lt;p&gt;To help me better understand these new technologies I&amp;#8217;ve been building open source tools, as well as contributing to existing ones.  For me as a developer open source serves a couple of purposes besides providing great frameworks upon which I build solutions on.  First it gives means by which I can dig into preexisting source code to understand concepts in a concrete way and second it provides a way for me to scratch and itch.  Therefore, many of these &lt;a href="http://nodejs.org/" target="_blank"&gt;node.js&lt;/a&gt; based projects I work on are inspired by real world problems I’ve been trying to help solve for my current client.  In fact the tools that I’m going to be talking about over the next few posts have been in production for over a month now and purring like a kitty.&lt;/p&gt;

&lt;p&gt;So here are my shiny new toys that I will be covering shortly:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/steelThread/connect-jsonp" target="_blank"&gt;connect-jsonp&lt;/a&gt; - &lt;a href="http://en.wikipedia.org/wiki/JSONP" target="_blank"&gt;JSONP&lt;/a&gt; middleware for sencha labs popular &lt;a href="https://github.com/senchalabs/connect" target="_blank"&gt;connect&lt;/a&gt; framework.  Think ruby &lt;a href="http://rack.rubyforge.org/" target="_blank"&gt;rack&lt;/a&gt; for &lt;a href="http://nodejs.org/" target="_blank"&gt;node.js&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/steelThread/redisfs" target="_blank"&gt;redisfs&lt;/a&gt; - Utility to pump files in &amp;amp; out of redis. &lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/technoweenie/coffee-resque" target="_blank"&gt;coffee-resque&lt;/a&gt; - a &lt;a href="http://coffeescript.org" target="_blank"&gt;CoffeeScript&lt;/a&gt; version of the popular ruby &lt;a href="https://github.com/defunkt/resque" target="_blank"&gt;resque&lt;/a&gt; project.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/steelThread/mimeograph" target="_blank"&gt;mimeograph&lt;/a&gt; - &lt;a href="http://coffeescript.org" target="_blank"&gt;CoffeeScript&lt;/a&gt; batching library, based on coffee-resque, to extract text and create searchable pdf files, OCRing where necessary.&lt;/li&gt;
&lt;/ul&gt;</description><link>http://seanmcdaniel.tumblr.com/post/5853826660</link><guid>http://seanmcdaniel.tumblr.com/post/5853826660</guid><pubDate>Wed, 25 May 2011 22:33:00 -0400</pubDate><category>redis</category><category>CoffeeScript</category><category>Node.js</category><category>Connect</category><category>Mimeograph</category><category>ExtJS</category></item><item><title>Me</title><description>&lt;img src="http://25.media.tumblr.com/tumblr_llevo37uKK1qkqp8zo1_r1_500.png"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;p&gt;Me&lt;/p&gt;</description><link>http://seanmcdaniel.tumblr.com/post/5617116591</link><guid>http://seanmcdaniel.tumblr.com/post/5617116591</guid><pubDate>Wed, 25 May 2011 21:30:48 -0400</pubDate></item><item><title>Nice Virginia evening.</title><description>&lt;img src="http://24.media.tumblr.com/tumblr_llgvdkdXiq1qkqp8zo1_500.jpg"/&gt;&lt;br/&gt;&lt;br/&gt;&lt;p&gt;Nice Virginia evening.&lt;/p&gt;</description><link>http://seanmcdaniel.tumblr.com/post/5652019504</link><guid>http://seanmcdaniel.tumblr.com/post/5652019504</guid><pubDate>Thu, 19 May 2011 19:46:25 -0400</pubDate></item></channel></rss>
