Wednesday, June 20, 2012

Modules in JavaScript

Working with JavaScript (JS) (well CoffeeScript) was a bit of a shock. After working with in the late 90's I always felt that JS was an inferior, hard to learn, and badly standardized language that utterly sucked. Over the years I regularly read up on it because JS was used in the place where I had such great hopes for Java. In '98 I wrote an application for Ericsson Telecom completely based on applets on Netscape's IFC. Many aspects were a nightmare but in the end it all worked very nice, the application ran for many years. Then Sun screwed everything up by developing Swing (way to heavy at the time while IFC was doing well). However, most of all they killed Java in the browser with their security warnings before you could run anything. While a native code active-x was downloading in the background, executing a JAR file felt like playing in the Dear Hunter's most famous scene. Goodbye applets.

Welcome JavaScript. If you look today Javascript is quite a mature language and the browser incompatibilities have mostly disappeared (if you ignore previous Internet Explorer versions). Languages like CoffeeScript (that translate 1:1 to JavaScript) have even made the syntax pleasant by removing boilerplate, and standardizing the way you do classes in JavaScript (which is extremely flexible but easy to tangle yourself up in). However, the most interesting thing for me is how conventions have petered out over time to let different JS programs collaborate in the browser. This is the software collaboration problem I've been struggling with over the past decade and it is interesting to see how JS modularity is converging on a service model that is remarkably close to OSGi's service registry.

Modularity is crucial in the browser because the JavaScript can loaded from the HTML, different frames, on XMLHttpRequests, etc. This all in a single namespace. Obviously one has problems then with global variables, especially since the default in JS is to create a global variable. In Java this problem was addressed to use really long names (class names) that always have a unique prefix. In JS they like really short names ($ comes to mind, used in jQuery and Prototype) but they quickly found out that mixing libraries caused fatal clashes.

Scoping in JS can only be done with functions. Coming from Java, JS functions take some getting used to. A function is actually a plain JS object (Map comes closest) with properties but it can be executed as well by using it with parentheses:

function f() { ... }   // defines a function
f()                    // call function
f.a = 3                // set a property in the function object
x = f['a']             // get a property, with alt. syntax
Since the function is the only scoping mechanism, a new module was born. Fortunately, JS is completely recursive so you can nest variables and functions in functions ad nauseum.

If the module is a function, the semantics of the bundle activator can be seen as calling the function. Since JS is executed as it goes along it is easy to call a function:
  function foo() {
     var moduleLocal="x";
     Foo = { bar: function() {return moduleLocal; } }
  }
  foo();
  // Foo.bar() == "x"
For a Smalltalker it is quite natural that a local variable remains in existence long after the function is called. For the common Java programmer it is unsettling as this contradicts the stack model of the language (and has been the subject of fierce debates around Java closures. Bloch et al wanted to stick to final variables, albeit with a better default, while Gafter et al wanted real closures. After all, the name closures is derived that it created a closure around the local scope. With an initialization function, we can provide local names to the name for global variables by passing them as parameters:
  function foo($) {
     var moduleLocal=$("x");
     Foo = { bar: function() {return moduleLocal; } }
  }
  foo(jQuery);
  // Foo.bar() == "x"
Since it is superfluous to define and then call a function, a clever script kiddie came up with the anonymous function.  An anonymous function is the declaration of a closure and directly calling it. This limited the pollution of the global namespace (no initialization function name):
(function ($) {...})(jQuery)
The missing piece of the puzzle is dependencies. In the popular CommonJS framework a module specifies its dependencies via a require(name) function. This function takes the name of a dependency (usually related to a file name searched on a module path), loads it if it is not yet loaded and executes it. This model uses lazy initialization, each module makes sure its dependencies are loaded before it runs its activation code by calling require() for each of these dependencies. A problem with lazy initialization is that all modules must be loaded sequentially, you only find out about the need for the next module after you've loaded the previous. Therefore AMD was born. Instead of calling the activator function directly, the module function is registered with its dependencies.
    define( 'Foo', ['Bar', 'Lib'], function(bar,lib) { ... } );
This very interestingly inverses the control flow. Now the module system now load dependencies at its leisure because it can decide when to initialize the module. A bit like Declarative Services in OSGi that also wait until the dependencies are met. In browsers this is not overly useful because loading is usually in the order of the JS files in the html, but the model opens up new possibilities. The most interesting approach I've seen so far is actually part of a JS framework that I had never heard of before: Angular.js. It completely changes your view on how to develop JS applications in the browser. All wiring management (clicking here changes something over there) becomes declarative and the JS code only has to worry about the model. Backed by an extensive set of services to it has the potential to change the change the way we build JS applications. It is built by Google and is used internally though the authors are not allowed to say what for.

Back to modules. From a modular perspective it found an intriguing way to inject services. First, dependencies are specified on instances and not modules. A basic concept in Angular is the model controller, which is a function that is being called to setup the model. This function can be declared as follows:
function MyModelController($scope, $http, other) {
  ...
}
Amazingly, this sets up a dependency to objects in a service registry that have the names '$scope', '$http', and 'other'. This puzzled me to no avail when I first saw this, wondering where the heck it got the names of the dependencies until I completely read the developers manual and found out they use a very clever hack. In (most!) JS implementations you can call toString() method on a function object, giving you source code for the function. This source code is then used to parse out the parameter names, which are then used to establish the dependencies. Though all known browsers support this feature, it is not standardized. It is therefore also possible to provide the dependencies by specifying an $inject property in the function object (really handy, those properties!).
function MyModelController($scope, $http, other) {...}
MyModelController.$inject=['$scope', '$http', 'other'];
Declaring a module is done by including a JS file and in the code of that file registering an object or factory with angular:
    angular.
       module('myModule', ['ngResource']).
       factory('MyService', function($resource){ ... })
The Angular approach is awfully close to the OSGi service registry minus the dynamics. It provides a central point to share and find instances, completely decoupled from the modules they originate from. For me that has always been the greatest benefit of OSGi since this model significantly reduces dependencies between the code of the modules. This is as far as I know the best practice to create reusable components.

Even closer to OSGi is of course Orion from Eclipse. They've implemented a full blown OSGi service registry, including dynamics and isolation. Each module is now a separate HTML file that runs in a headless iframe. Communication is asynchronously through promises (like Futures). This model is identical to the OSGi Service Registry and even uses most of the same method names. Though I am very inclined to like it, it feels like they need to learn from what Angular does to make the registry be less in your face. This was the same problem OSGi had before we had DS and annotations.

It should be clear that a lot of exciting things are happening in the script kiddies world, these kids surely have grown up. It is refreshing to see that they've come up with ways of working that resemble the OSGi service registry.

 Peter Kriens

Friday, June 8, 2012

bnd week

Next week Beaulieu will be made unsafe with the bnd(tools) crew. Neil Bartlett (Paremus), PK Sörelde (ComActivity), Bert Bakker (Luminis), Ferry Huberts, Marcel Offermans (Luminis), and Marian Grigoras (Siemems) are coming over to prepare for the next release. Unfortunately, Stuart McCulloch (Sonatype) won't be able to come this time. However, he helped us with a very fresh snapshot release of the maven bundle plugin. It would be highly appreciated if people tested this plugin against their code base. You can find the maven bundle plugin here. Please report any errors or inconsistencies you find on github.
 It will be a heavy week, as usual, because there have been a lot of new functions added. For bnd, this actually means I will move bndlib to version 2.0.0. Except for the significantly new functionality, the API has also changed. When bndlib was small, Map<String,Map<String,String>> worked quite well to maintain the manifest information and package attributes. However, in the current code base it was becoming painful. Especially since Java has a naming fetish. org.example.Foo.X, org/example/Foo$X, org/example/Foo$X.class, Lorg/example/Foo .X and Lorg/example/Foo$X; are all identifying the same class in different contexts. Just imagine how easy it is to confuse these strings. So now bndlib has Parameters, Instructions, and Packages with lots of convenience methods. bndlib is used in ant, maven, sbt, osmorc, bndtools, and other products. Though the number of indirect users is quite large, the developers that program its API is quite small. However, it is a fun library to use if you need to work with JAR files and/or bundles. Some examples:

 File asm = new File("asm.jar");
 Jar jar = new Jar(asm);
 jar.getManifest().write(System.out);
This will output the following manifest:

 Manifest-Version: 1.0
 Implementation-Vendor: France Telecom R&D
 Ant-Version: Apache Ant 1.6.2
 Implementation-Title: ASM
 Implementation-Version: 2.2.2
 Created-By: 1.5.0_04-b05 (Sun Microsystems Inc.)

As manifests go, this is actually quite good. Most people have a significantly more lonely manifests. However, since this is no bundle, we need to add OSGi headers. The following code will set the versions of the bundle version and the version of the org.objectweb.asm packages to 2.2.2. In this case we can use a macro. We could create a special macro, version, for this and use this in the Bundle-Version and Export-Package headers. However, we can also reuse the Bundle-Version header since any header is also a macro. Notice that we use a time stamp on the version to find out about the build date.

 Analyzer analyzer = new Analyzer();
 analyzer.setJar(jar);
 analyzer.setProperty("Bundle-Version", "2.2.2.${tstamp}");
 analyzer.setExportPackage("org.objectweb.asm.*;version=${Bundle-Version}");
 Manifest manifest = analyzer.calcManifest();
 jar.setManifest(manifest);
 jar.getManifest().write(System.out);

This provides the following manifest:

Manifest-Version: 1.0
Export-Package: org.objectweb.asm;version="2.2.2.201206081457",org.obj
 ectweb.asm.signature;version="2.2.2.201206081457"
Implementation-Title: ASM
Implementation-Version: 2.2.2
Tool: Bnd-1.52.2
Bundle-Name: showcase
Created-By: 1.6.0_27 (Apple Inc.)
Implementation-Vendor: France Telecom R&D
Ant-Version: Apache Ant 1.6.2
Bundle-Version: 2.2.2.201206081457
Bnd-LastModified: 1339160222619
Bundle-ManifestVersion: 2
Bundle-SymbolicName: showcase
Originally-Created-By: 1.5.0_04-b05 (Sun Microsystems Inc.)
bnd added defaults for crucial OSGi information that was missing. The name, symbolic name, version, etc. It also copied all the headers from the old jar so that no information is lost. However, most important it calculated the Export-Package header.

Bundle-Version: 2.2.2.201206081456
Export-Package: 
 org.objectweb.asm;version="2.2.2.201206081457",
 org.objectweb.asm.signature;version="2.2.2.201206081457"
So lets save the jar on disk, including the digests so the bundle can be verified:

  jar.calcChecksums(new String[] {"SHA", "MD5"});
  jar.write("asm-2.2.2.jar");
So what more can we do? Lets take some of our own code and create a JAR out of it. The following example takes code from the bin directory, packages it and links it to the asm on disk.

 Builder b = new Builder();
 b.setPrivatePackage("simple");
 b.addClasspath(asm);
 b.addClasspath(new File("bin"));
 Jar simple = b.build();
 simple.getManifest().write(System.out);
The Private-Package will copy any package it specifies from the class path to the Jar. Since the asm on disk has no OSGi headers, we do not get import ranges.

Import-Package: org.objectweb.asm

So lets use the Jar we've just created instead, and lets also export the simple package.

 Builder b = new Builder();
 b.setExportPackage("simple");
 b.addClasspath(jar);
 b.addClasspath(new File("bin"));
 Jar simple = b.build();

Since the asm built jar we added has versions, the imports have version ranges. bnd also calculates the uses constraints on the exported packages:

Export-Package: simple;uses:="org.objectweb.asm";version="1.0.0"
Import-Package: org.objectweb.asm;version="[2.2,3)"

Last, lets say you want to know all the references from a JAR:

 Analyzer analyzer = new Analyzer();
 analyzer.setJar(j3);
 analyzer.analyze();
 System.out.println("Referred    " + analyzer.getReferred());
 System.out.println("Contains    " + analyzer.getContained());
 System.out.println("Uses" );
 for ( Entry<PackageRef, List<PackageRef>> from : analyzer.getUses().entrySet())
   System.out.printf("  %-40s %s\n", from.getKey(), new TreeSet<PackageRef>(from.getValue()) );

Which gives the following output:

Referred    java.lang,org.objectweb.asm
Contains    simple
Imports     [org.objectweb.asm]
Exports     [simple]
Uses
  simple                                   [java.lang, org.objectweb.asm]

This blog could go on forever (ok, for quite a long time); there is quite a lot of useful functionality in the API. However, for normal usage bnd(tools) works best since it has a nicer user interface, integrates with continuous integration, and is tremendously nice to develop with. However, if you find yourself processing jars or OSGi bundles, consider working with the API.

Peter Kriens