Programming: July 2011

Thursday, July 28, 2011

Event loop approach to concurrency

Event loop approach to concurrency as an alternative to threading - everything is non-blocking and executed via callbacks. The event loop is executing a queue of callbacks forever. This works well as long as the callbacks complete quickly! If a callback is going to take long it should fork another process. The primary application is networking - non-blocking I/O.
Douglas Crockford's presentation on event loop approach to concurrency

Libraries that use this approach include node.js, Ruby's Event Machine and Python's Twisted. and Java's new JDK7 Asynchronous IO and the older NIO library. The main difference between new Asychronous I/O and the older NIO - for NIO you are notified when the read operation is ready to start (data is available); while in Asychronous I/O - you are notified only when the read is completed (all data is read).

The design pattern being employed in all of these is the Reactor Design pattern. The Reactor pattern allows for the activation of handlers when events occur (e.g. activates handler to read data from socket when the data is available)

Wednesday, July 27, 2011

Five minute rule

The new five-minute rule
Compares the cost of holding data in memory vs disk I/O. With flash memory prices becoming cheaper, you can now pool memory from different machines to provide an ocean of RAM with low-latency.
RAM -> Flash memory -> Disk

Tuesday, July 26, 2011

CAP theorem

Eric Brewer's presentation of CAP theorem at PODC (Principles of Distributed Computing) keynote address: Consistency, Availability and Partition to network tolerance - only two of these properties can be possessed by shared data systems.
Consistency + Availability: Single-site databases (2-phase commit)
Consistency + Partitions: Distributed databases (pessimistic locking)
Availability + Partitions: DNS (conflict resolution)
Formally proven in 2002 paper by Seth Gilbert and Nancy Lynch.
BASE (Basically Available, Soft-state, Eventually consistent) is the opposite of ACID.

Verner Vogels article on Eventual Consistency
Great write on CAP here

Tuesday, July 19, 2011

Javascript and OOP

OOP is defined by three things: encapsulation, polymorphism and inheritance. Douglas Crockford's article claims it supports all three so it is an object-oriented language.

var AnimalClass = function Animal(name) {
this.name = name;

// private method
function sayPrivate() {
return "sayPrivate";
};

this.sayPrivileged = function() {
return sayPrivate();
}
}

// public method is added to the prototype
AnimalClass.prototype.say = function (something) {
return this.name + something;
}

var anAnimal = new AnimalClass("foo");
alert(anAnimal.name);
alert(anAnimal.say("ha"));
alert(anAnimal.sayPrivileged("ha"));

Typical way to implement inheritance in Javascript is via object-augmentation. For example, the underscore library defines the following function to extend any given object with the properties of the passed in object.
// Extend a given object with all the properties in passed-in object(s).
_.extend = function(obj) {
each(slice.call(arguments, 1), function(source) {
for (var prop in source) {
if (source[prop] !== void 0) obj[prop] = source[prop];
}
});
return obj;
};

_.extend(anAnimal, { "foo" : "bar" });
alert(anAnimal.foo);

Monday, July 18, 2011

Monitoring app performance

Front end performance: Speedtracer - can tell you how much time was spent on - DOM processing, garbage collection in the browser.
New relic tracks both front-end and backend performance by injecting javascript into the brower:
http://blog.newrelic.com/2011/05/17/how-rum-works/

Friday, July 15, 2011

Spring Roo 1.1.5 with GWT & GAE

mkdir rooapp
cd rooapp
start roo shell
roo> project --topLevelPackage com.xxx.rooapp --java 6
roo>persistence setup --provider DATANUCLEUS --database GOOGLE_APP_ENGINE
roo>entity --class ~.model.Product --testAutomatically
roo>field string --fieldName name --notNull
roo>field string --fieldName id --notNull
roo>field date --fieldName dateIntroduced --type java.util.Date --notNull
roo>field number --type java.lang.Float --fieldName unitPrice --notNull
roo>field string --fieldName description --notNull
roo>web gwt setup
I encountered a number of problems ...
The POM file it generated setup gae:home to be in the maven repo. Had to change it for things to work.
Compile failures with gwt:compile:
[INFO] [ERROR] Line 3: The import com.xxx.rooapp.server.gae.UserServiceLocator cannot be resolved
[INFO] [ERROR] Line 4: The import com.xxx.rooapp.server.gae.UserServiceWrapper cannot be resolved
This because GWT needs access to these sources - these files were not in the GWT source path!
Ran into lots of other errors around the generated classes for GAE and the GWT DesktopInjector ...
[ERROR] Generator 'com.google.gwt.inject.rebind.GinjectorGenerator' threw an exception while rebinding 'com.xxx.rooapp.client.scaffold.ioc.DesktopInjector'

Wednesday, July 13, 2011

Deploying apps with Puppet

Use Puppet master/agent

I want to deploy a simple application that installs a file in /tmp on a single box.

Here is the puppet module definition:
/etc/puppetlabs/puppet/modules/myapp/manifests/init.pp:
class myapp {
file { 'testfile' : path => '/tmp/testfile-local', ensure => 'present', content => 'Test', mode => 0640}
}

Typically, the puppet master uses a site.pp file for node definitions:
/etc/puppet/manifests/site.pp:

node development {
include "myapp"
}

node staging {
include "myapp"
}

Start the puppet master:
puppet master --no-daemonize --verbose
notice: Starting puppet master version 2.6.4

This is telling puppet master that myapp needs to be installed on the development box. If I have a puppet agent running on the client (which in this case happens to be the same box), then it will apply the latest configuration from the server automatically when it polls the next time.

I could run the agent manually onetime by ssh'ing into the box and invoking the agent:
[root@learn ~]# puppet agent --no-daemonize --onetime --server puppet --verbose
info: Retrieving plugin
info: Caching catalog for puppet
info: Applying configuration version '1310559260'
notice: /Stage[main]/Myapp/File[testfile-local]/ensure: created
notice: Finished catalog run in 0.02 seconds

CONS:
The main problem here is that I am not sure how you to achieve this via puppet master itself .. have not checked the UI - but surely there must be a way to update specific nodes - i.e. dev nodes only ?

You can always automate this using a capistrano script:
set :user, "root"
task :deploy
role :app, ‘development’
run 'puppet agent --no-daemonize --onetime --server puppet --verbose'
end

The problem with using a script like this is that the environment information has to be maintained in two places – in the script and the site.pp file. One way to avoid this is to generate this script from the site.pp file.

PROS:
You are using Puppet to install the application, just like a sysadmin would to ensure that machines were setup correctly. Easy to sell to ops.

Server-less puppet

Use capistrano to run puppet “apply” on the relevant nodes. Here is a Capfile for the app. It assumes that you have manifests checked out on the clients.

set :user, "root"

task :development do
role :app, “development”
end

task :staging do
role :app, "staging"
end

task :deploy do
# TODO: checkout manifests to module path
run "puppet apply -e \"include myapp\""
end

To deploy to development environment, you would run cap for that environment:

rg6977:puppet Thoughtworks$ cap development deploy
* executing `development'
* executing `deploy'
* executing "puppet apply -e \"include myapp\""
servers: ["192.168.56.101"]
Password:
[192.168.56.101] executing command
command finished

PROS:
Node definitions are now in Capistrano and not in puppet. Puppet is used only to install the application in a given environment. Puppet does a poor job of managing environment specific information. See http://docs.puppetlabs.com/guides/environment.html.

CONS:

Node definitions are now in Capistrano and not in puppet. Puppet is used only to install the application in a given environment.
This is more code than the previous approach. Puppet “apply” will only apply manifests to the local machine. So, if your application must be installed on a web server and db server, your Capistrano script needs to invoke puppet apply on the appropriate manifest (db vs web).

Tuesday, July 12, 2011

Amazon ec2

Command line api:
ec2-describe-images -o amazon
ec2-run-instances -k
ec2-describe-instances
ec2-authorize default -p 22 -open port for ssh
- ssh into the box
ssh -i root@xxx.amazonaws.com
ec2-terminate-instances
Good tutorial here

Puppet with Nagios

Puppet : Resources, aggregate resources using "defines" and "classes", organize using "modules" (see Puppet Language Guide)
Run-Stages used to control the order of resource management. The "sigils" - magical operators - <| and @ when doubled up - are very powerful.
Good article on configuring nagios with puppet. Also see the original Puppet example it references.

Programming