Programming: 2011

Sunday, August 14, 2011

Building C++/.NET apps with MSBuild 4.0

In .NET 4.0/VS 2010, Microsoft replaced vcbuild.exe with msbuild.exe.
To build both .NET managed as well as native C++ apps, you only need .NET 4 along with Windows 7 SDK:
http://www.microsoft.com/download/en/details.aspx?displayLang=en&id=8279
There is no need to install VS 2010.
Here is a walkthrough for a simple hello world C++ app:
http://msdn.microsoft.com/en-us/library/dd293607.aspx
With VS 2010, vcbuild.exe is no longer used to build C++ projects.

For VS 2008 solution files, you will need Microsoft Windows 7 SDK and .NET 3.5:
http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=3138
After installation, use the CMD shell (Programs->Windows 7 SDK->Cmd) to invoke msbuild on solution files.
This version of MSBuild (3.xx) uses vcbuild.exe to build C++ projects.

There is no need to install VS 2008.

Thursday, July 28, 2011

Event loop approach to concurrency

Event loop approach to concurrency as an alternative to threading - everything is non-blocking and executed via callbacks. The event loop is executing a queue of callbacks forever. This works well as long as the callbacks complete quickly! If a callback is going to take long it should fork another process. The primary application is networking - non-blocking I/O.
Douglas Crockford's presentation on event loop approach to concurrency

Libraries that use this approach include node.js, Ruby's Event Machine and Python's Twisted. and Java's new JDK7 Asynchronous IO and the older NIO library. The main difference between new Asychronous I/O and the older NIO - for NIO you are notified when the read operation is ready to start (data is available); while in Asychronous I/O - you are notified only when the read is completed (all data is read).

The design pattern being employed in all of these is the Reactor Design pattern. The Reactor pattern allows for the activation of handlers when events occur (e.g. activates handler to read data from socket when the data is available)

Wednesday, July 27, 2011

Five minute rule

The new five-minute rule
Compares the cost of holding data in memory vs disk I/O. With flash memory prices becoming cheaper, you can now pool memory from different machines to provide an ocean of RAM with low-latency.
RAM -> Flash memory -> Disk

Tuesday, July 26, 2011

CAP theorem

Eric Brewer's presentation of CAP theorem at PODC (Principles of Distributed Computing) keynote address: Consistency, Availability and Partition to network tolerance - only two of these properties can be possessed by shared data systems.
Consistency + Availability: Single-site databases (2-phase commit)
Consistency + Partitions: Distributed databases (pessimistic locking)
Availability + Partitions: DNS (conflict resolution)
Formally proven in 2002 paper by Seth Gilbert and Nancy Lynch.
BASE (Basically Available, Soft-state, Eventually consistent) is the opposite of ACID.

Verner Vogels article on Eventual Consistency
Great write on CAP here

Tuesday, July 19, 2011

Javascript and OOP

OOP is defined by three things: encapsulation, polymorphism and inheritance. Douglas Crockford's article claims it supports all three so it is an object-oriented language.

var AnimalClass = function Animal(name) {
this.name = name;

// private method
function sayPrivate() {
return "sayPrivate";
};

this.sayPrivileged = function() {
return sayPrivate();
}
}

// public method is added to the prototype
AnimalClass.prototype.say = function (something) {
return this.name + something;
}

var anAnimal = new AnimalClass("foo");
alert(anAnimal.name);
alert(anAnimal.say("ha"));
alert(anAnimal.sayPrivileged("ha"));

Typical way to implement inheritance in Javascript is via object-augmentation. For example, the underscore library defines the following function to extend any given object with the properties of the passed in object.
// Extend a given object with all the properties in passed-in object(s).
_.extend = function(obj) {
each(slice.call(arguments, 1), function(source) {
for (var prop in source) {
if (source[prop] !== void 0) obj[prop] = source[prop];
}
});
return obj;
};

_.extend(anAnimal, { "foo" : "bar" });
alert(anAnimal.foo);

Monday, July 18, 2011

Monitoring app performance

Front end performance: Speedtracer - can tell you how much time was spent on - DOM processing, garbage collection in the browser.
New relic tracks both front-end and backend performance by injecting javascript into the brower:
http://blog.newrelic.com/2011/05/17/how-rum-works/

Friday, July 15, 2011

Spring Roo 1.1.5 with GWT & GAE

mkdir rooapp
cd rooapp
start roo shell
roo> project --topLevelPackage com.xxx.rooapp --java 6
roo>persistence setup --provider DATANUCLEUS --database GOOGLE_APP_ENGINE
roo>entity --class ~.model.Product --testAutomatically
roo>field string --fieldName name --notNull
roo>field string --fieldName id --notNull
roo>field date --fieldName dateIntroduced --type java.util.Date --notNull
roo>field number --type java.lang.Float --fieldName unitPrice --notNull
roo>field string --fieldName description --notNull
roo>web gwt setup
I encountered a number of problems ...
The POM file it generated setup gae:home to be in the maven repo. Had to change it for things to work.
Compile failures with gwt:compile:
[INFO] [ERROR] Line 3: The import com.xxx.rooapp.server.gae.UserServiceLocator cannot be resolved
[INFO] [ERROR] Line 4: The import com.xxx.rooapp.server.gae.UserServiceWrapper cannot be resolved
This because GWT needs access to these sources - these files were not in the GWT source path!
Ran into lots of other errors around the generated classes for GAE and the GWT DesktopInjector ...
[ERROR] Generator 'com.google.gwt.inject.rebind.GinjectorGenerator' threw an exception while rebinding 'com.xxx.rooapp.client.scaffold.ioc.DesktopInjector'

Wednesday, July 13, 2011

Deploying apps with Puppet

Use Puppet master/agent

I want to deploy a simple application that installs a file in /tmp on a single box.

Here is the puppet module definition:
/etc/puppetlabs/puppet/modules/myapp/manifests/init.pp:
class myapp {
file { 'testfile' : path => '/tmp/testfile-local', ensure => 'present', content => 'Test', mode => 0640}
}

Typically, the puppet master uses a site.pp file for node definitions:
/etc/puppet/manifests/site.pp:

node development {
include "myapp"
}

node staging {
include "myapp"
}

Start the puppet master:
puppet master --no-daemonize --verbose
notice: Starting puppet master version 2.6.4

This is telling puppet master that myapp needs to be installed on the development box. If I have a puppet agent running on the client (which in this case happens to be the same box), then it will apply the latest configuration from the server automatically when it polls the next time.

I could run the agent manually onetime by ssh'ing into the box and invoking the agent:
[root@learn ~]# puppet agent --no-daemonize --onetime --server puppet --verbose
info: Retrieving plugin
info: Caching catalog for puppet
info: Applying configuration version '1310559260'
notice: /Stage[main]/Myapp/File[testfile-local]/ensure: created
notice: Finished catalog run in 0.02 seconds

CONS:
The main problem here is that I am not sure how you to achieve this via puppet master itself .. have not checked the UI - but surely there must be a way to update specific nodes - i.e. dev nodes only ?

You can always automate this using a capistrano script:
set :user, "root"
task :deploy
role :app, ‘development’
run 'puppet agent --no-daemonize --onetime --server puppet --verbose'
end

The problem with using a script like this is that the environment information has to be maintained in two places – in the script and the site.pp file. One way to avoid this is to generate this script from the site.pp file.

PROS:
You are using Puppet to install the application, just like a sysadmin would to ensure that machines were setup correctly. Easy to sell to ops.

Server-less puppet

Use capistrano to run puppet “apply” on the relevant nodes. Here is a Capfile for the app. It assumes that you have manifests checked out on the clients.

set :user, "root"

task :development do
role :app, “development”
end

task :staging do
role :app, "staging"
end

task :deploy do
# TODO: checkout manifests to module path
run "puppet apply -e \"include myapp\""
end

To deploy to development environment, you would run cap for that environment:

rg6977:puppet Thoughtworks$ cap development deploy
* executing `development'
* executing `deploy'
* executing "puppet apply -e \"include myapp\""
servers: ["192.168.56.101"]
Password:
[192.168.56.101] executing command
command finished

PROS:
Node definitions are now in Capistrano and not in puppet. Puppet is used only to install the application in a given environment. Puppet does a poor job of managing environment specific information. See http://docs.puppetlabs.com/guides/environment.html.

CONS:

Node definitions are now in Capistrano and not in puppet. Puppet is used only to install the application in a given environment.
This is more code than the previous approach. Puppet “apply” will only apply manifests to the local machine. So, if your application must be installed on a web server and db server, your Capistrano script needs to invoke puppet apply on the appropriate manifest (db vs web).

Tuesday, July 12, 2011

Amazon ec2

Command line api:
ec2-describe-images -o amazon
ec2-run-instances -k
ec2-describe-instances
ec2-authorize default -p 22 -open port for ssh
- ssh into the box
ssh -i root@xxx.amazonaws.com
ec2-terminate-instances
Good tutorial here

Puppet with Nagios

Puppet : Resources, aggregate resources using "defines" and "classes", organize using "modules" (see Puppet Language Guide)
Run-Stages used to control the order of resource management. The "sigils" - magical operators - <| and @ when doubled up - are very powerful.
Good article on configuring nagios with puppet. Also see the original Puppet example it references.

Thursday, May 26, 2011

SNA Projects Blog at LinkedIn

Building a terabyte-scale data cycle at LinkedIn with Hadoop and Project Voldemort:
http://project-voldemort.com/blog/2009/06/building-a-1-tb-data-cycle-at-linkedin-with-hadoop-and-project-voldemort/
There are many other interesting articles at http://sna-projects.com/blog/

Amazon Dynamo
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

Running JBehave tests with xvfb

- Install xvfb
- Start xvfb: Xvfb :1 -screen 0 1024x768x24
- export DISPLAY=:1
- firefox (should start without errors)
- install vnc server and use a VNC client like ChickenOfVnc to connect and see your tests running.

Friday, January 28, 2011

Rails Architecture/Performance issues

Typical Rails architecture:
Nginx/Apache --> HAProxy --> Mongrel cluster

HAProxy is a http proxy that forwards requests from web server to an available Mongrel instance. queue requests up if all Mongrels are busy.

HTTP options: Passenger, Mongrel, Thin, Unicorn

Monitoring tools:
New Relic, top/iostat, God, Monit

Great blog on reasons for memory bloat in Rails:
http://www.engineyard.com/blog/2009/thats-not-a-memory-leak-its-bloat/
Github architecture: https://github.com/blog/530-how-we-made-github-fast
Rack:Bug: https://github.com/brynary/rack-bug/

Thursday, January 27, 2011

Mailing list managers (Listserv, Mailman) vs MTAs (Sendmail, Postfix)

Some terminology:
Mail Transfer Agent: MTA implements both the client (sending) and server (receiving) portions of the Simple Mail Transfer Protocol.
An MTA receives a message from another MTA, MSA or MUA. If the recipient mailbox is not hosted locally, then it routes it to another MTA. The Domain Name System (DNS) associates a mail server to a domain with mail exchanger (MX) resource records containing the domain name of a host providing MTA services.
MX: A mail exchanger record (MX record) is a type of resource record in the Domain Name System that specifies a mail server responsible for accepting email messages on behalf of a recipient's domain and a preference value used to prioritize mail delivery.
MUA: Mail user agent - this is an email client like GMail, Outlook etc. MUA uses POP3 (Post Office Protocol) or IMAP (Internet Message Access Protocol) to retrieve messages from an MTA.
MSA: Mail submission agent - sits between the MUA and MTA. Functionally same as MTA.

Sendmail/Postfix use information from the Domain Name System (DNS) to figure out which IP addresses go with which mailboxes.
1. Setup a domain name -e.g. companyA.com
2. Configure name servers for your domain (primary and secondary)
3. Configure MX records for your domain.
4. After the name servers are setup, register your domain using one of the registries.
5. Configure sendmail to listen for mail/route outgoing mail.
6. Mailing lists are configured by setting up aliases. In Postfix, edit the /etc/aliases file. It has the format:
alias: address1,address2. After updating this file, you are usually required to run commands to update the internal db file used by Postfix/Sendmail.

Mailing list managers like Mailman integrate with an MTA like Sendmail or Postfix, so that when new lists are created, or lists are removed, Postfix's alias database will be automatically updated.

Wednesday, January 26, 2011

Nokogiri & Jruby - native library not found!

Nokogiri -

Function 'xmlSchemaValidateFile' not found in [libexslt.dylib] (FFI::NotFoundError)
file:/Users/Admin/Work/googleapps/ruby/teamsapp/WEB-INF/lib/jruby-stdlib-1.5.6.jar!/META-INF/jruby.home/lib/ruby/site_ruby/shared/ffi/ffi.rb:112:in `create_invoker'

Cucumber,Capybara & Webrat, Selenium

Cucumber is a BDD tool that provides a language - Given/When/Then syntax for describing "behavior"
The steps in this behavior description maybe implemented as pure Ruby code or in a variety of other DSL's like Capybara or Webrat for acceptance testing web applications. Both Capybara and Webrat are similar - the former has a cleaner/more flexible architecture. They both provide DSL e.g.
visit('/projects')
fill_in('First Name', :with => 'John')
click_link('id-of-link')
Here are reasons why Capybara is better than Webrat from the author himself)
"Webrat is fantastic, and it has done wonders for testing Ruby webapps,
it provides a very slick and elegant DSL for interacting with webapps,
but it also has a couple of problems:
* It is strongly tied to Rails and the Integration Testing built into
Rails
* It doesn't have (comprehensive) support for testing JavaScript
* It is difficult to extend
* There is no driver agnostic test suite to make sure that Selenium
mode for example behaves the same as Rails mode.
* It cannot run different drivers in the same process, so it can't run
one feature under selenium and another in simulation.
All of these pain points led me to tinker around with building a
driver agnostic solution with the following goals:
* Make it dead simple to switch between different drivers
* Support multiple drivers out of the Box
* Provide a comprehensive test suite which can run against any driver
* Make it work with any Rack based framework
* Make it as compatible as possible with the Webrat API
The result of this work is called Capybara and can be found at GitHub
here: http://github.com/jnicklas/capybara
It uses rack-test to simulate a browser instead of Rails' Integration
Testing, which means that interacting with the controller is out (it's
bad practice anyway, imho. It's an integration test after all). I also
intentionally didn't make have_tag and have_text work, since those by
virtue of how they work will never be useful with Selenium, Culerity
or any other browser simulator. Instead there's have_content,
have_xpath and have_css. Other than that it's very similar to Webrat. "

Monday, January 17, 2011

Ruby/Rails on Ubuntu

For apt-get to install packages on a new Ubuntu install, first run update: apt-get update

sudo apt-get install curl
sudo apt-get install build-essential (installs libraries required to compile C programs/make)
sudo apt-get install zlib1g-dev libreadline5-dev libssl-dev libxml2-dev
sudo apt-get install ruby1.8
ruby1.8 -ropenssl -rzlib -rreadline -e "puts:Hello"
sudo apt-get install rubygems
sudo gem install rails (rails script is found in /var/lib/gems/1.8)
export PATH=/var/lib/gems/1.8/bin/:$PATH

sudo apt-get install ruby1.8-dev << this package is essential for building ruby native extensions.

MYSQL:
sudo apt-get mysql-server mysql-client
libmysql-ruby: This is an API module that allows to access MySQL database from programs in Ruby.
libmysqlclient-dev: package includes development libraries and header files for MySQL
sudo apt-get libmysql-ruby libmysqlclient-dev
gem install mysql

APACHE & PASSENGER:
sudo apt-get apache2
sudo gem install passenger
sudo passenger-install-apache2-module

Programming