CFMX - SOAP vs REST benchmarks

One of the applications that we have developed relies quite heavily on web services and I've been doing some benchmarking to figure out how we can improve the performance of it.

Currently the application consists of a AIR/Flex frontend and a CF backend which is using SOAP web services to communicate. I wanted to understand the level of overhead that the SOAP layer added to each coldfusion request and understand if it would be beneficial to move away from using SOAP to using a REST type direct XML format.

Test 1: SOAP CFC vs REST CFM

Initally I created two files which were a very simple hello world example to test each scenario.

soapservice.cfc:

<cfcomponent>
<cffunction name="hello" access="remote" returntype="string">
   <cfreturn 'hello world'>
</cffunction>
</cfcomponent>

and the CFM/REST version

xmlservice.cfm:

<cfsetting enablecfoutputonly="true">
<cfcontent reset="true" type="text/xml">
<cfparam name="url.method" default="">
<cfswitch expression="#url.method#">
   <cfcase value="hello">
      <cfoutput><?xml version="1.0" encoding="utf-8"?><str>Hello World</str></cfoutput>
   </cfcase>
   <cfdefaultcase>
      <cfoutput><?xml version="1.0" encoding="utf-8"?><str>Unknown Method</str></cfoutput>
   </cfdefaultcase>
</cfswitch>

The results of the load tests were striking

The rest version of the code was 4 to 5 times faster, which meant 1000 second per minute instead 200 requests per second.

However, looking at the code I thought it was an unfair comparison as the SOAP one was using OO-style code which was more maintainable. I'm very much an OO person so I didn't want to have to live with horribly unmaintainable code just to get performance.

Test 2: SOAP CFC vs REST CFM with CFC

I created a new version of the CFM file which this time leveraged the soapservice.cfc which meant I could use the same code but bypass the seemingly very slow SOAP layer in Coldfusion.

xmlservicewithcfc.cfm:

<cfsetting enablecfoutputonly="true">
<cfcontent reset="true" type="text/xml">
<cfparam name="url.method" default="">
<cfswitch expression="#url.method#">
   <cfcase value="hello">
      <cfset oSS = createObject("component","soapservice")>
      <cfoutput><?xml version="1.0" encoding="utf-8"?><str>#oSS.hello()#</str></cfoutput>
   </cfcase>
   <cfdefaultcase>
      <cfoutput><?xml version="1.0" encoding="utf-8"?><str>Unknown Method</str></cfoutput>
   </cfdefaultcase>
</cfswitch>

So this version now uses a REST type request but leverages the CFC code for the functionality of "hello world".

This was not quite as good performance as the pure CFM code with only 3 to 4 times the performance of the SOAP version - but still not bad at all.

Test 3: SOAP CFC vs REST CFM with CFC with caching

One thing that is known to be reasonably costly in CF is the instantiation of CFC objects. A typical way to avoid this is to instantiate the objects once and then store them in the persistent application scope. This technique will work for lots of the code I am looking to optimise so I wanted to see how much impact it would have.

xmlservicewithcfccached.cfm:

<cfsetting enablecfoutputonly="true">
<cfcontent reset="true" type="text/xml">

<cfapplication name="xmlservicewithcfcached" sessionmanagement="false" clientmanagement="false">

<cfparam name="url.method" default="">
<cfswitch expression="#url.method#">
   <cfcase value="hello">   
      <cfif NOT structKeyExists(application,'oSS')>
         <cfset application.oSS = createObject("component","soapservice")>
      </cfif>
      <cfoutput><?xml version="1.0" encoding="utf-8"?><str>#application.oSS.hello()#</str></cfoutput>
   </cfcase>
   <cfdefaultcase>
      <cfoutput><?xml version="1.0" encoding="utf-8"?><str>Unknown Method</str></cfoutput>
   </cfdefaultcase>
</cfswitch>

The performance of this version of the code was almost identical to the 1st version - i.e. 4 to 5 times faster than using the SOAP version but it more easily maintainable.

Is this a valid benchmark?

In a word no - in the strict sense of a reproducible benchmark. It was done on my development laptop as I don't have lots of spare machines lying around.

The specs of the machine are as follows:

  • Intel Core 2 Duo T7200 @ 2GHz
  • 3GB Ram
  • Ubuntu 7.10 (gutsy) with standard 2.6.22-14-generic kernel
  • Coldfusion 7.0.2 - Multiserver install - default config
  • Apache JMeter 2.3.1 for load testing

Each thread in the tests ran 10,000 loops - so for example the tests where I ran 50 threads would be 10,000 x 50 = 500,000 requests.

The entire test was conducted over the course of one day and there were no reboots during the tests. Mostly the same programs were running but it was connected to the internet so minor variations in load and traffic could have affected the results.

What about CF8?

After I finished the testing on CF7 (which is what we currently have on our Production servers) I tried it out on the CF8 Install to see how it compared - the details are detailed below.

Show me the stats

So here are the pretty stats and charts that I created to understand all the data:

Graph 1:

Table 1 - figures for Graph 1:

Graph 2:

Table 2 - figures for Graph 2:

Need more stats?

I have also uploaded the spreadsheet that I created to generate these stats in the original OpenDocument format and also as an Excel file.

Conclusion

For me - I'll be definitely looking at using the REST methods further on any high traffic sites where I might previously have used SOAP.

What are we gaining

  • Lots of performance - a 5 fold performance increase can't be ignored.
  • Smaller data being transferred as no SOAP envelope overhead
  • More control over data transferred
  • Ability to use conventional debugging and caching techniques

What are we giving up

  • Automatic handling of complex objects
  • Lots of overhead

I'd love to hear your thoughts on this - what else am I giving up?

Cheers, Mark

Pretty network diagrams and flow charts with Open Source software.

I had been using Dia for all my charts and diagrams, and it does a very good job. The only downside is that it doesn't produce very pretty charts of the like of Omnigraffle.

However, inspired by this article, I'm starting to use a new solution, Inkscape, that fulfills my criteria that works on Linux, Mac and Windows.

Dia Omnigraffle Inkscape
Can create pretty pictures No Yes Yes
Cross Platform (Linux, Mac, Windows) Yes No - Mac only Yes
Open Source Yes No Yes
Open file format (to work with version control) No - Binary No - Binary Yes - SVG
Export to PNG Yes Yes Yes

Now you may argue that Inkscape is not a diagraming tool but a generic drawing tool. This is true, but with the addition of a few templates of pre-drawn items it become a very good diagramming tool.

I've created a few simple flow chart elements which I use and also scoured the net for some SVG icons for network based diagrams which were sourced from Gnome (via Ubuntu) and from Quantum Bits.

Templates Previews

Flow chart diagram preview

Computing devices preview

Download

Download the svg templates (right click - Save As):

Give it a try - it's very easy.

JMeter load testing - Regex problem solved

I've been having fun playing with JMeter to do some load testing on some of our applications recently and I came across this problem which had me stumped. Eventually my friendly sysadmins put me out of my misery. So to prevent other people from the same issues - here is the problem and the solution.

Jmeter screenshot - not working

JMeter is a great load testing application but it can be a bit tricky to when you are learning it.

For the particular testing I needed to get it to do the following:

  • Login in as a user
  • Retrieve the valid userid
  • Create a new test for the user
  • Get the testid back
  • Load the test details
  • Submit the test

I pretty quickly got everything working with some regex's to extract the userid and testid as required and my test case looked like the picture above.

Everything was working except the final submission of the test, which was failing as the regex variable was always using the default value, despite the fact that is was working for the previous step.

Jmeter screenshot - working!!

It turns out that the regex needed to be nested inside the request it was referring to or it would be called after every request, which in most cases would return the default value as the regex wouldn't match.

Once I put the regex's inside the requests the were not constantly being reset and everything worked a treat.

Hope it helps.

Mark

Cairngorm for CF developer series by Joe Rinehart

I just came across this series of posts, which is very helpful for CF people getting started with Flex and CairnGorm.

The other really good reference for people starting with Flex is the series of articles from Steven Webster. These articles refer to earlier versions of Cairngorm but all the core concepts are very well explained.

Cheers, Mark

MemCached Client for CFMX - alpha version

I've been looking at using memcached in order to speed up a few sites that I've been working on.

What does it do?

Memcached is basically a large RAM cache that is distributed across a network and can be shared between many machines. Many CF applications use the application scope to cache information - including the wonderful CF_Accelerate tag. This works great for single servers (and instances) but once you need to scale beyond this you end up with each server having duplicate copies of the cached data.

Memcache gives you a central place to store this info and allows you to move it out of the ColdFusion memory space which uses as much memory on the network as you have available. Plenty more info on the memcache site.

How do I use it?

Install memcached on a machine (using apt on ubuntu is the following)
sudo apt-get install memcached
Then run it (the -vv gives extra debugging info):
memcached -vv

Download the java api for memcached (for CF7 you need the 1.3.2 version unless you have upgraded your VM)

Copy it into the lib folder for CF and restart CF eg:

sudo cp java_memcached-release_1.3.2.jar /opt/jrun/lib

Get the CFC:

Then try the following test code:

<cfscript>
   oMemCached = createObject("component","MemCached").getPoolInstance();
   oMemCached.shutdown();
   
   //List of server:ports to use
   serverList = '127.0.0.1:11211';
   
   //Create a new memcached object - this could be stored in application    
   oMemCached = createObject("component","MemCached").init(serverList=serverList);
      
   //Create some data to store    
   aTemp = ArrayNew(1);
   aTemp[1] = '34343';
   aTemp[2] = '134343';
   aTemp[3] = '234343';
   aTemp[4] = '334343';
   aTemp[5] = '434343';

   //Save to keys (one add and one set)
   oMemCached.add('key',aTemp);
   oMemCached.set('key2',aTemp);      
</cfscript>

<!--- Output the data from cache --->
<cfdump var="#oMemCached.get('key')#">
<cfdump var="#oMemCached.get('key2')#">

The cfc is not production ready - but is working in some basic tests. One area that is not working (as far as I can tell) is the failover of servers.

Have a play, and all comments welcome.

OO Part 1: From code libraries to instances

Code libraries

When people first start using CFC's they often use them as a handy way to group related functions together. For example, if you could put all the functions that deal with users into the same cfc and you may have something like the following pseudocode:
User.cfc
function getUserName(id)
function getFirstName(id)
function getLastName(id)
function authenticate(username,password)
function getAllUsers()
function getUserCount()
function createUser(username,password,firstname,lastname)
function updateUser(id,username,password,firstname,lastname)

While this is a useful and valid way to use CFC's it isn't using them to their full potential.

Dealing with a single user

When working in object oriented way we try to discern what the different objects in a system would be. In this case User would be an object - and the user would know certain things about itself (name, username, password, date created) and just as importantly it would not know certain things (eg: total number of users in system)

User.cfc
var id //We store the user id in the cfc
function init(id)
function getUserName() //id is not required as it is already in the instance function getFirstName()
function getLastName()
function updateUser(username,password,firstname,lastname) //also no id required here

This "instance" of a user gives us access to all the properties that a user knows about but shields us from many of the functions that are not relevant.

We now have a user that we can easily conceptualise and work with without having all of the other methods such as getAllUsers() or getUserCount() getting in our way. It also allows a team of programmers to have a set of guidelines as to where code should live. If it deals with a single user it will be in the user CFC

But how do we create a new user

You'll notice that the above code has no way of creating a user - this is because it doesn't make sense for a user to be able to create another user (we're not talking about giving birth here). So how do we create a user.

We need a place to put the create user method that is outside of the User.cfc - and I have typically used something like UserManager.cfc.

UserManager is also responsible for all functions that deal with more than one user.

UserManager.cfc
function authenticate(username,password) // We don't have a valid user at this point
function getAllUsers()
function getUserCount()
function createUser(username,password,firstname,lastname)

This works really well in that there is now a logical place to put all of the code relating to Users and if all the people working on the application know where things should go.

Two CFC's for each object type?

However having two CFC's for each object type can get really messy, espcially when you are drawing Class diagrams. The java solution to this lies in the use of Static methods.

Static methods (or Class methods) are functions that are available on the Class as opposed to the Instance - i.e. when we are dealing with the concept of Users as opposed to a specific user Steve Bennett who works on level 5.

So now we are back to having the following:

User.cfc
var id
// static methods
function authenticate(username,password)
function getAllUsers()
function getUserCount()
function createUser(username,password,firstname,lastname)
function init(id)

//Instance methods
function getUserName() //id is not required as it is already in the instance
function getFirstName()
function getLastName()
function updateUser(username,password,firstname,lastname) //also no id required here

But aren't we back where we started with everything jumbled together? Not really, we now have a obvious set methods that deal with a single instance and another set that deal with multiple users.

In this example you only have half the number of methods to look through when you are dealing with a specific user (i.e. all the instance methods) and the same when you are dealing with multiple users.

Enforcing it in ColdFusion

However Coldfusion doesn't naturally handle the differentiation of Static methods from Instance methods - however with a bit of ingenuity (aka hackery) we can enforce it in our CFC's. I've got a working prototype of this and I'll post some more on it shortly.

Further reading

If you are interested in getting more aquainted with Object Oriented design I've found the following books very helpful:

CFC Component Instantiation overhead and dependencies

I've been working on a ColdFusion application that makes really heavy use of application cached CFC components which are all inited in the application.cfc. This makes for a very high performance application and the components currently have inter-relations based on the fact that each component knows that any dependencies will be available in the application scope.

This works well for the main application where performance is the critical measure, however it is not very flexible when it comes to creating an admin application that used the same core libraries but where having each component cached in the application is wasteful of resource.

I'm currently exploring to see if it's possible to get the best of both worlds:

  • Runtime dependency handling (guaranteeing that components dependencies are available)
  • Caching components to ensure we don't have the overhead of instantiating them multiple times.

To discover the what the overhead of instantiating a component is I ran the following tests.

10000 iterations
Large component (45 functions)T1T2T3Average
Create component each time17438117661201013738
Re use component6952724084407544
Small component (1 functions)T1T2T3Average
Create component each time9059955786449087
Re use component5950723858736354

What do these numbers mean?

For a large component up to 45% of the time was spent creating instances

For the small component 30% of the time was spent creating instances

What this shows is that it is definitely worth minimising component instantiation in performance critical applications, as between 30 - 45% of time could be spend instantiating the component (if it not reused).

Stay tuned for a solution to instantiating and caching components.

Who what when of cfc's, custom tags and cfincludes

There are a lot of ways to reuse code in coldfusion, to the point that for new people it is becoming more difficult to understand where it is appropriate to use the different techniques. There are CFC's, Custom tags, cfimported custom tags, includes and UDFs.

I've developed these rough guidelines over the past few years:

[More]

Object Orientation for CF Developers

Why?

Why is Object Oriented code worth learning:

  • reliability - possible to test each component
  • maintenance - it is easy to know where things should be
  • encapsulation - knowing where the hard bits are
  • reuse - all code is in libraries not pages of your site

[More]

BlogCFC was created by Raymond Camden. This blog is running version 5.1.004.