What's The Point Of Using WCF In A Web App?

A very common approach of building web applications in .NET is to put most of the non-UI related code behind an internal WCF service layer. I used to be a fan of this approach as well, but these days I just don't see the benefit of that internal service layer anymore. The overhead that an internal WCF service layer adds to development, deployment and runtime performance just doesn't stack up favorably to the supposed benefits IMO. To be clear: I'm talking about WCF services that will only be used by the front end of your web application.

Let's talk about the overhead on development first. If you're using WCF services in your web app, you need proxies to access those services. Some people prefer to generate the proxies based on the WSDL of the services that will be used. In the worst case, this leads to regenerating proxies and all of the types that are defined in the WSDL every time you change a service contract or one of the types that are used by the services. If multiple people need to make changes to any of these concurrently, this easily leads to merging problems when people need to commit their changes. Another way is to share the same types on both sides (client & server), and implement your service proxies by inheriting from ClientBase and manually keeping the implementation of the proxies up to date with the definitions of their service contracts. This is better than regenerating a bunch of code all the time, but you're still writing a lot of redirection code for the purpose of, well, what exactly? Another possibility is to use dynamic proxies which automatically implement the service contracts but this increases the amount of infrastructure code you need to put in place and it's not always clear to everyone how exactly communication with the services happens. There's also a lot of WCF configuration for each service that you need to maintain, and it can quickly grow unwieldy.

Then there's the overhead on performance. I hope we can all agree that any operation that goes out of process is at least an order of magnitude slower than a similar operation that can be executed in process. First of all, there's the networking overhead (even if your services are hosted on the same machine as the web app) that you have to keep into account. Secondly, there is the cost of serializing and deserializing everything that is transferred between the client and the server. Even with the most efficient bindings and serializers, the cost of all of this quickly adds up on high-traffic web apps. That's not to say that WCF services are inherently slow. They can be very fast and efficient, but they'll never be as fast and efficient as executing that logic in process within the web app.

Finally, there's the extra overhead it introduces to the deployment phase:

  • more endpoints to set up and transfer artifacts to
  • more configuration
  • more monitoring of endpoints
  • more servers if you're not hosting the services and the web app on the same machine

Of course, people will argue that there a plenty of benefits to using a WCF service layer in a web app. The ones I hear about most often are the forced separation of business logic and UI logic and improved scalability and reliability. I really disagree that you need a physical separation of business and UI logic. I much prefer approaches where the separation is based on abstractions. A good example was recently posted by Ayende (here and here). And when it comes to scalability/reliability, a web app that isn't dependent on a WCF service layer is as easy (or even easier depending on your setup) to scale than one that is entirely dependent on WCF services. First of all, if you care about scalability/reliability your web app should already be prepared to run behind a load balancer. If you already have a load balancer in place, you can just add more web servers to your setup when needed. If you'd host the WCF services on the same machines that are hosting the web front end, you'd get less total throughput from one server than you would if that one server could just host a web app that fully executes in process (not including the database obviously). If you're hosting the WCF services on separate machines, you'd end up with more servers to handle the load and to achieve the reliability you need than you would with just being able to add more web servers to your setup. That also increases your licensing costs. And of course, it also means increased networking overhead on every service call, which also implies that the threads on your web servers will be blocked for longer periods while they wait for those service calls to return. Unless you're calling those services asynchronously, but most people simply don't. Also, if you have serious scalability and reliability requirements you're probably better off with asynchronous messaging solutions than with SOAP services.

WCF has its benefits (though I prefer Web API's or asynchronous messaging over SOAP services these days) and it has its use cases. I just don't think internal service layers for web apps is one of them.

What are the benefits that you think an internal WCF service layer brings to your web app? And what's your opinion on how they stack up versus the downsides?

Written by Davy Brion, published on 3/18/2012 5:28:28 PM
Categories: architecture , code-quality , performance , wcf


Displaying Feed Items On A Web Page: My Solution

A couple of days ago I asked you how you'd implement showing links from an RSS feed on a web page (in this case: my new company web site). These are my requirements for this:

  • It needs to be fast
  • The fewer requests that are impacted by retrieving the feed data, the better
  • If I publish a post, the links on the company website should contain the new link within 30 minutes
  • The simpler the solution, the better

I came up with a very simple solution, which satisfies these requirements better than any other solution I could think of, or heard of from other people. It is extremely fast, doesn't delay any requests, and doesn't require me to deploy anything but the company website. I'm building the site with Express on Node.js, which means I can take full advantage of the asynchronous nature of Node.js to implement this.

Let's go over the code... in the script that starts the express server, I have the following code:

I'll discuss the code in just a moment, but first I want to show the view code that renders the links:

And that's all. This is the solution in its entirety!

If you're new to Node, this code probably requires some explanation. Let's start with this part:

Here I'm adding a dynamic helper to the Express application. It basically means that my views have access to the getRecentFeedItems function, which returns the value of the recentFeedItems variable. It's important to know that the getRecentFeedItems function creates a closure on the recentFeedItems variable created above it. That means that if the value of the recentFeedItems variable changes at any point in time, the getRecentFeedItems function will return that new value.

This just creates a function that we can use later on. It retrieves the feed asynchronously, and when the result is retrieved, we parse the feed using the NodePie library and we get the 5 most recent items which we store in the recentFeedItems variable. Again, this creates a closure on the recentFeedItems variable which means that every time we assign a value to this variable, any subsequent call to the getRecentFeedItems function will return the value we just assigned to it because both functions point to the same memory thanks to the magic of closures. Finally, if a callback is provided as a parameter, the callback will be invoked.

The call to setInterval makes sure that the processFeed function is called every 30 minutes. After that, we call the processFeed function manually, and we pass in a callback where we start the Express server. This guarantees that the feed items will be in memory before the server starts processing requests.

What makes this solution so great is that we take full advantage of some of Node's benefits. Whenever we retrieve the RSS feed, Node.JS will retrieve that data asynchronously. As soon as it has fired the request to get the RSS feed, it just goes to the next event in its eventloop so no request is kept waiting while we wait for the data to be downloaded. Until the data from the RSS feed is returned, each request will just use the items that are stored in the recentFeedItems variable. Once the data has been returned, our callback is executed which overwrites the value of the recentFeedItems variable. We don't need to do any locking here because the Node.JS eventloop is single-threaded: while our callback is running, no other code that has access to the recentFeedItems variable can be executed anyway. And the actual parsing of the RSS feed is done by NodePie, which uses expat behind the scenes, which is supposedly the fastest C XML parser available.

Looking back on my initial requirements, I think this solution matches very well.

Written by Davy Brion, published on 12/20/2011 8:00:55 AM
Categories: express-js , javascript , node-js , performance


Challenge: Displaying Feed Items On A Web Page

I'm finally getting around to implementing the website for my company, and there's one small part of it that's quite interesting from an implementation point of view. The website will have a footer on each page which displays links to my 5 most recent blog posts:

Of course, I don't want to update those links manually whenever I publish a new post, so they need to be retrieved from my blog's RSS feed, which is published by Feedburner. I was hoping to be able to retrieve only the metadata from the posts (date, title and URL is all I need) because my feed always contains the last 20 posts and its total size is usually above 100KB. I haven't found a way to do that, so getting the information I need has to be retrieved through the full feed. Sure, 100KB isn't much but keep in mind that you need to retrieve it and parse it and that I absolutely want to minimize the time each request takes and that I'd rather not see any visual delays on the page either.

I'm interested in hearing how you would implement this. You have total freedom to pick the technologies you'd like to use and no limits on how you'd use them. My only requirements are these:

  • It needs to be fast
  • The fewer requests that are impacted by retrieving the feed data, the better
  • If I publish a post, the links on the company website should contain the new link within 30 minutes
  • The simpler the solution, the better

My solution can be found here.

Written by Davy Brion, published on 12/17/2011 4:17:41 PM
Categories: performance


Repeated Failed Log-Ins: What's Your Strategy?

I've only been using the server that's hosting this blog for a week or two, so I'm still keeping a close eye on it. I check usage graphs (cpu, disk I/O and network) a couple of times a day to verify whether things are still running smoothly. This morning, I saw a noticeable increase in CPU usage and network activity that lasted for about 11 hours. I logged into the machine, checked some logs and found out that someone had conducted an 11 hour lasting brute-force SSH attack. It doesn't make much sense to try that on my server since my SSH daemon doesn't allow password authentication, and indeed there was no successful login during the attack so no harm done, right?

Even if such an attack is not successful, it does consume resources on the targeted server(s). And wasteful, unnecessary resource usage has always been a bit of a pet peeve of mine so I wanted to prevent this from happening again. For this particular scenario, it's pretty easy. I installed DenyHosts which routinely checks for repeated (configured at 5) failed log-in attempts, and adds the offending IP addresses to /etc/hosts.deny so every other attempted SSH connection from those IP addresses will be denied immediately. Each offending IP address will be purged from /etc/hosts.deny after 1 week. Then I added a firewall rule that prevents you from connecting through SSH more than 5 times in 60 seconds. If you go over 5 connections, it just starts dropping packets, and by the time the drop behavior for your IP address expires, you'll have been added to /etc/hosts.deny already. As I said, pretty easy in this scenario because there are great tools I can rely on.

But what would you do if you had to implement a strategy to deal with this yourself? The most interesting approach I've heard of is to add an incremental delay on each failed authentication attempt. If the user fails the authentication check, delay the response with 1 second. If the user fails the second time, delay the response with 2 seconds. Third failure means a delay of 3 seconds, and so on. This pretty much makes a brute-force or dictionary attack impossible. The key is though, that you can't block any of your request-handling threads because then you open yourself up to an easy DoS attack.

Implementing this for a web application built on Node.js and Express.js is incredibly easy (there's an ASP.NET MVC example later in this post btw). I took the authorization example of Express.js and made just a few minor changes. First of all, I added the delayAuthenticationResponse function:

This is the most important part of the implementation. Every time we get here, we increment the number of attempts for this user by one and store the number in the user's session. Side note: this is one of the few things you'd actually want to use a session for: session-related data. Then we schedule the callback to be executed after the number of attempts * 1000 milliseconds have passed. The important part to remember here is that Node's event loop is not blocked by this, so our ability to handle other requests is not impaired in any way. The only one who suffers here is the attacker. Note that in a real world implementation, you'd probably only want to start increasing the delay after 5 attempts or so, in order to not piss off users who're just having problems remembering their password.

Then I changed the authenticate function so that it receives a session as the first parameter, and uses our delayAuthenticationResponse function whenever something goes wrong:

After that, it's just a matter of changing the function that is assigned to the login route:

And there we go. This effectively makes it impossible to brute-force your way into this web application, and I'm sure you can agree it was rather easy to do so. Of course, this is only because Node.js is inherently non-blocking. In an environment where non-blocking is the exception rather than the rule, you have to keep a few more things into account when trying to implement this strategy.

For instance, ASP.NET MVC is a typical blocking web framework. There's a certain number of threads that are waiting to handle requests, and once they receive a request, they process that request in its entirety. That means that if your code has to wait on something, the request handling thread is blocked and can't handle any other requests. So obviously, if you'd like to implement this strategy for dealing with repeated failed log-ins, you really want to avoid doing something like this:

(note: this is a slightly modified LogOn method from the default AccountController when selecting 'internet application' in the MVC project wizard)

While this looks like it does the same as the Node/Express example, it certainly doesn't. The experience for the attacker is the same, because each failed attempt causes the response time to be increased with an extra second. But on your server, the thread handling the request is blocking the whole time and is thus incapable of handling extra requests while you're making the attacker wait.

Luckily, you can use ASP.NET MVC's asynchronous controllers to provide an asynchronous implementation of an action without blocking the request handling thread:

Your controller has to inherit from AsyncController instead of Controller to make this work. Of course, it's much more complicated and requires more ceremony compared to the Node/Express approach, but then again, ASP.NET MVC isn't optimized for this kind of usage whereas Node/Express definitely is.

Either way, no matter what web framework you use, if you can add an incremental delay to the response of each failed log-in attempt without blocking a request-handling-thread, you've added a very effective and low-cost protection against brute-force and dictionary attacks.

Written by Davy Brion, published on 9/10/2011 7:38:45 PM
Categories: asp-net-mvc , express-js , node-js , patterns , performance , security


Performance Of NHibernate With Ruby Objects Compared To Traditional C# Objects

I recently showed how you can use NHibernate to persist and query Ruby objects through IronRuby. We've continued the experiment (though we've already done some big optimizations in the code based on the first results of these tests) and we recently had to decide whether or not the performance difference between using NHibernate with regular static C# code and using it with dynamic Ruby objects was acceptable. So we ran a set of tests, and compared all of the numbers. Note that we don't claim that these benchmarks are scientifically correct in any way, but we do think they give us a good idea on what we can reasonably expect. I want to share the results with you, and would appreciate any feedback you guys have on this... particularly on whether or not we missed something obvious in our tests or whether or not we should trust these numbers. After all, we're not professional benchmarkers so our approach might very well just suck :)

We have a scenario which consists of 15 'actions'. For these actions, we use some tables from the Chinook database, basically just Artist/Album/Track/Genre/MediaType. The actions are the following:

  • Retrieve single track without joins, and access each non-reference property
  • Retrieve single track with joins, and access all properties, including references
  • Retrieve single track without joins, and access all properties, including references (triggers lazy-loading)
  • Create and persist object graph: one artist with two albums with 13 tracks each
  • Retrieve created artist from nr 4, add a new album with another 13 tracks, change the title of the first album from nr 4, and remove the second album from nr 4 including its tracks
  • Retrieve created artist from nr 4 and delete its entire graph
  • Create a single track
  • Retrieve single track from step 7 and update it
  • Retrieve single track from step 7 and update the name of one of its referenced properties
  • Retrieve single track from step 7 and change one of the reference properties so it references a different instance
  • Delete the track from step 7
  • Retrieve 100 tracks and access each non-reference property
  • Retrieve 200 tracks and access each non-reference property
  • Retrieve 100 tracks without joins and access all properties, including references (triggers lazy-loading)
  • Retrieve 100 tracks with joins and access all properties, including references

Note: when I say we access reference properties to trigger lazy loading, I mean that we access a non-id property of the referenced property to make sure it indeed hits the database.

The scenario is ran 500 times with regular C# objects, and 500 times with Ruby objects. We keep track of the average time of each action in the scenario, as well as the total duration of the scenario. Also, keep in mind that we ran these tests on a local database.

The following graph shows the average duration of each action in milliseconds on the Y axis, and the number of the action on the X axis:

(you can click on the graph to watch it in its full size)

Before I'll discuss these results, I'd also like to show the following graph which shows the average difference in milliseconds between the static and the dynamic execution of each action:

Two actions immediately stand out: the last two which both deal with fetching a set of items and accessing all of their properties. They're both about 6ms slower than their static counterparts, which is a performance penalty of 71% for action 14, and 87% for action 15. That deals with a part of code that we can't really optimize any more. Well, it probably is possible but we've already done a lot of work on that, and this is the best we can come up with so far.

Now, those 2 actions are things we avoid as much as possible in real code anyway, so maybe they aren't that big of an issue. The other 2 actions where there is a noticable difference (though it actually means an increase in average execution time of 1.1ms using a local database) is the creation and persistance of an object graph (step 4), and the retrieval/modification/persistence of that same graph (step 5). Most other actions don't have a noticeable difference, and in some cases the dynamic version is actually faster than the static one, no doubt because NHibernate has in some cases less work to do when using the Map EntityMode (which we rely on for the dynamic stuff) compared to the Poco EntityMode.

We also wanted to see whether the performance difference would get worse when spreading the workload evenly over a set of threads, or even a 'pool' of IronRuby engines. I was pretty happy to see that it didn't really lead to a noticeable difference.

The following graph shows the average duration of the entire scenario in a couple of different situations:

I do have to mention that the numbers shown in this graph aren't averages, but the result from running the scenario once in each situation. We did however ran the scenarios in each situation more than once, and while we didn't list the averages, the numbers are representative of each testrun... we didn't see any really noticeable differences over multiple runs. The percentage difference for each situation is shown in this graph:

As you can see, the performance penalty of the entire scenario in each situation varies between 15% and 26%.

Now, considering the fact that we prefer to avoid loading 'large' sets of data through NHibernate into entities (we prefer to use projections instead for that) we wanted to see what the difference would be for the entire duration of the scenario in each situation, without the final 4 actions. Basically, just the typical CRUD scenarios:

Now the difference varies between 6% and 15%.

Now, suppose that we have a compelling reason to actually go ahead with using this approach (we do actually, but I'm not gonna get into that here), do you think we can trust these numbers? Is there anything else we're missing? Are we complete idiots for testing the performance difference like this? Do you have any feedback whatsoever? Then please leave a comment :)

Written by Davy Brion, published on 10/19/2010 4:22:38 PM
Categories: ironruby , nhibernate , performance , ruby


« Older Entries