Author Topic: The potential time/cost benefits of improved code efficiency. (Read 4962 times)

IainB · « **on:** November 26, 2014, 06:32 AM »

With the increasingly higher speed processors and faster disk access times that we may be accustomed to nowadays, code efficiency (including, for example, execution efficiency and the utilisation of CPU secs. and I/O operations) is not necessarily such a pressing matter of concern for developers as it was in times past. So I was quite interested in reading the case study below about how relatively marginal efficiency improvements in a relatively large-scale computing platform could lead to significant time/cost savings.
(Copied below sans embedded hyperlinks/images.)

How shaving 0.001s from a function saved $400/mo on Amazon EC2 | Ben Milleare

If premature optimisation is the root of all evil, then timely optimisation is clearly the sum of all things good.

Over at ExtractBot, my HTML utility API, things have been hotting up gradually over several months; to the extent that, at peak, it’s now running across 18 c1.medium instances on Amazon EC2. Each of those weigh in at 1.7Gb memory and 5 compute units (2 cores x 2.5 units).

At standard EC2 rates that would work out at around $2.52/hr (almost $2000/mo).

Amazon states that one EC2 compute unit is the “equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor”. So that’s like having 90 of them churning through HTML; and it takes a lot of it to keep them busy.

It’s not so much the number of requests that dictates CPU load with ExtractBot, but more what the assemblies look like (think of an assembly as a factory conveyor belt of robots passing HTML snippets to each other). Now, most of our beta testers are fairly low volume right now, but one of them is a little different; over ~18 hours of each day they pump around 2.2M HTML pages into the system. In their specific assembly, each page runs through a single CSS robot and the results (~10 per page) then get fed into a further 11 separate CSS robots along with a couple of Regex robots.

If we look at just the CSS robots for now, that’s around 244 million over the course of the 18 hour run. Or to put it in a way that’s easier to visualise – over 3,700 per second.

Normally, shaving 0.001s from a function would not exactly be top of my optimisation hit list, but after looking at where requests were spending most of their time it was obvious it would make considerable difference. 0.001s on 3.7k loops means we could save a whopping 3.7 seconds of CPU time in every second of real time. To put that another way, we could effectively drop about four of our c1.medium instances, a saving on standard EC2 pricing of over $400/mo.

So, what does shaving 0.001s from a single function look like?

cpudrop_500px [the graph shows a 17% step drop in CPU utilisation]

This entry was posted in Crawler.io on September 25, 2013.

Renegade · « **Reply #1 on:** November 26, 2014, 07:43 AM »

Or, they could hire crappier programmers for cheaper, fire the expensive ones, save $8,000 a month, and not care about the $400.

Ok, uh, yeah... I'm gonna be right over there by the... <RUNS! />

That was pretty interesting though.

40hz · « **Reply #2 on:** November 26, 2014, 11:27 AM »

In and of itself, it may not be that important to some developers. But to their clients, who are increasingly buying CPU cycles from cloud providers like Amazon, it's will inevitably become a major concern. No different than identifying the most fuel efficient vehicles for their fleet purchases.

Faster processors and disk speeds only benefit you if you own those processors and disks. Since so many enterprise customers are looking to get out of owning their own hardware for a variety of reasons (some valid, some not so) I'm guessing "code efficiency" will become a significant selling point in enterprise software not too long from now.

@IainB - +1 w/Ren. It was an interesting read. Thx for sharing.

TaoPhoenix · « **Reply #3 on:** November 26, 2014, 02:18 PM »

It's just a bit sad.

Part of my quasi-infamous Ludum Dare craze was because it harkened back to times when programming was about wresting every ounce of coolness out of ailing hardware. (Really, race the beam, and soul of a new machine!?)

Now we are going all biz-y about stuff. "Ho hum, let's just hire cheaper devs."

IainB · « **Reply #4 on:** November 26, 2014, 07:08 PM »

Well, it's food for thought, isn't it?
Operational code efficiency was not only a salient point when I was learning assembler on mainframes, but also later when I was developing/supporting analysis and reporting programs written in FORTRAN (mostly for cross-tabulation, mathematical programming and financial modelling).
The advent of the conventional 3-tier client-server model tended to somewhat obscure the relevance/need for code efficiency, but it was still relevant to mainframe operations which were being used on some kind of shared service (or time-sharing) basis - which is arguably what the current cloud-based models are.

So what @40hz says is likely to be true:

In and of itself, it may not be that important to some developers. But to their clients, who are increasingly buying CPU cycles from cloud providers like Amazon, it's will inevitably become a major concern. ...
-40hz (November 26, 2014, 11:27 AM)

- i.e., it's a business issue.
For much the same reason, the TCO (Total Cost of Ownership) of an IT operation will tend to remain a business issue.

@Renegade's joke about:

Or, they could hire crappier programmers for cheaper, fire the expensive ones, save $8,000 a month, and not care about the $400. ...
-Renegade (November 26, 2014, 07:43 AM)

- would tend to be useful only in a relatively very short-term view, as, in the longer-term, it would frustrate/defeat the theoretical objective benefits of improving the processes of software development per Humphrey's CMM, and software operation per Deming's 14-point philosophy - i.e., in the former, improvement of software development process efficiency and in the latter improvement of operational software efficiency would be synergistic business objectives.

Thus "producing the optimum cost-effective capital cost and optimum cost-effective design for fuel-efficiency for fleet vehicles" - to use @40hz's analogy.

@40hz and @Renegade - I thought you might find it interesting!

Author Topic: The potential time/cost benefits of improved code efficiency. (Read 4962 times)

IainB

The potential time/cost benefits of improved code efficiency.

Renegade

Re: The potential time/cost benefits of improved code efficiency.

40hz

Re: The potential time/cost benefits of improved code efficiency.

TaoPhoenix

Re: The potential time/cost benefits of improved code efficiency.

IainB

Re: The potential time/cost benefits of improved code efficiency.