Me, Myself, and I

Tuesday, September 9, 2014

Callback Hell Awaiting ES6 Heaven - Promises (Hebrew)

A lecture I gave on Callbacks, promises, and async/await style programming in Node.js.

Wednesday, April 23, 2014

Finally - a concise post explaining what I feel about TDD

http://david.heinemeierhansson.com/2014/tdd-is-dead-long-live-testing.html

Superb quote:

The current fanatical TDD experience leads to a primary focus on the unit tests, because those are the tests capable of driving the code design (the original justification for test-first).

I don't think that's healthy. Test-first units leads to an overly complex web of intermediary objects and indirection in order to avoid doing anything that's "slow". Like hitting the database. Or file IO. Or going through the browser to test the whole system. It's given birth to some truly horrendous monstrosities of architecture. A dense jungle of service objects, command patterns, and worse.

Yes!
And what is the answer?

I rarely unit test in the traditional sense of the word, where all dependencies are mocked out, and thousands of tests can close in seconds. It just hasn't been a useful way of dealing with the testing of Rails applications. I test active record models directly, letting them hit the database, and through the use of fixtures. Then layered on top is currently a set of controller tests, but I'd much rather replace those with even higher level system tests through Capybara or similar.

I think that's the direction we're heading. Less emphasis on unit tests, because we're no longer doing test-first as a design practice, and more emphasis on, yes, slow, system tests. (Which btw do not need to be so slow any more, thanks to advances in parallelization and cloud runner infrastructure).

Yes!

And again, yes!

Sunday, March 30, 2014

Towards a better programming

The other day, I came to the conclusion that the act of writing software is actually antagonistic all on its own. Arcane languages, cryptic errors, mostly missing (or at best, scattered) documentation - it's like someone is deliberately trying to screw with you, sitting in some Truman Show-like control room pointing and laughing behind the scenes. At some level, it's masochistic, but we do it because it gives us an incredible opportunity to shape our world.

-- Chris Granger, Towards a better Programming

While it seems that Chris is trying to do what others have already tried, and failed to do, (intentional programming, anybody?) Chris' description of the programming universe is spot on!

But I wish him luck! And hopefully he'll succeed.

Friday, April 20, 2012

Elements of the CloudShare Universe

(cross-posted from http://blog.cloudshare.com/2012/04/18/elements-of-the-cloudshare-universe/)

After discussing CloudShare's development process, I wanted to delve into the subject of testing, and try and provide a glimpse of how testing is accomplished on the CloudShare development team.
As I started writing the post, I stopped almost immediately - how could I explain how we're testing, without first explaining what we're testing? Do I really want to put the cart before the horse?
Nope. I want to explain how CloudShare works before explaining how we test things. So, without further ado...

How CloudShare Does its Thing

There are five main components that make CloudShare tick:

The Frontend (we call it “UIBL” - weebill - for “UI and Business Logic”). This is the web app everybody uses to setup their environments and use them. It also handles all payment, licensing, and permission stuff.
The Backend. This is the code that does everything behind the scenes – creates the environments, runs them, suspends them, publishes, and other stuff like configuring the networking.
The VM Infrastructure. This is the software that runs the virtual machines themselves. This was not written by CloudShare – we use VMware for this purpose.
The Hardware. These are servers that run the VMware software. Each server runs several virtual machines.

Let’s discuss each of them. But I’ll discuss each of them in a bottom-up fashion – I’ll start from the bottom and work my way up.

The Hardware

Since this is primarily a post about software, then from a simplistic point of view the following is good enough - we have lots and lots of servers, with lots and lots of CPU-s, and these servers have lots and lots of memory and are connected to lots and lots of disks with lots and lots of terabytes of space. And what are we running on the hardware? VMware – our Virtual Machine infrastructure.

The VM Infrastructure

VMware is one of the leading VM software vendors. The software enables us to run multiple virtual machines on a server – with each one virtual machine unaware that it is running alongside the others. This allows us to share the our servers amongst multiple users (you didn’t really thing you get a real machine each time you create a machine in an environment, right?). VMware allows us to create such a machine, configure its CPU, Ram, Disk, and networking layer. Finally it allows us to run it, suspend it, reboot it, and connect to it via SSH, console, or RDP. So if we have this wonderful software that can create and manage the virtual machines, why do we need the Backend?

The Backend

We need the backend because somebody has to tell VMware what to do. Somebody has to say – “oh, this user requested a Windows 7 Virtual Machine connected to a SharePoint Server? And it wants to run it from now for two hours?”. “Oh,” says the Backend, “OK. I’ll just tell copy our Windows 7 template hard drive and memory, and the SharePoint Server hard drive and memory. And after that is done, I’ll tell VMware about the copied machine, and ask it to run it.” “Oh,” says the Backend, “I totally forgot! I need to figure out on which server to run the copied machines, and to configure the network so that they won’t see any other machine except these two.” The Backend’s job is is not easy. Because we live in an imperfect world - all these operations, they sometimes fail. Just like that. So it needs to retry the operations. And it needs to check that a machine that should be (for example) running, is in fact still running, and if not – rerun it. We call this capability “self healing”. But that’s an incredibly interesting (and challenging!) topic, so I’ll leave it to another blog post. The Backend is a very dynamic piece of code, so we use a very dynamic piece of technology – the Python Programming Language. But who tells the backend what the user wants? Well, that’s the job of the user, through the Frontend.

The Frontend

The Frontend is the GUI. It’s a WebApp. Written in C#, and using ASP.NET MVC as its Web framework. It shows the GUI that you all know about – the GUI that allows you to create and run the virtual machines. And if you’re a CloudShare ProPlus user, it also handles your payments. In the end, it communicates your wants and desires to the Backend, telling it what the users want. How does it communicate that? Well, I’ll discuss it in a future post, but suffice to say that it uses a database for that.

That’s It

That’s it. This is how CloudShare does its thing. Now I can rest easy, and in the next post, I'll discuss how we test all those lines of code in the Backend and in the FrontEnd. (image by ralphbijker, under Creative Commons License)

Monday, March 12, 2012

The CloudShare Agile Development Process

Agile, agile, agile. As a software as a service company, our customers expect us to be agile. Agile in responding to customer support, but also agile in responding to feature requests, and agile in adding functionality to the base product.

Customers today expect monthly enhancements to their services on the web. They expect that they will be getting more for their money than they did the month before. And they expect that the service today will be faster than it was a month ago.

This blog post (and others that will follow) explains how CloudShare delivers the incremental functionality that customers expect to get, on a month by month basis, without impacting the core stability of the service.

The development process in CloudShare rests on three legs:

Staggered version releases every two weeks
Automated branch merging
Aggressive unit testing
Final Testing

(well, OK, four legs.)

Staggered Version Releases

So how does the development team release a version every two weeks? By doing staggered versions. And what are staggered versions? Simple – the versions that the development team always work on in parallel - the development team always works on 3 versions at once.
(I can hear you saying – “and that’s simple?”. Well, bear with me.)

Let’s say we’re a new company, where the service was never deployed to the Internet. No customers whatsoever. The development team is developing the first version, which we’ll call A.
After two weeks of development, the developers pass it on to final testing, and start developing version B.

So now we have version A in final testing, and version B in development. Now version B is being developed, but version A is being tested. Invariably, final testing finds bugs (not many, mind you, but we’ll get to that when we discuss aggressive unit testing). So, while the developers are developing version B, they are also fixing bugs in version A.

Finally, after two weeks of testing and development, version A is ready to be deployed on the Internet, version B is ready to be tested, and a new version, C, is starting to be developed.

So in the next two weeks, development has to contend with three versions – C, being developed, B being tested, and A, in production. All three of these may need coding. Version C, obviously, is being coded and developed. Version B – is being changed to fix the bugs found by testing, and version A – to patch the production code if some horrible bug was found (don’t worry – it’s rare that this happens).

So there you have it – three versions that are being worked on in parallel – the dev version, the test version, and the prod (production) version.

Now comes the interesting part – when the developers start working on version D. This is also when we start testing version C, and when version B is being deployed on the Internet.

You can work it out by yourself (answers in the bottom of the blog). After two weeks, the developers start working on version ___1, we are starting to test version ____2, and version ____3 is being deployed on the Internet.

To summarize – we are always working on three versions – most of the work is on the dev version, a bit of work is on the test version, and in rare circumstances, we patch the prod version.

Automated branch merging

“Ha ha!”, you say, “I got you now: this procedure looks really good in theory, but in practice, each change you do for the test version, you also need to do for the dev version!” And it gets worse! If we patch the prod version, we need to do the same patch for the test version and the dev version.
As they say:

Yes, inconceivable, unless you have automated branch merging! Using this technique, we use our source control tools to automatically merge the changes we did in one branch (for example, the change we did in the qa branch that fixes a bug that QA found), to another branch (the dev branch). All this happens automatically, without the developer needing to do anything.

The beauty of this approach is that if the versions were months apart, this would never have worked, but since the versions are a maximum of two weeks of changes apart, most of the times the merge will succeed without any problem. And, in the rare case where the automatic merging fails, the developers get an email asking them to do the merge manually, which they do.

But how do they know that the merge works? Especially when the merge is automatic, how do we know that it worked? This brings us to the second leg of our agile development process…

Aggressive unit testing

How do developers know that the code they wrote is working? The obvious answer – test it – works. But how would we know that the code we wrote did not break other stuff? The same answer – test everything – won’t work. There is just too much to test, and too little time. And throwing it at the testing team to check is also not going to work. Any bug found by final testing costs a lot of time for the team. Time better spent on coding.

So what should a developer do? Well, we do what any professional developer does – automate. This is a recurring theme in the CloudShare development team (and in any agile team) – if you want to be agile, you need to automate.

How can we automate testing? The answer today is simple – automated unit testing. I could discuss CloudShare unit testing for hours, and I will dedicate a blog post about this incredibly important subject, but suffice right now to say that we have tests for every conceivable customer scenario (and lots of inconceivable customer scenarios, just in case the inconceivable becomes conceivable).
Unit tests, in CloudShare, are a safety net – they make sure that even if a developer does something wrong, the net will catch the mistake. And it is not only the developer’s responsibility to run all the tests – the tests are run every day automatically on the three branches (dev, test, prod). We have tens of machines just waiting to run these tests in parallel, so that feedback about the change the developer did gets to the developer as quickly as possible.

These unit tests are fast – a developer can get feedback about a change they did in about 15 minutes. But they are fast for a reason – they test things programatically, and not through the browser UI. They also do a lot of tests without using the file system, or the cloud infrastructure.
So what happens when a bug was created in the UI, or in the way we use the cloud infrastructure? Or the file system? This is where we reach our last leg – and the most important one:

Final Testing

Our Quality Assurance team - the testing team - gets a very debugged version. It has passed all the unit tests, and they are pretty confident that most of the functionality will work. But still we test – for the reason mentioned above, and also to verify that nothing has slipped through the cracks.

Our Quality Assurance engineers test each of the new features that were implemented. But they also test all the other functionality. Given that we have two weeks to test a version – new functionality and old functionality – we also have to resort to automation here. The testing team has a comprehensive system of tests that run the system just like the customer does – through the browser, and using the same infrastructure as in production.

Final Words

Agility is not just getting new functionality quickly out of the door. It is building a software development process, a company culture, that supports agileness. A company culture that understands that you have to work on multiple versions at once, build tools to enable this, and above all – aggressively test, test, test.

1 E
2 D
3 C