Tuesday, January 19, 2010

Cloud Computing--No Single Point of Failure

A UC Berkeley technical report on cloud computing observes:

Just as large Internet service providers use multiple network providers so that failure by a single company will not take them off the air, we believe the only plausible solution to very high availability is multiple Cloud Computing providers. The high-availability computing community has long followed the mantra “no single source of failure,” yet the management of a Cloud Computing service by a single company is in fact a single point of failure. Even if the company has multiple datacenters in different geographic regions using different network providers, it may have common software infrastructure and accounting systems, or the company may even go out of business. Large customers will be reluctant to migrate to Cloud Computing without a business-continuity strategy for such situations. We believe the best chance for independent software stacks is for them to be provided by different companies, as it has been difficult for one company to justify creating and maintain two stacks in the name of software dependability

Read more of the report for other astute observations.

Monday, January 18, 2010

Design Principles for Modern Distributed Systems

The design principles used for Amazon's S3 are generally applicable to modern distributed systems design. Quoting the S3 design principles:

Amazon S3 Design Principles

The following principles of distributed system design were used to meet Amazon S3 requirements:

Decentralization: Use fully decentralized techniques to remove scaling bottlenecks and single points of failure.

Asynchrony: The system makes progress under all circumstances.

Autonomy: The system is designed such that individual components can make decisions based on local information.

Local responsibility: Each individual component is responsible for achieving its consistency; this is never the burden of its peers.

Controlled concurrency: Operations are designed such that no or limited concurrency control is required.

Failure tolerant: The system considers the failure of components to be a normal mode of operation, and continues operation with no or minimal interruption.

Controlled parallelism: Abstractions used in the system are of such granularity that parallelism can be used to improve performance and robustness of recovery or the introduction of new nodes.

Decompose into small well-understood building blocks: Do not try to provide a single service that does everything for everyone, but instead build small components that can be used as building blocks for other services.

Symmetry: Nodes in the system are identical in terms of functionality, and require no or minimal node-specific configuration to function.

Simplicity: The system should be made as simple as possible (but no simpler).

Sunday, January 17, 2010

Elasticity in Cloud Computing

As Berkeley computer scientists have noted in a recent technical report, cloud computing (as utility) providers bring an economics where "using 1000 servers for one hour costs no more than using one server for 1000 hours." Economically, this translates into an elasticity of resources" without paying a premium for large scale ... unprecedented in the history of IT."

Interactive GAE / AWS application

Out of pure curiosity, it would be interesting to build applications with applications parts interacting with both GAE and AWS at the same time.

Monday, January 04, 2010

Building PostgreSQL on Windows

If you want to get and build PostgreSQL on Windows, try using the "git" repositories here. The repository that works with the anonymous cvs is apparently the root repository and the "git" ones are "mirrors" but it turns out that the root cvs repository has some Windows end-of-line characters that cause the build to fail. The repository you can access through "git" seems to be missing these and the build goes through. You need to ensure your cygwin installation includes readline, bison and flex.

Code Signs

About Me

Blog Archive

Links