Cloud computing

From Citizendium
Revision as of 18:16, 28 February 2010 by imported>Howard C. Berkowitz
Jump to navigation Jump to search
This article is developed but not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable, developed Main Article is subject to a disclaimer.

Cloud computing refers to accessing computing resources that are typically owned and operated by a third-party provider on a consolidated basis in data center locations. It is aimed at delivering cost-effective computing power over the Internet, including virtual private networks (VPN) or even virtual private line networks (i.e., Layer 2 VPN) mapped onto facility providers. Consumers of cloud computing services purchase computing capacity on-demand and are not generally concerned with the underlying technologies used to achieve the increase in server capability.

In terms of the problem it solves, it is less new technology and more "a new deployment model." [1] Many commercial cloud offerings, however, are really no more than traditional managed hosting being marketed as clouds. While much of the trade press is legitimately concerned with security in the cloud, the issue of disaster recovery may be even more important. Several major cloud providers have had significant outages, which proved to be localized to a single physical data center — they had not exploited the inherent failover capabilities of some of the technologies that enable clouds.

It has similarities to a number of network-enabled computing methods, but some unique properties of its own. The core point is that users, whether end users or programmers, request resources, without knowing the location of those resources, and are not obliged to maintain the resources. The resource may be anything from an application programming interface to a virtual machine, on which the customer writes an application, to Software as a Service (SaaS), where the application is predefined and the customer can parameterize but not program. Free services such as Google and Yahoo and Hotmail are free SaaS, while some well-defined business applications, such as customer resource management as provided by Salesforce.com, are among the most successful paid SaaS applications. PayPal and eBay arguably are SaaS models, paid, at the low-end, on a transaction basis.

"What goes on in the cloud manages multiple infrastructures across multiple organizations and consists of one or more frameworks overlaid on top of the infrastructures tying them together. Frameworks provide mechanisms for:

  • self-healing
  • self monitoring
  • resource registration and discovery
  • service level agreement definitions
  • automatic reconfiguration

"The cloud is a virtualization of resources that maintains and manages itself. There are of course people resources to keep hardware, operation systems and networking in proper order. But from the perspective of a user or application developer only the cloud is referenced[2]

It is, by no means, a new concept in computing. Bruce Schneier reminds us of that it has distinct similarities in the processing model, although not the communications model, with the timesharing services of the 1960s, made obsolete by personal computers. " Any IT outsourcing -- network infrastructure, security monitoring, remote hosting -- is a form of cloud computing."

The old timesharing model arose because computers were expensive and hard to maintain. Modern computers and networks are drastically cheaper, but they're still hard to maintain. As networks have become faster, it is again easier to have someone else do the hard work. Computing has become more of a utility; users are more concerned with results than technical details, so the tech fades into the background.[3]

Offerings

There is no single industry-accepted definition.[4] Some services broker extra capacity available on enterprise servers, as well as resources in pools of managed virtual servers. Others sell capacity on virtual servers. Yet others include any external computing resource, even to outsourced backup services, within the definition.

While the details of the service vary, some common features of sizing apply:

  • Separation of application code from physical resources.
  • Ability to use external assets to handle peak loads (not having to engineer for highest possible load levels).
  • Not having to purchase assets for one-time or infrequent intensive computing tasks.

These definitions are converging, however, on three or four major models:

  • Software as a Service (SaaS): the user interface is a human interface, usually a GUI, or a structured data exchange using XML or industry-specific file formats
  • Platform as a Service (PaaS): customers program the cloud, but at a relatively high level, such as Web Services
  • Infrastructure as a Service (IaaS): customers program the cloud, at a low level, either by a guest copy of an operating system on which they program, or, in some cases, at the low level of emulated hardware (e.g., block structured disk drivers)
  • Data as a Service (DaaS): The cloud is more a repository than an active programming environment; the interface is to files, databases, or backup formats

Software as a Service

Free email and search engines are SaaS, with limited customization, and sometimes premium offerings with more functionality. Salesforce is a prominent SaaS application, which merges into Platform as a Service by hosting third-party applications.

There are two different types of SaaS customers. The first only pays a nominal fee for these services -- and uses them for free in exchange for ads: e.g., Gmail and Facebook. These customers have no leverage with their cloud providers; there is very little recourse in the event of failures. The second type of customer pays considerably for these services: to Salesforce.com, MessageLabs, managed network companies, and so on. These customers have more leverage, providing they write their service contracts correctly. Still, nothing is guaranteed.

Quite a few business services are really SaaS, such as PayPal and eBay; they are services to facilitate transactions between users. The creation of various credit card and check payment features are examples of how SaaS can be customized without programming.

Since the workload per user is highly predictable, billing for SaaS tends to be on a "per-seat" basis of a charge per registered user per month.

Platform as a Service

Similar to SaaS in that the user interface is web-based, but differing in that programming, at a higher level of abstraction than in utility computing, is necessary. Some are restricted to an business function specific set of APIs (e.g., Strike Iron and Xignite) to a wider range of APIs in Google Maps, the U.S. Postal Service, Bloomberg, and even conventional credit card processing services. Billing for PaaS is most commonly on a per-transaction model.

Salesforce.com now offers an application-building system with this term, called Force.Com, as well as its original SaaS products. [5] Google App Engine is here. Mashup-specific platforms include Yahoo Pipes or Dapper.net.

StrikeIron, for example, might be considered a computer-to-computer mashup, integrating external data bases with enterprise data, and combining them within a common business functions such as call centers, customer resource management and eCommerce. It refers to its offering as "Data as a Service", and accepts some of the data sources described above, such as the U.S. Postal Service. [6]. Xignite is more specific to the financial industry, retrieving data such as stock quotes, financial reference data, currency exchange rates, etc.

Web service clouds

See also: Web services

The APIs come from the framework of Web services.

Java clouds

Some may consider this Infrastructure as a Service, but Java services also are offered in clouds. There is differentiation among the offerings. Nikita Ivanov describes two basic approaches, which are not mutually exclusive, but different products tend to have one or the other dominate. [7] The first is much like the way a traditional data center is organized, where the developers have little control over infrastructure. "The second approach is something new and evolving as we speak. It aims to dissolve the boundaries between a local workstation and the cloud (internal or external) by providing relative location transparency so that developers write their code, build and run it in exact the same way whether it is done on a local workstation or on the cloud thousands miles away or on both."

Cloud integration

Inter-cloud linkage, perhaps to allow business-to-business rather than user-to-service functionality; this has also been called Enterprise Service Bus. Vendors in this space, such as Rearden Commerce and Ariba, are brokers between customers and service providers.

Cloud integration comes, at least in part, from virtualization vendors. A Gartner Group analyst, Cameron Haight, says it is really several years away, with issues such as how "one cloud provider can consume the metadata associated with a virtual machine from another vendor," the metadata describing the service requirements of that virtual machine. There is a controversy over the business approach taken by VMWare, whose management tools will support only its own hypervisor, as opposed to the more general approach of Citrix and Microsoft.[8] Red Hat is also moving into the virtualization market, with cloud integration, through the open source project called DeltaSource.org, to facilitate private-public cloud integration. [9]

There are over 1,000 vendors of VMware's vSphere, including AT&T, Savvis and Verizon Business. VMware is offering its vCloud API to the Desktop Management Foruml, which they say is responsive to open standards. An industry analyst, Chris Wolf of the Burton Group, said that making the API available without the infrastructure is marketng, not interoperability.[8]

Infrastructure as a Service

Amazon.com, Sun, IBM, and others who now offer storage and virtual servers that IT can access on demand to virtual application servers, while others build virtual datacenters with multiple servers (e.g., 3Tera's AppLogic and Cohesive Flexible Technologies' Elastic Server on Demand. Liquid Computing's LiquidQ offers similar capabilities, enabling IT to stitch together memory, I/O, storage, and computational capacity as a virtualized resource pool available over the network. The charging model is based on the use of resources, such as processing time, memory, and mass storage.

Amazon Elastic Compute Cloud (Amazon EC2) is a cloud offering on which customer developers write application on a wide assortment of virtual machines, which the customer builds from choices among operating systems, data bases, web servers, etc. [10]

The first category tends to be for highly interactive applications, so the computational load is less on-demand and less dynamic. The cloud extensions, therefore, tend to be things such as consoles, plugins, and management consoles that either run or do not run, rather than require resource tuning. Examples: RightScale, GigaSpaces, ElasticGrid

The second does not require human interaction, so can be more on-demand and need more sophisticated resource management. Applications here might be MSP infrastructure outsourcing, data mining, etc. Examples: Google App Engine (for Python), GridGain

Some cloud computing applications do try to replicate compute-intensive supercomputer applications using highly distributed parallel processsing. [11]

Physical infrastructure as a service

A few data center vendors are targeting cloud providers, allowing them to dynamically configure systems of physical servers, as required by their customer demand, but not requiring they construct a data center.

Software infrastructure as a service

While not necessarily marketed as clouds, certain infrastructure services, such as Domain Name System, are available as on-demand services. Messaging services that may be outsourced are broader than email, especially when there are regulatory requirements for archiving, audit or security.

Messaging

Messaging services are inherently distributed, and now extend to service beyond email. When sent in a corporate context, for example, instant messaging services may be less formal to the user than email, but still need to be archived for possible litigation or law enforcement discovery. It is useful to separate the problems that outsourcing can help, from the packaging of outsourced services.[12]

Messaging service providers list some of the reasons for outsourcing, although it must be remembered that this is marketing. For example, some say they will not have configuration errors, but no in-house or outsourced service is perfect. What is true is that an provider that has full-time configuration specialists is less likely to have an error than a small business with a part-time administrator, but the comparison is more difficult for a large enterprise that maintains a skilled in-house staff. The MSP decision is most often based on economy of scale, although it also may consider CAPEX and OPEX, especially when having to meet new regulatory requirements.

  • In-house unexpected failures: these are unplanned outages due to errors and disasters
  • Data loss windows: depending on the system, there may be a shutdown while a backup is taken, or a backup may not capture the most recent messages; MSPs say they are protected against this because they have multiple levels of offsite service. An enterprise, however, can have multilevel storage, perhaps with a mixture of archives and caches. Multilevel backup may be too complex for a small business.
  • Staffing costs
  • Need to rearchitect at various levels of scaling or component life
  • Optimal copies (i.e., deduplication)
  • Validating that archives are tamperproof
  • Rearchitecting due to operational or legal changes in data retention requirements
  • Unpredictable deployments
  • Too much local customization with support knowledge in a few minds
  • Limited capabilities of Exchange Server

There are a variety of good reasons to outsource messaging security functions such as screening for malware and spam, although such outsourcing must fit the enterprise trust and accountability model. Often, different appliances or services are involved with the malware and spam functions, although they may still come from the same MSP. Examples include:

  • Barracuda networks, a mixture of onsite appliances and outsourced services[13]
    • Barracuda Spam and Virus Firewall [14]
  • Dell EMS Email Security, an outsourced service[12]
  • Google Postini, an outsourced service[15]
    • Malware services[16]
    • Spam and message security services[17]
Data as a Service

Cloud storage is a model of networked data storage where data is stored on multiple virtual servers, generally hosted by third parties, rather than being hosted on dedicated servers.[18] There are cloud storage service for the small and home office market, such as Carbonite (backup)[19] and MozyHome (backup).

Even the consumer services differentiate, from being a "flash drive in the cloud" to offering encryption, incremental backups, data sharing, etc.

Cloudsafe points out that for digital content, such as video, one uses an application to access it, not the file system. "If your application is housing your metadata, and only object storage is required, the Simple Object interface can be accessed with either a Java SDK, or HTTP/REST API... the resulting object ID is stored directly within the application." Filesystem interfaces are available if needed, as with loading content.[20] There are high-end massive data backups, as well as federated data bases.

Messaging archives are another form of DaaS. A basic archive simply provides for backup. A more complex service, which includes not only infrastructure services but specialized end user services such as legal compliance (e.g., compliance and discovery) can be delivered with clouds.

  • Barracuda Message Archiver [21]
  • Dell, for example, offers a "Rapid Archive", and an "Enterprise Archive" meant for large companies
  • Postini [22]

Especially with a distributed workforce, it may be wise to have messaging service that will continue to operate if a customer-owned server fails, or even if the data center is put out of service by a disaster. Various commercial services provide backup email services with standard protocols such as Simple Mail Transfer Protocol, Post Office Protocol, and Internet Message Access Protocol; proprietary servers with mixed proprietary and standard protocols such as Microsoft Exchange; and end user access such as webmail, Blackberry or other personal devices and text-to-speech. This may be done purely for disaster recovery, but also can be for legal reasons of compliance or discovery, and for cheaper storage of old data.

  • Barracuda Backup Service[23]
  • Dell EMS Continuity Service

Underlying architecture

Internally, cloud computing almost always uses several kinds of virtualization. The application software will run on virtual machines, which can migrate among colocated or networked physical processors.

For full generality, Citrix CEO Simon Crosby said customers shouldn't have "to ask the cloud vendor whose virtual infrastructure platform was used to build the cloud. Citrix supports the Xen open source hypervisor. Amazon and Rackspace use Xen. Network World does suggest that Citrix, with its smaller market share, must support other hypervisors including VMware and Microsoft, while the reverse is not true.[8]

The customer interface to the VM may be deliberately machine- and OS-independent (e.g., Java virtual machine), or virtualized operating system instances.

From the provider perspective, some of the advantages of offering cloud, rather than more conventional services, include:

  • Sharing of peak-load capacity among a large pool of users, improving overall utilization.
  • Separation of infrastructure maintenance duties from domain-specific application development.
  • Ability to scale to meet changing user demands quickly, usually within minutes

For reasons of commercial reliability, however, the resources will rarely be consumer-grade PCs, either from a machine resource or form factor viewpoint. Disks, for example, are apt to use Redundant Array of Inexpensive Disks technology for fault protection. Blade server, or at least rack mounted server chassis will be used to decrease the data center floor space, and often cooling and power distribution, complexity. These details are hidden from the application user.

The cloud provider can place infrastructure in geographic areas that have reduced costs of land, electricity, and cooling. While Google's developers may be in Silicon Valley, the data centers are in rural areas further north, in cooler climates.

Security and trust

See also: Federal Information Security Management Act of 2002

Security, which includes availability, confidentiality and integrity may well be the greatest obstacle to deploying cloud technology. It simply may not be possible for the user to audit and control certain security mechanisms in the cloud. There is a spectrum of risk-benefit: few would worry about the read-only webcam that shows a view of the nearby harbor being on any cloud; few would accept a military system that controls the use of nuclear weapons being on other than a highly isolated network.

In other words, it is probably realistic to say clouds, as of early 2010, are secure enough for some missions but not others. The security implementations for some purposes simply are not sufficiently mature. Trust and audit, however, is a broader issue than security controls alone.

Trust, however, is not only a cloud issue. Alan Murphy points out that to get the benefits of clouds, one has to trust the providers for certain things, but doing so continues a trend in information technology. "I have to modify my level of trust, and apply new and stronger safeguards to the rest of my workflow processes (personal and professional) to make sure I’m able to recover if/when there is a massive breach that’s beyond my control. My recovery is something I can control, and I definitely trust myself." In his early work, he did detailed on-site audits of traditional physical data centers.

What I took from that multi-year experience: It’s extremely expensive to conduct these types of audits, and at some point the liability baton is passed to the people actually implementing the technology, away from those who designed it. I could interview people all day, and spend weeks walking through their network, but once I left the premises and filed my report, it was up to them to stick to those procedures. We had to trust (in our case legally) that what I saw remained in place...in the cloud model, we have to trust so many new components in the stack. Of course we can have safeguards (SSL) and checks and balances (pen-tests, people who responsibly disclose security flaws) but at a minimum, those require near unfettered access to systems that are no longer in our control and require knowledgeable people to address them. In my auditing days I had unfettered access, during a specific window of time. Once I was done my access went away.[24]

Various sectors and industries have compliance requirements, such as Payment Card Industry Data Card Standard (PCI DSS), HIPAA in health care, and FISMA in the U.S. Government. There are no general answers if cloud computing can be trusted for compliance, but analysis may show some customer-cloud combinations where it can, and some when it cannot. Several vendors have said they either are PCI DSS compliant, or, like Amazon, “in the process of, and will continue our efforts to obtain the strictest of industry certifications in order to verify our commitment to provide a secure, world-class cloud computing environment.” [25] Savvis describes PCI compliance in some detail; [26] Terremark states it is compliant but does not go into detail.

References

  1. V. Bertocci (April 2008), Cloud Computing and Identity, MSDN
  2. Kevin Hartig (15 April 2009), "What is Cloud Computing?", Cloud Computing Journal
  3. "Cloud Computing", Schneier on Security, June 4, 2009
  4. Eric Knorr, Galen Gruman (7 April 2008), What cloud computing really means
  5. Force.com Platform, Salesforce.com
  6. Solutions, StrikeIron Data as a Service
  7. Nikita Ivanov, Java Cloud Computing - Two Approaches, GridGain Computing Platform
  8. 8.0 8.1 8.2 Jon Brodkin (31 August 2009), "VMware cloud initiative raises vendor lock-in issue", Network World, p. 1, 19
  9. John Fontana (31 August 2009), "Red Hat targets heavyweights in virtualization, cloud computing", Network World, p. 1, 24
  10. Amazon Elastic Compute Cloud (Amazon EC2), Amazon.com
  11. Aaron Ricadela (16 November 2007), "Computing Heads for Clouds", Business Week
  12. 12.0 12.1 White paper: Solving On-Premise Email Management Services with On-Demand Services, Dell Modular Services, 2009
  13. Barracuda Products Overview, Barracuda Networks
  14. Barracuda Spam and Virus Firewall, Barracuda Networks
  15. Hosted security and archiving services for your business, Google Postini
  16. Protect your business with web virus and spyware protection, Google Postini
  17. Protect and secure your existing email system, Google Postini
  18. Lucas Mearian (13 July 2009), "Consumers find rich array of cloud storage options: Which online service is right for you?", Computerworld
  19. About Carbonite, Carbonite Computer Company
  20. Object Storage, Cleversafe
  21. Barracuda Message Archiver, Barracuda Networks
  22. Simplify email retention with a central archive, Google Postini
  23. Barracuda Backup Service, Barracuda Networks
  24. Alan Murphy, "Cloud Computing: a New Level of Trust", Virtual Data Center
  25. Rich Miller (2 January 2009), "Can Cloud Computing Handle Compliance?", Data Center Knowledge
  26. Payment Card Industry (PCI) Data Assessment Solutions, Savvis