Next step in evolution for cloud computing: Data as a Service

by Mike Sweetman
05 November 2019

During the period of the rapid development of cloud solutions, we initially got three main types of cloud systems: IaaS, PaaS, and SaaS. And then suddenly other types of cloud systems began to appear: "something there" as a service. 

These include DBaaS, CaaS, Desktop as a service, and so on. And one of the directions in the development of cloud systems has become the area of data service, Data as a Service.

Let's look in order why this kind of cloud system is needed and why it is in demand in different application areas.

Evolution of cloud computing services

The first generation of cloud computing services

Cloud systems is one of the branches of computer technology that is evolving quickly.

Cloud systems began with the introduction of Infrastructure as a Service (IaaS) and further developed in the direction of Platform as a Service (PaaS) and into the independent branch, Software as a Service (SaaS).

The second generation of cloud computing services

The second generation of cloud systems brought us very interesting solutions for data storage and processing. In this generation, for example, you can list services for storing files in cloud systems, database services in cloud systems (both SQL oriented and noSQL oriented). Along with these types of cloud systems, services for streaming audio and video, Desktop as a Service, etc. began to develop rapidly.

NoSQL databases are another type of information storage system that can run in the cloud. NoSQL databases are built to service heavy read/write loads and can scale up and down easily, and therefore they are natively suited to running in the cloud. However, most contemporary applications are built around an SQL data model, so working with NoSQL databases often requires a complete rewrite of application code. Some SQL databases have developed NoSQL capabilities including JSON, binary JSON (e.g. BSON or similar variants), and key-value store data types.

A multi-model database with relational and non-relational capabilities provides a standard SQL interface to users and applications and thus facilitates the use of such databases for contemporary applications built around an SQL data model. Native multi-model databases support multiple data models with one core and a unified query language to access all data models.

DBaaS is a type of solution that can provide database functionality to several consumers. It differs from traditional solutions in that the deployment and subsequent management of a specific DBMS instance are carried out at the user's request on the basis of self-service, and the provider provides a given level of service with payment as resources are used. In order for these resources to be available, DBaaS are supplied with service catalogs containing all the necessary attributes, and the form and content of these directories are determined by the provider.

There is no strict classification of the many options for implementing DBaaS yet, but there are poles around which different approaches are concentrated. One involves the use of some standard database (MySQL, DB2, etc.), but resident not on a specific server, but on a private or global cloud. This group of approaches is also called Hosted Database Service, and the main difference from the classical solution is the location of the DBMS server.

The third generation of cloud computing services

The third generation of cloud systems is already focused on more advanced and sophisticated methods of data storage and processing. These include, for example, Data as a Service, cloud systems for artificial intelligence, IoT cloud, machine learning, word processing, speech recognition, speech synthesis, lambda cloud computing, Container as a Service, computer vision and machine translation from different human languages.

The term DaaS (Data as a Service) appeared later than other cloud abbreviations. To better understand the place of DaaS against the general background, you need to build a single system of cloud concepts.

In the application to the data placed in the clouds and provided in the form of services, you can find several close technological solutions, including Database as a Service (DBaaS), Cloud Storage, Hosted Database Service.

To introduce DaaS, a four-level model can be proposed. It is based on storage systems - it can be well-known solutions used in private clouds, or global cloud storage systems such as Amazon S3, AT&T Synaptic Storage, EMC Atmos Online, Mezeo, Nirvanix, Rackspace CloudFiles, etc., but the next two levels, separating data and applications, are much more interesting.

The main previous step for creating this as a service was the Database as a Service direction - some cloud platforms offer a database service that can be used without a virtual machine. In this case, the user does not need to install and maintain the database on their own. 

Instead, the service provider takes responsibility for installing and maintaining the database. For example, Amazon Web Services provides three databases in their cloud service: Amazon SimpleDB (NoSQL, where data is stored in key-value pairs), Amazon Relational Database Service (SQL-oriented database with MySQL interface) and DynamoDB.

Data as a Service

The simple definition of Data as a Service we can find in Wiki and is very descriptive: “DaaS builds on the concept that the product (data in this case) can be provided on demand to the user regardless of the geographic or organizational separation of provider and consumer”.


DaaS is supported by data providers who look for data, clear it, sort it, and provide it to users. Factual, established in 2007 by Jill Elbats, was the first to take this path. He was guided by the simple idea of ​​doing something like Flickr, but for data.

Elbats decided to create an open repository where you can put a wide variety of data with the possibility of changing and supplementing it. Today, the range of data in Factual ranges from games and entertainment to information from government agencies. Access to this data is carried out manually or using an open API.

InfoChimps has taken the Factual path, but with the difference that it provides not an open repository, but a trading platform for data, offering both paid and open data sets. InfoChimps specializes in several areas, primarily the collection and analysis of data from social networks, and its strength is analytics. The second direction is data that has a geographic reference. Third is market analysis.

The new approach combines three types of data that are uniquely customized for each individual company:

  1. Primary data. First=party data, which are combined with third- party data. As for the latter, these special data sets that are difficult to find are combined from hundreds of Big Data sources and go far beyond third-party data. As an example, you can take highly specialized data sources or user interests in the field for specific application areas.
  2. Embedded data. This is offline data that has been converted to addressable online data. This type of data provides new opportunities for attracting customers and opens up prospects in a constantly changing digital universe. Targeted ads can be shown to specific customers or audience segments.
  3. Fast data. Real-time data on user behavior that determines the intention to make a purchase. An example is a status in social networks.

SafeGraph’s President and co-founder Brent Perez likes to remind us that there are three pillars of data businesses are Acquisition, Transformation, and Delivery. For instance, 

Specific categories of information from Big Data.

In cloud-based systems of data as a service, the process of data storage plays an important role. The information generated from Big Data to DaaS can be divided into six specific categories:

  • Web mining. This is data that is freely available on the Internet. This category of data collection includes automated processes for detecting and extracting information from web documents and servers, including the extraction of unstructured data: information extracted from server logs, information about user activity from browsers, information about the site structure and links, or data obtained from content and documents.
  • Search data. Information obtained as a result of user search activity in the browser. This data identifies digital audiences using online identifiers assigned to each user.
  • Social networks. The average Internet user spends about two and a half hours a day on social networks.  Thanks to this, companies get access to a huge array of data, which is based on the personal preferences of consumers, likes, registrations, comments, and reposts.
  • Crowdsourcing. Data that is collected from various sources, including large communities, forums, polls, studies, etc.
  • Transaction data. Data generated in the course of business activity - purchases, requests, insurance claims, deposits, cash withdrawals, airline reservations, purchases with a credit card, etc.
  • Mobile. Mobile data drives the biggest data growth. This is not only information about the use of smartphones and user preferences regarding certain models, but also information obtained using mobile applications or other services running in the background.

Advantages of Data as a Service

The following advantages are very clear for almost all types of users:

  • Agility – Customers can move quickly due to the simplicity of the data access and the fact that they don’t need extensive knowledge of the underlying data. If customers require a slightly different data structure or have location-specific requirements, the implementation is easy because the changes are minimal.
  • Cost-effectiveness – Providers can build the base with the data experts and outsource the presentation layer, which makes for very cost-effective user interfaces and makes requested changes at the presentation layer much more feasible.
  • Data quality – Access to the data is controlled through the data services, which tends to improve data quality, as there is a single point for updates. Once those services are tested thoroughly, they only need to be regression tested, if they remain unchanged for the next deployment.
  • The attractiveness of DaaS to data consumers, because it allows for the separation of data cost and data usage from the cost of a specific software environment or platform.
  • More recently, Data as a Service solutions offered by leading vendors (MuleSoft, Oracle, Microsoft) help organizations more rapidly ingest large volumes of data, integrate that data, analyze the data and publish the data to business users in real-time using Web service APIs that adhere to the REST architectural constraints (also known as RESTful API).

A huge amount of data often complicates the task of finding really useful information. One solution for this is Oracle Data Cloud which has proven effective and powerful built-in tools. No matter what type of organization you have, Oracle Data Cloud will help you collect and analyze data using technology solutions. Thanks to them, it will be possible to better understand the processes and analyze the state of the necessary parameters.

Oracle Data Cloud aggregates and analyzes consumer data powered by Oracle ID Graph across channels and devices to create cross-channel consumer understanding.

In this video “Introducing Enterprise Data As A Service” George Clamp (Storage Switzerland) and David Chang (Senior VP, Co-Founder, Actifio) explain the importance of using Data as a Service in enterprises. “One of the missing elements of the "IT as a Service" model is the crown jewel of IT, data. Data is the information lifeblood of most organizations and it needs to be delivered as a service.  Watch as we ChalkTalk through why organization's need to develop a "Data as a Service" strategy and how to actually set that service up”.

Conclusion

And so, in modern conditions, cloud services are increasingly more and more replacing the usual methods of obtaining and processing information with their services. This applies to both databases and data as a service.

Cloud systems for data solve the main problem for the user, supplying the necessary information in the right form. And the more infrastructure is transferred from the client-side to the cloud, the more convenient its use will be.

Users immediately felt the benefits of cloud-based data systems as a service, and this class of cloud systems has already become entrenched in real systems to solve a variety of problems.

Related articles.