What's Data?
If you're asked the question "What's Data?", what could possibly be your answer? Though it's somewhat a confusing question to answer, I would call data as a form of representing knowledge, experience, observations, statistics, facts, concepts, etc that could further be formatted or presented in an orderly manner to be used in decision making, pattern identification, etc. As per the definition given to it, it's clear that data could be one of the most obvious things out of the stuff that we deal with, in our day to day life and it also implies the fact that every bit of moment that we spend in our lives can be converted into some form of data.
Though the term "data" has such a generic meaning, we often use it to refer a small subset of data which is constrained by some context associated with it. For example, if we consider some organization engaged in some form of a business, that particular organization would mostly be interested in "data" related to the context of the business that they are involved in. Not only the organizations but also the individuals, applications and various other entities spanned across the world amass data of their own contexts to analyse and find certain patterns, keep track of the older activities, or do whatever the tasks they are interested in carrying out. There comes the need of having a mechanism to store those data in a meaningful manner together with the ease of access. The myriad data storage mechanisms ranging from Relational Database Management Systems to NoSQL databases, data ware houses, legacy systems, document repositories, google and Excel spreadsheets, CSV, etc which are currently available in the world, were originated to cater that requirement. And almost all the enterprise bodies use one or a combination of the aforementioned data storage mechanisms to fulfill their various data manipulation needs.
What's Data Federation?
As the term itself implies, Data Federation usually refers to the integration of data scattered across numerous types of data sources into some sort of form which makes it easy to access. Before the introduction of the concept of Data Federation, the most commonly used practice was to, first copy the relevant data into some other additional storage space and then carry out the integration based on the previously described copied chunk of data. But the bottlenecks encountered while doing so, such as copyright infringements when copying data, the need of additional storage space, led the way out to find better alternatives that possess the potential to avoid such bottlenecks. Among such alternatives, the concept of Data Federation could be considered the most advanced and efficient solution which makes it possible for various organizations to collect and process data scattered across their various data sources efficiently.
How does the WSO2 DSS fit in?
If we delve into the enterprise data solutions that are currently available in the market which offers its users with Data Federation functionality, WSO2 Data Services Server comes handy with its capabilities over the Data Federation as it supports a wide range of data source types to be federated varying from Relational Database Management Systems (RDBMS) such as Mysql, Oracle, MSSql, Postgres, H2, Derby to tabular data sources such as Google Spread Sheets, Excel Spread Sheets, CSV, etc. In WSO2 Data Services Server, the users are provided with the functionality of manipulating data stored in multiple types of data sources and present them to the user with an unified format.
In WSO2 DSS, this is implemented by using two main functionalities, namely,
In WSO2 DSS, this is implemented by using two main functionalities, namely,
1. Multiple data source support.
2. Nested query support.
3. Export parameter support.
1. Multiple data source support
Multiple data source support is another enticing feature available via WSO2 Data Services Server which enables users to define multiple database configurations within the same data-service descriptor. The following diagram depicts how it's done using a sample descriptor. There, each database configuration is given an id to uniquely identify the data source and this particular id will be later used in the process of integrating the data extracted out from various types of data sources together.
2. Nested Queries
Nested queries can also be considered another vital feature used in the process of data federation which carries out the real integration of the data queried from different types of data sources together. In other words, this makes it possible for a particular data service query to feed the result obtained after the execution of that particular query, as an input to some other query and eventually integrate both results to an unified format before presenting it to the user. The following diagram depicts the configurations of such sample data service queries and how they are integrated together to from a nested query which could be used in the process of data federation.
3. Export parameter support
With this particular feature, the user is given the ability to export values of the output parameters of a particular query to be used in another query.
Having discussed about the bits and pieces of the Data Federation implementation of WSO2 DSS, let's delve into some practical use cases where you can actually make use of this feature in the real production environment.
2. Nested query support.
3. Export parameter support.
1. Multiple data source support
2. Nested Queries
Nested queries can also be considered another vital feature used in the process of data federation which carries out the real integration of the data queried from different types of data sources together. In other words, this makes it possible for a particular data service query to feed the result obtained after the execution of that particular query, as an input to some other query and eventually integrate both results to an unified format before presenting it to the user. The following diagram depicts the configurations of such sample data service queries and how they are integrated together to from a nested query which could be used in the process of data federation.
3. Export parameter support
With this particular feature, the user is given the ability to export values of the output parameters of a particular query to be used in another query.
Having discussed about the bits and pieces of the Data Federation implementation of WSO2 DSS, let's delve into some practical use cases where you can actually make use of this feature in the real production environment.
Sample use cases:
Usecase 1: Let's consider a hypothetical usecase where a particular organization has the data related to its employees and offices in two RDBMSs' of the type MySQL and Oracle. further imagine the MySQL database contains a table named "Employee" and the Oracle database contains a table named "Office" to store the relevant data. Here, the user needs to present both those data sets queried from different databases merged as a list of offices which nests its employees under each listed office.
Usecase 2: Assume a particular user has some data stored in the form of CSV files and he needs to get those data exported into a MySQL database.
Download the WSO2 Data Service Server and try those samples yourself!
Download the WSO2 Data Service Server and try those samples yourself!
Hi Prabath,
ReplyDeleteI have been implementing a POC similar to the above mentioned UseCase 2 using BPEL. Currently, in my role, I am evaluating WSO2 products. It will be very kind if you can throw some light on the issue that currently I am encountering while implementing this usecase. Below are the details:
I am writing a WSO2 Carbon Application and the goal is to read CSV file and insert data in database. I am using Eclipse Helios SR2 BPEL Designer to write a BPEL that would orchestrate the mentioned goal.
Firstly, I created a data service called CSVReader that reads a CSV file and deployed it on WSO2 DSS. On testing the service using tool TryIt it produced desired results.
On invoking CSVReader using an invoke activity; I repeatedly get error messages as below:
[2013-05-10 03:07:47,228] ERROR {org.apache.ode.bpel.runtime.INVOKE} - org.apache.ode.bpel.common.FaultException: {http://docs.oasis-open.org/wsbpel/2.0/process/executable}uninitializedVariable: The variable CSVReaderPLRequest isn't properly initialized.
[2013-05-10 03:07:47,234] WARN {org.apache.ode.bpel.engine.BpelProcess} - Instance 1649 of {http://wso2.org/bps/sample}CSVReaderProcess-38 has completed with
fault: FaultData: [faultName={http://docs.oasis-open.org/wsbpel/2.0/process/executable}uninitializedVariable, faulType=null] @70
I have initialized CSVReaderPLRequest Variable using Fixed Value option but still getting the same error.
I somehow believe that there exists a bug in BPS as BPEL is unable to invoke request operation of CSV/Excel service. The bug suspected is as below:
On extracting WSDL from CSVReader service; the request parameter looks like below:
I think complex-type request parameters with empty sequence is causing the invoke issue.
Kindly help on what can be causing this issue.
Thank you.
ReplyDeleteHi Prabath,
The request parameter couldn't appear as it was in XML format. I have converted it to HTML and below is the excerpt of WSDL file.
<xs:element name="CSVReader">
<xs:complexType>
<xs:sequence/>
</xs:complexType>
</xs:element>
<wsdl:message name="CSVReaderRequest">
<wsdl:part name="parameters"
element="ns0:CSVReader"/>
</wsdl:message>
Thanks for sharing this info..
ReplyDeleteenterprise data solutions