Exploitation Database Approach for Right On-line Analytical Processing 1

In this study, a framework for building an Exploitation Database Approach (EDA) is provided. Suchan EDA requires tasks such as data warehouse change detection, EDA queries rebuilding and queries results delivering to all users across organization sites. For this purpose, we introduce an innovative approach for creating a new simple design, with high benefits, in order to manage and exploit On Line Analysis Processing (OLAP) queries and reporting information of OLAP applications across the overall organization sites without regard to latencies limitation and barriers. The latency requirements for delivering information span a wide range depending on specific business processes. Data Replication using EDA appears as a robust and perfect solution for eliminating requirements latencies in answering OLAP querying.


INTRODUCTION
As the business environment has become increasingly competitive, the need to use corporate data as a strategic resource has intensified.However, most organizations in today's technology-based businesses are data rich and information poor.Much of the essential information needed to anticipate changing market conditions and customer preferences, forecast future demand for products and services and develop profitable business plans is locked in various transactional systems, spreadsheets and Web log files.Without the ability to deliver the easy, fast and right information to the right people at the right time, companies cannot stay competitive in today's fast changing economy.
Situated in the domains of Business Intelligence (BI), this study proposes to describe the wide variety techniques and designs used by the technology in order to implement and run a strong On-Line Analytical Processing (OLAP) in BI, Also to try to design an approach for reducing, or eliminating if possible, the major barrier or challenge that appears against any implementation of a powerfully OLAP solution in BI.This barrier/challenge against OLAP is due to the Time Latency in respond to the end users (staff) queries issued from different enterprise branches.
These branches or sites are considered as data sources in the construction and implementation of large data warehouse (DW).Large DW approach (many terras of data) has been severely criticized due to the sheer enormity of pulling it all together while still maintaining existing systems called also historic data.
The business processes and complexity of calculations need to be a part of the entire BI infrastructure.Data Base Administrators (DBA) needs to clearly understand the data analysis requirements and what may be necessary to turn the proposed new data into business results (Al-Debei, 2011).They ask many questions: "What do I need to run my business and how much of all this information delivers any portion of my solution?","How much data do you need at varying levels of analysis?", "What are the aggregations that most end users look at within specific data areas?","Do you know how many end users actually take advantage of the drill-through to detail data?".There is one answer (Biere, 2003): "All your queries in one basket...one way to look at OLAP": The real cost of any of this end-user computing "stuff" lies in the many hour's latency that no one will quantify or track.When you set out to engage in the typical query and reporting activities, you will hit the same brick walls every time, regardless of the tool you are using.
So how much do you make per hour?How many hours are you willing to spend or can you spend learning a tool?What if your requirements are heavy in calculations outside the scope of your data as delivered, but you have minimal time (Biere, 2003) to spend producing the results?Hourly rate x number of ours playing at BI = €?
Every moment that you spend working with a tool without obtaining the results, you need costs you and your company money.If you never obtain the correct results, then every moment that you spent is a waste and is pure cost with no return to the business.That goes for every user who invests time in "playing" BI without getting to the end of the job.If you have not identified the specific users and expected usage in advance, then it is guaranteed that you will end up with a select few who can do the math and many hangers-on and significant dropouts.Fast OLAP reporting and analysis features make your business applications more intelligent, more robust, more usable and ultimately more valuable to you and your end users.In addition to typical reporting capabilities, fast OLAP reporting implies more power and overall impact to your business applications into a pure BI solution (LogiXML, 2010).
Exploitation Database Approach (EDA) illustrated in Fig. 1 consists of a medium size database implemented near the main DW and replicated on each of database sources participated in the building of DW.EDA model is alimented periodically from the multidimensional OLAP cube.So, EDA will contain all possible queries that can be generated by OLAP users and will provide their results to all end-users staff across the organization.

EDA OBJECTIVES
The principal Benefits or Specific Objectives of our approach for the short-term requirements are to: • Ensure the Growing demand for real-time access to real-time information across the sites of an organization.• Eliminate the fatal latencies due to the answered queries from OLAP system to the sources database of organization.• Most organizations update their DW nightly, but sources EDA could be updated on a long term period like monthly or other depending from the type of application.• There is no need for processing calculation.
Because in the actual OLAP approach (Fig. 2) the performance of queries can be slower since the values need to be calculated on the fly instead of being accessed from the pre-calculated storage.• There is no need to buy a special hardware server at sources level because EDA can stores easier the entire data volume of many Multidimensional OLAP cubes (MOLAP) on the hard disk.Because actual OLAP analysis stores the entire cube in RAM, it does not scale to the data volume larger than the RAM size.• If a disconnection appears between source and destination, there is no need to repair immediately this disconnection because an updated version of historic queries answers exists.This is the area where BI at the enterprise level will make the greatest sense within all levels of the enterprise.If we look back at the benefits that we assigned to our approach EDA for OLAP in BI solution, we should be able to clearly state what this will provide the business.
As part of this EDA approach, you really have an opportunity to look at how a BI solution spans the enterprise to improve the Project Business Value for Long-Term Goals or the General Objectives.Let us look at a simple sales and marketing example.Our theoretical current need is to provide rapid feedback and analysis for a campaign that we are going to run in a new geography.As a part of this effort, we will have a special promotion for new customers.At a minimum, we should have a logical tie among the sales, marketing and advertising departments.
We should be able to quantify current business in similar geographies and demographics.If we have run campaigns in the past, can we provide any predictive information about how useful we believe the results will be?Suppose that we cannot provide any of these items.Why would we not be able to provide them?Are we better served by putting the proper database in place before we begin, so that we are not always launching efforts and campaigns in the dark?
When choosing the EDA architecture approach for your BI solution, you must focus on both the shortterm requirements and long-term goals (Brian, 2008) and benefits of the organization.With this in mind, what are the key considerations for choosing the EDA architecture and approach?
• Flexibility for future growth.It is common for organizations to build data mart solutions where the acquisition and integration of data as well as the data mart are built in a single database.The staging areas are typically temporary in nature and used to support the current reporting requirements set forth prior to the project starting.As requirements, change and new requirements are added the data marts are extended.Nevertheless, often these types of solutions cannot extend to an enterprise effort without considerable rework.If new requirements require data that is only available in an existing data mart, then that data mart may not support the new business requirements easily.
When historical data is required that is not available in the required formats, then the data mart may be useless.Data marts are good when supporting known business requirements but may not be flexible enough when business requirements change.• Projects will only be successful when staffed with the correct resources and skills.The organization should ensure it has the best skills available when undertaking an enterprise effort.Departmental solutions can be developed with a smaller set of skills if that is the focus.• Budget Enterprise projects can be expensive.These projects typically have more components and can therefore take longer to build than one-off departmental data mart solutions.Development times are therefore typically longer and require more resources to build and maintain.However, these projects can be delivered successfully when designed, designed and developed correctly.When delivered in an iterative fashion based on a structured architecture and methodology this allows enterprise architectures to be built quickly while providing business value in an iterative manner also.Enterprise projects are best developed using best-of-breed extract, transform and load (ETL) and reporting tools.ETL tools should be platform and database independent and be able to operate in distributed environments.Querying tools should provide an array of functionality, including batch, ad hoc and Online Analytical Processing (OLAP) functionalities.The expense of these tools is surpassed by the functionality they provide.• Has the scope of the effort been clearly defined and thought out?I.e., are you building a solution to support the requirements of a department or the foundation for an enterprise effort that must extend to support future requirements for other parts of the organization?If your focus is on a departmental solution, then often a simpler architecture and approach may suffice.But if you are striving toward a longer-term enterprise solution, then the architecture should be designed to accommodate this.The latter choice should consider both architectural options as well as alignment with organizational and IT current strategies and future vision.• Scope and complexity.Enterprise solutions typically have multiple database layers that separate the integration layer (DW) of the data from the analytical layer (Cube).The former layers are often called DW or staging areas and may be temporary or persistent in nature.However, the implementation approach of EDA is most important here.• When developing enterprise solutions.i.e., solutions that will be leveraged to support both long-and short-term organizational objectives, it is important to separate the data integration components from the analytical components.Why? Source data can come from within and outside of the organization.Data may exist today, could be sourced tomorrow or may have to be manufactured.Data integration rules may be complex.Data quality issues may not be clearly understood.The integration layer of the data should be designed to support the acquisition, integration and maintenance of the data.A normalized modeling approach as EDA is the best suited for these requirements because it makes no assumptions regarding the underlying data and quality of the data.When data quality is important from an enterprise perspective, then a normalized approach is the best option.The reporting requirements for the organization may not be fully understood when starting to build an enterprise solution, so it is not advisable to model data structures based on unknown or vague reporting requirements.Separating the integrity versus the reporting requirements enables each component of the data architecture to be modeled and maintained based upon the unique set of requirements of each component without jeopardizing future flexibility of the overall solution.To support reporting requirements, data is modeled to support the organization's analytical requirements.These requirements are best met when using a non normalized modeling technique and combinations of star/snowflake modeling design.This type of modeling technique does not support data maintenance operations well, makes no assumptions regarding the underlying data quality (i.e., may not provide visibility into data quality issues if designed without consideration for data quality) and, most importantly, may be flexible enough or contain the appropriate information to support future reporting requirements without changes to the data models themselves.Separating the data layers for integration and reporting allows each layer to be designed appropriately based on its usage and provides more flexibility as business requirements are added or change over time.
• Organizational focus on data quality: When data quality is not an important consideration and the reporting requirements are departmental versus enterprise-focused, then a simple querying solution may be the best choice for your organization.If the focus is on developing an enterprise solution together with a focus on improving data quality within the organization, then the architecture outlined is the best option.By developing an architectural layer where the focus is on data acquisition and integration, the focus can be extended to support an enterprise-focused data quality effort also.This type of architecture allows a robust data quality solution to be designed for identifying data quality issues improve data quality upstream and provide improved future reporting and analytical needs.

MATERIALS AND METHODS
If we are taking about the Business Justification Approach there are several things (Brian, 2008) we need to know: • What is the scope of the project and what is the quantifiable business value that we know it will deliver?• If there is no specific quantifiable value, then what will happen if we do not complete the project?• Do we have historical data and information with which to measure our success?• If not, do we intend to provide the structure and database to support future efforts so that we can add to our information pool?• If we do not intend to create this infrastructure, do we have adequate information to justify and quantify to management? • Do we have information that spans several business areas such that our effort provides value in more than one business area?
Globalization has put pressure on businesses to be available 24/7/365 and it is up to their IT departments to figure out how to supply the necessary data and applications in support of this "never closed" situation.
Time and Time Again, Managing Time in Relational Databases, DW and OLAP applications consists to reduce or eliminate if possible the major barrier doing against this technology which is the fatal latencies due to answering distant request queries issued from End-users staff around many locations sites of an organization (Goil and Choudhary, 1997).
For these reasons, we began thinking to find innovative ways for creating a new simple design approach, with high benefits, to managing and exploiting querying and reporting information's of OLAP applications and across the overall organization sites without regard to latencies limitations and barriers.In addition to short terms benefits and to long terms goals presented in section2, this design approach (Fig. 1) called Exploitation Database Approach (EDA) consists of the principal following characteristics: • Managing simple software for a medium size database.• Situated near DW (as source) and near each site application across an organization (as destination).• In opposite to DW for OLAP classical architecture, EDA source will be implemented near DW destination, EDA destination will be implemented near each source application participating in DW creation and on each other selected site within an organization.Database replication has been around for many years and is a mature technology that is enjoying a resurging interest in many enterprises.IT shops are finding new and innovative ways to use this proven technology for operations, BI and even master data management (MDM).Over the years, replication technology has been enhanced and improved to support these new activities.
Database replication is defined as enterprise software that enables companies to copy and move data bi-directionally from one database to another at a transaction level in real-time (Holenstein et al., 2011).This is accomplished by delivering them to distributed database targets without regard to distance limitations.
As one surveys the various database replication offerings available in the market today, there are ten key characteristics (Elmasri and Navathe, 2002;Claudia, 2008) that can be identified as the characteristics of a premier solution: • Ability to work in a heterogeneous environment.
• Simultaneous replication from multiple sources.
• Simultaneous replication to multiple targets.
• Support for local application independence.
• Data integrity • Efficient use of network resources • Provides real-time continuous replication • Selective replication • Ease of administration There are a number of guidelines or conditions to think about when determining if database replication technology is right for your project (Claudia, 2008;Hainaut, 2012;Todman, 2001).These include the amount of data transformation required, the state of the data's quality and whether real-time data is the driving requirement: • The Amount of Data Transformation Required, Database replication technology does what it was designed to do; it replicates data very effectively and efficiently.It was not designed to perform complex transformations of the data that is, it was not designed to perform heavy-duty data integration of massively disparate data.There is other technology called ETL that is better suited for this process.Therefore, the first guideline is to perform a thorough analysis of the source data being replicated and the ultimate target schema.Is the replication process relatively straight-forward requiring a minimal amount of data transformation?We have described many scenarios where the target data was identical or very similar in format or construction to the source data.Database replication technology works best when the data integration is simple, involving light data model transformations between the sources and targets, connected through point-to-point interfaces whish design the principal characteristics of my EDA approach.
• The State of the Data's Quality, Database replication technology is not data quality or data cleansing technology.If analysis of the source data determines that data quality processing is required before the data is suitable for the target's usage, then you will need to perform these actions on the source data before invoking replication.It does the enterprise more harm than good to replicate bad data into more systems and applications.It may be possible to replicate the unprocessed data into a "staging area" where the data quality processes can work on the data unhindered by other activities.
Once the data is merged, purged and the quality certified, it can then be replicated with assurance that it is of the proper quality level.It should be communicated that these processes do not occur in real-time and that a certain amount of data latency must be acceptable.• The Real-time Data Movement Requirement, Realtime data denotes information that is available immediately upon collection and reflects the most recent changes or updates made to it.The data latency is negligible in terms of timeliness (Marius et al., 2009).As more and more of the enterprise demands real-time access to real-time data, the more pressure it applies to its IT infrastructure.Database replication can certainly relieve a great deal of this pressure but it is important that the project implementers ascertain the precise timely need for data.Even a few seconds of latency can give the technology time to perform its checks and balances to ensure data integrity, security, quality, etc.If these other characteristics outweigh the demand for real-time data, the infrastructure must accommodate their needs while reducing the data latency as much as is feasible.

RESULTS AND DISCUSSION
Defining principal Actors and Scenarios are the base of implementing a clear and pure UML study (Elmasri and Navathe 2002;Hainaut, 2012;Soutou, 2002) for EDA.
Two principal actors are DW Administrator called also IT or OLAP Builder and End-User Staff called also (OLAP User) as shown (Fig. 3).
The essential roles for DW Administrator when using EDA are: ADD Questions, ADD OLAP Answer, ADD Exploitation Type, Lock/Unlock OLAP Answer and Send Copies to sources and to others selected.At his own site, DW Administrator can play also the roles of End-User Staff.
End-User Staff or (OLAP User) plays the roles of: Browse Questions, View OLAP Answer, Print OLAP Answer.

EDA general use case diagram:
EDA sequence diagrams:

CONCLUSION
Take a moment to think about your own organization.What function or activity wouldn't benefit from having accurate, current data and analytics available to it?In fact, we recommend five key trends driving the demand for near real-time data (Claudia, 2008) in just about every industry and organizational function.These trends are: • The fact that low-latency data movement to support key business processes has become the new enterprise standard.• Application availability requirements are more stringent due to industry regulation and online access.• Proliferation of heterogeneous database environments has increased the need to quickly integrate disparate systems.
• Very Large Data Base (VLDB) implementations demand more robust replication performance.• Globalization and distributed operations heighten data distribution, sharing and synchronization urgency.
Given these economic, regulatory and operational pressures on your enterprise, you can easily understand the need for EDA database replication as one of the key technologies required to support your enterprise's realtime data movement needs.The question then becomes, "What are the primary capabilities I should look for in an EDA database replication solution and what are the optimum scenarios for its usage"?
Applying these five recommendations should enable any organization to implement predictive analytics with a good measure of success.While many people seem intimidated by predictive analytics because of its use of advanced mathematics and statistics, the technology and tools available today make it feasible for most organizations to reap value from predictive analytics.
Database replication technology, while enjoying a long history, has come into its own again as new and expanded uses for it have come to light.With increasing data volumes, demand for real-time data, implementation of operational BI capabilities and the need to make IT's infrastructures as stable and consistent as possible, EDA with replication technology is playing a significant and critical role in the real-time enterprise.

Fig. 6 :
Fig. 6: EDA Class Diagram • OLAP answer: it is an answer issued from an OLAP application Cube of organization.First, it will be saved as an object and then linked to Answer inside OLAP Answer as OLE type.• Aggregation type: an aggregation may be unique aggregation like COUNT(), SUM(), MAX(), AVG() or a combination between multiple aggregations like AVG(SUM()), MAX(COUNT()), ..……… • Exploitation Type: an Exploitation will be an attribute containing a numeric field value like CarSale, InsurancePolicySale, FoodMarketSale, ItemRevenue, …., Income, …., Discount ….….

end users staff thinks are owners of the distant and large DW. • End users staffs can get OLAP results from simple
• All On-Line Transaction Processing (OLTP) users can get faster and easier the benefits of OLAP users.• All end users staff cans shares the same information's at different organization locations in the same time.• Fast queries answers and results: from many hours waiting to a fraction of second.• All • End users staff does not need to think about DW analysis like dimensions, level positions, measures, filtering and many other complexes tools.• Wide-reaching availability for end users.OLAP Reporting extends the benefits of OLAP throughout the organization to any user at any branch around the word.
Press Check Box to Lock or Unlock Viewed OLAP • Nominal Scenario (5) for Lock/Unlock OLAP Answer Use Case: o Press Lock/Unlock OLAP Answer Bottom o Choose Question you Want to Lock or Unlock it Answer o o Press Add Aggregation Type Bottom o Enter New Aggregation Type Actor 2: End-user Staff (OLAP user) (IT, or OLAP Builder) sequence diagram sequence diagram Nominal Scenario (5) for Lock/Unlock OLAP Press Lock/Unlock OLAP Answer Bottom Choose Question you Want to Lock or Unlock ituser Staff (OLAP user) (Fig.5).•Nominal Scenario for End-user Staff (OLAP user) Use Cases: