Info Area: - Infoarea is a folder which is used to segregate the project.
Source system: - The system from which we extract the data to BIW system is called as Source System.
Info source: - Info source is defined as a communication structure grouping of logically related info objects in which format data has to be loaded to BIW system. Ex: water tank – Pipe – storage area.
Data source: - Data source defines transfer structure. Grouping of logically related fields in which format data has to be extracted from the source machine.
o Data source is specific to source system.
o If it is Flat file Extraction source system itself is the data source.
Transfer Rules: - transfer rules defines how data has to be transformed from Data source to Info source.
PSA (Persistent storage area): - Data coming from source system has been staged BIW at PSA has 2 dimension tables
o Any error comes, updation will Stop at PSA table, rectify the data in PSA and update into the target.
Info Package: - Info package is a scheduler which defines when load has to be triggered.
Info Cube: - Info cube is an info provider as well as Data Target. Objects which send data from Info cube are called Info provider.
Types of Cube: -
o Info cubes are categorized into 2 categories.
o Standard Info Cube
o Real Time Info Cube
Standard Info Cube: - In a standard info cube we cannot load data Manually.(Planning is not possible)
Real Time Info Cube: - in a Real Time Info Cube we can load data manually. (Planning is allowed)
DSO (Data Store Object): - If the Data source is not maintaining the reference in the form of Images, Load data to DSO. DSO will maintain the images from DSO load data to Info Cube.
o New Image “ N ”
o Before Image “ X ”
o After Image “ _ ” (Blank, No symbol)
o New Image:- Any record which enters the DSO for the 1st time will be marked as “N” (New Image)
o Before Image: - whenever we load modified record to DSO, it takes as 2 records.
When we create DSO, system generates 3 tables in the DB
• New Data Table
• Active Data Table
• Change Log Table
o Whenever we load data to DSO, Data enters New Data Table First.
o When we activate the data in DSO, data moves from New Data Table to Active Data Table and Change Log Table.
o When we load data to Info Cube from DSO with Full Update data comes from Active Data Table of DSO.
o With Delta update Data comes from Change Log Table of DSO.
o Change Log Table will maintain Images.
o Data in Active Table Overwrites.
Advantages of DSO
o To maintain Images
o To utilize overwrite functionality
o Acts as backup
o Detailed level of analyzing
o Reconciliation (To check data coming from Source is same or not)
Types of DSO
o Standard DSO 1. NDT, 2. ADT, 3. CLT.
o Direct update DSO – we can update data manually & it has only one table that is Active Data Table.
o Write Optimized DSO – when ever our update has to stop at DSO level we use write optimized DSO. W.O. DSO has only one table – Active data table.
Multiprovider: - Multiprovider is a info provider, but not a data target.
o In DB terminology Multiproviders are nothing but a View.
o Multiprovider will not hold data physically.
o When ever we want Query by combining data from 2 or more info providers we can use Multiproviders.
o Using Multiproviders we can combine
• Info objects (P Table)
• Info cubes (F Table)
• DSO (Active Data Table)
• Info Set
o When ever we want to create a Multiprovider, at least there should be one common Info object available in the Info provider. Which we are combining and that Info object should be a part of Primary key.
o Once we close the query there is no physical data in the multiprovider
o When multiprovider is created by default it takes a technical object called as Zero info providers.
o Using zero info providers we can restrict the output of the query with respect to a particular info provider.
o Using multiprovider it is improves DB performance and degrades query performance.
Infoset: - Infoset is an info provider but not a data target.
o Infoset will not hold data physically.
o When ever we want a query by combining a data from 2 or more info providers we use Infoset.
o By using Infoset we can combine
• Info objects
• DSO
• Info cubes (only in version 7)
o In DB terminology infosets are nothing but Joints
2 types of joints in Infoset
o Inner Join
o Left Outer Join
o Advantages of Infoset
• Slow moving analysis
• Slow moving analysis is possible only when it is Left outer join and when Info object is left
• Improves DB performance and degrades query performance
• Maximum of 2 cubes can be used in Infoset, cubes cannot be left
Aggregates: - aggregates are the smaller Cubes, which are built on the main cube in order to improve query performance.
o Aggregates are specific to a character; by default it takes all the key figures of the main Cube.
o Whenever we execute a query the processor will search for a suitable aggregate. If found it fetches data from the aggregate. If not it fetches data from the main Cube.
o Initial Fill: - After creating the aggregates, what ever the Data that we load from the main cube to the aggregates for the 1st time is called initial fill.
o Using aggregated the retrieval rate will be faster.
o OLAP time reduced and DB time.
o Rollup: - After the initial fill what all the data that we load to the main cube has to be Roll Back to aggregates.
o If we don’t do the Rollup the new request will not be available in the aggregates.
o Using aggregates we can increase the query performance & degrades DB performance.
o If DB time is greater than 30% of the time and aggregates ratio should be faster than 10%
o Agg = no of records selected / no of records transferred
Indexes: - Arranging the records in DB with respect to some pointers.
o 2 types of Indexes: -
o 1. Primary Indexes, 2. Secondary Indexes
Bitmap B-Tree (Binary Tree)
Pointers will be in 0’s & 1’s Parent and child relation, for every search it reduces 50% of the properties
P
LC RC
o Using Indexes the retrieval rate will be faster.
o Indexes are created on fact table
o The number of Indexes created on Face table will depend on number of dimensions in cube.
o Before loading data into cube delete index and create indexes after loading.
Line Item Dimension: - when we assign one character to one dimension, will make determine as Line Item Dimension. When we make it as Line Item Dimension there is no extra Dimension table strength away like from SID table to fact table. Using LID improves load performance, DB performance & query performance.
Process Chain: - Group of process associated in the form of a chain in order to automate them is called process chain.
o Process: - Process defines what exactly we are performing.
o Variant: - Variant defines on what object process has to be performed, every process is associated with variant.
o Error Handling in Process chain: - 1. Load Error, 2. Other than Load Error
o If we get load error, it should be rectified manually.
o Error handling should be maintained in log view only.
o Start the chain manually when stopped.
o Other than load error is due to Lock issue for this wait for some time and reload.
No comments:
Post a Comment