Data Warehousing Concepts
Data warehousing concepts are pieces of information you need to know that pertain to data warehousing. One of the pertinent pieces of information regarding data storage terminology is “What is a data warehouse?” Another important piece of information regarding data warehousing is data warehouse architecture. A data warehouse is a place where data bits are stored, necessary information is extracted, and the filtered information is transformed into necessary computer applications for computer users. Data warehouse architecture refers to the kind of data storage structure a business or company chooses to have, from file storage (what you see is what you get) to complex architecture (several networks of data, a network for each significant process in a corporation).
Other data warehousing concepts include:
- Dimensional Data Model
- Slowly Changing Dimensions
- Conceptual Data Model
- Logical Data Model
- Physical Data Model
- Data Integrity
- Factless Fact Table
- Junk Dimension
- Conformed Dimension
Data warehousing tutorials are guides designed to help you understand how the data storage process works, its evolution, and the other issues that are connected to it. Some of these concepts are star schemas and snowflake schemas, ETL load-time dimension, data mining, etc. Data interview questions are questions asked about issues surrounding data warehousing, similar to the following:
- How to define a star schema in relation to databases
- Distinction between physical and logical modules
- Definition of active data warehousing and its advantages
- Where does the data go when it leaves the warehouse?
- How to load time dimension
- The use of data mining more than older forms
- What are the OTLP and OLAP systems?
- What is the ETL process?
- Snowflake schema versus star schema
- What is XMLA?
- Discreet and continuous data used in data mining
- The difference between data warehousing (DW) and business intelligence (BI)
- Cube vs. linked cube
- Definition of virtual data warehousing
- Junk dimension
- What is a data mart?
- Kimball and Inmon approaches
- Real-time load processing vs. batch
Data mart concepts are significant terms related to data marts, data storage units that contain information designed for certain users in computer applications. For example, an example of a data mart would be billing information. This significant level of data would be contained for only those who have a subscription with a magazine, buys a certain number of products from a corporation, or requests certain services from a business. Examples of other data mart concepts are:
- Definition of data mart
- Data mart vs. data warehouse
- Dependent vs. independent data marts
- The process of data mart implementation (five steps)
- Data mart design
Many of these concept questions are answered in data warehousing concepts PDF files. Rutgers University has published such a PDF online and it does a good job of tackling the material. Some of the topics the Rutgers PDF addresses are:
- Definitions of data and information
- Data warehouse definitions (subject-oriented, integrated, non-volatile, time-variant, accessible, process oriented, data warehouse, multidimensional analysis, hypercube, star and snowflake schemas, etc.)
- Comparison of data warehouse and operational data
- Data warehousing process
- Data warehouse tools/software
Data warehouse architecture is a description of the elements of the data warehouse. With a house, the architecture consists of the elements used to build it: wood, nails, brick, glass, cement, roofing, etc. In a data warehouse, the architecture consists of all the different types of data that constitute the warehouse. The fewer the elements used to build the data warehouse, the simpler the data architecture; the complex the data, the complex the architecture.
There are several components that make up a data warehouse:
- Data sources
- Data transformation
- Optional components
Data sources are an electronic storage unit to which data (information) is passed, whether manually or automatically. Data transformation occurs when information is received, cleaned, transformed into the right mode, and sent to the storage unit. Reporting occurs when the data in the storage unit is converted into applications for computer users who have access to the data via password, security entry code, etc. Metadata, translated literally as “the data about data,” refers to a library catalog-like database that gives you the breakdown of all the various data in the data storage unit and tells you what each type of data is about. Operations refer to the processes by which data is stored, cleaned, converted, and transferred to user applications. Operations include loading, cleaning, and extracting information, user management, security, capacity management, and other related actions.
What is the purpose of these concepts? The purpose of these concepts is to help businesses and companies make financially-wise decisions. By knowing how data warehouses work, and the various issues associated with them, businesses can develop business intelligence applications that will help them see the direction of the company and have an opportunity to reverse adverse effects (if the situation calls for it). Data warehouses would help companies develop user applications that would allow consumers to answer questionnaires and surveys so as to help businesses cater to a greater number of people than normal. Businesses would learn how to make their customers happy; in turn, customers would want to make businesses and companies happy—which means that these businesses and companies would make great profits from their customer base and the consumer base at large. Simply put, data warehousing concepts are the bread and butter of businesses in the current economic market.