Data Warehouse Interview Questions
Data warehouse interview questions are an important element of a good interview. In order to choose a well-qualified applicant, interviewers must make sure that the questions they ask make sense, require some thought on the part of the interviewee, and also cover basic knowledge and skills that will be required for the job.
The first step in creating a good set of interview questions is to start by covering basic data warehouse concepts. Making sure that an interviewee is comfortable with these concepts is essential. Any candidate should be able to correctly and easily answer questions about some basic data warehouse concepts from the components and functions of a data warehouse to the types of software and hardware that can be used in conjunction with the data warehouse. An employee who will be able to see the directions that the company could take in terms of hardware and software to make certain tasks more efficient is invaluable.
It is also important to include ETL interview questions as the ETL process in data warehousing is also essential to the efficiency and usefulness of a data warehouse. Without proper inputs, the data warehouse cannot usefully store data. Database interview questions are also useful as there may be databases connected to the data warehouse or that need to use information from the data warehouse. A versatile employee should be able to understand topics about databases as well.
So what are the some of the top data warehouse or business intelligence interview questions? Each interview will certainly use common questions such as asking why the interviewee wants to work at that company, what they hope to bring to the team, and where they see themselves going in their own lives with this job. However, these data warehouse interviews can include many questions that are specific to data warehousing. These include basic questions like:
- What is a data warehouse?
- What is ODS?
- What is a lookup table?
- Why should you put your data warehouse on a separate system from the OLTP system?
- What are the differences that can be seen between OLTP and OLAP?
- What are SCD1, SCD2, and SCD3?
- Why would one avoid using an OLTP database design for a data warehouse?
- What are some of the most common reporting tools on the market currently?
- What are the methods of loading dimension tables?
10. How are dimension tables designed?
These questions cover a wide variety of data warehouse topics. Most of them are fairly basic and someone with proper training in data warehousing should be able to answer them correctly. While there is a huge array of possible questions and one should carefully choose questions that pertain specifically for the job the candidate will have to perform, these questions can pertain to any individual seeking work with data warehouses.
Data warehouse questions answers for these questions are as follows, more or less:
1. A Data Warehouse is a repository for integrated information that is capable of handling queries and performing analysis.
2. ODS is the acronym for Operational Data Store. This is used for data mining of operational data.
3. A lookup table is the table used to update a data warehouse. It allows the table to be updated and defines the information to be updated.
4. OLTP systems deal with data and not subject material. They cannot handle a dimensional model. This is why OLAP systems are designed for data warehouses. A data warehouse is not designed for specific queries so it would overload the system if users could input queries directly into the warehouse.
5. Some of the main differences between OLTP and OLAP are that OLTP is customer oriented (used by clerks, IT professionals, clients, and general data analysis) while OLAP is market oriented (used by executives and managers for analysis). OLTP is detail oriented and manages current data while OLAP manages lots of historical data, allows for summaries and aggregation, and it stores information in various levels of granularity. Also, OLTP focuses on current data for a business or department while OLAP can view multiple versions of the databases and thus can see the evolution of a business.
6. SCD is the acronym for Slowly Changing Dimensions. So, SCD1 only maintains the updated values. This is for situations where information like addresses or phone numbers are updated. SCD2 maintains historical or current information with a) the effective date, b) the version, c) flags, or any combination thereof. SCD3 adds new columns to the target table so that historical and current information are both kept.
7. OLTP database designs generally have the tables normalised, so the query response will be slowed down for the end user. Also, OLTP does not contain years of data over time so they are difficult or impossible to analyze.
8. Some of the top reporting tools are Proclarity, Cognos, Microstrategy, Informatica Power Analyzer, Actuate, Oracle Express OLAP, Hyperion, MS-Excel, Business Objects, and MS reporting services. There are also others.
9. The two methods of loading dimension tables are Conventional and Direct. Conventional offers a check before the data is loaded while Direct does not, but Direct is faster as a result.
- Dimension tables are usually designed using normalization principles up to 2NF, and in some cases 3NF. The data for this dimension must be located, there must be a way to extract it, and then it is important to make sure to maintain changes to this dimension.
Many of these data warehouse interview questions have longer answers, and there are a myriad of other possible questions to ask.