Data Mining and Business Intelligence (DMBI) Important Questions
This IMP’s is contributed by Darshan and Mangesh. Make sure to follow them on their social handles:
- Mangesh Pangam:
- LinkedIn: Mangesh Pangam
- Instagram: @Mangesh_2704
- Darshan
- Instagram : darshil2599
Module 1 – Data Warehouse (DWH) Fundamentals with Introduction to Data Mining (Weightage: 5 – 10 marks)
- Explain the KDD Process with a Diagram.
- What is data mining? Explain the KDD Process with a Diagram.
- What are the major issues in data mining.
- Compare and contrast between OLTP and OLAP.
- Compare star schema, Snowflakes schema, and star constellation.
Module 2 – Data Exploration and Data Preprocessing (Weightage: 10 – 15 marks)
- What are various types of attributes?
- Explain Data Cleaning, Data Integration, Data Reduction.
- Write a short note on Data Transformation.
- Define and explain the statistical description of data.
- State the Apriori Algorithm. Any numerical on the Apriori Algorithm.
- Explain the concepts: Normalization, Binning, Histogram Analysis.
- What is noisy data? How to handle noisy data?
Module 3 – Classification (Weightage: 15 – 25 marks)
- Explain Regression. Explain Linear Regression with an example.
- Explain Cross-Validation.
- Explain the concept of Decision Tree Induction.
- Numerical on Naive Bayes Algorithm: Using the given training dataset classify the following tuple using Naïve Bayes Algorithm.
- Compare star schema, Snowflakes schema, and star constellation.
Module 4 – Clustering and Outlier Detection (Weightage: 15 – 20 marks)
- What is an outlier? Describe methods that are used for outlier analysis?
- Explain K means algorithm in detail. Apply K-means Algorithm to divide the given set of values {2,3,6,8,9,12,15,18,22} into 3 clusters.
- Explain DBSCAN algorithm with an example.
Module 5 – Frequent Pattern Mining (Weightage: 15 – 20 marks)
- What are Multiple Levels and Multidimensional Association Rules? Explain with suitable examples for each.
- Explain the Market Basket Analysis.
- Explain frequent itemset using candidate generation.
- Write a short note on Constraint-Based Association Mining.
- Explain concept of Mining Frequent itemset using vertical data form.
- Discuss the generation Association Rules from Frequent itemset.
- Explain briefly Frequent Itemset, Closed Itemset & Association Rules.
Module 6 – Business Intelligence (Weightage: 5 – 10 marks)
- Explain the Business Intelligence issues.