Semester Offering: InterSem
 

Data mining has emerged as an exciting and important discipline with the growth of massive digital data archives. The object of data mining is to automatically process a data archive to find patterns that represent knowledge or, equivalently, information interesting to the user. Data mining is a multidisciplinary field which invokes techniques from AI, statistics, pattern analysis, and others.

 

The object of this course is to introduce data mining techniques with a view to practical application. Topics covered will include association and rule generation, classification and prediction (including Bayesian and rule-based), cluster analysis (including partitioning, hierarchical and grid-based methods, and outlier analysis), data stream mining, and social network analysis. Practical case studies will be from the financial services and retail sectors. We will use the Weka software package.

 

None

 

I Data Warehouse and Online Analytical Processing (OLAP) Technology

1. Data warehouse architecture

2. Multidimensional data model

3.Data warehouse implementation

4.Data warehousing to data mining


II Data cube computation and data generalization

1. Efficient cube computational methods

2. Data cube and OLAP technology

3. Attribute-oriented induction


III Mining frequent patterns, associations, and correlations

1. Efficient and scalable frequent itemset mining methods

2. Mining association rules

3. Association mining to correlation analysis

4. Constraint-based association mining


IV Classification and prediction

1. Classification and prediction methods

2. Accuracy and error measures

3. Evaluation techniques

4. Model selection


V Cluster analysis

1. Clustering methods

2. High-dimensional data

3. Constraint-based cluster analysis

4. Outlier analysis


VI Special applications

1. Mining data streams

2. Mining time series data

3. Graph mining

4. Social network analysis


VII Case studies

 

None

 

J. Han and M. Kamber (2006), Data Mining: Concepts and Techniques, 2nd edition, Morgan Kaufmann.

 

M. J. A. Berry and G. Linoff (1997), Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, Wiley.

I. H. Witten and E. Frank (2001), Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann.

T. Soukup and I. Davidson (2002), Visual Data Mining: Techniques and Tools for Data Visualization and Mining, Wiley.

P. Tan, M. Steinbach and V. Kumar (2005), Introduction to Data Mining, Addison-Wesley.

D. T. Larose (2006), Data Mining Methods and Models, Wiley.

 

- Assignment 30%
- Midterm exam 30%
- Final exam 40%