Session 01
08.28.2007 |
Introduction - class description Class web page Textbooks Goals & topics of the course Intro on data mining Data warehousing Data cubes Data warehousing schemas
|
Session 02
08.30.2007 |
Ch.2-3 textbook
What is Data Mining?
Data mining steps, DM in business intelligence, DM as a confluence of disciplines
DM Query Language (DMQL)
Integration and coupling of DM and DW DM Architecture DM Functionalities and task primitives OLTP vs. OLAP Data warehouse and data cubes Schemas: star, snowflake, and fact constellation
|
Session 03
09.04.2007 |
Ch.3 textbook
Multidimensional Data Model (MDDM) Measures
Concept Hierarchies (CHs) OLAP in MDDM Starnet Model Data Warehouse Architecture Views, approaches and steps 3-tier architecture Back-end tools Metadata OLAP servers (ROLAP , MOLAP, and HOLAP)
|
Session 04
09.06.2007 |
Ch.3-4 textbook Data Warehouse implementation
Data cube computations
Data cube construction (number, access, query) Materialization (no, partial, full)
Partial materialization (iceberg cube, shell cube Indexing OLAP data (Bitmap indexing, join indexing) |
Session 05
09.11.2007 |
Ch.3-4 textbook Data Cube Computation & Materialization
Ancestor and descendant MultiWay Array Aggregation Bottom-Up Computation (BUC) |
Session 06
09.14.2007 |
Ch.3-4 textbook Data Cube Computation & Materialization
(Star-cubing) Star-cubing computation (shared dimensions cuboid trees star-tree construction star-tree in computing iceberg cube) |
Session 07
09.18.2007 |
Ch.6 textbook & Berry's book on DM for marketing Extracting models for data classification & prediction Two phases or data classification Classification (supervised learning, unsupervised learning) Classification Accuracy (training and testing, accuracy, boosting the accuracy) Classification & Prediction (data preparation, data transformation, algorithms) Decision Trees (example, ID3 & CART, induction, splitting attributes, selection methods, scenarios, stopping criterion, complexity, diversity and purity, purity measures - Gini , Entropy)
|
Session 08
09.20.2007 |
Practical Data Mining Real-world issues concerning you and your data |
Session 09
09.25.2007 |
Data Mining
Guest Lecture: Have fun.
Look ma, it's a network. |
Session 10
09.27.2007 |
Practical issues with decision trees Building, browsing a data cube in SQL MDX |
Session 11
10.02.2007 |
Project papers presentations and discussions |
Session 12
10.04.2007 |
Learning from Neighbors Eager vs. lazy learners Lazy learners K-nearest-neighbor classifier Case based reasoning (CBR) Distance measures Euclidian space coding theory fuzzy space |
Session 13
10.09.2007 |
Distance measures in FL & NNS Fuzzy logic (Intro, Fuzzy systems, Distance in fuzzy logic) Neural Networks (Intro, Hamming distance and net value, CPN networks) Data Mining Based on Experience Memory (Case) Based Reasoning
(Examples & steps, Case study from Barry's book)
|
Session 14
10.11.2007 |
Data Mining Based on Experience: Collaborative (Information) Filtering Steps Examples Applications Papers Resources
|
Session 15
10.16.2007 |
Crisp (Hard) vs. Fuzzy Clustering Clustering Based on ED Clustering using Neural Networks (Kohonen Self-Organizing Map,
forming clusters as you go)
Hard clustering (k -Means clustering) Fuzzy clustering (c -Means clustering) |
Session 16
10.18.2007 |
Mining the Web Page Layout Structure HITS (Hyperlink-Induced Topic Search Steps Examples Problems
|
Session 17
10.23.2007 |
Review - Analytic Geometry in Euclidean Space with Cartesian Coordinates (slope intercept, line intercept, scalar form of a line, 2 & 3 point form, one point and vector form) Neuron as linear classifier (Neuron, definition, threshold, bias) Neuron as linear classifier (Linear classifier with more than two classes, Linear classifier in multidimensional space) Support Vector Machines History, books Algorithm Derivation for linearly separable patterns Derivation for linearly inseparable patterns XOR example A few more examples of neural networks (Radial Basis Function Networks (RBF), Perceptron adjustable rule)
|
Session 18
10.25.2007 |
Discussion of previous session material Brief review of ANNs for classification and learning (RBF Radial Basis Function Networks, CPN Counter Propagation Networks, LVQ Learning Vector Quantization, Functional Link Networks, Polynomial Networks, Perceptron adjustable rule) |
Session 19
10.30.2007 |
Classification and Prediction - Regression Analysis (Types, history, Least square method, Measure of goodness-of-fit, Multiple linear, nonlinear, fuzzy regression) Accuracy & error measures (Accuracy, misclassification rate, confusion matrix, Other measures - TP, FP, TN, FN, P, Accuracy vs. threshold, Predictor error measures)
|
Session 20
11.01.2007 |
Exam review |
Session 21
11.06.2007 |
Project paper presentations |
Session 22
11.08.2007 |
Project paper presentations Reading material presentations
|
Session 23
11.13.2007 |
Techniques for accuracy estimation Holdout and random subsampling
k-fold cross-validation Bootstrap Techniques for accuracy improvement Bagging Boosting |
Session 24
11.15.2007 |
Fuzzy Preference Relation Definitions (fuzzy number, fuzzy value, av , gamma resolution) Fuzzy preference relations and properties ( Orlovsky , Lee) Fuzzy satisfaction degree and properties Fuzzy preference relation in continuous domain and visualization Applications (decision support systems, attack signatures, fault tolerance voters) |
Session 25
11.20.2007 |
Revisiting covered material Discussion |
No class
11.22.2007 |
Thanksgiving, no class |
Session 26
11.27.2007 |
Post exam review Weka data mining tool |
Session 27
11.29.2007 |
Mining Frequent Patterns Definitions (frequent itemset , frequent sequential pattern, frequent structured pattern, market basket analysis) Association Rules (support & confidence) Strong rules (occurrence frequency of an itemset ; relative support; absolute support; frequent itemset ; confidence) Challenges in association rule mining |
Session 28
12.04.2007 |
Revisiting covered material Discussion |
Session 29
12.06.2007 |
Paper discussions and class project paper discussions |
Session 30
12.11.2007 |
No examination week Final class paper presentations |
Session 31
12.13.2007 |
No examination week Remaining class paper presentations Course summary (Topics, assignments, presentations)
|
Week of
12.17.2007 |
Final examination week |