Skip to content | Change text size

M O N A T A R

InfoTech Unit Avatar

CSE5320 Statistics of Data and Data Mining

Chief Examiner

This field records the Chief Examiner for unit approval purposes. It does not publish, and can only be edited by Faculty Office staff

To update the published Chief Examiner, you will need to update the Faculty Information/Contact Person field below.

NB: This view restricted to entries modified on or after 19990401000000

Unit Code, Name, Abbreviation

CSE5320 Statistics of Data and Data Mining [DataMining]

Reasons for Introduction

Obsolete Reasons for Introduction

The recent Federal Government initiatives in high performance computing saw the establishment of the Australian Partnership for Advanced Computing (APAC) and its Victorian arm (VPAC). Monash University is founding member of and a substantial contributor to VPAC. APAC/VPAC aims not only to provide hardware support but also educational and training to the scientific community, and to raise the profile of high performance computing within the Australian business community.Monash now proposes to position itself as a major provider of education and training for the APAC project and the wider scientific and technical community in Victoria. This proposed unit provides an integral part of a Graduate Certificate in Computational Science to be offered jointly by the Faculty of Science and the Faculty of Computing and Information Technology.The current and increasing prevalence of high-speed computers and large data-sets has seen the important fields of machine learning, statistics and econometrics gradually creating common field of study in computational sciences. This common area is fashionably known to many as data mining.This units builds on CSE5310, which introduces the student to the advanced computing, parallel programming paradigms and the associated programming tools. Based on those tools, this unit similarly provides an introduction to statistical and probabilistic methods to mine information from very large data sets and databases.The unit covers the following major areas of data mining and associated statistical methods:
  • Bayesian Nets and Causal Nets,
  • Clustering Methods (using for example Snob),
  • Decision Trees (using for example C5 and DtreeProg),
  • Support Vector Machines (using for example SVM-light), and
  • Neural Networks (using for example Matlab)
    Evaluation will be based on
  • Artificial and real-world data
  • Training and test data
  • "Right"/"wrong" prediction and probabilistic prediction
  • Kullback-Leibler distance

Objectives

Unit Content

Teaching Methods

Assessment

Workloads

Resource Requirements

Software Requirements (21 Oct 2005, 1:04pm)

Prerequisites

Faculty Information

Proposer

David Dowe

Approvals

School:
Faculty Education Committee: 25 Jun 2002 (Judith Beart)
Faculty Board: 25 Jun 2002 (Judith Beart)
ADT:
Faculty Manager:
Dean's Advisory Council:
Other:

Version History

17 Oct 2005 David Sole Added Software requrirements template
21 Oct 2005 David Sole Updated requirements template to new format

This version: