Programme
September 11, 2024 2024-11-25 17:18Programme
The programme’s interdisciplinarity promotes the applications of computer technology, operational research, statistical modelling, and simulation for problem-solving and decision-making in organisations and enterprises within the private and public sectors.
Admission Rate
Employment
QS World University Rankings by Subject 2024 (Data Science & Artificial Intelligence)
Curriculum
The curriculum of the MDASC programme adopts a well-balanced and comprehensive pedagogy of both statistical and computational concepts and methodologies, underpinning applications that are not limited to business or a single field alone. The programme is ideal for those interested in acquiring analytical skills in areas ranging from statistics to computational analytics, and those who wish to pursue further studies in data science after having completed undergraduate studies in areas such as science, engineering, medical sciences, social sciences, information systems, computing and data analytics.
72 Credits
Part Time - 2.5 Year
Interdisciplinary Courses
Capstone Requirement
Programme Highlights
Interdisciplinary and comprehensive curriculum
Solid foundation in statistical and computational analyses
Hands-on applications of methodologies with software
Electives cover a broad range of contemporary topics about Computer Science and Statistics
Capstone project with real-life applications
Course Highlight
Statistical modelling
Computational intelligence
Blockchain data analytics
Multimedia technologies
Programme Structure
The curriculum is the same for both full-time and part-time study mode. You could refer to the 2024 syllabuses and regulations for information. The curriculum is extracted below.
Compulsory Courses
24 credits
COMP7404 - Computational intelligence and machine learning (6 credits)
This course will teach a broad set of principles and tools that will provide the mathematical,algorithmic and philosophical framework for tackling problems using Artificial Intelligence (AI) andMachine Learning (ML). AI and ML are highly interdisciplinary fields with impact in differentapplications, such as, biology, robotics, language, economics, and computer science. AI is the scienceand engineering of making intelligent machines, especially intelligent computer programs, while MLrefers to the changes in systems that perform tasks associated with AI. Ethical issues in advanced AIand how to prevent learning algorithms from acquiring morally undesirable biases will be covered.Topics may include a subset of the following: problem solving by search, heuristic (informed) search,constraint satisfaction, games, knowledge-based agents, supervised learning (e.g., regression andsupport vector machine), unsupervised learning (e.g., clustering), dimension reduction learningtheory, reinforcement learning, transfer learning, and adaptive control and ethical challenges of AIand ML.
Pre-requisites: Nil, but knowledge of data structures and algorithms, probability, linear algebra, andprogramming would be an advantage.
Assessment: coursework (50%) and examination (50%)
DASC7011 - Statistical inference for data science (6 credits)
Computing power has revolutionized the theory and practice of statistical inference. Reciprocally,novel statistical inference procedures are becoming an integral part of data science. By focusing onthe interplay between statistical inference and methodologies for data science, this course reviews themain concepts underpinning classical statistical inference, studies computer-intensive methods forconducting statistical inference, and examines important issues concerning statistical inference drawnupon modern learning technologies. Contents include method of moments estimation, least squaresestimation, maximum likelihood estimation, principal component analysis, singular valuedecomposition, simple Monte Carlo simulation, resampling methods, and LASSO and RIDGEregressions.
Assessment: coursework (40%) and examination (60%)
DASC7104 - Advanced database systems (6 credits)
The course will study some advanced topics and techniques in database systems, with a focus on theaspects of database systems design & algorithms and big data processing for structured data.Traditional topics include: query optimization, physical database design, transaction management,crash recovery, parallel databases. This course will also survey some recent developments in selectedareas such as NoSQL databases and SQL-based big data management systems for relational(structured) data.
Prerequisites: A course of introduction to databases and basic programming skills.
Assessment: coursework (55%) and examination (45%)
STAT7102 - Advanced statistical modelling (6 credits)
This course introduces modern methods for constructing and evaluating statistical models and theirimplementation using popular computing software, such as R or Python. It will cover both theunderlying principles of each modelling approach and the model estimation procedures. Topics from:(i) Linear regression models; (ii) Generalized linear models; (iii) Model selection and regularization;(iv) Kernel and local polynomial regression; selection of smoothing parameters; (v) Generalizedadditive models; (vi) Hidden Markov models and Bayesian networks.
Assessment: coursework (50%) and examination (50%)
Disciplinary Electives
36 credits*
List A
At least 12 credits
COMP7107 - Management of complex data types (6 credits)
The course studies the management and analysis of data types which are not simple scalars. Suchcomplex data types include spatial data, multidimensional data, time-series data, temporal andspatio-temporal data, sparse multidimensional vectors, set-valued data, strings and sequences,homogeneous and heterogeneous graphs, knowledge-base graphs, geo-textual and geo-social data.For each of these data types, we will learn popular queries and analysis tasks, as well as storage andindexing methods for main memory and the disk.
Assessment: coursework (50%) and examination (50%)
COMP7305 - Cluster and cloud computing (6 credits)
This course offers an overview of current cloud technologies, and discusses various issues in thedesign and implementation of cloud systems. Topics include cloud delivery models (SaaS, PaaS, andIaaS) with motivating examples from Google, Amazon, and Microsoft; virtualization techniquesimplemented in Xen, KVM, VMWare, and Docker; distributed file systems, such as Hadoop filesystem; MapReduce and Spark programming models for large-scale data analysis, networkingtechniques in hyper-scale data centers. The students will learn the use of Amazon EC2 to deployapplications on cloud, and implement a SPARK application on a Xen-enabled PC cluster as part oftheir term project.
Prerequisites: Students are expected to install various open-source cloud software in their Linuxcluster, and exercise the system configuration and administration. Basic understanding of Linuxoperating system and some programming experiences (C/C++, Java or Python) in a Linuxenvironment are required.
Assessment: coursework (50%) and examination (50%)
COMP7409 - Machine learning in trading and finance (6 credits)
The course introduces our students to the field of Machine Learning, and help them develop skills ofapplying Machine Learning, or more precisely, applying supervised learning, unsupervised learningand reinforcement learning to solve problems in Trading and Finance.
This course will cover the following topics. (1) Overview of Machine Learning and ArtificialIntelligence, (2) Supervised Learning, Unsupervised Learning and Reinforcement Learning, (3)Major algorithms for Supervised Learning and Unsupervised Learning with applications to Tradingand Finance, (4) Basic algorithms for Reinforcement Learning with applications to optimal trading,asset management, and portfolio optimization, (5) Advanced methods of Reinforcement Learningwith applications to high-frequency trading, cryptocurrency trading and peer-to-peer lending.
Assessment: coursework (65%) and examination (35%)
COMP7503 - Multimedia technologies (6 credits)
This course presents fundamental concepts and emerging technologies for multimedia computing.Students are expected to learn how to develop various kinds of media communication, presentation,and manipulation techniques. At the end of course, students should acquire proper skill set to utilize,integrate and synchronize different information and data from media sources for building specificmultimedia applications. Topics include media data acquisition methods and techniques; nature ofperceptually encoded information; processing and manipulation of media data; multimedia contentorganization and analysis; trending technologies for future multimedia computing.
Assessment: coursework (50%) and examination (50%)
COMP7506 - Smart phone apps development (6 credits)
Smart phones have become an essential part of our everyday lives. The number of smart phone usersworldwide today surpasses six billion and is forecast to further grow by more than one billion in thenext few years.
Smart phones play an important role in mobile communication and applications.Smart phones are powerful as they support a wide range of applications (called apps). Most of thetime, smart phone users just download their favorite apps remotely from the app stores. There is agreat potential for software developer to reach worldwide users.
This course aims at introducing the design and technical issues of smart phone apps. For example,smart phone screens are usually smaller than computer monitors while smart phones usually possessmore hardware sensors than conventional computers. We have to pay special attention to theseaspects in order to develop attractive and successful apps. Various modern smart phone appsdevelopment environments and programming techniques (such as Java for Android phones and Swiftfor iPhones) will also be introduced to facilitate students to develop their own apps.
Students should have basic programming knowledge.
Mutually exclusive with: COMP3330 Interactive mobile application design and programming
Assessment: coursework (60%) and examination (40%)
COMP7507 - Visualization and visual analytics (6 credits)
This course introduces the basic principles and techniques in visualization and visual analytics, andtheir applications. Topics include human visual perception; color; visualization techniques for spatial,geospatial and multivariate data, graphs and networks; text and document visualization; scientificvisualization; interaction and visual analysis.
Assessment: coursework (50%) and examination (50%)
COMP7906 - Introduction to cyber security (6 credits)
The aim of the course is to introduce different methods of protecting information and data in the cyberworld, including the privacy issue. Topics include introduction to security; cyber attacks and threats;cryptographic algorithms and applications; network security and infrastructure.
Mutually exclusive with: ICOM6045 Fundamentals of e-commerce security.
Assessment: coursework (50%) and examination (50%)
DASC7606 - Deep learning (6 credits)
Machine learning is a fast growing field in computer science and deep learning is the cutting edgetechnology that enables machines to learn from large-scale and complex datasets. Ethicalimplications of deep learning and its applications will be covered and the course will focus on howdeep neural networks are applied to solve a wide range of problems in areas such as natural languageprocessing, and image processing. Other applications such as financial predictions, game playing androbotics may also be covered. Topics covered include linear and logistic regression, artificial neuralnetworks and how to train them, recurrent neural networks, convolutional neural networks, generativemodels, deep reinforcement learning and unsupervised feature learning.
Prerequisites: Basic programming skills, e.g., Python is required.
Assessment: coursework (50%) and examination (50%)
FITE7410 - Financial fraud analytics (6 credits)
This course aims at introducing various analytics techniques to fight against financial fraud. Theseanalytics techniques include, descriptive analytics, predictive analytics, and social network learning.Various data set will also be introduced, including labeled or unlabeled data sets, and social networkdata set. Students learn the fraud patterns through applying the analytics techniques in financialfrauds, such as, insurance fraud, credit card fraud, etc.
Key topics include: Handling of raw data sets for fraud detection; Applications of descriptiveanalytics, predictive analytics and social network analytics to construct fraud detection models;Financial Fraud Analytics challenges and issues when applied in business context.
Required to have basic knowledge about statistics concepts.
Assessment: coursework (50%) and examination (50%)
ICOM6044 - Data science for business (6 credits)
The emerging discipline of data science combines statistical methods with computer science to solveproblems in applied areas. In this case we focus on how data science can be used to solve businessproblems especially those enabled by electronic commerce. By its very nature e-commerce is able togenerate large amounts of data and data mining methods are quite helpful for managers in turning thisdata into knowledge which in turn can be used to make better decisions. These data sets and theiraccompanying quantitative methods have the potential to dramatically change decision making inmany areas of business. For example, ideas like interactive marketing, customer relationship management, and database marketing are pushing companies to utilize the information they collectabout their customers in order to make better marketing decisions.
This course focuses on how data science methods can be applied to solve managerial problems inbusiness. Our emphasis is developing a core set of principles that embody data science: empiricalreasoning, exploratory and visual analysis, and predictive modeling. We use these core principles tounderstand many methods used in data mining and machine learning. Our strategy in this course is tosurvey several popular techniques and understand how they map into these core principles. Thesetechniques are illustrated with case studies that involve decisions about targeting, productrecommendation, customer retention and financial lending. The class takes a learning-by-doingapproach to analyse data and make decisions from these analyses. However, the emphasis is not onthe software for implementing these techniques but on understanding the inputs and outputs of thesetechniques and how they are used to solve business problems, and effectively communicate them tomanagers.
Assessment: coursework (65%) and examination (35%)
List B
At least 12 credits
STAT6008 - Advanced statistical inference (6 credits)
This course covers the advanced theory of point estimation, interval estimation and hypothesis testing.Using a mathematically-oriented approach, the course provides a formal treatment of inferentialproblems, statistical methodologies and their underlying theory. It is suitable in particular for studentsintending to further their studies or to develop a career in statistical research. Contents include:(1) Decision problem – frequentist approach: loss function; risk; decision rule; admissibility;minimaxity; unbiasedness; Bayes’ rule; (2) Decision problem – Bayesian approach: prior andposterior distributions, Bayesian inference; (3) Estimation theory: exponential families; likelihood;sufficiency; minimal sufficiency; completeness; UMVU estimators; information inequality;large-sample theory of maximum likelihood estimation; (4) Hypothesis testing: uniformly mostpowerful (UMP) test; monotone likelihood ratio; UMP unbiased test; conditional test; large-sampletheory of likelihood ratio; confidence set; (5) Nonparametric inference; bootstrap methods.
Assessment: coursework (40%) and examination (60%)
STAT6013 - Financial data analysis (6 credits)
This course aims at introducing statistical methodologies in analyzing financial data. Financialapplications and statistical methodologies are intertwined in all lectures. Contents include: classicalportfolio theory, portfolio selection in practice, single index market model, robust parameterestimation, copula and high frequency data analysis.
Assessment: coursework (40%) and examination (60%)
STAT6015 - Advanced quantitative risk management (6 credits)
This course covers statistical methods and models of risk management, especially of Value-at-Risk(VaR). Contents include: Value-at-risk (VaR) and Expected Shortfall (ES); univariate models (normalmodel, log-normal model and stochastic process model) for VaR and ES; models for portfolio VaR;time series models for VaR; extreme value approach to VaR; back-testing and stress testing.
Assessment: coursework (50%) and examination (50%)
STAT6016 - Spatial data analysis (6 credits)
This course aims at introducing statistical methodologies in analyzing financial data. Financial applications and statistical methodologies are intertwined in all lectures. Contents include: classical portfolio theory, portfolio selection in practice, single index market model, robust parameter estimation, copula and high frequency data analysis..
Assessment: coursework (40%) and examination (60%)
STAT6019 - Current topics in statistics (6 credits)
This course may include modules such as:
Causal Inference, is an introduction to key concepts and methods for causal inference. Contents include 1) the counterfactual outcome, randomized experiment, observational study; 2) Effect modification, mediation and interaction; 3) Causal graphs; 4) Confounding, selection bias, measurement error and random variability; 5) Inverse probability weighting and the marginal structural models; 6) Outcome regression and the propensity score; 7) The standardization and the parametric g-formula; 8) G-estimation and the structural nested model; 9) Instrumental variable method; 10) Machine learning methods for causal inference; 11) Other topics as determined by the instructor.
Functional data analysis, covers topics from: 1) Base functions; 2) Least squares estimation; 3) Constrained functions; 4) Functional PCA; 5) Regularized PCA; 6) Functional linear model; 7) Other topics as determined by the instructor.
Assessment: coursework (100%)
STAT7008 - Programming for data science (6 credits)
Capturing and utilising essential information from big datasets poses both statistical and programming challenges. This course is designed to equip students with the fundamental computing skills required to use Python for addressing these challenges. The course will cover a range of topics, including programming syntax, files IO, object-oriented programming, scientific data processing and analysis, data visualization, data mining and web scraping, programming techniques for machine learning, deep learning, computer vision, and natural language processing, etc.
Assessment: coursework (100%)
STAT8003 - Time series forecasting (6 credits)
Discrete time series are integer-indexed sequences of random variables. Such series arise naturally in climatology, economics, finance, environmental research and many other disciplines. The course covers statistical modelling and forecasting of time series. Topics may include stationary and nonstationary time series, ARMA models, identification based on autocorrelation and partial autocorrelation, GARCH models, goodness-of-fit, forecasting, and nonlinear time series modelling.
Assessment: coursework (50%) and examination (50%)
STAT8017 - Data mining techniques (6 credits)
With the rapid developments in computer and data storage technologies, the fundamental paradigms of classical data analysis are mature for change. Data mining techniques aim at helping people to work smarter by revealing underlying structure and relationships in large amounts of data. This course takes a practical approach to introduce the new generation of data mining techniques and show how to use them to make better decisions. Topics include data preparation, feature selection, association rules, decision trees, bagging, random forests and gradient boosting, cluster analysis, neural networks, introduction to text mining.
Assessment: coursework (100%)
STAT8019 - Marketing analytics (6 credits)
This course aims to introduce various statistical models and methodology used in marketing research. Special emphasis will be put on marketing analytics and statistical techniques for marketing decision making including market segmentation, market response models, consumer preference analysis, conjoint analysis and extracting insights from text data. Contents include statistical methods for segmentation, targeting and positioning, statistical methods for new product design, text mining techniques and market response models.
Assessment: coursework (50%) and examination (50%)
STAT8300 - Career development and communication workshop (Non-credit-bearing)
The course is specially designed for students who wish to sharpen their communication and career preparation skills through a variety of activities including lectures, skill-based workshops, small group discussion and role plays. All of which aim to facilitate students in making informed career choices, provide practical training to enrich communication, presentation, time management and advanced interview skills, and to enhance students’ overall competitiveness in the employment markets.
Assessment: coursework (100%), assessment of this course is on a pass or fail or distinction basis
STAT8306 - Statistical methods for network data (3 credits)
The six degree of separation theorizes that human interactions could be easily represented in the form of a network. Examples of networks include router networks, the World Wide Web, social networks (e.g. Facebook or Twitter), genetic interaction networks and various collaboration networks (e.g. movie actor coloration network and scientific paper collaboration network). Despite the diversity in the nature of sources, the networks exhibit some common properties. For example, both the spread of disease in a population and the spread of rumors in a social network are in sub-logarithmic time. This course aims at discussing the common properties of real networks and the recent development of statistical network models. Topics may include common network measures, community detection in graphs, preferential attachment random network models, exponential random graph models, models based on random point processes and the hidden network discovery on a set of dependent random variables.
Assessment: coursework (100%)
STAT8307 - Natural language processing and text analytics (6 credits)
Natural language processing (NLP) is a subfield of artificial intelligence, focusing on understanding human language. This course aims to provide students with knowledge and skills in natural language processing and text analytics, including basic information retrieval, text classification, word embedding, neural networks, sequence models, encoder-decoder, transformer, contextualized world representation, and language model. Students are required to be familiar with Python programming.
Assessment: coursework (100%)
STAT8308 - Blockchain data analytics (3 credits)
In this course, we start by studying the basic architecture of a blockchain. Then we move on to several major applications including (but not limited to) cryptocurrencies, fintech and smart contracts. We conclude by examining the cybersecurity issues facing the blockchain ecosystems.
Assessment: coursework (100%)
Capstone Requirement
12 credits
DASC7600 - Data science project (12 credits)
Candidate will be required to carry out independent work on a major project under the supervision of individual staff member. A written report is required.
Note: Students should not be taking or have taken DASC8088 Data science practicum
Assessment: written report (70%) and oral presentation (30%)
DASC8088 - Data science practicum (6 credits) + a 6-credit course (from List A or List B)
This course provides students with first-hand experience in applying academic knowledge in a real-life work environment. To be eligible, students should be undertaking a data-science-related practicum with no less than 160 hours in at least 20 working days spent in a paid or unpaid position. It is allowed for part-time students to complete their practicum within their current place of employment. The practicum will normally take place in the summer semester for full-time students or during the summer semester of the second year for part-time students.
Note: Students should not be taking or have taken DASC7600 Data science project
Assessment: Upon completion of the practicum, each student is required to submit a written report
(60%) and to give an oral presentation (40%) on his/her practicum experience. Supervisors will assess the students based on their performance during the practicum period. Assessment of this course is on a Pass, Fail or Distinction basis with three criteria: (1) supervisor’s evaluation, (2) written report, (3) oral presentation. Failing in fulfilling any of the three criteria satisfactorily leads to a “Fail” grade in the course.
- The programme structure will be reviewed from time to time and is subject to change.
- *Students who have completed the same courses in their previous studies in HKU, e.g. Master of Statistics or Master of Science in Computer Science may, on production of relevant transcripts, be permitted to select up to 36 credits of disciplinary electives from either List A or List B above if they are not able to find any untaken options from either of the lists of disciplinary electives.
DU Shenghui (Class 2024, FT)
Data Scientist, Deloitte HKMOK Ho Fung Arthur (Class 2024, PT)
Manager, Ernst & Young Advisory Services LimitedCHEN Zeyu (Class 2024, FT)
PhD in Teledentistry and AI, HKUCongregation
From 2021-22, two Congregations will be held annually (in July and December respectively) which is to align with international practices and to facilitate the convenience for students.
Graduands who are eligible for graduation will be assigned to the Congregation nearest to the completion date of their studies for conferral of degree. The Graduation Certificate normally could be collected on the day of the Congregation.
*Expected graduation time for normative study period
1.5 years
Full Time
Summer (July 2027)
2.5 years
Part Time
Summer (July 2028)
Tuition Fee, Scholarship and Fellowships
Programme Fees (subject to approval)
This is a self-funded programme. The full tuition fee is HK$324,000 (for FT & PT students admitted in September 2025). The full fee would normally be paid by full-time students in 3 installments, and part-time students in 5 installments.
Targeted Taught Postgraduate Programmes Fellowships Scheme
Master of Data Science (MDASC) is one of the Programmes sponsored by University Grants Committee (UGC) for Targeted Taught Postgraduate Programmes Fellowships Scheme. Local full-time or part-time offer recipients who will be students of MDASC in the academic year 2024-25 are eligible for application, and applicants are required to prepare a proposal on how they can contribute to the priority areas (i.e. Business and STEM) of Hong Kong after completing MDASC. More application details will be released to the eligible candidates by email in due course.
Successful applicants will each receive an award of HK$120,000. Please note that if the awardees cannot complete MDASC for any reasons or are not able to obtain satisfactory results, they will be required to refund the full amount of the award
Continuing Education Fund (CEF)
All CEF applicants are required to attend at least 70% of the courses before they are eligible for fee reimbursement under the CEF.
*The mother programme (Master of Data Science) of these courses is recognised under the Qualifications Framework (QF Level 6).