CSE 291-F: Graph Mining and Network Analysis

Course Overview

General Info: Graduate-level course, CSE Dept., UC San Diego, Spring quarter 2017

Lecture hours: Tuesday and Thursday, 18:30 - 19:50
Lecture room: PCYNH 106

Instructor: Fragkiskos Malliaros
Email: fmalliaros [at] eng.ucsd.edu
Office hours: TBA

TA: Mohammad Motiei
Email: mmotiei [at] eng.ucsd.edu
Office hours: TBA

Piazza: piazza.com/ucsd/spring2017/cse291f/home


Networks (or graphs) have become ubiquitous as data from diverse disciplines can naturally be mapped to graph structures. Social networks, such as academic collaboration networks and interaction networks over online social networking applications are used to represent and model the social ties among individuals. Information networks, including the hyperlink structure of the Web and blog networks, have become crucial mediums for information dissemination, offering an effective way to represent content and navigate through it. A plethora of technological networks, including the Internet, power grids, telephone networks and road networks are an important part of everyday life. The problem of extracting meaningful information from large scale graph data in an efficient and effective way has become crucial and challenging with several important applications and towards this end, graph mining and analysis methods constitute prominent tools. The goal of this course is to present recent and state-of-the-art methods and algorithms for exploring, analyzing and mining large-scale networks, as well as their practical applications in various domains (e.g., social science, the web, biology).





Schedule and Lectures

The topics of the lectures are subject to change (the following schedule outlines the topics that will be covered in the course). The slides for each lecture will be posted on piazza just before the start of the class.


Week Date Topic Material Assignments/Project
1April 4IntroductionLecture 1
April 6Graph theory, linear algebra and ML basics recapLecture 2
2April 11Network properties and the random graph model
April 13Network generative models
3April 18Random walks and centrality criteriaAssignment 1 out
April 20Link analysis algorithms (PageRank and HITS)
4April 25Project proposal short presentations (all teams)Project proposal slides due on April 24
April 27No class. Traveling to SDM 2017Project proposal due
5May 2Graph clustering and community detection (Part I)Assignment 1 due
May 4 Graph clustering and community detection (Part II)
6May 9 Node similarity and link prediction Assignment 2 out
May 11Graph similarity, graph kernels and graph classification
7May 16 Representation learning in graphs
May 18Influential spreaders and influence maximization
8May 23Anomaly detection Project progress report due
May 25Graph sampling and summarization
9May 30Dense subgraph detection and applications Assignment 2 due
June 1Rich network structures: signed, uncertain, multilayer graphs, geo-social and textual networks
10June 6Core decomposition in graphs
June 8Project presentations or poster session. The date is subject to change based on student's availabilityProject final report due on June 10



Lecture 1: Introduction

Introduction to graph mining and network analysis, administrivia, course structure and overview of the topics that will be covered in the course.

Reading:

Lecture 2: Graph theory, linear algebra and ML basics recap

Presentation of basic concepts in graph theory, linear algebra and machine learning that will be used throughout the course.

Reading:



Course Structure

Learning objectives

The course aims to introduce students to the field of graph mining and network analysis by:
  • Covering a wide range of topics, methodologies and related applications.
  • Giving the students the opportunity to obtain hands-on experience on dealing with graph data and graph mining tasks.
We expect that by the end of the course, the students will have a thorough understanding of various graph mining and learning tasks, will be able to analyze large-scale graph data as well as to formulate and solve problems that involve graph structures.


Prerequisites

There is no official prerequisite for this course. However, the students are expected to:

  • Have basic knowledge of graph theory and linear algebra.
  • Be familiar with fundamental data mining and machine learning tasks (e.g., CSE 258).
  • Be familiar with at least one programming language (e.g., Python or any language of their preference).
In the second lecture, we will review basic concepts in graph theory, linear algebra and machine learning.


Reading material

Most of the material of the course is based on research articles. Some of the topics are also covered by the following books:



Evaluation

The evaluation of the course will be based on the following:

  1. Two hands-on practical assignments: the hands-on assignments will familiarize the students with basic graph mining and analysis tasks.
  2. Project: this will be the main component for the evaluation of the course. The students are expected to form groups of 3-4 people, propose a topic for their project, and submit a final project report (it would also be interesting to organize a poster session at the end of the quarter). Please, read the project section for more details.

The grading will be as follows:

Assignment 1 (individual): 10%
Assignment 2 (groups of 3-4 students): 30%
Project (groups of 3-4 students): 60%


Academic integrity

UCSD and CSE's policies on academic integrity will be strictly enforced (please see here and here). In particular, all of your work must be your own. Some relevant excerpts from UCSD's policies are:

  • Don't copy another student's assignment, in part or in total, and submit it as your own work.
  • Acknowledge and cite source material in your papers or assignments.




Project

Details about the project will be posted here soon.




Resources

Datasets


Software tools

  • NetworkX: Python software package for graph analytics
  • igraph: collection of software packages for graph theory and network analysis (Python, C++ and R)
  • SNAP: high performance system for the analysis of large network (C++ and Python)
  • Gephi: graph visualization and exploration software