The growing interest in network data analysis and inference is parallel to the huge importance of these data in the digital modern world. Combined with the recent developments in modeling and data processing, many aspects including modelisation, inference, visualisation are of interest from the statistical point of view.
The goal of these courses is to present the main concepts and ideas of network analysis, sparse inference, high-dimensional data and graphical models. Theoretical aspects developments of presented methods are motivated and demonstrated by real-world applications in various fields.
One of the main developments for network analysis gave rise recently to the web platform linkage.fr. It is the very first approach capable of analyzing both the topology of the network (who talks to who ?) and the documents exchanged between the individuals (about what ?). The methodology and the platform itself will be presented in details.
Course 1 (M-L. Martin-Magniette): Networks: what? what for? and how?
This lecture is an introduction to the networks, it is composed of three parts. In the first part, the goal is to understand that networks can be seen and understood in different ways. The second part of the lecture will be dedicated to the definition of global and local indicators that characterize a network and their interpretation in biological networks. The third part will be on how networks are built and this part will be a transition for the second lecture.
Course 2 (S. Robin): Network inference and graphical models
Networks are mostly used to depict interactions (links) between entities (nodes). In many situations, the network is not directly observed and needs be inferred based on observations collected on each node. Network inference can then be rephrased in terms of graphical models. In this framework, the joint distribution of the data collected on all nodes is supposed to factorize along the cliques of a certain graph G, which encodes the dependency structure between the nodes. This problem is sometimes called ‘structure inference’. Several graphical models, from Gaussian graphical models (GGM) to more complex ones, will be introduced as well as different strategies to infer G. This problem turns out to raises both theoretical and computational issues.
Course 3 (C. Bouveyron): Clustering of networks and applications
One of the most important usage of statistical network modeling is the clustering of network nodes. This is of great interest, for instance, in Marketing to form homogeneous groups of customers, in Biology to understand the roles of genes in regulation networks or in Defense to identify groups of persons with abnormal behaviors. In this course, we will introduce the main techniques (modularity, SBM, LSCM, ...) to cluster the elements of a network. We will also propose to practice with the R software those techniques on real-world data sets.
Course 4 (P. Latouche): Recent advances in network analysis
In this course, we will consider recent advances in the analysis of networks. In particular, we will consider statistical models designed for analyzing dynamic networks, networks with overlapping clusterings and networks with textual edges. The R packages associated with the introduced methodologies will be used. The course will be illustrated with examples coming from real-world problems in social networks, biology, historical sciences. Finally, the platform linkage.fr will be presented along with the algorithm at the core of the approach. This platform is the very first capable of analyzing communication networks where documents are exchanged between nodes. Moreover, we will illustrate the relevance of the approach through two applications. The first analyses the activity of the French Twitter accounts before the last French presidential election. In the second study, linkage allows to uncover the roots of the Enron scandal.