PROGRAMME
Symposium:
The mathematics of ranking
Wednesday October 15, 2008
- 09h30 Welcome and coffee The mathematics of ranking
- 10h00 Hugo Zaragoza: Godeaux lecture: Ranking text in search of knowledge and wealth
- 11h00 Bettina Berendt: Ranking - Use and Usability
- 12h00 Lunch
- 13h30 Paul Van Dooren: Some graph optimization problems in data mining
- 14h30 Leo Egghe: Lotkaian informetrics and applications to social networks
- 15h30 Coffee
- 16h00 Gianna M. Del Corso: Evaluating scientific products by means of citation-based models
- 17h00 End
-
Bettina Berendt
Ranking - Use and Usability
The ultimate goal of ranking in search engines is to help people find what they are looking for - and (depending on the application) to suggest to them things that they didn't know they were looking for, but might still find interesting. Any evaluation of ranking methods should therefore consider such deployment scenarios. In this talk, an overview will be given of i) where and how ranking is used by the operators of a Web site or similar service, ii) how ranking is used by the end users of that site or service, and how such usage is measured, and iii) how and according to which criteria this usage and the success as well as the quality of ranking are measured.
The slides -
Gianna M. Del Corso
Evaluating scientific products by means of citation-based models
Some integrated models for ranking scientific publications together with authors and journals are presented and analyzed. The models rely on certain adiacency matrices obtained from the relations of citation, authorship and publication, which concur to forming a suitable irreducible stochastic matrix whose dominant eigenvector provides the ranking. Some perturbation theorems concerning the dominant eigenvector of nonnegative irreducible matrices are proved. These theoretical results provide a validation of the consistency and effectiveness of our models. Several paradigmatic examples are reported together with some results obtained on a real set of data.
The slides -
Leo Egghe
Lotkaian informetrics and applications to social networks
Growth of the internet is illustrated as an example of 1-dimensional informetrics. Then we define 2-dimensional informetrics with sources and items leading to an Information Production Process (IPP) and examples are given. We introduce the notion of size-frequency function and of rank-frequency function. When the size-frequency function is a power law we say that the system satisfies the law of Lotka. Lotka's law is equivalent with the law of Zipf. Examples in websites are given. The scale-free property of Lotkaian systems is highlighted and its consequence that Lotkaian IPPs can be interpreted as self-similar fractals is given, hereby illustrating the important role of the Lotkaian exponent.
Dynamic aspects of Lotkaian IPPs are given via the study of the effect of transformations on the sources and on the items. Transformation formulae are given for the size- and rank-frequency functions with applications. With the law of Lotka we can also give a model for the cumulative first-citation distribution and show practical applications.
We introduce the h-index and the g-index, give examples of their calculation on a ranked list of number of citations to papers and show their use in the evaluation of a scientist's career and of a meta-author. Advantages of the g-index above the h-index are given. When we have the law of Lotka we are able to present formulae for the h- and g-index. Using these formulae and the above described transformations, we are able to predict what will be the effect of "publicitis" on the value of the h-index. A website is shown where one can - in the context of Google Scholar - calculate the h-index, the g-index and other indices of an author.
The slides -
Paul Van Dooren
Some graph optimization problems in data mining
Graph-theoretic ideas have become very useful in understanding modern large-scale data mining techniques. We show in this talk that ideas from optimization are also quite useful to better understand the numerical behavior of the corresponding algorithms. We illustrate this claim by looking at two specific graph theoretic problems and their application in data mining.
The slides
The first problem is that of reputation systems where the reputation of objects and voters on the web are estimated; the second problem is that of estimating the similarity of nodes of large graphs. These two problems are also illustrated using concrete applications in data mining. -
Hugo Zaragoza Godeaux lecture:
Ranking text in search of knowledge and wealth
Search engines play a major role in the success and growth of the Web. In doing so they shape the web in all kinds of ways: creating new business models, modifying content creation practices, supporting new forms of user interaction, and magnifying ethical and social issues. But, at the core, search engines are incredibly fast ranking machines using relatively simple ranking algorithms. In the first part of my talk I will give an overview of the different "ranking problems" that search engines need to solve today, and the methods used to tackle them. One of the most crucial elements of search engines today is their ability to process the text in web pages to match future queries. In the second part of the talk I will concentrate in "text ranking" problems, from simple key-word based query retrieval problems, to more sophisticate problems where text is parsed and labelled semantically in the hope of constructing better ranking algorithms.
The slides
- The Godeaux lecture is organized at least once every two years during a BMS event. These lectures honoring the memory of Lucien Godeaux are organized with the assets of the Belgian Center for Mathematical Studies which were transferred to the BMS after the dissolution of this Center. Lucien Godeaux (1887-1975) was one of the world’s most prolific mathematicians (with 644 papers published) and took many initiatives to encourage young mathematicians to communicate their research. He was the founder of the Belgian Center for Mathematical Studies in 1949.

