Statistical Mechanics for Modeling and Prediction of Human Behavior

Micro-dynamics, model selection and lab-in-the-field experiments

Description

StatMech2Pred is an ambitious project which intends to understand human behavior at individual and society scales by developing new prediction and observational techniques based on statistical mechanics. During the past 15 years we have witnessed a remarkable increase in both the scale and scope of social and behavioral data available. Such wealth of data has not only opened the possibilities to understand social systems in an unprecedented manner but also has also emphasized the need for new ways to observe, model and predict human behavior at micro and macro scales. From this perspective, tools built upon the principles of statistical mechanics are natural drivers to capture human­ related phenomena.

Importantly, most of the recent advances in this area have been descriptive, and although many models have been proposed, the tools and data behind these models lack the accuracy to be predictive and prescriptive. Current models can neither anticipate the behavior of individuals nor the dynamics of the system as a whole. Therefore, to hypothesize about different scenarios is still an arduous task. For example, information diffusion is usually modelled by mechanistic models borrowed from biology, which do not describe the complex and context­-dependent dynamics of information sharing. For models to become predictive we need: (i) better understanding of the micro­dynamics of human behavior and its non­trivial connection to contextual factors, (ii) better model selection and statistical inference tools, and (iii) better design of experiments and data collection.

StatMech2Pred aims precisely at developing the complex systems tools necessary to infer predictive models of human behavior from empirical data in different contexts, and to carry out experiments to investigate and model aspects of human behavior that are not covered by existing datasets. Our cross­disciplinary approach will allow not only to have more accurate models of human behavior, but also to respond to important problems at macro level like financial markets stability, economic growth or social inequalities.

On the methodological side we will develop tools combining network and non­network inference and model­selection approaches, the theory of critical phenomena, and stochastic processes. Specifically, we will focus on the use of statistical mechanics to: (i) develop models of human micro­dynamics, and (ii) create better model selection tools from empirical data. On the experimental side, we want to combine big data from online sources with data from experiments by for instance using social dilemmas. Through such experiments we will gather controlled data to either validate findings in data from online sources, or to answer specific fundamental questions related to human actions.

Finally, we will analyze empirical data and develop predictive and grounded models considering the relationship between actions and contextual factors. Specifically, we will address: (i) the human decision making process in controlled settings,(ii) the impact of human guesses on market price changes, and (iii) the relationship between human behavior shifts and socio­economic indicators.

StatMech2Pred will draw from our previous experiences in the development of mathematical tools, data analysis and the setup of controlled experiments to go beyond the current understanding of human actions while creating tools and experimental frameworks to be used as references in future human behavior studies.

Highlights

Analyzing gender inequality through large-scale Facebook advertising data

Garcia, D, Kassa, YM, Cuevas, A, Cebrian, M, Moro, E, Rahwan, I, Cuevas, R

Online social media are information resources that can have a transformative power in society. While the Web was envisioned as an equalizing force that allows everyone to access information, the digital divide prevents large amounts of people from being present online. Online social media, in par...

Journal

Consistencies and inconsistencies between model selection and link prediction in networks

Valles-Catala, T, Peixoto, TP, Sales-Pardo, M, Guimera, R

A principled approach to understand network structures is to formulate generative models. Given a collection of models, however, an outstanding key task is to determine which one provides a more accurate description of the network at hand, discounting statistical fluctuations. This problem can be...

Journal

Weather impacts expressed sentiment

Baylis, P, Obradovich, N, Kryvasheyeu, Y, Chen, HH, Coviello, L, Moro, E, Cebrian, M, Fowler, JH

We conduct the largest ever investigation into the relationship between meteorological conditions and the sentiment of human expressions. To do this, we employ over three and a half billion social media posts from tens of millions of individuals from both Facebook and Twitter between 2009 and 201...

Journal

People

Roger Guimerà

Universitat Rovira i Virgili - ICREA Research Professor

Contact

roger.guimera@urv.cat

@sees_lab

Site

Marta Sales-Pardo

Universitat Rovira i Virgili - Associate Professor

Contact

marta.sales@urv.cat

@sees_lab

Site

Esteban Moro

Universidad Carlos III de Madrid - Associate Professor

Contact

emoro@math.uc3m.es

@estebanmoro

Site

Josep Perelló

Universtiat de Barcelona - Associate Professor

Contact

josep.perello@ub.edu

@josperello

Site

Jordi Duch

Universitat Rovira i Virgili - Associate Professor

Contact

jordi.duch@urv.cat

@tanisjones

Miquel Montero

Universitat de Barcelona - Associate Professor

Contact

miquel.montero@ub.edu

Site

Jaume Masoliver

Universitat de Barcelona - Professor

Contact

jaume.masoliver@ub.edu

Site

Young-Ho Eom

Universidad Carlos III de Madrid - Experienced Fellow

Contact

yeom@math.uc3m.es

Site

Javier Villarroel

Universidad de Salamanca - Professor

Contact

javier@usal.es

Site

Publications

Online social media are information resources that can have a transformative power in society. While the Web was envisioned as an equalizing force that allows everyone to access information, the digital divide prevents large amounts of people from being present online. Online social media, in particular, are prone to gender inequality, an important issue given the link between social media use and employment. Understanding gender inequality in social media is a challenging task due to the necessity of data sources that can provide large-scale measurements across multiple countries. Here, we show how the Facebook Gender Divide (FGD), a metric based on aggregated statistics of more than 1.4 billion users in 217 countries, explains various aspects of worldwide gender inequality. Our analysis shows that the FGD encodes gender equality indices in education, health, and economic opportunity. We find gender differences in network externalities that suggest that using social media has an added value for women. Furthermore, we find that low values of the FGD are associated with increases in economic gender equality. Our results suggest that online social networks, while suffering evident gender imbalance, may lower the barriers that women have to access to informational resources and help to narrow the economic gender gap.
[Visit journal]
A principled approach to understand network structures is to formulate generative models. Given a collection of models, however, an outstanding key task is to determine which one provides a more accurate description of the network at hand, discounting statistical fluctuations. This problem can be approached using two principled criteria that at first may seem equivalent: selecting the most plausible model in terms of its posterior probability; or selecting the model with the highest predictive performance in terms of identifying missing links Here we show that while these two approaches yield consistent results in most cases, there are also notable instances where they do not, that is, where the most plausible model is not the most predictive. We show that in the latter case the improvement of predictive performance can in fact lead to overfitting both in artificial and empirical settings. Furthermore, we show that, in general, the predictive performance is higher when we average over collections of models that are individually less plausible than when we consider only the single most plausible model.
[Visit journal]
We conduct the largest ever investigation into the relationship between meteorological conditions and the sentiment of human expressions. To do this, we employ over three and a half billion social media posts from tens of millions of individuals from both Facebook and Twitter between 2009 and 2016. We find that cold temperatures, hot temperatures, precipitation, narrower daily temperature ranges, humidity, and cloud cover are all associated with worsened expressions of sentiment, even when excluding weather-related posts. We compare the magnitude of our estimates with the effect sizes associated with notable historical events occurring within our data.
[Visit journal]
Despite the huge interest in network resilience to stress, most of the studies have concentrated on internal stress damaging network structure (e.g., node removals). Here we study how networks respond to environmental stress deteriorating their external conditions. We show that, when regular networks gradually disintegrate as environmental stress increases, disordered networks can suddenly collapse at critical stress with hysteresis and vulnerability to perturbations. We demonstrate that this difference results from a trade-off between node resilience and network resilience to environmental stress. The nodes in the disordered networks can suppress their collapses due to the small-world topology of the networks but eventually collapse all together in return. Our findings indicate that some real networks can be highly resilient against environmental stress to a threshold yet extremely vulnerable to the stress above the threshold because of their small-world topology.
[Visit journal]
Mental disorders have an enormous impact in our society, both in personal terms and in the economic costs associated with their treatment. In order to scale up services and bring down costs, administrations are starting to promote social interactions as key to care provision. We analyze quantitatively the importance of communities for effective mental health care, considering all community members involved. By means of citizen science practices, we have designed a suite of games that allow to probe into different behavioral traits of the role groups of the ecosystem. The evidence reinforces the idea of community social capital, with caregivers and professionals playing a leading role. Yet, the cost of collective action is mainly supported by individuals with a mental condition - which unveils their vulnerability. The results are in general agreement with previous findings but, since we broaden the perspective of previous studies, we are also able to find marked differences in the social behavior of certain groups of mental disorders. We finally point to the conditions under which cooperation among members of the ecosystem is better sustained, suggesting how virtuous cycles of inclusion and participation can be promoted in a 'care in the community' framework.
[Visit journal]
This article describes and analyzes the collaborative design of a citizen science research project through co-creation. Three groups of secondary school students and a team of scientists conceived three experiments on human behavior and social capital in urban and public spaces. The study goal is to address how interdisciplinary work and attention to social concerns and needs, as well as the collective construction of research questions, can be integrated into scientific research. The 95 students participating in the project answered a survey to evaluate their perception about the dynamics and tools used in the co-creation process of each experiment, and the five scientists responded to a semi-structured interview. The results from the survey and interviews demonstrate how citizen science can achieve a "co-created" modality beyond the usual "contributory" paradigm, which usually only involves the public or amateurs in data collection stages. This type of more collaborative science was made possible by the adaptation of materials and facilitation mechanisms, as well as the promotion of key aspects in research such as trust, creativity and transparency. The results also point to the possibility of adopting similar co-design strategies in other contexts of scientific collaboration and collaborative knowledge generation.
[Visit journal]
Social networks are made out of strong and weak ties having very different structural and dynamical properties. But what features of human interaction build a strong tie? Here we approach this question from a practical way by finding what are the properties of social interactions that make ties more persistent and thus stronger to maintain social interactions in the future. Using a large longitudinal mobile phone database we build a predictive model of tie persistence based on intensity, intimacy, structural and temporal patterns of social interaction. While our results confirm that structural (embeddedness) and intensity (number of calls) features are correlated with tie persistence, temporal features of communication events are better and more efficient predictors for tie persistence. Specifically, although communication within ties is always bursty we find that ties that are more bursty than the average are more likely to decay, signaling that tie strength is not only reflected in the intensity or topology of the network, but also on how individuals distribute time or attention across their relationships. We also found that stable relationships have and require a constant rhythm and if communication is halted for more than 8 times the previous communication frequency, most likely the tie will decay. Our results not only are important to understand the strength of social relationships but also to unveil the entanglement between the different temporal scales in networks, from microscopic tie burstiness and rhythm to macroscopic network evolution.
[Visit journal]
We study financial distributions from the perspective of Continuous Time Random Walks with memory. We review some of our previous developments and apply them to financial problems. We also present some new models with memory that can be useful in characterizing tendency effects which are inherent in most markets. We also briefly study the effect on return distributions of fractional behaviors in the distribution of pausing times between successive transactions.
[Visit journal]
Leadership positions are still stereotyped as masculine, especially in male-dominated fields (e.g., engineering). So how do gender stereotypes affect the evaluation of leaders and team cohesiveness in the process of team development? In our study participants worked in 45 small teams (4–5 members). Each team was headed by either a female or male leader, so that 45 leaders (33% women) supervised 258 team members (39% women). Over a period of nine months, the teams developed specific engineering projects as part of their professional undergraduate training. We examined leaders’ self-evaluation, their evaluation by team members, and team cohesiveness at two points of time (month three and month nine, the final month of the collaboration). While we did not find any gender differences in leaders’ self-evaluation at the beginning, female leaders evaluated themselves more favorably than men at the end of the projects. Moreover, female leaders were evaluated more favorably than male leaders at the beginning of the project, but the evaluation by team members did not differ at the end of the projects. Finally, we found a tendency for female leaders to build more cohesive teams than male leaders.
[Visit journal]
In this paper, we consider a stochastic process that may experience random reset events which relocate the system to its starting position. We focus our attention on a one-dimensional, monotonic continuous-time random walk with a constant drift: the process moves in a fixed direction between the reset events, either by the effect of the random jumps, or by the action of a deterministic bias. However, the orientation of its motion is randomly determined after each restart. As a result of these alternating dynamics, interesting properties do emerge. General formulas for the propagator as well as for two extreme statistics, the survival probability and the mean first-passage time, are also derived. The rigor of these analytical results is verified by numerical estimations, for particular but illuminating examples.
[Visit journal]
We derive the three-dimensional telegrapher's equation out of a random walk model. The model is a three-dimensional version of the multistate random walk where the number of different states form a continuum representing the spatial directions that the walker can take. We set the general equations and solve them for isotropic and uniform walks which finally allows us to obtain the telegrapher's equation in three dimensions. We generalize the isotropic model and the telegrapher's equation to include fractional anomalous transport in three dimensions.
[Visit journal]
Quantum walks and random walks bear similarities and divergences. One of the most remarkable disparities affects the probability of finding the particle at a given location: typically, almost a flat function in the first case and a bell-shaped one in the second case. Here I show how one can impose any desired stochastic behavior (compatible with the continuity equation for the probability function) on both systems by the appropriate choice of time-and site-dependent coins. This implies, in particular, that one can devise quantum walks that show diffusive spreading without losing coherence as well as random walks that exhibit the characteristic fast propagation of a quantum particle driven by a Hadamard coin.
[Visit journal]
We review some extensions of the continuous time random walk first introduced by Elliott Montroll and George Weiss more than 50 years ago [E.W. Montroll, G.H. Weiss, J. Math. Phys. 6, 167 (1965)], extensions that embrace multistate walks and, in particular, the persistent random walk. We generalize these extensions to include fractional random walks and derive the associated master equation, namely, the fractional telegrapher's equation. We dedicate this review to our joint work with George H. Weiss (1930-2017). It saddens us greatly to report the recent death of George Weiss, a scientific giant and at the same time a lovely and humble man.
[Visit journal]
Craniosynostosis, the premature fusion of cranial bones, affects the correct development of the skull producing morphological malformations in newborns. To assess the susceptibility of each craniofacial articulation to close prematurely, we used a network model of the skull to quantify the link reliability (an index based on stochastic block models and Bayesian inference) of each articulation. We show that, of the 93 human skull articulations at birth, the few articulations that are associated with non-syndromic craniosynostosis conditions have statistically significant lower reliability scores than the others. In a similar way, articulations that close during the normal postnatal development of the skull have also lower reliability scores than those articulations that persist through adult life. These results indicate a relationship between the architecture of the skull and the specific articulations that close during normal development as well as in pathological conditions. Our findings suggest that the topological arrangement of skull bones might act as a structural constraint, predisposing some articulations to closure, both in normal and pathological development, also affecting the long-term evolution of the skull.
[Visit journal]