[BDCSG2008] Algorithmic Perspectives on Large-Scale Social Network Data (Jon Kleinberg)

How can we help social science to do their science, but also how can we create systems from the lessons learned. This topics also include security and sensitivity of the data. He also review from the Karate papers to the latest papers about social networks. Scale changes the way you approach the data. The original studies allowed knowing what each link mean, but large scale networks loses this property. However he is approaching for a language to express some of the analysis of the social networks and processes. Also, how we bind information per user and how can we model users. But the also security policies. Diffusion in social networks and how things are propagated (even locally), but it is hard to measure how people change their minds on the diffusion process. Chain-letter study where the petition and the trace was collected, but they can also be forward to mailing list, but you can trace some some of the traces of the mailing list. The path were messed with mutations (typos) amputations, etc. They generate some algorithms for maximum likelihood of the tree assemble. But the output was unexpected, opposed to the six-degree separation, they found narrow deep trees. Why a chain-letter would run as a deep-first search? Time played a role. Since friends are small searches, and basically the replicated copies where discarded. The model of the trees was able to be replicated following this time dimension. Another element gets throw to the mix is the threshold of the diffusion. Basically, a message gets in, but how many inputs repetitions your require to validated it an pass it along? Results show that the second input the one that boost that threshold. Viral marketing is another example that wants to understand diffusion. All this leads to multiple models and how you integrate them. Privacy and social networks is another key element. How does that play? Is anonymation the way to go? Social network graphs, even if anonymized hints can lead to the deanonymation of the picture. Before the network is release you can add actions to it, and then you have something to roll back from. The idea create a unique pattern, and then ping them to other people. You can compromise a graph with square root of the log of the number of nodes. Jeff final reflections: toward a model of you. Models of human behavior are possible (for instance the model of time to reply email). But computers track more information about your behavior, opening the door to new modeling (something that the DISCUS project has also been postulating for the last 5 years).