Computational Social Science is a blending of computer science methods (namely ‘Data Science’), social sciences (the data source), and complexity science (e.g. networks). Some of my favorite papers:
- Bose-Einstein condensation in complex networks – this work connects Bose Einstein Condensates to complex networks, which can then be used to learn about the topology of networks. This application is discussed in the next paper.
- The physics of the Web – Barabasi explores the various intracies of networks, such as the basics (e.g. random vs scale-free), how these basics relate to power outages, and the internet. My favorite concept in this paper is the application of ‘fitness’. In short, the fitter a network node the easier it is for the node to make new connections. The concept is the rich-get-richer scenario, one seen across many domains.
- Prestige drives epistemic inequality in the diffusion of scientific ideas – Clauset argues that the spread of scientific ideas is like a competition. Great ideas are more fit, so they spread more easily. Considering that ideas originate from people, Clauset et al investigate the role of institutions. They find that ‘‘research from prestigious institutions spreads more quickly and completely than work of similar quality originating from less prestigious institutions.’’
- Good Fences: The Importance of Setting Boundaries for Peaceful Coexistence – Co-authored with the brilliant Yaneer Bar-Yam, the team was able to accurately predict ethnic violence based on social structures. They found that intermediate ‘patches’ of ethnicity (e.g. religion, language, etc) were the key factor in violence. When patches are small or large the violence minimizes. The intermediate sizes led to Us-vs-Them behaviors. Bar-Yam et al also found that places with distinct boundaries helped to mitigate violence (e.g. rivers, mountains). They propose three ways to resolve ethnic violence: (1) accelerate mixing, (2) accelerate separation, (3) a good fence.
To better familiarize myself with the techniques used in this field I took the Coursera Computational Social Science Specialization. The specialization consists of five courses: Computational Social Science Methods; Big Data, Artificial Intelligence, and Ethics; Social Network Analysis; Computer Simulations; and Capstone Project. The capstone project is comprised of web scraping, social network analysis, natural language processing, and agent-based modeling. To truly test my skills I decided to create my own projects for each challenge within the capstone project.
For the web scraping portion I created two projects. Both are centered around my passion for health and fitness.
- Cataclysm Sentence for Jiu Jitsu – No black belt on the planet can easily list the fundamentals of jiu jitsu yet all preach the importance of the fundamentals. I used natural language processing to analyze a jiujitsu expert’s posts in hopes of extracting the fundamentals.
- How to Measure Fitness? – CrossFit is in the business of fitness. They aim to crown the fittest men, women, and teams on the planet with the annual CrossFit Games. This project was intended to gather data from their public leaderboard to analyze their claim, are they truly testing for the fittest. I was able to show inherent bias in their methodology. To correct for this bias I provided a new method using a distance metric.
- Testing Team Fitness with CrossFit – In this project I aimed to answer the question is a team of fitter people necessarily a fitter team?. To do this I analyzed the CrossFit Games teams results and compared the outcomes with the individual scores from the Open. The Open is the annual world wide competition, where each individual is compared against each other. As one might expect, a team of fitter people is not necessarily a fitter team. However, results are confounded by seasonal effects due to the nature of the sport. Therefore, further investigation is needed to decouple these effects. Results not published.
Social Network Analysis
- Performed (1) multi-class node classification. (2) link prediction, (3) community detection, (4) network visualization on Facebook Large Page-Page Network Data Set.
Natural Language Processing
- Cataclysm Sentence for Jiu Jitsu – No black belt on the planet can easily list the fundamentals of jiu jitsu yet all preach the importance of the fundamentals. I used natural language processing to analyze a jiujitsu expert’s posts in hopes of extracting the fundamentals. To do this I first needed to scrape their social media posts.
- Danaher’s Jiu Jitsu Roadmap – To extract the fundamentals of jiu jitsu I used John Danaher’s social media posts. He is widely regarded as the sports most knowledgable coach for this modern era. During my analysis in the above project I discovered a post where he lists 15 skills that he considers to be foundational to the sports. I used NLP to find relevant posts to attach to this roadmap.
- Ask Danaher (GitHub, WebApp) – Input keywords (e.g. ‘offense’, ‘defense’,’guard retention’,’ashi garami’) and app outputs relevant posts.
- Minimum Conflict – In the Nautilus article ‘Is Tribalism a Natural Malfunction?’ the author Simon Dedeo discusses the simulations which show generational fluctuations in cooperation. This led me to the question, ‘what is the minimum amount of conflict we can expect given a population size and random initiation of player strategy?’. In progress