My current research focus is hypergraph-based learning models on datasets with many modalities (e.g., text, image, and video), with applications in online news reporting and social media analysis. By extracting multimodal features using natural language processing (NLP) and computer vision (CV) techniques, I analyze the media’s responsibility of the biased representation of social minorities focusing on gender, race, migrants, and refugees. I have also worked on developing Bayesian deep learning on time series forecasting and implementing convolutional neural networks (CNN) for the neural mechanisms of political ideology using the fMRI brain dataset.
BEYOND PAIRWISE RELATIONSHIP: HYPERGRAPH
AS A NEW GRAPH-BASED PARADIGM IN POLITICAL NETWORKS
Multimodal datasets contain a huge amount of information from diverse modalities, such as text, audio, and video. Big data revolution and artificial intelligence have created unprecedented research opportunities for political scientists to delve deeply into huge amounts of multimedia objects. A growing number of scholars realize the need to develop a data-driven method to combine diverse unstructured data such as text, audio, and video in a unified model and encode complex relationships between them. To address such challenging issues and search for new methods, this project introduces a hypergraph as a new graph-based paradigm to handle multimodal datasets and model rich patterns of complex relationships among multimedia objects. Specifically, I introduce three hypergraph-based learning models and various applications ranging from political communication to legislative politics.
A PICTURE IS WORTH A THOUSAND WORDS.
MACHINE-LEARNING VISUAL FRAMING ANALYSIS
This paper presents an automated machine learning method to jointly explore word phrases and visual features of photographs in an unsupervised manner to measure media bias in contemporary media sources. I develop a scalable hypergraph regularized tensor decomposition that maps multimedia items stored in a three-order tensor into a low dimensional semantic space to uncover hidden topic structures in media coverages. Analyzing 173,204 articles with news photographs from 145 online newspapers for political bias in news reporting about abortion and immigration,
my method examines the patterns of news reporting on the visual and verbal level and identifies politically charged phrases and visual characteristics.
IMAGE WITH TEXT:
MULTIMODAL FRAMING ANALYSIS OF ONLINE NEWS COVERAGE
ON THE EUROPEAN REFUGEE CRISIS
In news reporting about conflict and crisis, photographs convey stories that generate emotions of all kinds that words cannot always deliver. While the rapid growth of online news photographs has created unprecedented research opportunities, quantitative approaches that deal with the volume, variety, and complexity of both images and texts have lagged behind in social science. To address such challenging issues and search for new methods, this paper introduces a new method for quantitative framing research to examine the patterns of news reporting on the visual and verbal level and explore image-text relations in news stories. Specifically, I introduce hypergraph as a new graph-based method to integrate the various types of data and model their complex relationships in a network. Unlike conventional graph structures that capture exclusively dyadic relationships, a hypergraph constructs rich patterns of more than dyadic relationships to better convey the underneath geometrical structure in various data types. Using hypergraph, I develop a hypergraph regularized topic model that fuses the visual, textual, and other multimedia features simultaneously to find the latent topic representation in media coverage during the European refugee crisis in 2015. Using this new topic model, I confirm that the visual and textual representation of the refugee crisis varies between right-biased and left-biased news sources. This paper aims to provide scholars with a new quantitative tool to investigate not only what news stories get mentioned but also how they are described visually and verbally.
FUNCTIONAL CONNECTIVITY SIGNATURES OF POLITICAL IDEOLOGY
Seo-Eun Yang, James Wilson, Zhong-Lin Liu, and Skyler Cranmer
the Proceedings of the National Academy of Sciences of the United States of America (PNAS) Nexus
Paper link: https://doi.org/10.1093/pnasnexus/pgac066
Emerging research has begun investigating the neural underpinnings of the biological and psychological differences that drive political ideology, attitudes, and actions. Here we explore the neurological roots of politics through conducting a large sample, whole-brain analysis of functional connectivity (FC) across common fMRI tasks. Using convolutional neural networks, we develop predictive models of ideology using FC from fMRI scans for nine standard task-based settings in a novel cohort of healthy adults (n = 174, age range: 18-40, mean = 21.43) from the Ohio State University Wellbeing Project. Our analyses suggest that liberals and conservatives have noticeable and discriminative differences in functional connectivity that can be identified with high accuracy using contemporary artificial intelligence methods and that such analyses complement contemporary models relying on socio-economic and survey-based responses. Functional connectivity signatures from retrieval, empathy, and monetary reward tasks are identified as important and powerful predictors of conservatism, and activations of the amygdala, inferior frontal gyrus, and hippocampus are most strongly associated with political affiliation. Although the direction of causality is unclear, this study suggests that the biological and neurological roots of political behavior run much deeper than previously thought.
BAYESIAN DEEP LEARNING FOR IDENTIFYING GRANGER CAUSAL GRAPHS AND FORECASTING POLITICAL DYNAMICS
Co-authored with Skyler Cranmer and Caleb Pomeroy
Time series modeling and Forecasting conflicts has traditionally been made using regression models of different types with parametric assumptions in political science. Current pre-assumed regression models for time series forecasts still face limitations in many empirical applications. First, classical Bayesian time series models do not scale. With the advent of Big Data, we now have many alternative ways to forecast conflicts by extracting insights from massive high-quality data. Second, identifying a suitable forecasting model for a particular time series beforehand is not possible due to the lack or incompleteness of our domain knowledge in many cases. To address such challenges, we propose Bayesian scalable causal graph learning (BSCGL). BSCGL models find the form of mapping function between input and output directly from data and capture the nonlinearities that traditional linear/nonlinear statistical models cannot fully develop. Thus, more complex relationships between time series can be discovered without relying on domain knowledge and any distribution assumption, resulting in better in-sample and out-of-sample prediction performances. Our proposed model also discovers non-linearities in the underlying granger causal mechanisms in time series.
MEASURING STATES' POLICY ATTENTION AND AGENDA-SETTING POWER IN THE UNITED NATIONS GENERAL ASSEMBLY (UNGA)
The United Nations General Assembly (UNGA) provides a great deal of information and insights about the political implications of change in the UN Security Council. Every year, the UN generates thousands of publications and documents such as draft resolutions, annual reports, meeting records, agendas, vote records, and lists of participants. Currently, the UN offers one million digital links that contain bibliographic metadata records and text-heavy data. Most of this rapid growth of information owes its origin to the unstructured data in the wild like texts as compared to the structured information stored in databases. A core research challenge presents itself as to how to turn such massive unstructured data into structured knowledge and integrate structured and unstructured data in a unified model. While the explosion of UN documents has created unprecedented research opportunities for IR scholars, quantitative approaches that deal with the volume, variety, and complexity of such data have not been sufficiently introduced. To address this, this paper proposes a scalable data-driven framework and a feasible machine learning technique for a total of 427,253 draft resolutions for 193 countries collected over 70 years. The goal of this project is to (1) construct a text-rich information network connecting heterogeneous types of document-associated entities like the textual contents of draft resolutions, the list of member states as sponsors or co-sponsors of draft resolutions, and their issue date, and (2) measure states’ underlying policy attention over a large number of issues for over 70 years. My method explores the methodological advancement associated with the measurement of state members’ issue attention and agenda-setting power in the UNGA by mapping text-rich information networks into latent policy spaces. Using this new technique, we can identify the issues to which member states pay attention over time, and agenda setting power in various policy dimensions such as national security and human right.