Profile Picture

Siddarth Srinivasan

Postdoctoral Fellow

Harvard University

Hello!

I'm a Postdoctoral Fellow at Harvard University advised by Prof. Yiling Chen. My primary research interests are in machine learning, game theory, mechanism design, and public policy. Currently, I'm thinking about incentives for natural language-aided reasoning and forecasting in multi-agent settings. Some relevant keywords for topics of interest: reasoning, forecasting, persuasion, debate, voting, democratic deliberation, crowdsourcing, and of course, large language models.

Previously, I completed my Ph.D. in Computer Science at the University of Washington, where I was advised by Prof. Byron Boots. During my Ph.D., I interned at Microsoft Research Montreal, Apple, and Disney Research. I also obtained an M.S. in Mathematics from Georgia Tech and a B.S. in Physics from Harvey Mudd College. Prior to my current position, I was a post-doctoral Fellow at the Brookings Institution where I spent some time thinking about AI policy.

Feel free to reach out if you're interested in anything you see here, or if you just want to chat about research -- I enjoy (kindly) peppering people with questions to learn more about their research!

Research

Year
Title
Authors
Venue
2023
Self-Resolving Prediction Markets
Abstract PDF

Prediction markets elicit and aggregate beliefs by paying agents based on how close their predictions are to a verifiable future outcome. However, outcomes of many important questions are difficult to verify or unverifiable, in that the ground truth may be hard or impossible to access. Examples include questions about causal effects where it is infeasible or unethical to run randomized trials; crowdsourcing and content moderation tasks where it is prohibitively expensive to verify ground truth; and questions asked over long time horizons, where the delay until the realization of the outcome skews agents' incentives to report their true beliefs. We present a novel and unintuitive result showing that it is possible to run an incentive compatible prediction market to elicit and efficiently aggregate information from a pool of agents without observing the outcome by paying agents the negative cross-entropy between their prediction and that of a carefully chosen reference agent. Our key insight is that a reference agent with access to more information can serve as a reasonable proxy for the ground truth. We use this insight to propose self-resolving prediction markets that terminate with some probability after every report and pay all but a few agents based on the final prediction. We show that it is an Perfect Bayesian Equilibrium for all agents to report truthfully in our mechanism and to believe that all other agents report truthfully. Although primarily of interest for unverifiable outcomes, this design is also applicable for verifiable outcomes.

Srinivasan S., Karger E., Chen Y.
Under Review
2024
Scalable Measurement Error Mitigation via Iterative Bayesian Unfolding
Abstract PDF

Measurement error mitigation (MEM) techniques are postprocessing strategies to counteract systematic read-out errors on quantum computers (QC). Currently used MEM strategies face a tradeoff: methods that scale well with the number of qubits return negative probabilities, while those that guarantee a valid probability distribution are not scalable. Here, we present a scheme that addresses both of these issues. In particular, we present a scalable implementation of iterative Bayesian unfolding, a standard mitigation technique used in high-energy physics experiments. We demonstrate our method by mitigating QC data from experimental preparation of Greenberger-Horne-Zeilinger (GHZ) states up to 127 qubits and implementation of the Bernstein-Vazirani algorithm on up to 26 qubits.

Srinivasan S.*, Pokharel B.*, Quiroz G., Boots B.
(Forthcoming) Physical Review Research
2021
Towards a Trace-Preserving Tensor Network Representation of Quantum Channels
Abstract PDF

The problem of characterizing quantum channels arises in a number of contexts such as quantum process tomography and quantum error correction. However, direct approaches to parameterizing and optimizing the Choi matrix representation of quantum channels face a curse of dimensionality: the number of parameters scales exponentially in the number of qubits. Recently, Torlai et al. [2020] proposed using locally purified density operators (LPDOs), a tensor network representation of Choi matrices, to overcome the unfavourable scaling in parameters. While the LPDO structure allows it to satisfy a ‘complete positivity’ (CP) constraint required of physically valid quantum channels, it makes no guarantees about a similarly required ‘trace preservation’ (TP) constraint. In practice, the TP constraint is violated, and the learned quantum channel may even be trace-increasing, which is non-physical. In this work, we present the problem of optimizing over TP LPDOs, discuss two approaches to characterizing the TP constraints on LPDOs, and outline the next steps for developing an optimization scheme.

Srinivasan S.*, Adhikary S.*, Miller J., Pokharel B., Rabusseau G., Boots B.
Second Workshop on Quantum Tensor Networks in Machine Learning @ NeurIPS
2021
Auctions and Peer Prediction for Academic Peer Review
Abstract PDF

Peer reviewed publications are considered the gold standard in certifying and disseminating ideas that a research community considers valuable. However, we identify two major drawbacks of the current system: (1) the overwhelming demand for reviewers due to a large volume of submissions, and (2) the lack of incentives for reviewers to participate and expend the necessary effort to provide high-quality reviews. In this work, we adopt a mechanism-design approach to propose improvements to the peer review process, tying together the paper submission and review processes and simultaneously incentivizing high-quality submissions and reviews. In the submission stage, authors participate in a VCG auction for review slots by submitting their papers along with a bid that represents their expected value for having their paper reviewed. For the reviewing stage, we propose a novel peer prediction mechanism (H-DIPP) building on recent work in the information elicitation literature, which incentivizes participating reviewers to provide honest and effortful reviews. The revenue raised in the submission stage auction is used to pay reviewers based on the quality of their reviews in the reviewing stage.

Srinivasan S., Morgenstern J.
arXiv preprint
2021
Learning Deep Features in Instrumental Variable Regression
Abstract PDF

Instrumental variable (IV) regression is a standard strategy for learning causal relationships between confounded treatment and outcome variables from observational data by utilizing an instrumental variable, which affects the outcome only through the treatment. In classical IV regression, learning proceeds in two stages: stage 1 performs linear regression from the instrument to the treatment; and stage 2 performs linear regression from the treatment to the outcome, conditioned on the instrument. We propose a novel method, deep feature instrumental variable regression (DFIV), to address the case where relations between instruments, treatments, and outcomes may be nonlinear. In this case, deep neural nets are trained to define informative nonlinear features on the instruments and treatments. We propose an alternating training regime for these features to ensure good end-to-end performance when composing stages 1 and 2, thus obtaining highly flexible feature maps in a computationally efficient manner. DFIV outperforms recent state-of-the-art methods on challenging IV benchmarks, including settings involving high dimensional image data. DFIV also exhibits competitive performance in off-policy policy evaluation for reinforcement learning, which can be understood as an IV regression task.

Xu L., Chen Y., Srinivasan S., de Freitas N., Doucet A., Gretton A.
ICLR
2021
Quantum Tensor Networks, Stochastic Processes, and Weighted Automata
Abstract PDF

Modeling joint probability distributions over sequences has been studied from many perspectives. The physics community developed matrix product states, a tensor-train decomposition for probabilistic modeling, motivated by the need to tractably model many-body systems. But similar models have also been studied in the stochastic processes and weighted automata literature, with little work on how these bodies of work relate to each other. We address this gap by showing how stationary or uniform versions of popular quantum tensor network models have equivalent representations in the stochastic processes and weighted automata literature, in the limit of infinitely long sequences. We demonstrate several equivalence results between models used in these three communities: (i) uniform variants of matrix product states, Born machines and locally purified states from the quantum tensor networks literature, (ii) predictive state representations, hidden Markov models, norm-observable operator models and hidden quantum Markov models from the stochastic process literature, and (iii) stochastic weighted automata, probabilistic automata and quadratic automata from the formal languages literature. Such connections may open the door for results and methods developed in one area to be applied in another.

Adhikary S.*, Srinivasan S.*, Miller J., Rabusseau G., Boots B.
AISTATS
2020
Expressiveness and Learning of Hidden Quantum Markov Models
Abstract PDF

Extending classical probabilistic reasoning using the quantum mechanical view of probability has been of recent interest, particularly in the development of hidden quantum Markov models (HQMMs) to model stochastic processes. However, there has been little progress in characterizing the expressiveness of such models and learning them from data. We tackle these problems by showing that HQMMs are a special subclass of the general class of observable operator models (OOMs) that do not suffer from the \emph{negative probability problem} by design. We also provide a feasible retraction-based learning algorithm for HQMMs using constrained gradient descent on the Stiefel manifold of model parameters. We demonstrate that this approach is faster and scales to larger models than previous learning algorithms.

Adhikary S.*, Srinivasan S.*, Gordon G., Boots B.
AISTATS
2018
Learning and Inference in Hilbert Space with Quantum Graphical Models
Abstract PDF

Quantum Graphical Models (QGMs) generalize classical graphical models by adopting the formalism for reasoning about uncertainty from quantum mechanics. Unlike classical graphical models, QGMs represent uncertainty with density matrices in complex Hilbert spaces. Hilbert space embeddings (HSEs) also generalize Bayesian inference in Hilbert spaces. We investigate the link between QGMs and HSEs and show that the sum rule and Bayes rule for QGMs are equivalent to the kernel sum rule in HSEs and a special case of Nadaraya-Watson kernel regression, respectively. We show that these operations can be kernelized, and use these insights to propose a Hilbert Space Embedding of Hidden Quantum Markov Models (HSE-HQMM) to model dynamics. We present experimental results showing that HSE-HQMMs are competitive with state-of-the-art models like LSTMs and PSRNNs on several datasets, while also providing a nonparametric method for maintaining a probability distribution over continuous-valued features.

Srinivasan S., Downey C., Boots B.
NeurIPS
2018
Expressing Coherent Personality with Incremental Acquisition of Multimodal Behaviors
Abstract PDF

As social robots increasingly enter people's lives, coherence of personality is an important challenge for longterm human-robot interactions. We extend an architecture that acquires dialog through crowdsourcing to author both verbal and non-verbal indicators of personality. We demonstrate the efficacy of the approach through a four-day study in which teams of participants interacted with a social robot expressing one of two personalities as the host of a competitive game. Results indicate that the system is able to elicit personality-driven language behaviors from the crowd in an incremental and ongoing way and produce a coherent expression of that personality during face-to-face interactions over time.

Mota P., Paetzel M., Fox A., Amini A., Srinivasan S., Kennedy J., Lehman J.
RO-MAN
2018
A Simple and Effective Approach to the Story Cloze Test
Abstract PDF

In the Story Cloze Test, a system is presented with a 4-sentence prompt to a story, and must determine which one of two potential endings is the 'right' ending to the story. Previous work has shown that ignoring the training set and training a model on the validation set can achieve high accuracy on this task due to stylistic differences between the story endings in the training set and validation and test sets. Following this approach, we present a simpler fully-neural approach to the Story Cloze Test using skip-thought embeddings of the stories in a feed-forward network that achieves close to state-of-the-art performance on this task without any feature engineering. We also find that considering just the last sentence of the prompt instead of the whole prompt yields higher accuracy with our approach.

Srinivasan S., Arora A., Riedl M.
NAACL
2018
Learning Hidden Quantum Markov Models
Abstract PDF

In the Story Cloze Test, a system is presented with a 4-sentence prompt to a story, and must determine which one of two potential endings is the 'right' ending to the story. Previous work has shown that ignoring the training set and training a model on the validation set can achieve high accuracy on this task due to stylistic differences between the story endings in the training set and validation and test sets. Following this approach, we present a simpler fully-neural approach to the Story Cloze Test using skip-thought embeddings of the stories in a feed-forward network that achieves close to state-of-the-art performance on this task without any feature engineering. We also find that considering just the last sentence of the prompt instead of the whole prompt yields higher accuracy with our approach.

Srinivasan S., Gordon G., Boots B.
AISTATS
2015
Compressed Sensing Environmental Mapping by an Autonomous Robot
Abstract PDF

This paper introduces the use of compressed sensing for autonomous robots performing environmental mapping in order to reduce data collection, storage, and transmission requirements. A prototype robot sends data collected over adaptively updated straight-line paths to a server, which reconstructs an image of the environment variable using Split-Bregman iteration. The amount of data collected is only 10% of the amount of data in the final map, yet the relative error is only 20%.

Horning M., Lin M., Srinivasan S., Zou S., Haberland M., Yin K., Bertozzi A.
Second International Workshop on Robotic Sensor Networks

Policy Work

Year
Title
Authors
Venue
2024
Detecting AI fingerprints: A guide to watermarking and beyond
Summary Publication

  • Sophisticated digital “watermarking” embeds subtle patterns in AI-generated content that only computers can detect.
  • Relative to other approaches to identifying AI-generated content, watermarks are accurate and more robust to erasure and forgery, but they are not foolproof; a motivated actor can degrade watermarks in AI-generated content.
  • An AI model developer can only build detectors for their own watermark, so coordination will be necessary for efficient identification of all watermarks. Other practical considerations include the necessity of AI developer cooperation, complications with open-source models, privacy implications, and ensuring trusted and accessible watermark detection services.
  • Given challenges, a realistic objective is to raise the barrier to evading watermarks so the majority of AI-generated content can be identified. In practice, this means watermarking will primarily be geared towards managing AI-generated content from popular models, while being of limited use in high-stakes settings.

Srinivasan S.
Brookings Institution
2022
Enforcing New York City’s AI Hiring Law
Summary Publication

NYC’s AI Hiring Law (2021/144) addresses the rapid adoption of automated employment decision tools (AEDTs) for hiring. These tools use artificial intelligence to process data such as education credentials to determine whether a candidate is qualified for a job. As job candidates are rarely aware of AEDT usage, the law mandates that employers notify candidates of AEDT use. However, the law goes into effect in 2023 and lacks detailed guidance about how to provide notice, leaving employers unsure of how to comply. This project recommends that the NYC Department of Consumer and Worker Protection (DCWP) use its rulemaking authority to ensure compliance with the notification portion of the NYC AI Hiring Law. See below for more information on this proposal, including a Playbook for DCWP that provides rulemaking guidelines and sample notification language that employers can use to notify applicants of the use of AEDTs; a template Frequently Asked Questions sheet for employers seeking to understand these recommendations; and an informational website for job candidates.

Carlton C., Davies J., Einstein L., Srinivasan S., Yang M.
Aspen Tech Policy Hub