Microsoft Azure Cloud Computing Grants

Microsoft Corporation has partnered with the Center for Data Science for Enterprise and Society and is providing a generous gift of Azure Cloud Computing credits available to Cornell faculty. The grants can support computational and data intensive research across a variety of domain areas; we aim to particularly support areas that align with initial priorities of the Center, which focuses on questions grounded in data that are generated by human activity, including computational social science (e.g., sociology and government), public health (especially issues related to the Covid-19 pandemic), the economics/computer science interface, aspects of digital agriculture related to production, resource and waste management, breeding and genetics, food manufacturing, distribution and consumption, digital platforms supporting urban infrastructure (e.g., the sharing economy), and as a theme that is cross-cutting in many of these areas, the corresponding issues of privacy, security, and fairness. We invite proposals based on a diversity of intellectual approaches and personal backgrounds.

Call for Proposals

Microsoft Azure Cloud Computing Grant Recipients:

Ken Birman
Ann S. Bowers College of Computing and Information Science, Professor of Computer Science

Our effort is motivated by the goal of improving infrastructure support for delay-sensitive data-centric AI/ML. Cascade hosts AL and ML logic, and is able to reduce end-to-end latency to milliseconds while achieving throughput as high as today’s very best options. The performance gains trace to several innovations: Collocated data and user-provided computation in the same address space, deep RDMA support, and minimized locking on a zero-copy critical path.

 

Sara Bronin
Cornell School of Art, Architecture, and Planning,  Professor of City and Regional Planning

The National Zoning Atlas (NZA) project, housed at the Cornell College of Architecture Art and Planning’s Legal Constructs Lab, is an ongoing effort to standardize and translate key aspects of zoning codes, display that information online, and make it available for a wide variety of educational and advocacy purposes.  Credits will be used to set up and maintain the digital infrastructure of the NZA.  A geospatially enabled database will be created, and associated support services for the deployment of the web application of the NZA will be hosted through a variety of Azure cloud services such as Azure Analysis services and Azure App Service.

 

Tanzeem Choudhury
Ann S. Bowers College of Computing and Information Science, Professor of Information Science

Developing machine learning modeling methods to create mental health digital biomarkers that are reliable across populations, time, and data types. 

 

 

Will Cong
SC Johnson College of Business, Associate Professor of Finance and the Rudd Family Professor of Management 

Developing AI models to help with high dimensional managerial decisions and learning large corporations’ management objectives. 

 

Deborah Estrin
Cornell Tech, Associate Dean for Impact and Robert V. Tishman ’37 Professor of Computer Science

We will explore the opportunities and challenges of implementing Extended Reality (XR) technologies to support the collaborative work between family caregivers and clinicians as they attend to the physical care needs of patients in the home setting. Our multidisciplinary approach integrates methods from Human-Computer Interaction (HCI) and Computer Graphics to lay the foundation for future research and development of diverse XR applications that could ultimately transform care at home.

 

Greeshma Gadikota 
College of Engineering, Croll Sesquicentennial Fellow and Assistant Professor of Civil and Environmental Engineering 

The aim of the project is to develop molecular-scale predictive controls on engineered accelerated weathering of silicate minerals for capturing CO2 from air. This project is motivated by the need to develop negative emissions technologies in response to a rapidly changing climate. Molecular scale insights into the solvation behavior of CO2 and metals in silicate minerals inform the design of scalable engineered processes for carbon removal. 

 

Nikhil Garg
Jacobs Technion-Cornell Institute at Cornell Tech, Assistant Professor of Operations Research and Information Engineering

We will create a simulation environment (mimicking AirBnb or Facebook Marketplace) in which multiple pricing algorithms compete with one another to sell goods to customers. Algorithms will price goods over time, receiving feedback and rewards based on sales – which depend on their prices, competitors’ prices, the customer choice model, and item and customer covariates. We will then host a pricing competition open to the research community to submit algorithms.  The Azure credits will be used for virtual machines to host the competition. Unlike other common task challenges in which researchers can simply submit predictions on a test dataset, this competition requires running code submitted by the external public –potentially including reinforcement learning models.

 

Allison Koenecke
Ann S. Bowers College of Computing and Information Science, Assistant Professor of Information Science

We audit the fairness of commercial speech-to-text technologies by comparing word error rates on Korean “standard” speech to five regional Korean dialects often spoken by underserved populations in rural Korea.

 

Rachee Singh
Ann S. Bowers College of Computing and Information Science, Assistant Professor of Computer Science

We will use the empirical characterization of GPU interconnects to inform the design of algorithms for collective communication which are crucial in achieving efficient distributed ML training. 

 

Kevin Tang
College of Engineering, Professor of Electrical and Computer Engineering 

Determining the right data to pass around in Edge Cloud assisted IoT (Internet of Things) applications with the goal of automatically generating task-specific inexpensive representations of environment sensory data that can enhance the performance of downstream controllers. 

 

Felix Thoemmes
College of Human Ecology, Associate Professor of Psychology

The aim of this project is to evaluate the use of instrumental variables in recovering parameter estimates in the presence of data that is not missing at random. In addition to the theoretical work we are doing, we will run large-scale simulation studies to study the effectiveness of our proposed method for a variety of missing data patterns.  

 

Co PI’s:    Robert Van Renesse,  Lorenzo Alvisi

 

Robbert Van Renesse
Ann S. Bowers College of Computing and Information Science, Professor of Computer Science  

 

Lorenzo Alvisi
Ann S. Bowers College of Computing and Information Science, Tisch University Professor of Computer Science 

Ziplog is a new approach to building fault tolerant totally ordered logs.  Using RDMA and sharding, Ziplog achieves essentially unlimited throughput at tail latencies below 100 microseconds.

 

 

Lars Vilhuber
The College of Arts and Sciences, Department of Economics, Director of Labor Dynamics Institute and Data Editor, American Economic Association 

Developing a cloud-based human-mediated scalable workflow to streamline the computational reproducibility checks of 100s of social science articles when data cannot be machine-acquired.

 

Matthew Wilkens
Ann S. Bowers College of Computing and Information Science, Associate Professor of Information Science 

Transfer Learning for Textual Geography. This project uses large neural language models to extract geographic references from hundreds of thousands of novels in support of library-scale literary historical analyses. 

 

Qian Yang
Ann S. Bowers College of Computing and Information Science, Assistant Professor of Information Science 

Developing novel methods and tools for interactive Natural Language Generation (NLG) application design.

 

 

Fengqi You
College of Engineering, Roxanne E. and Michael J. Zak Professor in Energy Systems Engineering 

We will use the granted resources to develop a novel framework that uses quantum computing-based techniques for molecular property estimation and computer-aided molecular design (CAMD) for agrochemical and/or pharmaceutical applications 

 

 

Zhiru Zang
College of Engineering, Associate Professor of Electrical and Computer Engineering 

This project investigates efficient machine learning techniques to improve the accuracy of genotype imputation, a key technique for genome-wide association studies, which has recently been used to identify potential risk loci contributing to the COVID-19 mortality. 


Madeleine Udell
College of Engineering, Assistant Professor of Operations Research and Information Engineering

 

Co-PI’s:  Julio Giordano,  Ken Birman

Julio Giordano
College of Agriculture and Life Sciences, Professor of Animal Science  

Ken Birman
Ann S. Bowers College of Computing and Information Science, N. Rama Rao Professor of Computer Science