Mike Arpaia
Partner at Moonfire 🌗🔥

I'm a computer scientist professionally focused on applying machine learning and natural language processing to help accelerate and improve decision making within the domain of early-stage venture capital.

My core technical focus is on deep learning and natural language processing but I'm also passionate about distributed systems, cognitive science, and high-performance computing.

I'm also an extremely psyched skier and climber. I'm a big advocate for the power of the outdoors to help people develop meaningful human relationships through shared experiences in the mountains.

Machine Learning

I've been working with distributed data systems and data analytics for my entire career. For the past several years, I've spent most of my professional and scientific effort focusing on deep learning approaches to natural language processing and understanding. I'm especially interested in novel techniques for training deep learning models to perform well within data-constrained domains.

Transformer-based Natural Language Processing

I have an enduring passion for machine learning applied to text processing and understanding. I first started using Transformers in production shortly after the BERT paper was released, way back when HuggingFace was building a library called pytorch-pretrained-bert.

Data-constrained deep learning

I'm espeically interested in techniques for training deep learning models within domains where limited directly-relevant data is available.

Multitask multimodal deep learning

I'm interested in multitask multimodal deep learning and the different ways that one can combine multiple tasks and data modalities within a single training process to improve performance.

Artificial neural networks as universal function approximators

The universal theory of approximation shows that an artificial neural network can approximate any function. My excitement about deep learning is driven by a vision for the future where neural networks are broadly deployed as a universal computation platform for learned algorithms.

Distributed Systems

I've spent a significant majority of my career working on large, stateful, distributed systems in high-throughput production environments. Within the stack, I'm really passionate about coordination and communication protocols, performance (of both systems and engineers), and autonomous infrastructure operations.


I’ve been using Kubernetes in production since 2016. I was also a member of the Kubernetes core contributor team from 2018 until 2021 where I served on the Release Team for four releases and participated in several Working Groups and Special Interest Groups.

Search and recommendation

Two of my endurring passions have long been deep learning for natural language processing and distributed systems. Modern search engines and recommender systems are an awesome intersection of these two worlds.

Consensus and coordination

I am passionate about the theory and practice of building systems which need to maintain a coordinated consensus to manage the distribution of actions and information.

Blockchain Infrastructure

Given my background working with distributed systems, cryptography, and economics, my personal mental model of blockchains is one of coordinated distributed systems with cryptographic verifiability and economic security.

A consensus and validation approach to blockchain architecture

When reasoning about protocol and infrastructure architecture, I like to focus on first trying to understand the consensus systems and any cryptographic and/or economic validation systems at play.

Solana validator infrastructure and staking tooling

I work with Cogent Crypto, the best validator on Solana, to help create some of the software within the cog staking ecosystem. We built the first validator ecosystem profit-sharing scheme on Solana.