Mike Arpaia
Partner at Moonfire Ventures

I'm a computer scientist professionally focused on applying machine learning for natural language to early-stage venture capital in Europe.

My core research focus is on deep learning and natural language processing with a focus on minority languages and multilingual transfer.

My research interests also include computational linguistics, sociolinguistics, distributed systems, snow science, and meteorology.

Machine Learning

I've been working with distributed data systems and data analytics for my entire career. For the past several years, I've spent most of my professional and scientific effort focusing on deep learning approaches to natural language processing and understanding.

Deep learning for natural language processing

I have an enduring passion for machine learning applied to text processing and understanding. I started using Transformers in production shortly after the BERT paper was released, way back when HuggingFace was building a library called pytorch-pretrained-bert.

Multi-task deep learning

Within deep learning, my main focus is on multi-task learning and the different ways that one can combine multiple tasks within a single training process to improve performance.

Multi-modal training and inference

My core focus is on the language modality but I am increasingly interested in integrating additional data modalities into the same model at both training and inference time.

Artificial neural networks as universal function approximators

The universal theory of approximation shows that an artificial neural network can approximate any function. I am excited about a future where neural networks are broadly deployed as a universal computation platform for learned algorithms.

Venture Capital

At Moonfire, we believe that pre-seed and seed capital is underfunded in Europe. Europe captures less than 20% of global VC capital despite representing over 25% of GDP. On top of this, the rate of European Series A investment is increasing faster year over year. At Moonfire, we're helping proliferate a strong pre-seed and seed ecosystem of breakthrough entrepreneurs using software, data, and machine learning.

Machine Learning for sourcing, screening, and evaluation

At Moonfire, I spend most of my time building systems which aim to use scalable data infrastructure and learned models to reason about all aspects of the venture capital lifecycle.

Venture fund modelling, forecasting, and simulation

Traditional VC fund models are often created at the beginning of a fund and not maintained. This makes it challenging to be confident adapting your strategy to evolving market conditions. I'm passionate about using statistics, simulations, and projections to maintain realtime insight into the diversity of potential futures.

Distributed Systems

I've spent a significant majority of my career working on large, stateful, distributed systems in high-throughput production environments. Within the stack, I'm really passionate about coordination and communication protocols, performance (of both systems and engineers), and autonomous infrastructure operations.


I’ve been using Kubernetes in production since 2016. I was also a member of the Kubernetes core contributor team from 2018 until 2021 where I served on the Release Team for four releases and participated in several Working Groups and Special Interest Groups.

Search and recommendation

Two of my endurring passions have long been deep learning for natural language processing and distributed systems. Modern search engines and recommender systems are an awesome intersection of these two worlds.

Consensus and coordination

I am passionate about the theory and practice of building systems which need to maintain a coordinated consensus to manage the distribution of actions and information.

Blockchain Infrastructure

Given my background working with distributed systems, cryptography, and economics, my personal mental model of blockchains is one of Byzantine fault tolerant distributed systems with cryptographic verifiability and economic security.

A consensus and validation approach to blockchain architecture

When reasoning about protocol and infrastructure architecture, I like to focus on first trying to understand the consensus systems and any cryptographic and/or economic validation systems at play.

Solana validator infrastructure and staking tooling

I work with Cogent Crypto, the best validator on Solana, to help create some of the software within the cog staking ecosystem. We built the first validator ecosystem profit-sharing scheme on Solana.

Open Source

I've been passionately working on open source software for my entire career


While working at Facebook, I created and open sourced the osquery project in 2014.


I spent several years on the Kubernetes contributor team where I contributed to release, multi-tenancy, and architecture.

Information Security

A significant portion of my career has been spent focused on cryptography, secure protocols, and using data analytics and machine learning to detect and respond to compromise and fraud at scale.

Operating system security and osquery

I worked on building systems for host intrusion detection and analytics for the first several years of my engineering career.

Detecting intrusions, anomalies, and fraud

While at Facebook, Etsy, and Kolide, I built systems to detect and respond to anomalies and fraud.

Confidentiality, integrity, and authenticity of secure protocols

I started off my computer science career doing network, code, and specification analysis over secure protocols.