Mike Arpaia
Managing Partner at Moonfire 🌗🔥

I'm a computer scientist professionally focused on applying machine learning and natural language processing to help accelerate and improve decision making within the domain of early-stage venture capital.

My core technical focus is on deep learning and natural language processing but I'm also passionate about distributed systems, high-performance computing, and information security.

I've spent much of my career focused on operating system security and intrusion detection. While working at Facebook in 2014, I created and open-sourced osquery. Osquery has since become a foundational tool in the information security and monitoring industries.

Get in touch

Machine Learning

I've been working with distributed data systems and data analytics for my entire career. For the past several years, I've spent most of my professional and scientific effort focusing on deep learning approaches to natural language processing and understanding. I'm especially interested in novel techniques for training deep learning models to perform well within data-constrained domains.

Transformer-based Natural Language Processing: I have an enduring passion for machine learning applied to text processing and understanding. I first started using Transformers in production shortly after the BERT paper was released, way back when HuggingFace was building a library called pytorch-pretrained-bert.
Data-constrained deep learning: I'm espeically interested in techniques for training deep learning models within domains where limited directly-relevant data is available.
Multitask multimodal deep learning: I'm interested in multitask multimodal deep learning and the different ways that one can combine multiple tasks and data modalities within a single training process to improve performance.
Artificial neural networks as universal function approximators: The universal theory of approximation shows that an artificial neural network can approximate any function. My excitement about deep learning is driven by a vision for the future where neural networks are broadly deployed as a universal computation platform for learned algorithms.

Distributed Systems

I've spent a significant majority of my career working on large, stateful, distributed systems in high-throughput production environments. Within the stack, I'm really passionate about coordination and communication protocols, performance (of both systems and engineers), and autonomous infrastructure operations.

Kubernetes: I’ve been using Kubernetes in production since 2016. I was also a member of the Kubernetes core contributor team from 2018 until 2021 where I served on the Release Team for four releases and participated in several Working Groups and Special Interest Groups.
Search and recommendation: Two of my endurring passions have long been deep learning for natural language processing and distributed systems. Modern search engines and recommender systems are an awesome intersection of these two worlds.
Consensus and coordination: I am passionate about the theory and practice of building systems which need to maintain a coordinated consensus to manage the distribution of actions and information.

Information Security

While I've worked across the entire information security stack, a consistent passion of mine throughout my career has been building systems to generate telemetry and building machine learning systems to find anomalies in that telemetry.

Operating system security and osquery: I created and open-sourced osquery while working at Facebook in 2014. Osquery has since become a foundational tool in the information security and monitoring industries. Osquery is actively used by thousands of companies all over the world to detect and respond to security threats, data breaches, and other critical incidents.
Detecting intrusions, anomalies, and fraud: I worked on building systems for host intrusion detection and analytics for the first several years of my engineering career. While at Facebook, Etsy, and Kolide, I built systems to detect and respond to anomalies and fraud.
Confidentiality, integrity, and authenticity of secure protocols: I started off my career as an offensive computer security engineer performing security assessments of secure protocols and cryptographic standards. My passion for math and protocol analysis has been a constant throughout my career.

Venture Capital

At Moonfire, we believe that pre-seed and seed capital is underfunded in Europe. Europe captures less than 20% of global VC capital despite representing over 25% of GDP. On top of this, the rate of European Series A investment is increasing faster year over year. At Moonfire, we're helping proliferate a strong pre-seed and seed ecosystem of breakthrough entrepreneurs using software, data, and machine learning.

Machine learning for sourcing, screening, and evaluation: At Moonfire, I spend most of my time building systems which aim to use scalable data infrastructure and learned models to reason about all aspects of the venture capital lifecycle.
Venture fund modelling, forecasting, and simulation: Traditional VC fund models are often created at the beginning of a fund and not maintained. This makes it challenging to be confident adapting your strategy to evolving market conditions. I'm passionate about using statistics, simulations, and projections to maintain realtime insight into the diversity of potential futures.

Open Source

I've been passionately working on open source software for my entire career