← Back

Technical Portfolio 🦾

Here is a selected list of projects I have done, each with a technical challenge I solved along the way. I only feature projects where I was responsible for most or all technical development. For most commercial projects, I was heavily involved in acquisition, UX design and client communication.

β-Variational Autoencoder

⚒️
Python, Pytorch, Pandas, Numpy
🏗️
Oxford University, Advanced Topics in Machine Learning
📅
2022

I implemented a β-Variational Autoencoder, reproducing the results in the original paper by Higgins et al. at DeepMind. An Autoencoder is a machine learning model which learns how to represent high-dimensional data in a low dimensional latent space. Conceptually, what an Autoencoder does is fascinating: it automatically destills the important information from your data, and throws away the negligible parts. Once you have this low dimensional latent space, you can also use it to generate new, unseen data. For example, you can generate new images of people that do not exist, and interpolate between data.

Here, you can see input images on the left, which are then encoded into the two-dimensional latent space in the middle. Scroll through the latent space to explore it! On the right, you can see the decoded image. This is really fascinating: only two dimensions are used to regenerate these!

Using a standard Autoencoder, however, this is problematic, since the distribution in the latent space is arbitrary and unknown. Thus, we instead use a Variational Autoencoder, which forces a specific distribution on the latent space, which can then be sampled from. Usually, we use a multivariate normal distribution as the target distribution in the latent space. The name 'variational' comes from the fact that we use a method called 'variational inference' to approximate the distribution of the latent variables. In training, we introduce an additional loss term, the KL-divergence between the current and the target latent distribution. This can be seen as a measure of how far away the current latent distribution is to the target latent distribution. In the DeepMind paper, the authors introduce a new hyperparamter β, which controls the influence of the KL-divergence on the final loss. They give a mathematical justification for this hyperparameter, and show that it helps in creating a disentangled latent space, which is especially important to create an interpretable generative model.

For more details, here is our reproduction paper, and the code for the interactive model above.

AR App Chemnitz.ZeitWeise

⚒️
Unity, C#, ARKit, ARCore
🏗️
Studio NEEEU
📅
2021

With NEEEU, I built an Augmented Reality app to rediscover historical buildings for the Museum of Archeology Chemnitz.

Buildings and places carry a rich history that is easily forgotten when they are remodelled or torn down. AR gives us an opportunity to rediscover the city as it was, and tell a story that seemed lost. Visitors of Chemnitz can discover historical landmarks on a map, compare views of Then and Now in 3D, and see the buildings as they were in Augmented Reality. Along the way, they can find out about the fascinating history of these places.

iPhone Mask
iPhone Screenshot

I built this application in Unity. We used mapbox to show and download offline maps of Chemnitz, GPS position and orientation to place AR content, and Strapi as a CMS so the client can change and curate any information about buildings online.

Try it on the App Store or Google Play!

Humboldt Forum AR App for Guided Museum Tours

⚒️
Unity, C#, ARKit, ARCore
🏗️
Studio NEEEU
📅
2020

With NEEEU, I created an application for guided museum tours for the Humboldt Forum in Berlin. With a networked AR experience, guides can lead visitors through the museum. In Augmented Reality, visitors can interact with fragile artifacts that would otherwise be locked behind glass.

AR Museum Tour

In a tour, every visitor receives an iPad, controlled from the guide's iPad. To synchronize devices, I used a custom-built networking setup built on top of Unity Mirror Networking. A lot of effort was spent on getting networked guiding right: guides can choose which video or AR exhibit visitors see or let them roam free, they can choose which AR objects are highlighted to the visitors, as well as rotate and zoom to show specific parts of the artifact. Enabling guides and visitors to interact with the same AR objects requires synchronized interaction - a bit like collaborating on a document on Google Docs.

To let museum curators edit the content of tours, all content is downloaded from an online CMS. Before starting, all devices automatially download the latest data, cache everything offline, make sure all content is synchronized with the guide's device, and start the guide / visitor connection. This way, guides can show images, scrub videos synchronized without latency, and control AR scenes all without an active internet connection. This was crucial for our client, since the networks at the museum can become unreliable at peak hours.

This application is now used daily by guides in the Humboldt Forum to offer richer guided tours for their exhibitions.

GDPR-compliant data collection App "Anima"

⚒️
Swift, SwiftUI, MongoDB, Realm, Python
🏗️
Oxford University Masters CS Project
📅
2021

As part of my masters, I created a research data collection application with a focus on transparency and GDPR-compliance. The application collects iPhone and Apple Watch Health data as well as self-reported mental wellbeing data while giving users full information and granular control about which data is being collected. The application is intended to provide a full data pipeline for scientific research: data collection, data storage, legal consent, and an interface for researchers to access the data. In my report, I provide an in-depth overview of the legal requirements for data collection in the GDPR, and derive practical design principles from the legal text.

iPhone Mask
iPhone Screenshot

I chose the type of data collected to facilitate follow-up work that I've been interested in for a while: how well can we predict mental wellbeing from wearable data? People create data about themselves, activities and behaviour patterns in abundance. If we find ways to leverage this data, we can invent whole new methodologies in psychological research and treatment. Traditionally, there is a fundamental divide between quantitative and qualitative research. I believe if we find ways to understand intricate behaviour patterns in quantitative analysis, we can move towards a quantitative reframing of psychology. Hopefully, we will also be able to build products that leverage personal health data for mental health therapy and treatment.

The application was built in Swift, following a Model-View-Viewmodel pattern. Data is collected from Apple HealthKit, which provides access to Apple Watch data. This includes regular hearbeat measures, distance walked, and workout times, but also datapoints like environmental noise exposure. Self-reported mental wellbeing data is collected via a custom interface and data browser in the app. Data is stored and synchronized to a server using MongoDB Realm, which allows for real-time synchronization across the server and any number of devices. MongoDB employs a NoSQL database on an AWS server in Germany. Once set up, the online server interface looks like this:

MongoDB

Here, you can see the database for the application, and one datapoint in the list. Each user gets assigned a unique anonymous user ID, and no directly identifying information is stored. Researchers must still be careful when handling and publishing data, as de-anonymization is often possible.

In my project report, I provide a proof-of-concept data analysis using a python API to pull data from the server. In my analysis, I find a correlation between my average walking speed and self-reported mood data, which is a great indication for follow-up work to collect larger amounts of data with many participants and in-depth analysis. For more details, see my project report.

Probabilistic Programming Language

⚒️
Python, Pandas, Numpy
🏗️
Oxford University, Bayesian Statistical Probabilistic Programming
📅
2021

I implemented a probabilistic programming language, a tool for statistical modelling. A probabilistic program is a simulation of a complex random process, such as a biological system, a social network, or a transport network. For example, Uber uses their own probabilistic language Pyro to determine which driver should pick you up when you order a cab. The heart of such a language is the inference engine, which updates the assumptions made in the model based on real world data. A short probabilistic program might look like this:

def coin_flip(samplef, observef, data):
    p = uniform(samplef, 0, 1) # our prior belief about how biased the coin is

    # observing real-world data
    for d in data:
        if d:
            observef(np.log(p))
        else:
            observef(np.log(1-p))
    return p
This program models how biased a coin is, and compares that to real-world data. In the absence of data, we have no idea how biased the coin is. Formally, we say the prior belief about the bias of the coin is distributed uniformly. Visualised, this looks like this:

# Running inference on the probabilistic program:
data = [] # first, running without data, for comparison
run_and_plot(coin_flip, data, LMH, n = 10000)

coin flip

Once we update that prior with data, we get a much better idea over how biased the coin actually is:

# Running inference on the probabilistic program:
data = [True, True, True, False, True, True] # data from a biased coin
run_and_plot(coin_flip, data, LMH, n = 10000)

coin flip

What's exciting about this technique is that it allows researchers to build models that are too complex for traditional statistical modelling. My implementation features multiple inference algorithms, including Lightweight Metropolis Hastings and other Monte Carlo methods.

This work was part of an exam. The university does not allow me to share the exam paper, but I can share my implementation.

micro:bit Parties Library

⚒️
C++, TypeScript
🏗️
Oxford University, Micro:bit Foundation, Group Project
📅
2019

During my undergrad, I worked with the BBC Micro:bit foundation to extend the capabilities of their educational programming platform. The micro:bit is a small computer that can be used to teach children how to code. It looks like this:

microbit

We worked on a library we called Micro:bit Parties for creating multiplayer games. Because the target audience was kids learning how to code, the library had to be super simple to use. Kids could use Microsoft Makecode or Typescript to build games, like so:

microbit parties

This is a 'hot potato' game where the image of a duck on the micro:bit screen can be passed on to someone else by shaking the device.

Our goal was to make games like these easy to build. To achieve this, we needed a shared list of all micro:bits in a room and enable them to send messages to each other. Our custom light-weight networking protocol sends out a 'heartbeat' on a radio frequency. These heartbeats are used to keep a list of all currently visible micro:bits. Unseen messages are forwarded so they can propagate through the network. This is a serverless synchronization protocol that requires no setup by the users at all. To get an idea of how messages are sent, here is the function that sends a hearbeat:

void sendHeartbeat(){
    if (radioEnable() != MICROBIT_OK) return;
    ownMessageId++;
    
    uint8_t buf[PREFIX_LENGTH+MAX_PAYLOAD_LENGTH];
    Prefix prefix;
    prefix.type         = PacketType::HEARTBEAT;
    prefix.messageId    = ownMessageId;
    prefix.origAddress  = microbit_serial_number();
    prefix.destAddress  = 0;
    prefix.hopCount     = 1;
    
    setPacketPrefix(buf, prefix);
    
    memcpy(buf + PREFIX_LENGTH, &status, sizeof(int));
            
    uBit.radio.datagram.send(buf, PREFIX_LENGTH + sizeof(int));
}

For the full implementation, see the parties.cpp file in the project repo.

The Parties Library received an award for the most production-ready student project. The Micro:bit foundation now uses this library to teach kids worldwide how to code.

vvvv - a visual programming language

⚒️
C#, vvvv
🏗️
vvvv Group Berlin
📅
2014 - 2018

I worked for the developers of the visual programming language vvvv. A program in vvvv looks like this:

vvvv

vvvv is used for large-scale interactive media installations. This is how I learnt to code, before going to university. Over the years, I worked on different parts of this comprehensive software product that combines a programming language, integrated development environment, and compiler.

Later, I focussed on teaching vvvv in courses and universities. This was one of the promotional videos for my courses:

Promotional video for vvvv courses 2017

Among the courses I taught was a workshop at the RCA in London and a month-long course for master students at the UCL Interactive Architecture Lab.

Video: student projects I tutored at the UCL Interactive Architecture Lab

vvvv is where I learnt to understand programming languages as tools for thinking and creative expression, and what brought me to study Computer Science and Philosophy at Oxford later on.

Visual:Drumset

⚒️
C++, openFrameworks
🏗️
Ars Electronica
📅
2013

Alongside highschool, I learnt how to code and built an interactive drumset visualisation. I used it to make a music video together with the DJ project Davidecks & Drums:

I used piezo sensors to pick up vibrations from individual drums, and an Arduino circuit board to process this signal.

visual:Drumset Piezo Sensors
Piezo vibration sensors attached to the drums

I built a custom projection mapping tool in openFrameworks to project on 3D geometry with multiple projectors. Here is a video of first tests I did:

This project won the Ars Electronica Golden Nica, referred to as the 'Oscars of Media Art', in the u19 category.