soph

Breaking the Central Limit Theorem

2019-06-14T13:46:00+00:00

AI is broken

2019-06-14T13:46:00+00:00

Below is the video from my talk at Disruption Labs in Berlin. Check out a bunch of amazing speakers on their channel. In the video below, Adam Harvey starts out and then I go on around 40 minutes in.

event
slides

Slash dataset: A toy dataset that stymies most non-convolutional models.

2019-05-28T17:38:00+00:00

TLDR

I’m releasing a dataset “Slash dataset” that is easy for convolutional neural nets to learn but difficult for most other models.

It will live here soph.info/data/slash_data.npz and will be licensed under a Creative Commons Attribution 4.0 International License.

Code for generating the dataset and replicating the analysis is at the bottom of this post. Cheers!

Background

The convolution is an extremely important mathematical operation. It’s frequently used in digital signal processing, including compression techniques for images and audio, and it’s the namesake and heart of “Convolutional Neural Network”.

But what actually is a convolution? There are a bunch of mesmerizing gifs like the one below, and there are some good explanations out there, but I think there’s room to improve.

I’m in the middle of a project to explain exactly what convolution does and, specifically, why it’s so useful for machine learning. As part of that effort, I’ve put together a toy dataset designed to be as simple as possible while also being difficult for all models except for convolutional ones.

Data

The dataset includes 1,000 images, each of which include a slash. It is divided into two classes according to whether the slash is downward facing (like the backslash character “”) or upward facing (like the forward slash ). Below is a random selection along with class labels.

Each image is 30x30 grayscale pixels. I’ve added a little bit of noise to the images to make things interesting.

Results

	1	4	5	6	2	3	0	7	8
Model	Random Forest	SVM - Linear	SVM - RBF kernel	SVM - Polynomial kernel	Logistic Regression	K Nearest Neighbors	KNN, k=1	Vanilla CNN	Custom CNN
Trainable params	9183	498600	675000	675000	901	675000	675000	1041	80
Train score	1	0.994667	0.838667	0.84	0.98	0.842667	1	1	1
Test score	0.472	0.5	0.532	0.532	0.54	0.604	0.908	1	1

Hopefully by now, you’re looking at this dataset and thinking “gee, that doesn’t look too difficult”. This would be a boring task for any human and it might be difficult at first to see why it is so challenging for most models.

I’m going to save a detailed explanation of exactly why that is for later. For now, I want to share my results: I tested out several off-the-shelf models as well as two convolutional models.

Of the off-the-shelf models, I didn’t waste my time tuning most of them, as I feel quite confident that even when tuned they would do poorly. (I’d love to be proved wrong on that 😉.) The one that I did tune was K-Nearest Neighbors, so you’ll see a result for one with the default parameters as well as one with the best performing parameters ($k=1$).

All of the models perform at or close enough to chance (50%) to dismiss them. KNN does pretty well when we tune it (90% test accuracy) but this performance looks less impressive when we note that KNN is memorizing the entire dataset and then comparing new examples to memorized ones. That strategy is not one that we would expect to hold up in the real world where data is much bigger than a few hundred B&W thumbnail images. Because of this, I’ve listed the trainable parameters for each model.

On the other hand, the ConvNets both get perfect scores on both train and test sets. The Vanilla CNN uses the same cookie-cutter structure that I would apply as a first attempt on just about any image data (two rounds of Convolution →Batch Normalization→Pooling). This Vanilla CNN is not tuned to the particular dataset. The Custom CNN is one I built and hand-tuned to demonstrate how ConvNets can really punch above their weight in terms of parameters. This architecture was pretty fragile (tweaking the numbers slightly is likely to break it) when testing but it’s mostly there to make a point.

Methods

All code used to generate this dataset is included below. Be on the lookout for more Convolution content shortly!

An opinionated guide to imbalanced classes

2019-05-07T16:32:00+00:00

Building Generative Adversarial Networks In Tensorflow and Keras

2019-05-01T00:01:00+00:00

How to make a gif in Colab

2019-04-12T10:20:00+00:00

I wrote a little demo of how to make a gif in Colab. You can check it out below or you can open it and run it yourself by clicking on the “Open in Colab” button at the top.

Neat papers in ML and DS: Dec 2018

2018-11-08T11:35:00+00:00

NeurIPS

Performance of different model types on MNIST by year

2018-11-08T11:35:00+00:00

Today I was trying to answer the question of why it seems like so much attention was given to support vector machines in the past. I had assumed that before the Deep Learning renaissance in 2006, SVMs were the dominate model because they outperformed Deep Learning models of the time. But if that was true, I wanted to see how it changed over time and how SVMs and DL models compared to less practical models like KNN.

To investigate this, I went to a table of top performances on the MNIST dataset maintained by Yann LeCun. This table is helpfully organized by model type and year.

I made a bunch of hand modifications (which you can find here) to allow me to plot the performance of different model types. To my surprise, SVMs were actually never the dominant. The reasons SVMs were favored over Deep Learning turn out to be much more subtle and I’ll save them for later.

How to clean Apple’s butterfly keys with a metrocard

2018-09-21T18:46:00+00:00

In early 2015, Apple debuted their butterfly keyboard on their new “macbook”. Since then Apple has migrated this “feature” to their macbook pro line while several have voiced loud complaints about the shallow, fragile keyboard. Apple has since admitted that there’s a problem and offered to repair keyboards with sticky or unresponsive keys.

But turning your computer in for a repair is tedious and likely requires parting with your computer for a week or more. Also, perhaps you are like me and have already extensively repaired your unrepairable computer and violated the warranty in a number of ways. In these cases, it might make more sense to repair your keyboard on your own.

I’ve been using this dumb keyboard for 3.5 years now (what can I say, I love the romance of having the smallest possible computer that can do the things I need). I’ve cleaned these keys more times than I can count and I’ll document my process here. I know this process works for gen 1 and gen 2 of the butterfly keyboard and I’m guessing it works for the rest.

What you’ll need

A metrocard or similar flimsy card.
A q-tips/cloth/paper towel to clean out the keyhole.

Step 1: Remove the key cap and switch

The back of the 'U' key cap and butterfly switch. I've marked the 4 points where the key cap connects to the switch. Sorry my potato phone stinks at macros!

Before you do anything, make sure you understand how the key cap and switch connect to each other. The key cap is what I’m calling the black piece with the key printed on int. The switch is what I’m calling the gray-white mechanism inside the keycap that flaps its wings like a butterfly.

There are four points that connect the key cap and switch: two clips on the top and two hooks on the bottom.

When removing the key cap from the keyboard, it is very important that you only lift from the top, where the clips are. The hooks should only be disconnected after the clips have. If you get this wrong, you can break the hooks on the bottom of the key cap or the pins on the bottom of the switch. Trust me, I’ve done this!

Now, first thing you’ll do is insert one of the corners of your key cap removal tool (metrocard) under the top of the key and pry it up. (video of this below) Sometimes you’ll get both the key cap and switch together, sometimes the key cap will come loose first. The switch is held in by four little pins on the inner rim and you can easily remove it with a fingernail or your key cap removal tool.

Step 2: Clean up your mess

Remove whatever it is you got under there. It can be really small! Take your time here. I’ve typically used a damp cloth to do this but I also have a high tolerance for risk so you do you.

Step 3: Replace the switch and key cap

For this step, a similar word of caution to step 1. It is very important that you don’t squish the hooks in the key cap onto the pins of the switch

If you haven’t already, remove the switch from the key cap.
Insert the switch itself (so, not the key cap) into the key hole. The side of the switch that should face the computer bulges out a bit, while the top should be relatively flat. You’ll press down on the switch until the pins click into the plastic brace.
Start with the bottom of the key cap. Slide the hooks over the pins on the bottom of the switch and then lay the key cap on top of the switch.
You should be able to gently press on the key and hear the top two clips engage.

I have a video of these steps below.

Bootcamp Guide for Everyone Else

2018-08-13T10:21:00+00:00

note 8/15: this is still a work in progress so check back as I fill it in

I’ve been a bootcamp instructor at Metis for about a year now¹. It’s been an incredible, rewarding experience and I plan to stick around for a while. In that time, I’ve been asked a lot about bootcamps. People want to know how bootcamps work, if bootcamps are right for them, holy cow are they really $15,000? I’m putting this guide together to answer those questions.

This guide is for anyone interested in moving to a technical career, like coding, data science, or similar and are able to participate in an immersive (roughly 3 months of full time work) bootcamp. I especially intend this for people who aren’t already familiar with the tech/coding world and who don’t have ~$15,000 laying around to drop on a bootcamp.

My TLDR:

Bootcamps are for real. They can be a legitimately great way to start a good career that too few people have access to.
First, make sure joining an bootcamp is right for you.
- Are you likely to finish and likely to be fully committed to the job search after finishing? Be very honest with yourself about this.
- A great way to figure this out is to complete a bootcamp prep program (many are free). Bootcamp prep is also a great way to figure out which type of bootcamp (like coding vs data science) is right for you.
Don’t put too much pressure on getting in and plan to get rejected from your first bootcamp. Bootcamps can be selective and admissions necessarily involves chance. You’ll learn each time you interview. You might have a favorite bootcamp but there are many good ones. Most bootcamps allow you to interview again later if you are rejected.
Don’t let cost be an issue.
- Any decent bootcamp will work with you on payment if you are admitted. They’ll walk you through alternative payment options.
- If cost is a problem, consider bootcamps with alternative payment options like deferred tuition (you aren’t required to make payments until you get a job) or income sharing (your payment is comes as a percentage of your salary once you get hired). Beware that there’s typically a lot of fine print for both of these.
- Bootcamp loans can also be a great option.

What is a bootcamp?

I’m going to talk specifically about immersive tech/coding bootcamps. Still, it’s a big category. Bootcamps are also called “accelerated learning programs”; they teach technical skills primarily to people who are new. Most bootcamps are roughly 12 weeks long and in that time they are intended to prepare someone to get their first job in an entirely new field.

Let’s stop here and just appreciate how tall of an order that is. If someone wants to prepare for an entirely new career, they might go to a 4 year college or get masters degree (2+ years with low graduation rates). These bootcamps try to accomplish the same thing in 3 months for a much, much lower cost.

That’s an extremely tall order, but for those who get into and complete bootcamps, it tends to work: Course Report found that in 2017 “80% of graduates surveyed say they’ve been employed in a job requiring the technical skills learned at bootcamp, with an average salary increase of 50.5% or $23,724. The average starting salary of a bootcamp grad is $70,698.” One reason this is possible is the immersive bootcamp model, where students are present, full-time,

I work for Metis and I’m super proud of what we do, but this guide isn’t an ad for my company. I’m honestly not going to mention Metis very often. But if you really want to know, I think that we’re the best in the business for people interested in Data Science and for whom our immersive bootcamp model and cost structure works. ↩

How to: Backup iMessages using OSX and Google Drive

2018-03-01T12:57:45+00:00

This is pretty short but I wanted to leave it here for posterity. The process is to just add the archive folder to Google Drive. You don’t have to move it to your Google Drive folder or hard link, as other tutorials have suggested, just add it to the list of extra folders synced by Google Drive.

First, find the folder iMessages uses to store all its data. Mine is in /Users/soph/Library/Messages. If you aren’t able to see the Library folder, use cmd-shift-. to make hidden items viewable.

Select the Google Backup and Sync icon in your menu bar at the top of your screen. Go to …->Preferences->Choose Folder and select that folder.

And, there, you’re done!

A Case for Diversity

2018-02-21T08:26:09+00:00

Author’s note: This is a pre-publication draft. Parts of this material may appear in subsequent publications. I’m sharing in this format to enable transparency and dialogue but I do not wish to misrepresent the relationship this draft may have with any later work.

There’s a lot of talk of diversity, especially discussion of the reasons companies and other groups should prioritize diversity.

Diversity for fairness

One important idea is a general sense of fairness. Careers that pay well should be distributed roughly proportionately among genders, ethnic groups, and other ways of categorizing people. We have the sense that there’s nothing about being male that makes someone a better programmer, or about being white that makes someone a better hedge fund manager. Then fairness suggests to many that, any unevenness in who gets access to these desirable careers reflects a lack of fairness. When a tech company primarily is located in a country where black people make up 12% of the population, but that company’s tech workers are only 1% black (as is true of Alphabet/Google, Amazon, and other tech companies) then something unfair is going on. Sure, there is some buck-passing here. There are problems at many levels that create unfairness: geographic segregation, school systems, gaps in generational wealth accumulation, gaps in laws. But the culture of many of these tech companies is one that famously fights (and wins!) against structural hurdles like these. If these companies can move fast and break things when it comes to new features and their bottom line, we should expect the same when it comes to fairness.

Diversity for social tech

There are other reasons companies prioritize diversity, like legal ones, that I won’t get into. Instead I want to focus on something I would like to hear more often and more clearly from the technology world. Independent of everything else, diversity will become essential to the success of any tech company. This is because the products made by tech companies are increasingly cultural ones, embedded in our everyday social lives, and producing social products that are successful requires engineers/designers/etc. who have a social understanding of those who use the products.

Let’s not forget when Google Maps pronounced Malcolm X Street “Malcolm the Tenth” as if he were a british monarch[Baratunde Thurston]; that Facebook’s name policy locks the accounts of trans and gender non-conforming users, including a Facebook employee[Zoe Cat]; when Google Images automatically labeled black people gorillas[Jacky AlcinÃ©]; the long history of racial and gender bias in face recognition[Joy Buolamwini: Gender Shades]; the many cases of technology ignoring dark skin [like soap dispensers and heart rate monitors]. These are just a few examples but they illustrate the point that we are interacting with our technology in social ways. Each of the above examples would have been trivial to catch for an engineer, project manager, or other employee from an underrepresented group.

What I’m arguing here is that it’s in tech company’s self-interest to seriously prioritize diversity. Even a company that is unconvinced that diversity is important because of fairness or legal reasons should be concerned about how they plan to design products that people interact with in social contexts. These companies will want to ensure that their tech workforce is diverse at every levelâfrom the people writing the code to those deciding which new features to develop and beyondâbecause each of these decisions will become more significantly social as time goes on.

Diversity to stop algorithmic violence

One specific concern involves the idea of algorithmic violence[Mimi Onuoha], a term coined by Mimi Onuoha that refers to the ways that automated decision-making does real harm to people. In college I studied Computer Engineering and took an ethics course along with civil and mechanical engineers where we discussed the ethical challenges involved in designing walkways and other physical things. What was missing then, and seems to largely be missing now, is a serious look at how decisions made in technology companies, including by software developers and data scientists, lead to real consequences.

Some algorithmic violence is quite easy to see, such as when Palantir (one of the most valuable data companies and one that was recently sued for racial discrimination[Vanity Fair]) builds an enormous data machine for the targeting and tracking of immigrants for deportation[The Intercept]. In most cases, though, algorithmic violence is less than obvious. Guillaume Chaslot writes[Medium] that YouTube’s massive recommendation engine, one that he helped design, is tasked with maximizing users’ viewing time. The recommendation engine does this in a single-minded way that ignores the effects that the kind of content one cannot turn away from (like disturbing videos targeted at kids[Medium] and conspiracy theories[Vanity Fair]) might have on its viewers and even elections[Chaslot @ Medium]. In ‘Automating Inequality’[Strand] Virginia Eubanks provides an extensive catalogue of ways that seemingly âobjective’ automated systems harm vulnerable people, whether they were set up to do so intentionally or not.

Algorithmic violence is a problem, like information security and global warming, that is virtually guaranteed to become more important with time. Because the effects of algorithmic violence are often hidden except to those who are affected by it, companies without a diverse workforce will be at a disadvantage when trying to recognize and prevent such violence. So prioritizing diversity is one of the many steps, like a Hippocratic oath for technology[Marie], we need to take to counter algorithmic violence.

Soph’s VM tweaks

2018-02-14T11:11:48+00:00

What’s this

(hi, this is a test)

This is a scratch pad for me to use when I set up new virtual machines. I’m sharing it in case anyone else is interested.

fish is by far my favorite shell

Install it and configure it following this tutorial

omf is a handy package manager for fish.

I use the foreign environment interface to load my ~/.bash_profile and similar. (Thank you)

You can install themes with it and my fav is probably sushi.

If you’re using anaconda, you’ll have to remember to use activate instead of source activate details.

tmux

I love tmux! I p much always install it first thing. To make it more useful, I add options by creating the following file at ~/.tmux.conf.

set-option -g mouse on
set-option -g default-command /usr/bin/fish

other tools

install dtrx for common sense file extraction
- sudo apt install dtrx
glances
gpustat

Access Jupyter from your server

I typically set up a jupyter server mostly according to Chris Albon’s instructions here.

Start Jupyter on restart

Previously, my workflow was something like this: go to console in browser and start ec2, go to terminal and mosh into ec2, start jupyter notebook, go back to browser and use jupyter. That’s an annoying amount of steps. I like to simplify this so that on every device restart, my machine automagically starts a jupyter notebook server and glances (which I use to monitor the machine’s resource usage).

Here’s the solution I found, which is a modification of this. Modify your /etc/rc.local to include the following above the exit 0 line:

export PATH="$PATH:/home/ubuntu/miniconda3/bin"
nohup jupyter notebook --notebook-dir=/home/ubuntu/ &
nohup glances -w &

exit 0

Paperspace how-to (cheap cloud GPU)

2018-02-01T16:11:40+00:00

The use of GPUs have become quite important for applications of Deep Learning. The landscape of this hardware has lately become quite interesting. The cryptocurrency fad has led to GPUs becoming expensive and/or unavailable. One solution to this problem is to rent a computer with a GPU from Google, Amazon, Microsoft, and others. Google even recently lowered their GPU prices quite substantially. I’ll save doing a full comparison of each platform for later, but what I’ll go over today is getting started with a relatively new service, PaperSpace based in my home of Brooklyn <3.

The virtue of paperspace is that their entry-level GPU offering costs $0.40/hr (compared to Google’s new $0.45/hr and Amazon’s $0.90/hr) but benchmarks ahead of Amazon and other offers based on the Tesla K80.

Below, I’ll show you how to get up and running quickly and how to set up cost-saving measures, like auto-shutdown.

Steps to get running

First, as with any new machine, update the current packages with

sudo apt update
sudo apt upgrade

(Note: I had to add the --fix-missing flag to sudo apt update)

Add a new user according to these instructions.

open ports with ufw

By default, Paperspace has a very strict firewall (this is a good thing). We’re going to want to get to our jupyter notebooks, though, so we need to open up some ports. You can do that with these instructions.

My version is pretty unsafe (it allows access from any IP) so feel free to check that link for info on restricting the IP that can access your jupyter port.

sudo ufw allow 8888
sudo ufw allow 60000:61000/udp

set up jupyter

fix autoshutdown with ssh

https://paperspace.zendesk.com/hc/en-us/articles/115002807447-How-do-I-use-Auto-Shutdown-on-my-Linux-machine-when-connecting-through-SSH-

set up ssh keys

https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys–2

https://apple.stackexchange.com/questions/48502/how-can-i-permanently-add-my-ssh-private-key-to-keychain-so-it-is-automatically

Install cuda 9.1

You can use the official guide but I prefer these instructions.

Install cudnn

http://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html

Deep Learning Tips

2017-12-04T13:15:11+00:00

I often find myself in roughly this situation:

I’ve done a lot of work to pre-process data, researched different DL architectures and selected one (or more), modified it so that it works well on my data. And now I’ve got a model that does well, but not quite as well as I want it to.

Below I’ll outline steps that I find very useful in this situation. Of course, more detail about how to get to that point is for another post. Also, I’m using Keras so many of these are specific to that tool, but they could absolutely be applied to other DL packages.

Clean up training code and improve logging.

My typical workflow when I’m building or tweaking a model by hand is to run Kears in a Jupyter notebook. This works fine if the entire pipeline runs in a few minutes but doesn’t work if I need to close my laptop while the pipeline is running. Jupyter often has trouble gracefully reconnecting and then I lose all the verbose info.

Porting my code into a self-contained python script allows me to connect to a machine over mosh and tmux, run the script, and then forget about it. This is handy on it’s own but it’ll be extremely handy when we get to the later tips.

The tools I use for this are:

CSVLogger takes the output from verbose and stores it in a csv file for later. If you’re using tmux, it should be preserving your terminal log (and the verbose output) but csv puts all of that history data in a format where you can easily use it.

TensorBoard is super handy if you want to be able to monitor the progress of a long-running model from somewhere other than your terminal. I can access this from my iPad and it looks great!

ModelCheckpoint saves your model every so often (you set the frequency as a parameter). This is nice on its own because it allows you to access a model that was trained in a script from anywhere (including my preferred tinkering environment, Jupyter). Even more than that, if you use save_best_only=True, then the script is automatically saving space by only saving the best performing model (you should youse val_loss or some other validation metric with this option).

Learning Rate Schedule

keras.callbacks.LearningRateScheduler
keras.callbacks.ReduceLROnPlateau This is my preference. I use this in conjunction with EarlyStopping and ModelCheckpoint all the time.

Use noise and other transformations to enlarge your datasets

Optimize

There are many hyperparameters that can be optimized:

Learning rate
Dropout rate
Preprocessing steps
Different architectures (size and shape of layers as well as count) and parameters
Regularizers and parameters
Initializers and parameters
and more!

These create an enormous space of possible models for which to test. My recommendation is to find something that works and then from that functioning model determine plausible options for the above hyperparameters. Then throw what you have into hyperopt.

HyperOpt does this automatically. It’s poorly documented but fairly easy to use.

Deep learning from scratch with python

2017-10-18T10:09:10+00:00

Last week I presented at the Data Science Study Group on a project of mine where I built a deep learning platform from scratch in python.

For reference, here’s my code and slides.

First, my project drew primarily from two sets of sources, without which I never would have completed this project. First, there are many examples of folks doing this online. Here’s an incomplete list in python:

While all of these are useful, Sabinasz’s was what I based my project on because he implements a system that builds a computational graph and includes a true backpropogation algorithm. The others I saw do this implicitly by calculating the gradients operation-by-operation. That approach is fine for a single demo but I wanted something that mimicked the flexibility of tensorflow, allowing me to compare different network structures and activations without starting over each time.

In addition to these resources, I drew heavily from Deep Learning by Goodfellow, Bengio, and Courville. I’m certain that the other examples I looked toward used this book as well.

While I started with Sabinasz’s code, I made a few modifications and improvements including:

Add graph visualization with python Graphviz
Remove the use of globals for the computational graph
Simplify backprop algorithm by adding gradient calculations to the operation classes
Add a Relu activation function
Tweak the visualizations

Here’s the learning rate plotted along with the classification boundary for a relu network with 4 hidden nodes.

And here’s the computational graph. You can really see the benefit of tracking the graph and automating the backprop algorithm for a graph of this size.

What I still want to do

I want to write up a blog summarizing my talk and the process for creating this. I think it could be a very useful explanatory tool.
I have a strong feeling that some of the gradients in here are inaccurate.
- In many cases the network fails to learn for any learning rate schedule unless I give it a much higher capacity than it needs (e.g. 4+ hidden nodes in the XOR task).
- In simple cases, like separable data, the model should be able to get arbitrarily close to $J=0$ but fails to do so.
- The softmax gradient seems to differ from that found in other sources
I want to extend this model to larger datasets and deeper networks. Right now it runs into what I think are underflow errors in these cases but they should be possible to avoid.

Trans fam, update your documents!

2017-07-26T11:18:04+00:00

Trans friends, here’s a periodic reminder that it might be useful to update your documents (name, gender marker, id). I’ve recently done this. Here’s the quickest and cheapest path for US citizens:

1 Update your passport gender marker. For this you need proof of citizenship (like a birth certificate), a letter from your doctor, and a new passport photo. Use the template in the link below. The cost is $110 and this can be done in one or two days. info here 2 Update your name in court. This process differs state by state. Roughly, you’ll schedule a hearing, gather documents including a birth certificate, publish an announcement in the paper (trans folks are generally able to waive the publication requirement in NY), then get certified copies of the court order NYC, NY, NJ 3 Update your passport name. Do this within a year of step one and it’s free and only requires the court order (form here). 4 Everything else (except sometimes a birth certificate) can be done with an updated passport and possibly some additional documentation.

If you need help, I’d love to or help find someone who can. If money is a problem, Trans Assistance Project, TLDEF, SRLP, and other places have provided that in the past. Let me know if you have any trouble.

Sophia

2016-04-02T00:00:00+00:00

One thing about being trans is that I get to come up with my own name. Plenty of non-trans people do this, sure, but to me it sort of feels like a rite of passage for trans folk. I’ve chosen “Sophia” for myself. I wanted to go through my thought process in choosing this name, both for myself and for others, as I found reading other people’s accounts of choosing their name to be quite helpful in choosing mine.

My checklist was something like the following (in approximate order of importance):

It had to feel right.
It had to feel feminine.
None of my close friends have that name.
It doesn’t stand out too much.

A common approach would be to feminize my current masculine name. So if my parents had named me Henry at birth, I might use Henrietta. This would have been fine, as I’m not personally bothered by deriving my female name from a male name, but it just doesn’t seem to fit. Another common option is to ask my parents what they would have named me if I had arrived as a girl to them. I think my mom has told me I would have been “Lindsey” which, for whatever reason, does not feel right. So back to the drawing board.

Sophia is a name that my wife and I have discussed for years as a potential name for a hypothetical daughter. But since we first started discussing the name, it has become an enormously popular. Rising from around 50th most popular girls name in the 90s, to around 10th in the 00s, to between the 1st and 3rd most popular now. We’ve always wanted to give our children names that fell farther down the list (because of course they will be unique little snowflakes!), so we’re not quite as excited about it as a name for a child now.

But for me, this could be a good thing. The popularity of the name means that “Sophia” is very recognizably feminine but, because it was more rare when I was young, I know essentially no one who has that name. It thus meets the last three criteria very well. It also has a weird personal connection because I’ve envisioned a daughter with that name—which I’m still not sure if that is good or bad.

But outside of all of these practical considerations, Sophia has a certain importance to me. To explain, let’s fire up the flashback machine to 2004 or so. At that time in my life I had just begun to play tabletop role-playing games with friends. It’s the kind of thing where you build a character out of the clay of your imagination and a giant rulebook and then roll dice to see what that character is able to do. You do this with friends who have made characters in a similar way, and try to make the voices the characters would make, et cetera. For me, it presented this strange dilemma: I was quite curious about playing a female character but totally apprehensive of doing so in front of the teenage boys others I played the game with. [I’d be interested to learn more about how these kinds of games impacted other trans people.]

Around that time I started playing what would become my favorite video game, The Knights of the Old Republic. It was a game for Xbox that followed the exact same rules as the pen and paper games I played with my friends. But this was single player on the Xbox. For the first time I got to choose the gender of my character, name her, and play her, all to myself. Sure I might have to explain to my brother or friends why my save game had a chick on it but that was so much less daunting than performing as a female in front of teenage boys.

Anyway, in this game I chose the character who looked most like me: dark hair, light skin, a woman. I named her Sophia after a character in the book I was reading at the time, Sophie Neveu in The Da Vinci Code. I’ve sort of grown out of The Da Vinci Code, though at the time my Dan Brown obsession was burning hot. And, yes, I realize Sophie’s character in The Da Vinci code is problematic for several reasons, but let’s give 14 year old me a break. The important thing is that it all just felt comfortable. And, props to Bioware, Sophia could do everything that any male character could, including pursuing plot-impacting romances with major female Non-Player-Characters. So adolescent me got to try on the skin of a lesbian jedi who always saved the day. And I fucking loved it.

For the next decade-plus, Sophia and I (or, I as Sophia) played so many role playing games: KOTOR 2, Mass Effect 1-3, Jade Empire, Dragon Age, … the list goes on (and certainly includes non-Bioware games, though those are my favorites). By the time I began to appreciate the fact that I am and have always been trans at age 27, I’ve already been trying on Sophia for nearly half of my life. After looking back and just realizing that fact, there is no way I can “choose” anything else. It doesn’t even feel like choosing at that point. I am Sophia and I have been for longer than even I realized.