Geoff Hulten, Author at Intelligent System

Top five career paths for data professionals

[ Video version here. ]

My first job was as a machine learning researcher in a product group at one of the big tech companies.

I remember thinking I needed a backup plan, because I was pretty sure the whole machine learning thing would turn out to be a fad, and then I’d have to figure out if I wanted to be a software engineer or a program manager…or if I’d just try to move back in with mom and dad.

A few years later I started managing other machine learning researchers. Every year at review time I encouraged them to keep up their non-machine learning skills. You know, because of the whole ‘machine learning might be a fad’ thing…

Boy was I wrong. Today machine learning is more than just a single stable career path, there are actually many different types of careers you can have in machine learning, depending on your interests and skills.

For example, my titles have included: researcher, applied researcher, program manager, applied research manager, principal software engineer, architect, machine learning scientist, and software engineering manager. And over the past 15 years I’ve worked with: data scientists, decision scientists, data analyst, machine learning engineers, data quality engineers, scientists, applied scientists, research scientists, and ranking engineers. And all of these titles were doing similar data and ML focused work.

So what the heck is going on?

Well, the large scale use of data is relatively new and we’re inventing stuff as we go. Different organizations have different data cultures, and there are many strange evolutionary forces at work.

For example, one company I worked for got rid of the ‘software test engineer’ function. In the process, many software test engineers were given the option to change their titles ‘data scientist’…and then to try to figure out what the heck a data scientist did for a living…

As you can imagine, this led to some chaos. And it strongly affected the data culture at the company. Because everyone who was a data scientist before this change did everything they could to avoid getting sucked into management chains that had no data experience, but huge data ambition.

The result? If you’re looking for a job at this company and search for ‘data science’ you might not end up with what you expect.

Another company I worked for thinks data skills will consolidate over time and eventually every software engineer will add ‘data and ML’ to their toolkit…and there won’t be any specialized data people…eventually. So if you search for ‘data scientist’ at this company you might find nothing – but the company would love to hire people with strong data science skills as ‘engineers’.

I’m sure there are hundreds of similar stories across the industry. And as people who learn data in one culture move between companies, things are diffusing and blending in crazy ways.

The point is: when you’re looking for a job in data or machine learning, keep an open mind – don’t get over-indexed on a particular title or a particular way of looking at the field.

So where to start? Here are five data professional job functions that I think will become stable over time. The names may vary, but the functions (hopefully) won’t. These are:

Machine Learning Researcher
Data Scientist
Machine Learning Scientist (Modeler)
Machine Learning Engineer
Machine Learning Architect (Program Manager)

Keep in mind that these are functions, not jobs. Most jobs will blend these to various degrees. I’ll go through and give a bit more detail.

Machine Learning Researcher

Machine Learning Researchers advance the state of human knowledge. They come up with theories about how the world works and they create experiments to test those theories. When they are right, the result is new algorithms or approaches that allow us to accomplish more than we thought we could.

And it is an exciting time to be a researcher in machine learning. Things are advancing crazy fast and small groups of people have accomplished incredible things.

Machine Learning researchers might be ‘applied’ in that they work in the context of a specific product, like a search engine or a self-driving car. But fundamentally research is not about building products. It is about understanding why a particular approach works in a particular setting, and creating knowledge that transcends any single feature or product.

Core skills for success with machine learning research include: scientific method, ability to deal with ambiguity, a high level of comfort with advanced math, good communication, the ability to advocate for crazy ideas, and just enough engineering skills to carry out some experiments.

To become a professional Machine Learning Researcher – like to get some company to sponsor you to sit around and try to advance human knowledge – you really need to publish papers at top scientific conferences. And the only practical way to learn how to do that is to get a PhD in Machine Learning.

There was a time when the only way to learn machine learning was to get a PhD; so there was a time where just about every professional machine learning practitioner had a PhD. But this is no longer necessary. In fact, unless you really want to advance human knowledge and write papers: a PhD is an inefficient way to become a data professional.

Data Scientist

Data Scientists find the stories in data and share them with others. They explore large data sets and answer questions, measure performance, track down problems, and find unexpected connections. A great data scientist is like a detective – they know how to interpret the clues they find in the data and track those clues to uncover valuable insights.

Most data scientists have background in statistics or applied mathematics, coupled with enough programming skill to independently get at log data, process and clean it, query it, and automate repetitive tasks.

And data science requires a specific mentality. You have to like staring at data and dreaming up stories that explain it. But you also have to be meticulous and technical enough to prove or disprove your stories (before spouting off random theories and confusing everyone around you).

Core skills for success with data science include: A curious and flexible mind, deep statistical knowledge, familiarity with data querying languages, and moderate programming, probably in R or Python.

And what does this have to do with machine learning?

You might say that Data Science is about understanding what is happening in a big complicated system, while “machine learning” is about predicting what is going to happen in the future. There is a lot of overlap in tools and approaches. The differences are about the focus.

Machine Learning Scientist (Modeler)

Machine Learning Scientists build models. They find or create training data, do feature engineering, they know what learning algorithm to use for any particular task, they tune model parameters, they measure, measure, measure, and they know how to evaluate the output of modeling runs and what to change to make based on these observations.

Modeling is an open ended, exploratory task. Kind of like constantly debugging a program written in a language you can’t understand. A machine learning scientist might spend weeks or months working on a single modeling task, making the model just a little bit better every single day.

Core skills for success at machine learning science include: a deep intuition with the core modeling algorithms and approaches, expertise in one or more domains (like NLP or computer vision), strong programming in a language like python, a lot of comfort with data processing and querying, and a passion for measuring and debugging.

The best way to become a machine learning scientist is to get a degree in a field with a computational focus, like computer science, applied math, maybe statistics or even a hard science like physics or chemistry. This will give the core statistics and computation skills. And then go to Kaggle.com and start entering their modeling competitions. Start doing well in Kaggle, and you’re well on your way to becoming a machine learning scientist.

Machine Learning Engineer

Machine Learning Engineers integrate machine learning into working systems to produce successful end-to-end experiences. They implement the runtimes where models execute, they build systems to deploy new models reliably, they connect model output into user experiences, and they build systems that collect telemetry about interactions between users and models, producing future training data.

Machine learning engineers create the systems that put guardrails around the machine learning modeling process, allowing creative exploration, but providing simple, reliable ways to take the resulting models and ingest them into the broader system.

Core skills for success in machine learning engineering start, of course, with a strong software engineering base. Beyond that, a good conceptual understanding of machine learning is key. Not the math behind the algorithms – that’s not super important to a machine learning engineer – but the pieces that make up a machine learning implementation, and where in the system should they live.

With a little study, any software engineer can get into machine learning engineering. You could take an online course, do a few Kaggle tutorials. But in my opinion, the best place to start is by reading this book. Building Intelligent Systems. Which I wrote. This book has all the stuff I wish I knew when I got started doing machine learning professionally.

Machine Learning Architect / Program Manager

Machine learning architects / program managers design ML-based solutions to real world problems. They know when machine learning is the right tool (and when it isn’t); they understand how to optimize a system end to end so that the machine learning is in position to shine; they know how to design around the mistakes that machine learning is guaranteed to make; and they know how to nurture a machine learning system through its lifecycle from a technical demo, to a viable product, to a world class solution.

When machine learning architects look at a problem they don’t ask: can my organization model that. They ask: should my organization model that. And if so what’s the best approach to be efficient and reliable. You can learn more by watching this video or this blog post.

Core skills for a machine learning architect or program manager are strong software design skills, customer empathy, and a strong conceptual understanding of aspects of machine learning (but not the math and not the specific algorithms).

And the best way to get into machine learning architecture or program management? Work as a program manager or engineer for a while… and then read Building Intelligent Systems. I don’t know. I’m sorry. I guess I’m a bit biased.

Summary

It’s an exciting time to be a data professional. Data and machine learning are making the world a better place – and things are changing fast. Good luck. Stay safe!

ML Career	Core Activity	Core Skills
Machine Learning Researcher	Advance human knowledge	Scientific method, math, basic programming
Data Scientist	Stories from data	Statistics, data manipulation, communication
(Applied) Machine Learning Scientist	Build predictive models	Machine learning algorithms, domain specific feature engineering, basic programming
Machine Learning Engineer	Integrate machine learning into systems	Software engineering, conceptual machine learning
Machine Learning Architecture / Program Manager	Design solutions that leverage machine learning	Software design skills, customer empathy, Strong conceptual machine learning

Thinking like a Machine Learning Architect

I’m going to go through three questions you can ask to start thinking like a machine learning architect:

Is machine learning the right way to solve the problem?

How do machine learning systems integrate with exiting systems?

What is the cost to build and run the ML system over time?

I had a chance to give an overview of machine learning to a “leader” – think executive at a big company. This person is world class at what they do, but has no understanding of machine learning, and fears they are getting left behind.

And it was an eye-opening experience for me. Think about it. Machine learning is hot. The news is full of amazing stories – beyond human level – successes with ML. What if a competitor gets there first? There’s a lot of pressure for a leader to make some good decisions.

And it’s not easy to know what to do. People throw around buzzwords, deep-boosted this, reinforcement BERT-ing of that, Bayesianized sigmoidization of neural activizationing, and blah-bla-die blah-da day. Ask a researcher, they’ll tell you how their latest technique is the linchpin to success and everyone else has been getting it wrong all along; ask someone just out of school and they’ll ask you where the training data is at; ask someone with a lot of experience, and, well, you can’t, because one of the big tech companies already hired them.

Where is the bridge between this potentially amazing tool, and a good decision about if and where to invest in it?

What’s needed is the ability to move beyond asking ‘can I model that?’, and starts asking the question ‘should my organization model that?’ I call this type of thought: machine learning architecture.

And you don’t have to be super technical to be a great machine learning architect. Just think of it like any other investment a business could make. Should we buy a second delivery van? Well, is a van the right tool to solve the problems we’re having? Can we adapt our business to properly leverage it? What will it cost to run it month over month?

Basic questions, but they require understanding the strengths and weaknesses of machine learning in a kind of deep way and they require popping up and understanding the context. I’m going to go through three questions you can ask to start thinking like a machine learning architect:

Is machine learning the right way to solve the problem?
How do machine learning systems integrate with exiting systems?
What is the cost to build and run the ML system over time?

Is machine learning the right way to solve the problem?

If you’re writing software for a bank to deal with withdrawals, you could use machine learning. You’ll have tons of training data, endless logs of transactions with info on: balance before, withdrawal amount, new balance. A simple regression problem…You could probably even get to like 99.5% accuracy if you worked at it hard enough…

Or you could write one line of code: newBalance = oldBalance – withdrawlAmount;

A bit of a silly example. But the point is that machine learning isn’t right for every problem.

A machine learning architect will have a good understanding of the properties that makes machine learning an efficient approach to solving a problem. Here are a few to get you started:

The problem is very large. Like if you have to organize tens of millions of web pages or pictures or social network posts and it’s just too much to do manually. Think about it, there are more web pages than 100 people could examine in their lifetime, more than 1000 people could. When a problem is huge, machine learning might be the right answer.
The problem is open ended. But there are more books, buildings, products, people, and, well – stuff – every day. If you need to constantly make decisions about new things and it’s just not practical to keep up, machine learning might be the right answer.
The problem changes. What’s worse than building an expensive system once? Building it every week, over and over, forever. We’re living through a huge change right now. Every business projection and decision process designed in 2019 is out the window for 2020. Machine learning isn’t a magic bullet for dealing with change, but it can make it faster and cheaper to adapt.
The problem is hard. Things like human level perception, or where humans need some serious expertise to succeed. Think about a game like tic-tac-toe. Anyone can become ‘world class’ at that game by learning a few simple rules. Tic-tac-toe is not hard enough to need ML. Contrast this to chess, where experts are much, much better than beginners – that’s a hard problem and ML might help.

A machine learning architect will identify one of these four properties in a problem before suggesting machine learning. In fact, they’ll probably see several of them. If not, they’ll probably find a cheaper and more reliable solutions without machine learning.

How do machine learning systems integrate with your existing systems?

One of the great Program Managers I had the pleasure to work with used to say: Machine learning is an approach, it isn’t a solution. A machine learning architect will understand how the machine learning approach they select complements and is supported by the approach taken by their existing systems.

And there are several important approaches to machine learning. I call them Machine learning design patterns.

One important ml design pattern is called corpus based, where you invest in creating a data asset and leveraging it across a series of hard (but not-time-changing) problems. When I worked on the Kinect, we took a corpus based approach, collecting tons of data and carefully annotating it for many different uses.

Another important ml design pattern is called closed loop, where you carefully shape the interactions your users will have your system so that they automatically create training data as they go. I used a closed loop approach when working on anti-abuse systems, where an adversary changed the problem every day, so there was much less value in building up a long-lived corpus.

I’ll provide links to videos about these two important ml design patterns (corpus based, closed loop), including a breakdown of their properties, and a walk-through of case studies.

A machine learning architect will be familiar with the pros and cons of the common machine learning design patterns. They’ll know how each design pattern could interact with their current systems & processes, which match well, and which would require major rework.

What is the cost to build and run the ML system over time?

Building an ML system is easy! Just install python, maybe pytorch, a bit of feature engineering, a few days of tuning, then compile the model into your current app and add ‘proven machine learning expert’ to your resume…right?

Well, that’s one way to do it, but if you’re doing it that way, you’re not thinking like a machine learning architect.

To be most valuable, machine learning needs a lot of support, and if you’re not building that support, you probably didn’t need machine learning to begin with.

This include things like telemetry systems, automated retraining, model deployment and management systems, orchestration systems, and client integrations.

You probably won’t have to invest in all of these to make efficient use of machine learning, but a machine learning architect would understand how important they are (given the ml design pattern they’d selected) and how much work it would be to add them to their existing systems.

For example, you’ve got to be realistic about mistakes, because any machine learning based system is going to make mistakes. Wild, and crazy mistakes. Like you know how when a human expert makes a mistake, they are usually at least kind of right? In the ballpark? Because they pretty much know what’s going on? Well, machine learning isn’t like that. Machine learning makes bat-zo-bizarro mistakes.

So ask yourself, how does your existing system interact with the mistakes you expect? Are the mistakes easy to detect and mitigate? Or will you have to change your existing workflows to identify and mitigate the problems that ML will create?

So there are three steps to thinking like a machine learning architect: Do you have the right problem; how does machine learning integrate with existing systems; and what does it cost to build and run over time.

And developing the base skills to think this way can be valuable to anyone involved in machine learning systems, not just the machine learning scientist. So If you’re a machine learning professional, a engineering manager, a technical program manager, or even that leader I got a chance to talk to, who is trying to figure out if and how to invest in machine learning…

You can learn a lot more by reading this book, building intelligent systems. Or by subscribing to my YouTube channel.

Good luck, and stay safe!

7 Tips for Engaging in Video Meetings

There are a lot more video meetings going on, and some of them are important: interviews, presentations, introductions to your boss’s boss – who knows, maybe even a first date. And if you’re lucky enough to have one of those lined up, you’re going to want to put your best foot forward.

Here are seven tips on how to look your best and engage in video meetings.

We all have experience on how to present ourselves well in person – wear nice clothes, a suit or maybe a collared shirt, comb your hair, a bit of makeup, maybe some jewelry. But then you sit down in front of your web camera, fire up the video preview – and bam! It’s not what you’re hoping to see.

But don’t worry, a few small things can make a huge difference. I’m going to give you seven simple tips you can (probably) implement with things you have around the house that will help you go from this:

To more like this:

Use lots of light
Modern web and phone cameras are awesome. They can take pictures in terrible low-light situations that would have been impossible a decade ago. But just because a camera can take a picture doesn’t mean that is the picture you’re going to want to use to represent yourself. If your light is poor your video will look grainy and dingy – fine for a regular team meeting, or a chat with a friend, but not the best way to make a great impression.

So bring in a few lamps, or flashlights, whatever you can find. Every bit of light you can get (within reason) will let your camera take higher quality video. But beware – you have to use the light correctly. In fact, adding light in the wrong way will make your face darker, or cast harsh shadows that make you look like a super-villain-wanna-be.

The next few tips will help you use light to achieve best results.
Avoid back-light
Your face should be the brightest thing your camera sees. Bright lights behind you cause your camera to compensate and can make your face darker instead of brighter. It’s easy to make a mistake with back light. One common fail-job is to set up in front of a window, because damn, your yard looks amazing and wouldn’t people like to get a look at that? No! Remember, the sun is a gigantic nuclear explosion in outer space. The sun makes a LOT of light. Even on a cloudy day, even with thin curtains drawn, if your camera can see a window behind you, you’re going to have a hard time making your face look its best.

Even dim lamps behind you can cause problems. Sure, they can create a neat effect, especially when they have colored lights, but setting up that effect takes some tweaking (and some gear). You can learn how to do it, if you want, but how about you start easy – where the camera can’t see any light source or window directly.
Diffuse your light
Ever wonder why photographers use those ridiculous gigantic umbrella-dome-things on their lights? It’s because the domes diffuse light. That is, they spread out a light source, so light emanates from the largest possible area, instead of all coming from a single point. When bright light comes from a single point it casts harsh shadows. Your face has all sorts of bumps and ridges that interact poorly with harsh light to make you look terrible.

It’s kind of like the difference between the shadows the sun makes on a sunny day (which are harsh) and the shadows the sun makes on a cloudy day (which are soft). Soft shadows look much, much better on your face that hash shadows do. So you should be careful not to shine harsh light at your face – diffuse it first!
A window can be a great diffuse light source (as long as the sun isn’t shining through it directly onto your face or creating back light).

You can make a diffuse light source taking a light (or window) and hanging a sheet in front of it, so the light passes through the sheet, diffuses across the sheet, and then reaches your face. Remember, you want to make the light source as big as possible, so put the light several feet behind the sheet, so the light-spot it makes is as large as possible.

If you don’t have any appropriate sheets, you can try reflecting the light off the wall too. Try to pick a wall with a neutral color.

3-point lighting
A bit more advanced, but if you are using multiple lights you want the brightest one to be in front of you, a few feet to one side of the camera, and maybe a little bit above eye level (this is called the ‘key’ light). It should be as diffuse as possible, but even a diffuse light will cast some shadow on your face, so you might want to use a second, less-bright light in front of you, a few feet on the other side of the camera from the key light to fill in those shadows (this is called a ‘fill’ light).

Finally, if you want to get even more fancy, you could put a light above you, shining down at the back of your head. Put it somewhere the camera won’t see it. It should create a bright rim on you, which will make you pop out from the background and look more alive and engaging (this is called the ‘rim’ light, or sometimes a ‘hair’ light). This one doesn’t have to be diffuse.
Give your audience a natural perspective
Doing a video conference with your laptop in your lap? Well, your audience perceives themselves as being below you, staring up your nose. That’s not a very natural thing for someone to do. Imagine you’re in a room with someone, sitting across a table from them. Their eyes would be generally at the same level as yours. You’ll make your audience most comfortable by giving them a similar perspective in the video chat.

So try to position your camera at eye height. You might want to pull over a table, stack some books to make a stand, and put your phone or laptop on it.

Make eye contact
Making eye contact is a great way to build a connection. It lets you communicate with micro expressions and is a big part of natural engagement between people. Unfortunately, when communicating via video chat, it’s hard to make eye contact. You’re doing what your brain is programmed to do, looking at the other person’s eyes in the video feed, but they aren’t looking back from that spot – they are looking through your camera. So to them, it will seem like you are looking down. The further apart your camera and the video on the screen, the worse the effect.

So if you have multiple monitors, make sure your video-chat app and camera are on the same monitor.

And if you have a big screen, position the chat app window as close as possible to your camera. You can do this by shrinking the video window way down and positioning it as close to the camera as possible. Or you can do it by rigging a stand to position your camera in front of your screen, right in line with the other person’s eyes on your monitor. You won’t see as much detail in a smaller video window, or with a camera blocking part of your screen, but your audience will have a much stronger illusion that you are meeting their eyes, which will make them feel more connected, more like you’re interested in what they have to say.

If these tricks won’t work for you, for example if you’re using a phone and can’t move the camera or the video window, try propping up the phone and standing back a few feet. The further away you are from the camera and video display, the less it will seem you are looking away from making eye contact – simple geometry.
Get a natural sound
Your computer probably has a microphone in it, your phone and laptop certainly do. You can use those for your video meeting, but they might not sound the best. In fact, they can sound quite echo-y and thin, particularly if your room has bad acoustics or if the microphone is more than a few feet away from your mouth (as it probably will be if you’re trying to set up an engaging visual connection).

One option is to get a little lapel mic (also called a lavalier mic). Clip it to your shirt and plug it into your computer’s audio input. You can get some amazing sound out of a pretty cheap lav if you spend a few minutes positioning it well. Here is an excellent lavalier microphone.

Another option, that a lot of professional broadcasters have been using when getting sound from home during Covid, is air pods (or some other Bluetooth audio ear buds). The sound is great, and people are getting used to seeing things dangling out of ears, so it probably won’t be too distracting.

With these tips, things you can find around the house, and an hour of experimentation, you can make your video meetings more effective and engaging.

Good luck and stay safe!

You can view this content on youtube (with many more samples).

Born to Prep

July 1st, 1971 – 15,000 dockworkers on the West Coast and in Hawaii went on strike. The strike continued until February 1972 (134 days).

My prepper instinct was born in Hawaii, in 1971. Which is kind of strange because I wasn’t born until a few years later, but in 1971 the International Longshore and Warehouse Union (ILWU) went on strike, shutting down shipping to and from Hawaii for 134 days. And, poof! Supply chain cut. The Hawaiian Islands, and everyone living there, had to get by for 134 days on what they had, or what they could make. And that meant, unless it was a pineapple, sugarcane, or a chord played on a Uke, if you were in Hawaii in the second half of 1971, you weren’t going to get it.

Like I said. I hadn’t been born yet, but my mother was living in Hawaii at the time. And it affected her.

Growing up I would go to a closet and find a basket of a dozen toothbrushes, a stack of soap good enough for thousands of showers, many dozens of rolls of toilet paper, and enough dental floss to last for years. I’d take what I needed – maybe one of the twenty boxes of tissue paper – and get on with my life. We drank re-constituted dehydrated milk, made our own orange juice from the stock of frozen concentrate in the freezer, ate frozen vegetables and tofu!

(In fairness, the tofu might not have been related, but I prefer to give my mom the benefit of the doubt – she was doing it because of the strike-related-trauma, not because she liked to torture me…)

All of this was because that 1971 ILWU strike – and living through the consequences of it – had changed my mother’s outlook.

And it changed me too, I guess. Because thirty-five years later I got a house of my own. Before long I bought three big storage racks, put them in the garage, and bought toilet paper, paper towels, batteries, ketchup, rice, spam (yay for spam), sodas, canned foods, soap, shampoo, and about a dozen other things – all enough to last me for months. I didn’t think too much of it. It’s just what you’re supposed to do, right?

Well now corona virus has arrived, and I’m very happy for that 1971 strike. Because I’ve moved a small piece of my personal supply chain into my house. That means I don’t have to rush to the supermarket the moment I hit the last roll in one of my bathrooms – I just go to the garage. I’m not sure I have enough to last through the current crisis. But I have enough to take some pressure off the supply chain, to let others get what they think they need at the start of it. I even have a box of N95 masks I plan to donate to evergreen hospital (my closest hospital, and the center of the US outbreak) on Monday.

So what?

The reason I’m writing this is to share one possibility about how this corona outbreak is going to affect our children and their children. I bet houses built 5 years from now will have extra pantries, extra storage space in their garages, extra closet space in their bathrooms, extra storage in their attics. And I bet most of the people who buy those houses are going use the space to keep three months of toilet paper on hand. If that 1971 strike is any guide, this might go on for the next fifty years.

And here are some tips to help you get started, for when this corona crisis is past – please wait for COVID-19 to pass and build your own personal supply chain during the good times, not during the crisis:

Get some shelving units
I like this style of wire shelving. They are easy to assemble, light to move, incredibly sturdy – and I think they look kind of neat. I have three of them in my garage (although I left off the casters). You might also like to get some shelf liners and dividers.
Install garage organizers
You’ll be surprised how much you can store in your garage without using any floor space. And installing these modular shelving kits was so easy that even I could do it…
Get some bins
you can stack stuff on your shelves, but many things don’t stack well (especially on wire shelves), like batteries or sticks of deodorant. Bins like these can help you make much better use of your space. There are all different sizes, so shop around and get a variety.
Use a label maker
Label your bins. Well, first, make sure you get bins with see through sides. But then also put labels on them. I put labels on the front, one side, and the top, so you can see what’s in them when they are stacked on a shelf and when you have them piled up because you’re reorganizing.
Never use your last thing
Identify your staple items and keep at least one backup. If you’re ever using your last thing – bottle of ketchup, bag of rice, box of detergent, whatever – get another one to replenish your stockpile. For these things I like to keep a running amazon pantry order, and add things to it for several weeks before submitting – helps cut down on waste and packaging.
Find your local food bank
I’ve had three major food bank trips over the years because the things I bought were getting near expiring. Sometimes it happened because I changed my tastes and didn’t want to eat the stuff. Sometimes it happened because I super miscalculated how much ketchup I could use in six months… But I don’t see this as a failure. I see it as a great opportunity to help others while focusing my stores on things I actually like and use. Tastes change. It’s OK.

Writing the Book

Recently I posted about getting my book deal. Now I want to share about the process of writing the book. I hope this post will help you get excited to write your own book and I hope it will help you avoid some of the mistakes I made along the way.

First – writing a book is a lot of work. Mine took just about every weekend, holiday, and vacation day for twelve and half months, Nov 2016 through Jan 2018. That was about 80 working days to produce the first draft, and another 36 days to edit that draft into final form. Some of those days were productive, some were miserable, but that’s what it took.

And I don’t think I could do it much faster if I had to do it again.

Cast of Characters

Writing is a solitary activity, and I wrote most of my book alone in a quiet room or in a coffee shop (with headphones on). But the day I got the book deal I also got a support team. The direct members of this team were:

The acquisitions editor – who’d made the deal with me (see my blog post on that). At this point I think her job was to make sure that the product I delivered was close enough to what was in the contract so that her management wouldn’t be grumpy with her (and/or fire me). She also dealt with a couple small contract changes we needed to make. Not much direct involvement in the content of the book.
The coordinating editor – who was great. Her job was to make sure I understood what I needed to do and that I was doing it in a timely fashion. She helped me with the chapter submission process, the formatting, the dates and deadlines. She also helped get critical questions answered when other participants were busy (or had my emails eaten by their spam filters).
The developmental editor – who was supposed to help if my writing skill, my ability to organize the content, or my general communication style weren’t cutting it.
The copy editor – who corrected all the grammar, spelling, and technical style issues (like the proper use of colons before bulleted lists, which I still think is done by flipping coins, although several smart people have told me otherwise).
The technical editor – who I got to select. He was someone I worked with for many years, another expert in the field. His job was to make sure I didn’t get anything (too) wrong.
The production people – who typeset the thing, produced the cover image, did legal stuff and whatever else. I’m not sure what else, but I know there was more work done by people I didn’t have direct contact with.

Having all these people watch me write was odd. It was a weird sort of pressure, lots of folks looking over the shoulder. I wasn’t expecting it and having all of them and a deadline did make the process of writing a little less fun.

Although without them I wouldn’t have been able to publish a book…

So I’m grateful for all that they did.

Milestones

I had four major milestones to hit. Unfortunately I didn’t know what or when they all were ahead of time, exactly. My sense is that most authors don’t hit the milestones the publisher would like, so publishers don’t always bother to set all the milestones all out at the start. Rather, they set one or two, see where the author ends up, then set the next ones based on that and other business needs.

I actually hit every milestone (although one of them was particularly miserable), but in retrospect, I don’t think I had to. I think I could have asked for more time and the publisher would have gladly given it.
Whatever. It got done. The milestones were:

8/1/2017 – first three chapters

Early July I had the contract in hand. The publisher wanted three chapters in a month. I interpreted this as another audition. If they chapters weren’t good, they might cancel the deal, so I spent the time to do a good job.

When I submitted the first three, the acquisition editor and the developmental editor read them. Between the two of them, they left about four minor comments in the doc. The acquisitions editor said the chapters were great.

That’s the last I heard of the developmental editor.

Seriously. She was peace out!

My interpretation is that they decided my product met their quality bar and they put the developmental editor on other projects. They probably saved some money. Fine. I’m not sure if it was a mistake on my part to polish these chapters so much. If I hadn’t the developmental editor might have stuck around, and I’m sure I would have ended up with a better book if I’d had her involvement.

Oh well!

12/31/17 – manuscript completion

This milestone was for the first draft of the entire book. Leading up to this deadline I submitted chapters as I thought they were at first draft quality. I talked to the coordinating editor a bunch, particularly every time I wanted to add or cut a chapter (or anything that would change the official table of contents). She made sure I was roughly on track, suggesting ‘soft milestones’ for how much content I should have at various points.

I think the biggest problem I encountered during this milestone was the length of the book. My word count came out just about where I said it would, but the number of pages was less than the publisher expected. This happened because I had very few figures or code listings. For a short while I thought I was going to have to figure out how to make 50 pages of padding. I even came up with about 30 illustrations (which the publisher didn’t like at all and vetoed).

Thankfully, they decided to take the book as I wrote it (good on word count, but short on page count).

1/8/2018 – author revisions

And here was my big mistake. I knew I was going to have a draft by 12/31. I knew I would have to do another pass on the book, because my first drafts are a bit sketchy. But I didn’t bother to ask how long I had to do this second pass.

Answer? One week.

Hah!

Impossible.

I don’t know why they thought this could work. Maybe they just said any random date and expected me to negotiate. Maybe most authors interpret ‘first draft’ differently than I do and submit more-polished first drafts. Or maybe the publisher had an opportunity to hit some release window if I could make that date, so they took a shot.

I don’t know.

I found out this was the deadline around thanksgiving time. From that point I had about five weeks to finish draft one, do draft two, and incorporate all the comments from my technical editor. I decided to just…do…it…

So I took some extra vacation days. I worked on my Hawaii trip (which is fine, because I would have worked on something else if I didn’t have the book to work on). In the end, I submitted the first draft early on 12/4/2017. And I submitted the final draft on 1/4/2018.

But I probably should have done something differently, because I was pretty burned out by the end.

2/20/2018 – review copy edits

For the month of Jan I did nothing (on the book). Then on Feb 10th they sent me to a website where I could review the copy edited version of the book. The webpage was some weird web based workflow, no track changes, no word processor I was used to…hrm.

At this point I was supposed to check the layout and styling elements. I was also supposed to make sure they hadn’t made changes that changed the meaning of anything.

I tried to compare to the version I submitted, to find what and how much they changed, but it was pretty much impossible. All I could really do was read it.

In the end I think I had a couple small comments, but I pretty much just went with what they had, which worked out fine.

My process for writing the book

When I’m writing I try to write every weekend day, every vacation day, and every holiday. I set a word target and I work until I produce that many new words or until about six hours is past, whichever comes first. I’ve tried writing on work days before or after work, but I’ve never had success with that. It makes me too tired.

For fiction my word target is 2,000 words. But this book was much harder for me, so I lowered my target to 1,000 words per day.

I also track my writing, recording the date and the word count after every writing session. As I said, I recorded 80 days drafting and 36 days editing. There were a few more writing days before I started recording, while I was conceptualizing, but I didn’t track those.

The following figure shows the word count progress for this book. I made slow progress the first few months, because I didn’t have the book deal yet, so I wasn’t hyper focused on the writing. Then about May I started working regularly. The bursts of output in September and November were made possible by taking vacation days from work. And December 2017 was all editing, so the word count didn’t move as much.

Graph of words over time during writing process.

I think this book was harder to write than fiction because I struggled to get an outline that flowed and explained concepts in order and in the right detail.

Most chapters I started writing by describing three related concepts simultaneously (because they were related in my brain). So my rough drafts were usually a mess. Then I had to step back and realize what I was actually trying to say. Then I had to rewrite three sections for the three concepts. Sometimes the concepts didn’t all fit in the same chapter, so it kind of messed up other things.

The writing wasn’t the hard part. The thinking it through and organizing it was.

This following chart shows a breakdown of the word count I was able to get across the 80 days of writing. Fourteen days that were basically failures – less than 233 words! A few days that were great – over 2,000 words.

Histogram of word count per day.

My average editing day added just 228 words to the manuscript. Eight of the editing days had negative word counts. One editing day I deleted over three thousand words.

Ugh.

Summary

Writing takes me a lot of time. I think I’m a bit slower than other writers, but not crazy slow. Finishing a book is a real grind.

But I love it. I love engaging my whole brain in the writing process, working really hard, and producing something that I’m proud of.

And it’s very nice to think that my effort might help others. I hope this book does help people be more productive at machine learning and produce amazing things.

I hope this post inspires you to write your book and to finish it easier than I did.

You can check out the final book here. You can also get an audio book version which is free if you sign up for a trial account with audible.

You’re Machine Learning the Wrong Thing!

Arrows missing a target. Well, you might be machine learning the wrong thing…

Because it’s easy to get complacent. You find something familiar, set it as a goal, work hard to achieve it, and get distracted from true success.

This is particularly true for machine learning people, because we have so many incredible tools for measuring the quality of the models we produce. It’s great. A classification task? Precision and a recall. You move those numbers in the right direction and you’re achieving success. Move them further, you’re doing better. You have the game set up. You have the tools. You can win!

Sometimes that works, but get tunnel vision on optimizing your models — before long you’ll be machine-learning the wrong thing.

For example, consider a system to stop phishing attacks.

Phishing involves web sites that look like legitimate banking sites but are actually fake sites, controlled by abusers. Users are lured to these phishing sites and tricked into giving their banking passwords to criminals. Not good.

But machine learning can help!

Talk to a machine-learning person and it won’t take long to get them excited. ML people will quickly see how to build models that examine web pages and predict whether they are phishing pages or not. These models will consider things like the text, the links, the forms, and the images on the web pages. If the model thinks a page is a phish, block it. If a page is blocked, a user won’t browse to it, won’t type their banking password into it. Perfect.

So number of phishing pages you block seems like a great thing to optimize — block more phishing sites, and the system is doing a better job.

Or is it?

What if your model is so effective at blocking sites that phishers quit? Every single phisher in the world gives up and finds something better to do with their time? Perfect! But then there wouldn’t be any more phishing sites and the number of blocks would drop to zero. The system has achieved total success, but the metric indicates total failure. Not great.

Or what if the system blocks one million phishing sites per day, every day, but the phishers just don’t care? Every time the system blocks a site, the phishers simply make another site. Your machine learning is blocking millions of things, everyone on the team is happy, and everyone feels like they are helping people—but the number of users losing their credentials to abusers is the same as before your system was built. Not great.

And these are sort of toy examples, but there are two important points: Things change and your metrics aren’t right.

Things change

Your problem will change, your users will change, the business environment will change. If you don’t also change your machine learning goals – you’ll be machine-learning the wrong thing in no time.

Some common sources of change include:

Users – new users come, old users leave, users change their behavior, users learn to use the system better, users get bored.
Problems – your problem changes, new news stories are published, fashion trends changes, natural disasters occur, elections happen.
Costs – the cost of running your system might change, which puts new constraints on model execution and data and telemetry collection.
Objectives – the business environment might change, maybe a feature that attracted users last year is ho-hum this year.
Abuse – if people can make a buck by abusing your system, you can bet they will…

If you aren’t thinking about how these types of change are affecting your system on a regular basis, you’re machine-learning the wrong thing.

Your Metrics Aren’t Right

The true objective of your system isn’t to have high-quality intelligence. The true objective is something else, like keeping users from losing their passwords to abusers (or maybe even making your business some money).

A system’s true objective tends to be very abstract (like making money next quarter), but the things a system can directly affect tend to be very concrete (like deciding whether to block a web site or not). Finding a clear connection between the abstract and concrete is a key source of tension in setting goals for machine learning and Intelligent Systems. And it is really hard.

One reason it is hard is that different participants will care about different types of goals (and have their own tools for measuring them). For example:

Some participants will care about making money and attracting and engaging customers.
Some participants will care about helping users get good outcomes.
Some participants will care that the intelligence of the system is accurate.

These are all important goals, and they are related, but the connection between them is indirect: you won’t make much money if the system is always doing the wrong thing; but making the intelligence 1% better will not translate into 1% more profit.

If you don’t understand how your metrics relate to true success, you’re machine learning the wrong thing (Ok, Ok… I promise, I’ll only say it one more time…)

Machine learning the right thing…

So you’ll need to invest in keeping your goals healthy.

Start by defining success on different levels of abstraction and coming up with some story about how success at one layer contributes to the others. This doesn’t have to be a precise technical endeavor, like a mathematical equation, but it should be an honest attempt at telling a story that all participants can get behind.

Then meet with team members on a regular basis to talk about the various goals and their relationships. Look at some data to see if your stories about how your goals relate might be right – or how you can improve them. Don’t get too upset that things don’t line up perfectly, because they won’t.

For example:

On an hourly or daily basis: optimize model properties, like the false positive rate or the false negative rate of the model. For example: how many phishing sites are getting blocked?
On a weekly basis: review the user outcomes and make sure changes in model properties are affecting user outcomes as expected. For example: you blocked more phishing sites, did fewer users end up getting phished?
On a monthly basis: review the leading indicators – like customer sentiment and engagement – and make sure nothing has gone off the rails. For example: How many users say they feel safer using your browser because of the phishing protection? How many are irritated by it?
On a quarterly basis: look at the organizational objectives and make sure your work is moving in the right direction to affect them. For example: market share, particularly for visits to banking sites?

Your team members will make better decisions when they have some understanding of these different measures of success, and some intuition about how they relate.

And remember: you’ll need to revisit the goals of your Intelligent System often. Because things change, and if you don’t invest the time to keep your goals healthy – you’re machine learning the wrong thing!

You can learn much more in the book: building intelligent systems. You can even get the audio book version for free by creating a trial account at Audible.

Design Patterns for Machine Learning

There are many skills that go into making working Intelligent Systems. As an analogy, in software you have base skills like:

Programming languages
Algorithms and data structures
Networking and other specialized skills

But then you have to take these skills and combine them to make a working system. And the ability to do this combination is a skill in its own right, sometimes called Software Engineering. To be good at software engineering you need to know about architecture, software lifecycles, management and program management — all different ways to organize the parts of the system and the people building the system to achieve success.

Software engineering skills are critical to moving beyond building small systems, with a couple of people, and to start having big impact.

When working with AI and machine learning you have to add a bunch of things to the base skills, including:

Statistics
Data science
Machine learning algorithms
And then maybe some specialized things like computer vision or natural language understanding

But then you also need to integrate these skills into your broader software engineering process, so that you can turn data into value at large scale.

And the ability to do this combination is a skill in its own right too. Not Software Engineering exactly, call it Machine Learning Engineering.

And here are two very important concepts in setting up an Intelligent System for success in practice:

The first is Closing the Loop between users and intelligence so that they support each other.
The second is Balancing the key components of your system, and maintaining that balance as your problem and your users evolve over time.

Taken together these form the basis of what I call the closed loop intelligent system pattern for applying machine learning.

Closing the Loop

Virtuous cycle between intelligence and users.

Closing the loop is about creating a virtuous cycle between the intelligence of a system and the usage of the system. As the intelligence gets better, users get more benefit from the system (and presumably use it more) and as more users use the system, they generate more data to make the intelligence better.

So, for example in a search engine, you type your query and get some answers. If you find a useful web page, you click it and are happy. Maybe you come back and use the search engine again. Maybe you tell your friends and they start using the search engine. As a user, you are getting value from the interaction. Great.

But the search engine is getting value from the interaction too. Because when you click your answers, the search engine gets to see which pages get clicked in response to which queries. Maybe the most popular answer to a particular query is 5th on the list. The search engine will see that users prefer the 5th answer to the answer it thought was best. The search engine can use this to adapt and improve. And the more users use the system, the more opportunities there are to improve.

This is a virtuous cycle between the intelligence of the system and the usage of the system. Closing the loop between users and intelligence is key to being efficient and scalable with Intelligent Systems.

Doing extra work to close the loop, and let your users help your Intelligent System grow, can be very efficient, and enable all sorts of systems that would be prohibitively expensive to build any other way.

Balancing Intelligent Systems

There are five things you need to keep in balance to have a successful Intelligent System.

The Objective. An Intelligent System must have a reason for being, one that is meaningful to users and accomplishes your goals. The objective should be one that requires an intelligent system (and that you can’t solve easier and cheaper some other way), and it must also be achievable by the Intelligent System you will be able to build and run. Your objective might be relatively easy, or it might be hard, getting the objective right is critical for achieving success, and it is hard to do.

The Experience. An Intelligent System needs a user experience that takes the output of the intelligence (such as the predictions its machine learning makes) and presents it to users to achieve objectives. To do this the experience must put the intelligence in a position to shine when it is right—while minimizing the cost of mistakes it makes when it is wrong. The experience must not irritate users, and it must leave them feeling they are getting a good deal. And it must also elicit both implicit and explicit feedback from users to close the loop and help the system improve its intelligence over time.

The Implementation. The Intelligent System implementation includes everything it takes to execute intelligence. This involves things like deciding where the intelligence lives: in a client, a service or a backend. It involves building the pipes to move new intelligence to where it needs to be safely and cheaply. It involves controls on how and when the intelligence is exposed to users. And controlling what and how much to collect in telemetry to balance costs while improving over time.

The Intelligence. Most Intelligent Systems will have complex intelligences made up of many, many models and hand-crafted rules. The process of creating these can be quite complex too, involving many people working over many years. Intelligence creation must be organized so that the right types of intelligence address the right parts of the problem, and so it can be effectively created by a team of people over an extended time.

The Orchestration. Things change, and all the elements of an Intelligent System must be kept in balance to achieve its objectives. This orchestration includes keeping the experience in sync with the quality of the intelligence as it evolves, deciding what telemetry to gather to track down and eliminate problems, and how much money to spend building and deploying new intelligence. It also involves dealing with mistakes, controlling risk, and defusing abuse.

If you want a to learn more you can watch the free webinar.

And if you really want to learn how to create Closed Loop Intelligent Systems check out the book or the audio book, which you can get for free if you start a trial account with Audible.

Getting the Book Deal

I got a book deal with a major publisher and I’m going to share everything about the process, the mistakes I made, and the things I learned. My book is a non-fiction work, about machine learning. So this might not all apply if you’re working on something different.

To get a non-fiction book deal, you start with a proposal (not a finished b Building Intelligent Systems Book

ook, which is common in fiction). And the proposal needs to convince a publisher they might not lose too much money by publishing your book. That’s a bit sarcastic, but in retrospect it’s a good way to think about it. Your proposal is asking a smart business person to bet 10, or 20, or 50 thousand dollars that you’re going to produce content valuable enough so they can recoup their investment and make some money.

So, you have a great idea, something unique to share, a real talent, and a desire to write a book. You’ve probably succeeded at a lot of things in your life, and this writing-a-book thing can’t be much different, can it?

But now you need to write a 5 or 10 page document that a smart person will read and say – I can see this being worth $50,000. In fact, I can see a chance to make some real profit here…so I can pay for all the other books I published that didn’t pay off…

And this is actually pretty hard.

At least it was for me.

Your proposal needs to convince this acquisition editor that:

There is a big enough audience willing to pay for the content you’re proposing.
You are credible enough, so this audience will care about the book if you write it.
You have a platform – a bunch of people who will read the book just because you wrote it.
You have the skills to produce a professional book, including expertise and writing ability.
You have thought through the whole thing, have a viable plan, and can articulate it.

You don’t need to be awesome at all of these to get a deal (I wasn’t), but you need to look good on most of them…and you can’t be too-obviously terrible at any of them…

The best way to get started on this is to go to a book store, find similar books to the one you want to write, and learn from them. Look at how long they are, how they market themselves, how they are organized. Then make a list of all the publishers who publish them, go to the publishers’ web sites and download their proposal forms – and get started.

Here is an example of a proposal form from a great publisher.

How this worked out for me

I’ll go through these five areas and share some of the challenges I had. I’ll also give examples of steps I took to improve.

The audience – my topic was machine learning, which was very hot at the time (and probably still is as you’re reading this). So that was good. But my book was conceptual, not exactly what everyone else was publishing in the space. So it wasn’t clear who would read it.

My mistake was starting too general and saying that my book would appeal to anyone in software and even many people outside of it, because, hey, who wouldn’t want to read my brilliant book, right?

Ahem…

Eventually I identified some specific personas – an engineer who wants to get into machine learning, a machine learning practitioner who wants to understand more context, a manager or program manager trying to deploy machine learning for their problems – and I wrote a very specific pitch for what each of these will get from the content.

That seemed to work.

Credibility – I’ve been working in the field for fifteen years now at one of the big companies, and I was lucky enough to win a nice award from a scientific publication. This was good. But I did get some feedback that my experience might be too academic (my title was ‘applied researcher’) and, combined with my conceptual approach, might not resonate with the audience.

I tried to address this by adding more practical experience and language to my proposal. I didn’t remove things or hide my experiences – just turned the dial on the ‘researcher’ stuff from a seven to a five.

Platform – I didn’t have one. I had about 200 connections on Linked in and some friends and parents. I didn’t bring many automatic sales with me, so this was a negative that I couldn’t do much about.

Platform is very important, because if you bring just a few thousand automatic sales, you basically totally remove the risk from the publisher. They are guaranteed to break even. Platform is also hard to get. I hadn’t worked on it because I felt it was embarrassing to write lots of blogs and build up social network connections and all that stuff. Not my thing.

But now I’m doing it, and it’s more fun than I thought it would be. I try to blog things that people will learn from, and I’ve gotten some nice feedback, which I really enjoyed. I do wish I’d started earlier.

Oh well.

Writing skills – I’ve been writing (and failing at) fiction for years, so the writing in my sample chapter was safely above the bar for what a technical publisher needs. This wasn’t a problem for me.

Thought through – and here is where I struggled most. I was lucky, because my overall pitch looked good enough that acquisition editors at major publishers took the time to interact with me. Several of them rejected me after we talked, but I learned a lot from them – thanks guys!

And based on these interactions, my proposal evolved a great deal from day one to success. See the table below for some statistics about this evolution.

Each proposal also included a sample chapter that was about 3,000 words which ended up turning into Chapter 1 of the book with little change.

A large part of this progression was me understanding more about what I wanted to write. Another big part of it was me understanding more about what publishers were looking for.

In retrospect, the problem in my early proposals was that I was too vague. I hadn’t created enough detail in my mind and I certainly didn’t have enough on the page. I’d listed topics, but the topic names did not communicate enough to smart non-experts – and my conceptual approach was a bit different, the book didn’t look like what acquisition editors were expecting.

A detailed log of the process

This next section is a bit of a journal of the interactions I had along the way. Maybe this will interest you. Or maybe you just want to skim the summary

table and move on.

1/15/2016 I had a first draft of the outline and started collecting proposal forms from publishers who might be interested. I was working with a co-author at this time and it was a back-burner project for us so things moved slowly.

3/8/2016 Initial pitch to a big publisher with an O in their name. This was a short email ~500 words with a back-cover-like description and a bit about the authors.

4/1/2016 Heard back from O’s acquisition editor, they asked for a full proposal.

5/27/2016 I sent the completed proposal #1 to O (remember, this was a bit of a back-burner project, which is why it took me two months).

7/1/2016 No response yet, so I asked…

7/2/2016 O’s acquisition editor responded saying they “Can’t quite sink their teeth into it…” They gave some helpful feedback – that the proposal wasn’t specific enough on audience, and the outline was too vague. They offered to let me write some blogs on the topic for their web site to test the concept and let me develop it. I sort of wanted to, but I didn’t end up doing this. We put the project firmly back on the back burner. But I did keep writing chapters on the weekends.

2/1/2017 Sometime in this time period my co-author dropped out of the project. Writing is a lot of work, and he just didn’t have the passion for it – too much else going on. This was sad, but it also allowed me to up the pace.

3/5/2017 Initial pitch to a big publisher with a M in their name. This included an updated proposal (proposal #2), 10 pages, ~1,000 words, 270 named sections and sub-sections. And, reviewing now, this was maybe 60% of the way to the final book’s outline.

4/4/2017 No response yet, so I asked…

4/7/2017 M’s assistant acquisition editor responded asking for a proposal and a phone conversation. I resent proposal.

4/18/2017 Had a conversation with M’s acquisition editor who asked lots of questions – he clearly hadn’t read the proposal (his assistant must have done the screening). He seemed interested. He asked to have till the next week to review the proposal and that we talk again at that point.

5/8/2017 M decided we didn’t actually need to talk and instead carried out an external review of the project based on the proposal and samples I’d sent. This involved sending them to maybe fifteen or twenty potential members of the audience to get feedback.

5/26/2017 Didn’t hear anything back, so I asked…

5/26/2017 M said they didn’t get enough response from their external review. Not really negative feedback, just no feedback. They interpreted as something not resonating with their audience. And asked that we talk again.

6/2/2017 Spoke with M’s acquisition editor again. Got very similar questions to our last conversation. Clearly my outline iteration wasn’t ‘there yet’. This finally sunk into my brain and I decided to fix it for real.

Wrote the final version of the proposal (that ended up working). This one was 30 pages long, 5,334 words, ~300 named sections and subsections, a paragraph of text describing each of the five parts of the book, and text for each chapter that summarized what the reader would learn from each chapter, and what types of questions they would be able to answer after reading the chapter, like this:

After reading this chapter the reader should:

Know all the places intelligence can live, from client to the service back-end, and the pros and cons of each.
Understand the implications of intelligence placement and be able to design an implementation that is best for their system.

The reader should be able to answer questions like:

Imagine a system with a 1MB intelligence model, and 10KB of context for each intelligence call. If the model needs to be updated daily, at what number of users/intelligence call volume does it make sense to put the intelligence in a service vs in the client?
If your application needs to work on an airplane over the Pacific Ocean (with no Internet) what are the options for intelligence placement?
What if your app needs to function on an airplane, but the primary use case is at a user’s home? What are some options to enable the system to shine in both settings?

6/11/2017 Sent the new proposal to M.

6/11/2017 Sent the new proposal to O, they responded same day saying it was much improved and they would take another look, also asked if I’d be willing to teach some video courses as part of the deal. I said sure…

6/17/2017 I was getting serious about getting on with this book, so I’d decided to send the proposal to a new publisher every week till I ran out of publishers or got a deal… I let everyone know I was talking to multiple publishers at this point and then I sent my proposal to a wonderful publisher named Apress.

6/27/2017 Apress approved the deal and offered me a contract. I contacted M and O, but neither opted to make competing offers.

Here is a link to my successful proposal: Proposal-SUCCESS

Summary

Publishing moves slowly. Notice it took 3-4 months from my pitches to my rejections. Also notice I had to ask for response several times. I think this is because publishers are busy, but also because my early proposals were borderline, so acquisition editors didn’t know exactly know what to do with them.
Feedback is key. Giving feedback is hard, receiving feedback is hard. Being rejected is pretty hard too. Don’t take it personally. Also keep in mind: when someone gives you feedback that something isn’t right (like by rejecting you) they are pretty much always correct – something isn’t right. When someone tells you how to fix the problem, they are often wrong. In the end, it’s your project, your vision. Take what you can from the feedback and then do the right thing!
When I finally talked to Apress things moved quickly. Maybe it is because my work was a better match for them than it was for the other publishers (and I didn’t have to go through all the revisions if I’d gone to them first). But I don’t think so. I think it was faster because I’d finally gotten my proposal to where it needed to be.
This is just my experience. I’m sure publishers take all sorts of other things into account. For example, what else they have in their catalog, what they are hearing about at conferences, what they think your work will do to improve (or potentially hurt) their brand, what else they have in the works.

I hope this helps you get your book deal faster and easier than I did. You might want to learn about the process of writing the book.

You can also check out the final book here. You can also get an audio book version which is free if you sign up for a trial account with audible.

Will Mistakes Ruin the AI Revolution?

Intelligent Systems make mistakes. There is no way around it. The mistakes will be inconvenient, some will be actually quite bad. If left unmitigated the mistakes can make an Intelligent System seem stupid, they could even render an Intelligent System useless or dangerous.

Here are some example situations that might result from mistakes in an Intelligent System:

You are talking to your wife, but your personal assistant thinks you said ‘Tell Bob…all the stuff you said to your wife…’
Your self-driving car starts following a lane that doesn’t exist and you end up in an accident.
Your social network thinks your posts are offensive…but they aren’t.

These types of mistakes, and many others, are just part of the cost of using machine learning and artificial intelligence to build systems.

And these mistakes are not the fault of the people doing the machine learning. I mean, I guess the mistakes could be their fault — it’s always possible for people to be bad at their jobs — but even people who are excellent — world class — at applied machine learning will produce intelligence that make mistakes.

Mistakes in intelligent systems can occur when:

A part of your Intelligent System has an outage.
Your model is created, deployed, or interpreted incorrectly.
Your intelligence isn’t a perfect match for the problem (and it isn’t).
The problem evolves, so yesterday’s answer is wrong for today.
You user base changes, and new users act in ways you did not expect.

Why mistakes in Intelligent Systems are so damaging

Intelligent experiences succeed by meshing with their users in positive ways, making users happier, more efficient, helping them act in more productive ways (or ways that better align with positive business outcomes).

But dealing with Intelligent Systems can be stressful for some users, by challenging expectations.

One way to think about it is this: Humans deal with tools, like saws, books, cars, objects. These things behave in predictable ways. We’ve evolved over a long time to understand them, to count on them, to know what to expect out of them. Sometimes they break, but that’s rare. Mostly they are what they are, we learn to use them, and then stop thinking so much about them.

Tools become, in some ways, parts of ourselves, allowing us powers we wouldn’t have without them.

They can make us feel good, safe, comfortable.

Intelligent Systems aren’t like this, exactly.

Intelligent Systems make mistakes. They change their ‘minds’. They take very subtle factors into consideration in deciding to act. Sometimes they won’t do the same thing twice in a row, even though a user can’t tell that anything has changed. Sometimes they even have their own motivations that aren’t quite aligned with their user’s motivations.

Interacting with intelligent systems can seem more like a human relationship than like using a tool.

Here are some ways this can affect users:

Confusion — When the intelligent system acts in strange ways or makes mistakes, users will be confused. They might want to (or have to) invest some thought and energy to understanding what is going on.

Distrust — When the intelligent system influences user actions will the user like it or not? For example, a system might magically make the user’s life better, or it might nag them to do things, particularly things the user feels are putting others’ interests above theirs (e.g. by showing them ads).

Lack of Confidence — Does the user trust the system enough to let it do its thing or does the user come to believe the system is ineffective, always trying to be helpful, but always doing it wrong?

Fatigue — When the system demands user attention, is it using it well, or is asking too much of the user? Users are good at ignoring things they don’t like.

Creep-o-ville — Will the interactions make the user feel uncomfortable? Maybe the system knows them too well. Maybe it makes them do things they don’t want to do, or post information they feel is private to public forums. If a smart TV sees a couple getting familiar on the couch it could lower the lights and play some romantic music — but should it?

If these emotions begin to dominate users’ thoughts when they think about systems built with AI — we have a problem.

Getting Ready for Mistakes in your own Intelligent System

So is it time to give up?

No way!

You can take control of the mistakes in your intelligent systems, embrace them, and design systems that protect users from them.

But in order to solve a problem, you have to understand it, so ask yourself: what is the worst thing my Intelligent System could do?

Maybe your Intelligent System will make minor mistakes, like flashing a light the user doesn’t care about or playing a song they don’t love.

Maybe it could waste time and effort, automating something that a user has to undo, or causing your user to take their attention off of the thing they actually care about and look at the thing the intelligence is making a mistake about.

Maybe it could cost your business money by deciding to spend a lot of CPU or bandwidth, by accidentally hiding your best (and most profitable) content.

Maybe it could put you at legal risk by taking an action that is against the law somewhere, or by shutting down a customer or a competitor’s ability to do business, causing them damages you might end up being liable for.

Maybe it could do irreparable harm by deleting things that are important, melting a furnace, or sending an offensive communication from one user to another.

Maybe it could hurt someone — even get someone killed.

Most of the time when you think about your system you are going to think about how amazing it will be, all the good it will cause, all the people who will love it. You’ll want to dismiss its problems; you’ll even try to ignore them.

Don’t.

Find the worst thing your system can do.

Then find the second worst.

Then the third worst.

Then get five other people to do the same thing. Embrace their ideas, accept them.

And then when you have fifteen really bad things your Intelligent System might do, ask yourself: is that okay?

Because these types of mistakes are going to happen, and they will be hard to find, and they will be hard to correct.

Making Your Mistakes Less Costly

Random, low cost mistakes are to be expected. But when mistakes spike, when they become systematic, or when they become risky/expensive you might consider mitigation, common approaches include:

Find mistakes fast — by building lots of great feedback systems into your product, including ways for users to report problems and telemetry systems to capture examples of problems occurring. This type of investment will help you solve problems before they cause serious trouble, but it will also help you get data to make the system better.

Build better intelligence management — that allow you to deploy new intelligence cheaply and reliably, expose it to users in a controlled fashion, and roll it back if something goes wrong. The faster you can react to a problem, the more you can control the cost of the problem.

Rebalancing the experience — so that mistakes are less costly to the user, are easier for the user to notice, and are easier for the user to correct. For example, prompting the user to ask if they want to send a message to their friend, instead of automatically sending it. Or moving a suspicious email to a junk folder instead of deleting it. Or by simply reducing the frequency of interaction between the user and the intelligent system.

Solving a different problem — if the mistakes your system can make are too bad to contemplate… you might consider doing something else. This could be a simpler version of what you are trying to do (e.g. lane following as opposed to full driving automation). And working on this simpler problem can give you time to build towards solving the problem you really want to solve.

Implementing guardrails — such as simple heuristic rules that prevent the system from making obvious mistakes, or from making the same mistake over and over and over. Sure, your machine learning should be able to learn these things. But sometimes you need to take control for a while and help keep users safe and happy. Used sparingly, guardrails can be an effective addition to any intelligent system.

Investing more in intelligence — by building better models. You can do this by investing in machine learning, in the data that fuels the machine learning (including collecting more telemetry from the live service). You can do this by allowing more CPU at training time or at run time. And even automating parts of the intelligence creation process.

An active mistake mitigation plan can allow the rest of your Intelligent System to be more aggressive — and achieve more impact. Embracing mistakes, and being wise and efficient at mitigating them, is an important part of creating systems that work in practice.

You can learn much more in the book: building intelligent systems. You can even get the audio book version for free by creating a trial account at Audible.

Also, check out my friend’s small business, which is currently being seriously affected by mistakes in a big company’s AI systems https://togethermade.com/.

Acing the Machine Learning Interview

A whiteboard during a machine learning interview.

In my decade of managing applied machine learning teams I’ve interviewed maybe a hundred people. Over that time, I’ve come to rely on two main questions. I’m going to tell you what they are.

First, a bit of philosophy. There are lots of things we could talk about in an interview:

What do you like?
What did you do in your last project?
Can you tell a good story about yourself?
Have you read lots of papers about machine learning?
Can you program?
Do you know statistics?

All of that is great, and of course candidates must know those things to get a job, but what I also want to know is: what can you do when you have a blank screen in front of you and an open-ended machine learning task to complete?

That isn’t easy to figure out in an interview, but I try. The approach I take is to talk through an end-to-end problem. For example:

Let’s walk through an example of intelligence creation: a blink detector. Maybe your application is authenticating users by recognizing their irises, so you need wait till their eyes are open to identify them. Or maybe you are building a new dating app where users wink at the profiles of the users they’d like to meet. How would you build it?

There are so many interesting things to discuss, so many ways to approach this question, and I still learn from the conversations I have. A good answer has discussion on the following topics:

Understanding the Environment
Defining Success
Getting Data
Getting Ready to Evaluate
Simple Features and Heuristics
Machine Learning
Understanding the Tradeoffs
Assessing and Iterating

Understanding the environment

The first step in every applied intelligence-creation project is to understand what you are trying to do. Detect a blink, right? I mean, what part of “detect a blink” is confusing? Well, nothing. But there are some additional things you’ll need to know to succeed. Candidates might ask things like:

What kind of sensor will the eye images come from? Will the image source be standardized or will different users have different cameras?
What form will the input take? A single image? A short video clip? An ongoing live feed of video?
Where will the product be used? On desktop computers? Laptops? Indoors? Outdoors?
How will the system use the blink output? Should the output of the intelligence be a classification (that is, a flag that is true if the eye is closed and false if it is opened)? Should the output be a probability (1.0 if the eye is closed, and 0.0 if the eye is opened)? Or should the output be something else?
What type of resources can the blink detector use? How much RAM and CPU are available for the model? What are the latency requirements?

That’s a lot of questions before even getting started, and the answers are important to making good decisions about how to proceed.

Defining Success

To succeed, the blink detector will need to be accurate. But how accurate? This depends on what it will be used for. I want to know if a candidate can consider the experience that their model will drive and discuss how various levels of accuracy will change the way users perceive the overall system.

Questions include:

How many mistakes will a user see per day?
How many successful interactions will they have per unsuccessful interaction?
What will the mistakes cost the user?

I look for a discussion of options for how accuracy and experience will interact, how users will perceive the mistakes, and how will they be able to work around them.

Getting Data

Data is critical to creating intelligence. If you want to do machine learning right out of the gate, you’ll need lots of training data. I hope a candidate can discuss two distinct ways to think about getting data:

Getting data to bootstrap the intelligence:

Search the web and download images of people’s faces that are a good match for the sensor the blink- detector will be using (resolution, distance to the eye, and so on). Then pay people to separate the images into ones where the eye is opened and ones where it is closed.
Take a camera (that is a good match to the one the system will need to run on) to a few hundred people, have them look into the camera and close and open their eyes according to some script that gets you the data you need.
Something else?

How to get data from users as they use the system:

A well-functioning Intelligent System will produce its own training data as users use it. But this isn’t always easy to get right. In the blink-detector case some options include:

Tie data collection to the performance task: For example, in the iris-login system, when the user successfully logs in with the iris system, that is an example of a frame that works well for iris login. When the user is unable to log in with their iris (and has to type their password instead), that is a good example of a frame that should be weeded out by the intelligence.
Creating a data collection experience: For example, maybe a setup experience that has users open and close their eyes so the system can calibrate (and capture training data in the process). Or maybe there is a tutorial in the game that makes users open and close their eyes at specific times and verify their eyes are in the right state with a mouse-click (and capture training data).

Getting Ready to Evaluate

A candidate should have a very good understanding of evaluating models, including:

1. Setting aside data for evaluation:

Make sure there is enough set aside, and the data you set aside is reasonably independent of the data you’ll use to create the intelligence. In the blink-detector case you might like to partition by user (all the images from the same person are either used to create intelligence or to evaluate it), and you might like to create sub-population evaluation sets for: users with glasses, ethnicity, gender, and age.

2. Creating a framework to run the evaluation:

That is, a framework to take an “intelligence” and executes it on the test data exactly as it will be executed at runtime. Exactly. The. Same.

3. Generating reports on intelligence quality that can be used to know:

How accurate the intelligence is.
If it is making the right types of mistakes or the wrong ones.
If there is any sub-population where the accuracy is significantly worse.
Some of the worst mistakes it is making.

Simple Features and Heuristics

I like to have some discussion about simple heuristics that can solve the problem, because:

Making some heuristics can help you make sure the problem is actually hard (if your heuristic intelligence solves the problem you can stop right away, saving time and money).
It can create a baseline to compare with more advanced techniques—if your intelligence is complex, expensive, and barely improves over a simple heuristic, you might not be on the right track.

In the case of blink-detection you might try:

Measuring gradients in the image in horizontal and vertical directions, because the shape of the eye changes when eyes are opened and closed.
Measuring the color of the pixels and comparing them to common “eye” and “skin” colors, because if you see a lot of “eye” color the eye is probably open, and if you see a lot of “skin color” the eye probably closed.

Then you might set thresholds on these measurements and make a simple combination of these detectors, like letting each of them vote “open” or “closed” and going with the majority decision.

If a candidate has computer vision experience their heuristics will be more sophisticated. If they don’t have computer vision experience their heuristics might be as bad as mine. It doesn’t matter as long as they come up with some reasonable ideas and have a good discussion about them.

Machine Learning

I look for candidates who can articulate a simple “standard” approach for the type of problem we’re discussing. And I am aware that standards change. It doesn’t matter what machine learning technique the candidate suggests, as long as they can defend their decisions and exchange ideas about the pros and cons.

And here is where I bring in the second question. I let the candidate pick their favorite machine learning algorithm and then ask them to teach me something about it.

This can mean different things for different people. They might go to the board and explain the math about how to train the model. Maybe they explain the model representation and how inference works. They could discuss what types of feature engineering works well with the approach. Maybe they explain what types of problems the approach works well on — and which it works poorly on. Or maybe they explain the parameters the training algorithm has and what the parameters do and how they know which to change based on the results of a training run.

What’s important is that they understand the tool and make me believe they can use it effectively in practice.

Understanding the Tradeoffs

I want a candidate to be able to discuss some of the realities of shipping a model to customers. This is a process of exploring constraints and trade-offs. Discussing questions like these:

How does the intelligence quality scale with computation in the run-time?
How many times will we need to plan to update the intelligence per week?
What is the end-to-end latency of executing the intelligence on a specific hardware setup?
What are the categories of worst customer-impacting mistakes the intelligence will probably make?

The answers to these questions will help decide where the intelligence should live, what support systems to build, how to tune the experiences, and more. The candidate should be able to talk about these.

Assess and Iterate

And of course, machine learning is iterative. The candidate must be able to talk about the process of iterating, saying things like:

You could look at lots of false positives and false negatives.
You could try more or different data.
You could try more sophisticated features.
You could try more complex machine learning.
You could try to change people’s minds about the viability of the system’s objectives.
You could try influencing the experience to work better with the types of mistakes you are making.
And then you iterate and iterate and iterate.

A junior candidate might start in the middle of this list and might only be able to talk about one or two of these topics. A senior candidate should have a good sense of all of them and be able to discuss options as I probe and add constraints. There is no right answer — good discussion is key.

And if you really want to learn how to ace the machine learning interview, you can check out the book or the audio book, which you can get for free if you start a trial account with Audible.