A guide to hiring in AI
Everything I've learned

Every few months, some question rises to the surface as often repeated in conversation. If it's something I know a little about I'll contribute, but more often than not I try and listen, understand patterns, examples - what's worked, what hasn't.

Eventually I find myself writing one of these - about model selection, using connections in RAG systems, defensibility as an AI company, prompting or product management.

Today it's about hiring in AI. In the last two weeks I've had over twenty conversations - all the way from companies directly working in this wave of AI, companies building AI-enabled products, to more traditional firms that want to improve operations and see what can be changed.

The problems are always the same:

  1. How do we know who to look for?
  2. How do I vet applicants?
  3. What should we look for once they're here?
  4. How do we set them (and us) up for success? How do we keep them around?

These are worthwhile problems to solve. Of course in every cycle, there is a point at which hype outpaces value. We might already be there, but I'm now firmly of the opinion that this AI summer holds quite a bit of value under the hype.

Modern LLMs (and LLM-powered tools) have been the single greatest force multiplier I've seen in my time, with the widest application surface, perhaps only rivalled by Spreadsheets and code hints. Every single company I know could use at least some knowledge and expertise of modern AI - either at the tool usage level or the engineering level.

Before we get into it, I want to point out a few caveats:

  1. This is for companies looking to hire people to work with LLMs (or other parts of GenAI) - either to build their own AI systems, improve their business, or to figure out what needs to be built from scratch.
  2. This is not a guide for hiring ML engineers. There is now a very clear separation from people who build using LLMs, and people who build models and LLMs. The two regimes have become very different - on the ML side you can expect to have more data, more general-purpose compute, and it's okay to be months from production deployments. Hiring in ML - should you need to - has been around longer, and you're far better served following established best practices in the field (citations, papers, PhD supervisors, Kaggle, etc) specific to the need you have, rather than this guide.
  3. Almost anything here will need to evolve as the active engineering layer changes. The advice I was giving (myself) a year ago had to do more with simple knowledge around models, understanding attention and its limitations, and prompting. Today a lot of that has been abstracted away. Models are good enough and cheap enough that the fine-grained prompting tips may not matter as much.
  4. Finally, this is also not a guide to vet/hire contractors and consultants. If at all possible, you should play the longer game and build towards having in-house experience, and making it last. We might honestly be approaching the point where the run-and-gun opinions and work you'll get part-time can be replicated by an LLM armed with some good courses and guides. The industry-level changes being made in this cycle will last for a long time, and the likely winners will be companies that have put new ways of working and building products into their DNA - something that will take more time, but pay off in the medium and long term.

ยงWho do we look for?

This has become a tough question because of a few facts. Some of these took me a while to really get used to.

Once you get used to them, you can start to form a few guidelines for what makes a reasonable applicant. Let's see if we can't use a points system here.

  1. Using AI is a must: Unlike every other cycle of AI, this one started with reasonably priced models behind an easy to use API that consumes and outputs text. Since then, they've powered the fastest growing consumer application, had insane amounts of press coverage, and have become 100x cheaper and faster. At this point, any applicant needs to have a good reason not to have used it before, or built things with it.
๐Ÿ‘

If they haven't built something involving AI before, -100 points from Gryffindor.

If they're built something simple, +10 points.

If they've built something that they (or someone they know) is a daily active for, +100 points.

(I want to reiterate that this guide is entirely about hiring for AI engineering positions or teams. Please don't judge sysadmins or petrochemical engineers by these metrics.)

  1. Understanding search/retrieval is big plus: This doesn't have to be in the context of LLMs or RAG. Any amount of understanding of how basic search systems work (BM25, Elastic, etc) and how to use them in practice is a huge leg up. As models get smarter, the problem space becomes learning how to feed them the right information at the right time, and transforming information into formats they can understand.
๐Ÿ‘

If they know the basics of string search, string distance, etc, +10 points.

If they can spin up elastic or do some basic BM25, +50 additional points.

If they have a working understanding of embeddings and what they should not be used for, +50 points.

  1. Comfort with OSS models is a bonus: Open-source models are very much part of the future - either in industries with requirements to seal processing away from providers, or as small models embedded inside applications to help with tasks. Knowing your way around is useful.
๐Ÿ‘

If they've used LMStudio or ollama to run a model from Huggingface, +10 points.

If they can outline idiosyncracies between models, +20 additional.

If they want to talk about model architectures and how they're different, +20 additional.

If they know about quantization, and how to pick a model size, +20 additional points!

The best AI engineers I've found have not always been engineers. Some have been CEOs, product people, some have only been learning how to code for 2 months. What has been crazy is that the percentage of good ideas I've seen heard and implemented has not leant heavily towards experience with engineering.

ยงHow do we vet?

The previous section should hopefully serve as a guide to thin out your pipeline. Once you're past that point - and you'll notice almost everything in there can be done in a few days as a candidate if you're willing to spend the time - there isn't a replacement for real projects.

If a candidate has real projects under their belt, you should absolutely be using those instead of what I'm suggesting below. However, I've encountered many engineers without public projects who've gone on to build some truly amazing things. In this case, take-homes are a good idea: if you can limit the time to a few hours.

Here's what I recommend. Assemble a pipeline of 3-6 projects, none of which should take more than a week - I'll provide some examples below. Line them up by difficulty. Place one of them in the hiring pipeline, and the rest in the probation/onboarding side of things.

The project in the pipeline should ideally be time-limited to a few hours. My template for doing this has been to suggest starting with an architectural overview, early proof of concept, and to effectively stop at the time limit and judge how far we've gotten.

For the projects on the onboarding side of things, what's interesting is you can build a library of past work on the same problem. Once (someone) completes each one, you're able to see the approach everyone else took, compare and contrast and discuss. Ideally the projects encourage learning and evaluation from both sides of the table - and it's a bonus if the result is something that can be useful to the individual or the firm.

If you set it up right, you have a chance to create an atmosphere of continuous learning. Even the same project repeated every month for the last year would have arrived at completely different methods and solutions - I know.

Here are some projects/take-homes you're free to use. These are also great if you're looking for projects to build as a way to learn AI engineering:

  1. Structured data extraction: Given a set of images or pdfs, can you extract this typespec? Given that this is simple with current models, you're looking for model selection, cost, speed, success rates, etc. and quality of engineering.
  2. Actions: Build an AI "agent" that uses at least five tools to do anything of your choice, on request from the user. One of the tools has to be an external API, and another has to do something with the disk. For bonus points, you can think about making it multi-model and multi-modal.
  3. Blind SQL Generation: Given nothing but the connection to a Postgres database, can you build an AI-based search engine that can take a question, generate a query, execute and return an answer? If it's too hard you can provide the table spec. If it's too easy some bonus points would be handling multi-turn questions and contexts, self-healing on error, streaming, etc.
  4. An email gauntlet: Doesn't have to be email, could be Slack or something else. Can you filter and forward emails based on a simple ruleset? Additional points for annotating the emails with relevant summaries and context, and stream processing incoming events (emails/messages) without much global state.

What's also really interesting is how much the floor for development has changed in the past year. Any of these (especially 3.) would have been projects that took me more than a week to get to a proof-of-concept, but today just pasting in this blog post and asking for an answer gave me a working prototype from 3.5 Sonnet in a single shot!

My favorite tasks - like my favorite video games - are easy to do and hard to master. Of course, you can skip the take-home in the pipeline if they have enough projects under their belt - we're simply looking for how comfortable someone is with the business end of a modern AI model.

ยงWhat else?

The one thing that's helped me the most is writing. The more you can talk about yourself, your firm and what you want to be doing, the easier it becomes to attract an almost passive incoming stream of good people who agree or feel the same way.

If you manage a team pointed into, near or at AI, the biggest worry you'll have is churn. The floor has reset upwards, especially with the influx in capital, however long that lasts. It seems often that a single engineer with the right tools and flexibility in learning can get to 1M ARR in a year just on their own.

My best path to countering this has been to have an environment of learning and growth. The one massive negative to a tiny company of one (or two) is that many minds can learn faster than one, with the right culture. If you can make yourself and your team feel like they can move forward faster together than they can alone, you might just stop a leak.

Hope that helps!

This guide is the combination of a set of conversations and lived experiences around shared concerns, but it's still one man's opinion. People are the most diverse thing we have, which makes anecdotal advice rarely exactly right. If you have thoughts/disagreements/other concerns, I'd love to know!

me
Hrishi Olickel
19 Aug 2024