Misc

A personal take on the latest AI job market

Reading Time: 4 minutes

Over a month ago, I was back on the job market. I connected with my friends and here are the opportunities I have found and a few tips on what the market is looking for.

  1. Products that use search. I met a lot of knowledgable individual in this area and learned a great deal along the way. Needs in this area revolves around using traditional search algorithms and mix in neural network approaches as well. I was surprised to find that keyword-based algorithms like BM25 is still one of the widely used methods. BM25 is similar to TFIDF, but rather than just count the rare words, it is also divide the frequency by the length of the text. So the longer text were not given favor just because there are more words in it. Even though they don’t have semantic meaning, this algorithm is still among the top performers of BEIR scores. I was surprise to find this since I have used similar algorithms in topic modeling, but found the results to be much inferior to using sentence embedding. The group I talked to in this area is more concerned about implementation of personalization of the search result. Many of techniques require “augmentation” of the search query. For examples, using intent classification on the search query to generalize it’s purpose. Since many search query are short phrases that barely have context, the algorithm essentially need to guess what people are asking for based on the personal history, location, and what other users found to be useful. Other preprocessing including segmentation of the search context, so larger groups are identified. The groups are basic units to be search rather than individual documents.
  2. Fine-tuning of large language models. Since GPT4 and large language models is the current and hot topic, many companies are looking to take advantage and ride the wave. It’s not always the best solution for their business case, but companies do occasionally find it useful. More often people use out-of-the-box LLMs to gain knowledge to their documentation. For example, if we want to able to search through written documents and find relevant answers, we can simply put that large document in chunks through the GPT4 model and ask for an answer. Or we can use LLM to build embeddings and search query against the embedding using cosine similarity and rank the results. Up until recently, GPT4 models have no access to internet data to fact check. Even if they do, they might not know business specific knowledge. So there are packages help user search through those information as well. Algorithms like Retrieval-Augmented Generation and Low Rank Adaptation are particularly useful. A major advantage of this approach is that they don’t require a lot of data to fine tune the large language model.
  3. Causal inference modeling. This type of analysis is often for industries that often need to figure out the root cause. One example could be the insurance industry. For example, if I want to figure out the important factors that influence the price of auto insurance, I can simply run a general linear model with feature like age, marital status, etc. to predict the insurance premium. And use models with feature importance to figure out which feature influence the premium the most. But this often does not solve the business problem since the executives will question why they think a particular feature is important. So the data scientist not only need to figure out the what the factors are, they also need to find out how. One way people do this is by constructing equations they think how these features influence the premium. This is often based on their industry experience and intuitive sense. I cannot say I understand this part, so it seems that many years of experience is need.
  4. Company-wise tool development. In larger corporations, there are research teams that build machine learning tools aimed to help the whole company. Since ML talents are rare for any company, they are looking to maximize the value. Instead of targeting specific use cases, they are asked to develop generic tools. This is actually the most common theme I have seen for the jobs I applied to. The needs range from fresh new departments to seasoned corporations with years of ML research experience. The only difference is that newer teams focus more on tool building, whereas mature teams thing more about large scale production and data normalization.
  5. Specific product development. As apposed to company-wise tool development, many company have specific productization goals. They may or may not include ML specific component, but I tried for these positions anyways. Most of them are not the best fit for me. Some are because of ethical reasons, and some simply because I’m not that good of a programmer and they don’t need ML knowledge at all. All and all, I’m glad I tried just to see where I’m and what I don’t want to do.

At the end, my decisions is to go with a position that’s most comfortable for family reasons. That same role also have enough ML needs and I can use my past experience and explore new things at the same time. To sum up the learnings for this experience, is to get as many options as possible, and really figure out what I wanted to do. Don’t shy away from tough interview questions because sometimes they help you figure out life purpose as well. And for offer negotiations, having multiple offers make it a lot easier to play the “don’t tell your salary expectation” game. It helped me immensely since I don’t have natural game to get the max offer I can get.