Skip to main content

10 Tips for Hiring a Data Scientist into a Tech Company

What gap are you trying to fill? Before you get close to offering a Data Scientist a job you should be clear in your own mind what skills gap in your organisation you are trying to fill. In my experience these are good reasons to be hiring a Data Scientist:
  • You need someone with mathematics, and in particular statistics, skills that can do a better job of understanding data and creating meaningful outputs than your average accountant or computer scientist.
  • You need someone that thinks and operates in a numerically framed way, someone that is comfortable with representing concepts as graphs and formulae.
Those 2 core competencies are to be found in any successful Data Scientist. You may be tempted to frame the role in the following terms:
  • You need someone to make sense of a large dataset, to understand the dimensionality and the "shape" or distribution of the key components of that data.
  • You need someone who can create, improve or debug some very sophisticated algorithms, i.e. the kind of algorithms that a software engineer would claim to be too complicated to be practical.
But these 2 role traits are per project skills, and if you only have 1 dataset or 1 key algorithm then you don't need to hire a Data Scientist, you need to offer a short term contract to a Data Science consultancy or trusted academic. If you don't have a stream of interesting data problems to throw at your Data Scientist then don't bring that skill in-house, there are more effective options. Thus...
Tip 1 : Make sure the role is big [data] enough

You need to think how the Data Scientist will integrate with the rest of the tech team. I have come across organisations where the Data Scientists are engaged in a full on insult contest with the software developers. This kind of issue comes from software developers not typically being good at maths and Data Scientists not having the day to day industrial coding skills, and both sides fear that the other is a threat to their skill dominance. The following diagram shows the 3 relationships that are formed when bringing in a Data Scientist:


For success, the critical relationship is between the Product and Data teams (B), the other 2 relationships are harder to form; a good product manager will be able to maintain the motivation of both the developer and data science teams by drawing on their strengths and not exposing their weaknesses, eventually a level of respect for each other's talents will lead to a Developer to Data Scientist working relationship (A); the relationship with the business is also hard because the business expectations on data projects is currently very high and the cost and lead times of those projects are a major investment, and as a result it will take time for the relationship (C) to form in a sustainable way. The product managers should be already adept at creating bridges between teams. Thus...
Tip 2 :  Align the role with the Product Management team

A positive aspect of aligning with the Product team is that the targeting of the Data Science effort goes towards product goals, and the statistical skills are not sidetracked into more general BI. It is not that Data Scientists can't do BI, but if you want them to go in that direction, you should get that in the job description (which will alter who applies).

In the same way that you should keep the job description focused on the product area, you should take care to avoid including detailed engineering or environment specifics. It is true that a Data Scientist will be more productive in a commercial context on day 1 if they are already familiar with the systems (source control, project management, build and release processes), but they can learn these systems and the introduction of data centric development is very likely to change these established processes over time. Should you teach Data Scientists to do BDD or Software Engineers to use Jupyter Notebooks? Either way you should keep the job description tight and avoid all the nice to have requirements that will cut down the applicant list. You can take these things into account, but you want the strongest Data Scientist not the one who is most like your average developer. Thus...
Tip 3 : Stick to Data Science skills in the job description

At this point you might find that you are not looking for a Data Scientist at all, perhaps you are in fact looking for a Data Engineer or a Data Ops person to fulfil your business goals. These roles are more aligned with systems engineering and operations skills and you are really looking for specific experience with big data and ML workloads. If this is the case please consider getting in touch contact@infer.systems ;)

So before you advertise the position you will need to have a budget in mind; I suggest you take the average Developer salary and double it. Commercial Data Scientists are in high demand, and so with such an inflationary market there will be little or no correlation between the salary and the strength of the candidate, by paying more you get more experience, but not really any guarantee of more skill or the ability to achieve ambitious goals. Thus...
Tip 4 : Do not put a salary on the job advert, you want to see all applicants

Data Scientists are rare beasts when compared to the overall tech market, so you need to use the same hiring techniques as you would do for any hidden talent pool. Thus...
Tip 5 : Get the word out via Meetups and specialist Recruiters

Whether you get a trickle or a flood of applications will depend on your specific advert and role; but you will notice that the applicants fall into 3 basic categories:
  1. People that wish they could do Data Science, but cannot demonstrate the capability;
  2. Engineers who can do enough maths and have learned commercial Data Science;
  3. Physicists (or similarly computational and data intensive science discipline);
It is possible to find what you are looking for in the first 2 categories, but unless you are certain you have found the best candidate ever then you are taking a big risk. I am an Engineer myself, and I know that I am not an A-list hire for a Data Science role. Engineers make very good Data Engineers and Ops team players. But just filtering for Physicists is not enough, the world of academia is not a good a place to develop skills to handle the commercial pressures of Data Science; there is time pressure but nothing like that in a commercial context, there is team work but at an insignificant scale compared to large tech companies, there is openness and transparency but nothing like the secrecy culture of companies like Apple. Thus...
Tip 6 : Are Engineers just as good as Physicists? No, so filter for solid computational science backgrounds AND commercial experience

A good Data Scientist will be well aware of the commercial value of their discoveries and insight in a role; and so discussing the impact they have had on the organisations and projects that they work on will be hard to do in a public context - so you will see some very vague language in good CVs. Candidates will also struggle to describe their achievements in a interview without compromising indicators of their employers IP. Thus...
Tip 7 : Don't interview applicants from competitive businesses unless you want to end up in court

And for your own business ambitions, you don't know who else this Data Scientist is interviewing for. You still need to be relatively closed about the goals and product directions, but with an NDA in place at least you can get deeper into the science with less worry around context. Any candidate with commercial experience will understand the need for a NDA. Thus...
Tip 8 :  Interview under mutual NDA in all cases

During the interview process you are actually looking for research skills, so you need to give the applicants the opportunity to show that they can research your problem. So drip feed them the areas you are working in the first interview and then go back over those areas in a subsequent session. In particular you are wanting them to have read and understood the generally applicable algorithms and models prevalent in you problem domain. Thus...
Tip 9 : Use at least 2 interviews and look for the homework they have done between them

In Developer interviews there is often a coding test, the equivalent for a Data Science role is harder to execute; the data is large, the candidate may not have access to the storage and processing capabilities required to analyse it, the data may contain PII or commercially sensitive data that cannot be exported from the confines of the corporate IT systems. But you still need to validate the skill set and how the candidates cope with a product description and can communicate the intermediate and output thinking. It should be possible to find a data set that you know well and can interactively explore with the candidate, much as pair programming can create code, pair data exploration can lead to a better demonstration of real capabilities. Thus...
Tip 10 : Use paired interactive data exploration to assess data handling and soft skills

If you manage to follow these tips then the role should be right, the management structure around that role should be right, the job description and salary will pull in the right candidates once you get the word out, you will know what to filter the CVs on, when to set up the NDA and how to check the skills of the applicants. The only trouble is that the demand still far outstrips the supply of good commercially minded Data Scientists and so you may have to revise your budget upwards to get to the point where you can meet the expectations of the candidates in both salary and interesting project terms.

Infer Systems can help you with the process of getting the most value out of your Data Science team by supporting them with the cloud infrastructure and software engineering tools required to effectively run a commercial ML or data driven product. Please get in touch via contact@infer.systems to find out more.

Comments

  1. Make use of your network of contacts to choose the ideal Salesforce Integration Consultant by getting in touch with someone who can make you aware about the advantages and drawbacks to expect and the lessons they might have learned from their experience working with an organization of similar size and requirements. Salesforce training in Hyderabad

    ReplyDelete
  2. Excellent Blog! I would like to thank for the efforts you have made in writing this post. I am hoping the same best work from you in the future as well. I wanted to thank you for this websites! Thanks for sharing. Great websites!

    data science course in India

    ReplyDelete
  3. Duball24hrs.com is an easy to use streaming service with live streams and replays in ดูบอลฟรี a wide range of European and American sports. Fans rave about the graphics and visuals available on Duball24hrs.com, which provides a great viewing experience

    ReplyDelete
  4. With special privileges and services, UEFA BET offers opportunities for small capitalists. Together ufa with the best websites that collect the most games With a minimum deposit starting from just 100 baht, you are ready to enjoy the fun with a complete range of betting that is available within the website

    ufabet , our one another option We are a direct website, not through an agent, where customers can have great confidence without deception The best of online betting sites is that our Ufa will give you the best price

    ReplyDelete
  5. Online slots (Slot Online) is actually the introduction of a gambling machine. Slot machine As said before above Used to make electronic games referred to as web-based slots, due to the development era, people have turned to gamble with one another by computers. Will provide slot games to make internet gambling games Via the internet network process Which players are able to play through the slot program or will play Slots with the service provider's site Which internet slots games are obtainable in the form of participating in rules. It is similar to playing on a slot machine. Both practical images and sounds are at the same time thrilling as they go to lounge in the casino ever.บาคาร่า
    ufa
    ufabet
    แทงบอล
    แทงบอล
    แทงบอล

    ReplyDelete
  6. ได้โดยที่จะทำให้คุณนั้นสามารถสร้างกำไรจากการเล่นเกมส์เดิมพันออนไลน์ได้เราแนะนำเกมส์ชนิดนี้ให้คุณได้รู้จักก็เพราะว่าเชื่อว่าทุกคนนั้นจะต้องรู้วิธีการเล่นและวิธีการเอาชนะเกมม สล็อต าแทบทุกคนเพราะเราเคยเล่นกันมาตั้งแต่เด็กเด็กหาคุณได้เล่นเกมส์คาสิโนออนไลน์ที่คุณนั้นคุ้นเคยหรือจะเป็นสิ่งที่จะทำให้คุณสามารถที่จะได้กำไรจากการเล่นเกมได้มากกว่าที่คุณไปเล่นเกมส์คาสิโนออนไลน์ที่คุณนั้นไม่เคยเล่นมาก่อนและไม่คุ้นเคย เราจึงคิดว่าเกมส์ชนิดนี้เป็นเกมส์ที่น่าสนใจมากๆที่เราอยากจะมาแนะนำให้ทุกคนได้รู้จักและได้ใช้บริการ

    ReplyDelete
  7. Nice. I really like the way you explain everything. Thanks for sharing this post.
    Data Science Training in Hyderabad
    Data Science Course in Hyderabad

    ReplyDelete
  8. We protect extremely high net-worth clients each having their unique requirements. That's why bodyguard company
    we train our bodyguards to be all-rounders and to be proactive rather than reactive. They maintain high professional standards and ll will know how to keep their boundaries when protecting you, your family, or your property.

    ReplyDelete
  9. Slots, Live Dealer and Table Games at Oklahoman - Oklahoma
    Come visit the Oklahoman Casino 해외 사이트 in Oklahoma and play over 50 table games and slots. You'll also find table 딥 슬롯 games like blackjack, 룰렛사이트 roulette, video 폰타나 벳 poker, 올레 벳

    ReplyDelete
  10. Hence, coordinate measuring machines ensure error-free operation of varied directives and provide data reports on machined components. Learn about how the numerous totally different kinds of CNC machines work, the materials available, the benefits & limitations, fundamental design rules CNC machining and the most typical purposes. We use algorithms to determine the most effective manufacturer in our network for your particular order, based on their proximity to you, their experience with comparable components and their available capacity proper now.

    ReplyDelete
  11. You can click on the Curaçao seal on the web site's footer to examine its validity. Besides having a good video slot choice, Super Slots additionally offers great bonuses, with a welcome bonus of a lot as} $6,000. Offers like free spins, match bonuses, and cashback promotions are additionally available. Super Slots is one other high playing web site for 코인카지노 a full-blown slots expertise. Its parent company has been round since 1991, and we suggest the platform for its respectable slot choice. Newbies can claim a lot as} $3,000 on their first three deposits and a lot as} $3,750 in the event that they} deposit by way of crypto.

    ReplyDelete
  12. Nice. I absolutely appreciate how you explain everything. Thank you for sharing this article.erp customer software

    ReplyDelete

Post a Comment

Popular posts from this blog

Conjecture Cards - Agile Research Project Management

Project Managing research activities is hard; the open ended nature of research makes it too easy to meander aimlessly through the available time and budget. Good project management won't help you find the solution to the problem but it may stop you wasting time getting to a conclusion. In the competitive world of commercial data science, project management could be the difference between market success and obscurity. Photo by Eden Constantino on Unsplash Background Some of the history has been simplified to avoid allowing the original project and corporate complexity to detract from the key points. In 2017 we started our first project with data as a primary component, and data science as a necessary skill. We were not creating a new type of model, but we were aiming to deliver a product based on a function for which no pre-trained model existed at the time. We started hiring a "mixed bag" of PhD. data scientists and dived in. 2 main problems started cropping up: It took...

Lightweight Conjecture Records for Research Teamwork

Lightweight Conjecture Records for Research Teamwork Intended for Data Science AI/ML Research Teams, but generally applicable. Slug conjecture-records-improve-research-teamwork Context Data Science is now a first class citizen of the technical world, but that is only a recent development and it still lags behind hardware and software in terms of ecosystem maturity. One area that is still behind the curve is the area of teamwork and working on large scale objectives. From Fred Brooks to Michael Nygard the software and system architecture challenges have always been the same - how best to communicate the solution in your head. So following in the footsteps of LADR files, and in the style of the original post ... Conjecture We posit that keeping a collection of "domain significant" conjectures will improve research teamwork; these conjectures put forward experimental thinking that affect dimensionality, data characteristics, pre-processing options, calibrations, qualitative ana...