I keep seeing this discussion on what skills make a good Data Scientist. I tried to do it through the classic “What vs. How Matrix”. Listed down all the key Whats and their relative importance and tried to match it to the Hows – through impact rating of a particular skill on a particular What.
What’s – Clarity on Business objectives (3), ability to translate a Business objective to Analytics objective (7), translate Analytics objective to choice of models (7), ability to make a judicious choice on models (5), data exploration ability (5), ability to sell analytics results (buy-in) (5), ability to handle big data (3) and productivity (3).
7 – very important, 5 – important, 3 – moderately important
Remember the importance are for a Data Scientist, hence although Business Objective is good to have but not must (some one else can explain you), but Business objective to Analytics Goal is a Data Scientist job (hence 7). Similarly real big data would be required in 10-20% of cases and if you smart enough so-called Big Data can be translated to in-database computing like Oracle or parallelization on a single machine. In the same way, think objectively, and see what are your results (simple Sum-product in excel would do this for you).
My results are (in descending order of importance)
- knowledge on breadth of Analytics models
- Depth in understanding of few important models
- Python and Fundamentals (Statistics/Probability)
- R and SQL
- Tie on Domain knowledge/MBA education/Java
- SAS/SPSS/Hive/Spark etc. come as next set of skills
Curious to know what others get.