What makes a good Data Scientist? Answered by leading Data Officers!

What makes a good Data Scientist? A question I got asked recently a lot by data science newbies as well as long-established CIOs and my answer ist probably not what you think:
In my opinion is a good Data Scientist somebody with, at least, a good knowledge of computer programming, statistics and the ability of understanding the customer´s business. Above all stands a strong interest in finding value in distributed data sources.

Debatable? Maybe. That’s why I forwarded this question to five other leading Data Scientists and Chief Data Officers in Germany, let’s have a look on their answers to this question and create your own idea of what a good Data Scientist might be:


Dr. Andreas Braun – Head of Global Data & Analytics @ Allianz SE

A data scientist connects thorough analytical and methodological understanding  with a technical hands-on/ engineering mentality.
Data scientists bridge between analytics, tech, and business. “New methods”, such as machine learning, AI, deep learning etc. are crucial and are continuously challenged and improved. (14 February 2017)


Dr. Helmut Linde – Head of Data Science @ SAP SE

The ideal data scientist is a thought leader who creates value from analytics, starting from a vision for improved business processes and an algorithmic concept, down to the technical realization in productive software. (09 February 2017)


Klaas Bollhoefer – Chief Data Scientist @ The unbelievable Machine Company

For me a data scientist thinks ahead, thinks about and thinks in-between. He/she is a motivated, open-minded, enthusiastic and unconventional problem solver and tinkerer. Being a team player and a lone wolf are two sides of the same coin and he/she definitely hates unicorns and nerd shirts. (27 March 2017)

 


Wolfgang Hauner – Chief Data Officer @ Munich Re

A data scientist is, from their very nature, interested in data and its underlying relationship and has the cognitive, methodical and technical skills to find these relationships, even in unstructured data. The essential prerequisites to achieve this are curiosity, a logical mind-set and a passion for learning, as well as an affinity for team interaction in the work place. (08 February 2017)

 


Dr. Florian Neukart – Principal Data Scientist @ Volkswagen Group of America

In my opinion, the most important trait seems to be driven by an irresistible urge to understand fundamental relations and things, whereby I summarize both an atom and a complex machine among “things”. People with this trait are usually persistent, can solve a new problem even with little practical experience, and strive for the necessary training or appropriate quantitative knowledge autodidactically. (08 February 2017)

Background idea:
That I am writing about atoms and complex machines has to do with the fact that I have been able to analyze the most varied data through my second job at the university, and that I am given a chance to making significant contributions to both machine learning and physics, is primarily rooted in curiosity. Mathematics, physics, neuroscience, computer science, etc. are the fundamentals that someone will acquire if she wants to understand. In the beginning, there is only curiosity… I hope this is not too out of the way, but I’ve done a lot of job interviews and worked with lots of smart people, and it has turned out that quantitative knowledge alone is not enough. If someone is not burning for understanding, she may be able to program a Convolutional Network from the ground but will not come up with new ideas.

 


Interview – Using Decision Science to forecast customer behaviour

Interview with Dr. Eva-Marie Müller-Stüler from KPMG about how to use Decision Science to forecast customer behaviour

Dr. Eva-Marie Müller-Stüler is Chief Data Scientist and Associate Director in Decision Science at KPMG LLP in London. She graduated as a mathematician at the Technical University of Munich with a year abroad in Tokyo, and completed her Doctorate at the Philipp University in Marburg.

linkedin-button xing-button

Read this article in German:
“Interview – Mit Data Science Kundenverhalten vorhersagen “

Data Science Blog: Ms Dr. Müller-Stüler, which path led you to the top of Analytics for KPMG?

I always enjoyed analytical questions, and have a great interest in people and finance. For me, understanding how people work and make decisions is incredibly exciting. In my Master’s and my PhD theses I had to analyse large amounts of data and had to program various algorithms. Now, combining a solid mathematical education with specific industry and business knowledge enables me to understand my clients’ businesses and to develop methods that disrupt the market and uncover new business strategies.

Data Science Blog: What kind of analytical solutions do you offer your clients? What benefits do you generate for them?

Our team focuses on Behaviour and Customer Science under a mantra and mission: “We understand human behaviour and we change it”. We look at all the data artefacts a person (for example, the customer or the employee) leaves behind and try to solve the question of how to change their behaviour or to predict future behaviour. With advanced analytics and data science we develop “always-on” forecasting models, which enable our clients to act in advance. This could be forecasting customer demand at a particular location, how it can be improved or influenced in the desired direction, or which kind of promotions work best for which customer. Also the challenge of predicting where, and with what product mix, a new store should be opened can be solved much more accurately with Predictive Analytics than by conventional methods.


Data Science Blog: What prerequisites must be fulfilled to ensure that predictive analyses work adequately for customer behaviour?

The data must, of course, have a certain quality and history to recognize trends and cycles. Often, however, one can also create an advantage by using additional new data sources. Experience and creativity are enormously important to understand what is possible and how to improve the quality of our work, or whether something only increases the noise.

Data Science Blog: What external data sources do you need to integrate? How do you handle unstructured data?

As far as external data sources are concerned, we are very spoiled here in England. We use about 10,000 different signals on average, and which vary depending on the question. These might include signals that show the composition of the population, local traffic information, the proximity of sights, hospitals, schools, crime rates and many more. The influence of each signal is also different for each problem. So, a high number of pick pocketing incidences can be a positive sign of the vibrancy of an area, and that people carry a lot of cash on average. For a fast food retailer with a presence in the city centre, for example, this could have a positive influence on a decision to invest in a new outlet in the area, in another area the opposite.

Data Science Blog: What possibilities does data science provide for forensics or fraud detection?

Every customer is surrounded by thousands of data signals and produces and transmits more by through his behaviour. This enables us to get a pretty good picture about the person online. As every kind of person also has a certain behavioural pattern (and this also applies to fraudsters) it is possible to recognise or predict these patterns in time.

Data Science Blog: What tools do you use in your work? When do you rely on proprietary software or on open source?

This depends on what stage we are in the process and the goal defined. We differentiate our team into different groups: Our Data Wranglers (who are responsible for extracting, generating and processing the data) work with other tools than our Data Modellers. Basically our tool kit covers the entire range of SQL Server, R, Python, but sometimes also Matlab or SAS. More and more, we are working with cloud-based solutions. Data visualization and dashboards in Qlik, Tableau or Alteryx are usually passed on to other teams.

Data Science Blog: What does your working day as a data scientist look like from after the morning café until the end of the evening?

My role is perhaps best described as the player’s coach. At the beginning of a project, it is primarily about working with the client to understand and develop the project. New ideas and methods have to be developed. During a project, I manage the teams and knowledge transfer; the review and the questioning of the models are my main tasks. In the end I do the final sign-off of the project. Since I often run several projects at different stages at the same time, it is guaranteed never boring.

Data Science Blog: Are good Data Scientists of your experience more likely to be consultant types or introvert nerds?

That depends upon what one is focused. A Data Visualizer or Data Artist reduces the information and visualise it in a great and understandable way. This requires creativity, a good understanding of business and safe handling of the tools.

The Data Analyst is more concerned with the “Slicing and Dicing” of data. The aim is to analyse the past and to recognize relationships. It is important to have good mathematical and statistical abilities in addition to the financial knowledge.

The Data Scientist is the most mathematical type. His job is to recognize deeper connections in the data and to make predictions. This involves the development of complicated models or Machine Learning Algorithms. Without a good mathematical education and programming skills it is unfortunately not possible to understand the risk of potential errors in full depth. The danger of drawing wrong conclusions or interpreting correlations counterfactually is very great. A simple example of this is that, in summer, when the weather is beautiful, more people eat ice cream and go swimming. Therefore, there is a strong correlation between eating ice and the number of drowned people, although eating ice cream does not lead to drowning. The influencing variable is the temperature. To minimise the risk for wrong conclusions I think it is important have worked and studied mathematics, data science, machine learning and statistics in depth – this usually means a PhD in science related subject.

Beyond that, business and industry knowledge is also important for a Data Scientist. His solutions must be relevant to the client and solve their problems or improve their processes. The best AI machine does not give any bank a competitive advantage if it predicts the sale of ice cream based on the weather. This may be 100% correct, but has no relevance for the client.

It is quite similar to other areas (e.g., medicine) too. There are many different areas, but for serious problems it is best to ask a specialist so that you do not draw wrong conclusions.

Data Science Blog: For all students who have soon finished their bachelor’s degree in computer science, mathematics, or economics, what would they advise these young ladies how to become good Data Scientists?

Never stop learning! The market is currently developing incredibly fast and has so many great areas to focus on. You should dive into it with passion, enthusiasm and creativity and have fun with the recognition of patterns and relationships. If you also surround yourself with interesting and inspiring people from whom you can learn more, I predict that you’ll do well.

This interview is also available in German: https://data-science-blog.com/de/blog/2016/11/10/interview-mit-advanced-analytics-kundenverhalten-verstehen/