7 Ways To Advance Your Data Science Knowledge and Expertise

Image Source: Pexels

As a data scientist, your knowledge and expertise are what powers industries. Businesses of all sectors of the economy now rely on data to inform their business processes. As many as 53% of companies have already adopted big data analytics, highlighting the upward trend in data science within the private sector.

Businesses rely on data scientists to stay competitive facing in this market. But how can you advance your data science knowledge and expertise to bring the most value to your work?

These seven strategies will help you build your resources and improve your opportunities to grow.

1. Recognize the Need for Growth

It may seem disheartening at first to realize that there is no end to the progress you can make in honing your data science skills. There is simply too much to master in just a few years. However, what this really means is that there is no end to the progress and advancement you can make as a data scientist.

Consider the breadth of what there is to know. Skills to master include probability, new programming languages, data visualization, data intuition, and so much more. Recognize the scope of your field to open the door to learning opportunities in data science.

2. Brush Up on the Latest Trends

Your opportunities as a data scientist are largely dependent on how well you can utilize new software and data analytics trends. Modern data analytics relies on artificial intelligence and machine learning processes to drive insights with unprecedented detail. Meanwhile, data communication and storage platforms like blockchain are emerging to supplement data management infrastructures.

An awareness of these modern developments paired with basic general knowledge and qualifications will be key to getting hired as a data scientist in 2021 and beyond. As companies across industries look to pivot to new tech and competitive data strategies, it is more important than ever to keep abreast of the latest data science trends.

3. Enroll in Data Science Bootcamps

Data science is a constantly changing field, driven by technological innovation. At the same time, the breadth of opportunities that exist in a tech field invite career flexibility. Data scientists can make the most of these advancement and flexibility opportunities by enrolling in boot camps and training courses designed to fill in skills gaps.

These programs cover a range of topics within the field of data science. No matter your level of expertise and education, engaging in supplemental training can help you advance your expertise and bring value-building benefits to your role as a data scientist.

4. Look for Guidance Online

Because of the increasingly virtual nature of all kinds of work and education, opportunities for data science growth may be better sought out online. There are many ways you can go about increasing your data science expertise on a virtual platform. From finding a mentor through social media like LinkedIn to participating in training courses crafted by other data science professionals, you can expand your knowledge base.

First, however, ensure that you have a productive workspace at home that will allow you to learn and grow while staying motivated. This means setting up a home office to accommodate the virtual shift, complete with a comfortable chair and desk set up to avoid neck strain and health problems.

With virtual guidance in a productive environment, you can advance your expertise to secure the value of your position.

5. Expand Your Horizons

Data science is a multifaceted arena. The role of a data scientist typically consists of harnessing and categorizing raw data to draw out useful and predictive insights. Meanwhile, other positions in analytics and IT lend to more powerful data results.

Customer analytics, for example, is another subset of data science that involves harnessing information to describe and predict customer journeys. This entails focusing on customer demographics and behaviors to assemble more carefully targeted buyer personas, which can then be used to increase customer engagement and conversion rates.

Through broadening your data skills to account for areas like customer analytics, you can advance your professional opportunities.

6. Let Your Passions Inspire You

Every data scientist has a reason they got into their field. Your passions and inspirations can inform new avenues of exploration into the many designations surrounding data science. For example, big data analysts, machine learning specialists, and data visualization experts all play vital roles in modern business.

Finding your niche and specialization can come down to what drove you into data science in the first place. Perhaps you have a talent for creating comprehensive visuals that expertly summarize the point you want to be taken from your graphic. Alternatively, diving deep into the ins and outs of algorithmic functions may be what inspires you most.

Explore your passions and commit to a lifetime of learning and growing.

7. Never Stop Improving

With rapid technological change, data scientists must maintain their awareness of new systems and processes at all times. Innovations in AI, for example, have created a skills gap in the market. Eighty percent of business leaders say that lack of talent is the biggest obstacle in AI implementation.

For data scientists, closing this skills gap can be a simple matter of improving your technological training over time. Learning how machine learning functions, for example, can assist in your application of this tech to increase the value you add to your business.

Never stop improving through new courses and credentials that explore changing technology and how these changes affect the world of data science. With a commitment to lifelong learning, your skills as a data scientist will never go out of vogue.

These seven strategies can help you formulate a plan to expand your expertise into new territory, leading to new opportunities and a lucrative financial future.

Data Security for Data Scientists & Co. – Infographic

Data becomes information and information becomes knowledge. For this reason, companies are nowadays also evaluated with regard to their data and their data quality. Furthermore, data is also the material that is needed for management decisions and artificial intelligence. For this reason, IT Security is very important and special consulting and auditing companies offer their own services specifically for the security of IT systems.

However, every Data Scientist, Data Analyst and Data Engineer rarely only works with open data, but rather intensively with customer data. Therefore, every expert for the storage and analysis of data should at least have a basic knowledge of Data Security and work according to certain principles in order to guarantee the security of the data and the legality of the data processing.

There are a number of rules and principles for data security that must be observed. Some of them – in our opinion the most important ones – we from DATANOMIQ have summarized in an infographic for Data Scientists, Data Analysts and Data Engineers. You can download the infographic here: DataSecurity_Infographic

Data Security for Data Scientists, Data Analysts and Data Engineers

Data Security for Data Scientists, Data Analysts and Data Engineers

Download Infographic as PDF

Infographic - Data Security for Data Scientists, Data Analysts and Data Engineers

Infographic – Data Security for Data Scientists, Data Analysts and Data Engineers

Select the Right career path between Software Developer and Data Scientist

In today’s digital day and age, a software development career is one of the most lucrative ones. Custom software developers abound, offering all sorts of services for business organizations anywhere in the world. Software developers of all kinds, vendors, full-time staff, contract workers, or part-time workers, all are important members of the Information Technology community. 

There are different career paths to choose from in the world of software development. Among the most promising ones include a software developer career and a data scientist career. What exactly are these?

Software developers are the brainstorming, creative masterminds behind all kinds of computer programs. Although there may be some that focus on a specific app or program, others build giant networks or underlying systems, which power and trigger other programs. That’s why there are two classifications of a software developer, the app software developer, and the developers of systems software.

On the other hand, data scientists are a new breed of experts in analytical data with the technical skills to resolve complex issues, as well as the curiosity to explore what problems require solving. Data scientists, in any custom software development service, are part trend-spotter, part mathematicians, and part computer scientists. And, since they bestraddle both IT and business worlds, they’re highly in-demand and of course well-paid. 

When it comes to the field of custom software development and software development in general, which career is the most promising? Let’s find out. 

Data Science and Software Development, the Differences

Although both are extremely technical, and while both have the same sets of skills, there are huge differences in how these skills are applied. Thus, to determine which career path to choose from, let’s compare and find the most critical differences. 

The Methodologies

Data Science Methodology

There are different places in which a person could come into the data science pipeline. If they are gathering data, then they probably are called a data engineer, and they would be pulling data from different resources, cleaning and processing it, and storing it in a database. Usually, this is referred to as the ETL process or the extract, transform, and load. 

If they use data to create models and perform analysis, probably they’re called a ‘data analyst’ or a ‘machine learning engineer’. The critical aspects of this part of the pipeline are making certain that any models made don’t violate the underlying assumptions, and that they are driving worthwhile insights. 

Methodology in Software Development 

In contrast, the development of software makes use of the SDLC methodology or the software development life cycle. The workflow or cycle is used in developing and maintaining software. The steps are planning, implementing, testing, documenting, deploying, and maintaining. 

Following one of the different SDLC models, in theory, could lead to software that runs at peak efficiency and would boost any future development. 

The Approaches

Data science is a very process-oriented field The practitioners consume and analyze sets of data to understand a problem better and come up with a solution. Software development is more of approaching tasks with existing methodologies and frameworks. For example, the Waterfall model is a popular method that maintains every software development life cycle phase that should be completed and reviewed before going to the next. 

Some frameworks used in development include the V-shaped model, Agile, and Spiral. Simply, there is no equal data science process, although a lot of data scientists are within one of the approaches as part of the bigger team. Pure developers of the software have a lot of roles to fill outside data science, from front-end development to DevOps and infrastructure roles. 

Moreover, although data analytics pays well, the roles of software developers of all kinds are still higher in demand. Thus, if machine learning isn’t your thing, then you could spend your spare time in developing expertise in your area of interest instead. 

The Tools

The wheelhouse of a data scientist has data analytics tools, machine learning, data visualization, working with databases, and predictive modeling. If you use plenty of data ingestion and storage they probably would use MongoDB, Amazon S3, PostgreSQL, or something the same. For building a model, there’s a great chance that they would be working with Scikit-learn or Statsmodels. 

Big data distributed processing needs Apache Spark. Software engineers use software to design and analyze tools, programming languages, software testing, web apps tools, and so on. With data science, many depend on what you’re attempting to accomplish. For actually creating TextWrangler, code Atom, Emacs, Visual Code Studio, and Vim are popular. 

Django by Python, Ruby on Rails, and Flask see plenty of use in the backend web development world. Vue.js emerged recently as one of the best ways of creating lightweight web apps, and similarly for AJAX when creating asynchronous-updating, creating dynamic web content. Everyone must know how to utilize a version control system like GitHub for instance. 

The Skills

To become a data scientist, some of the most important things to know include machine learning, programming, data visualization, statistics, and the willingness to learn. Various positions may need more than these skills, but it’s a safe bet to say that these are the bare minimum when you pursue a data science career. 

Often, the necessary skills to be a developer of the software will be a little more intangible. The ability of course to program and code in various programming languages is required, but you should also be able to work well in development teams, resolve an issue, adapt to various scenarios, and should be willing to learn. This again isn’t an exhaustive list of skills, but these certainly would serve you well if you are interested in this career. 

Conclusion

You should, at the end of the day must choose a career path that’s based on your strengths and interests. The salaries of data scientists and software developers  are the same to an average at least. However, before choosing which is better for you, consider experimenting with various projects and interact with different aspects of the business to determine where your skills and personality best fits in since that is where you’ll grow the most in the future.

Connections Between Data Science & Finance

Image Source: pixabay.com

The world of finance is changing at an unprecedented rate. Data science has completely altered the face of traditional finance management. Though data has long been a critical component to finances, the introduction of big data and artificial intelligence have created new tools that are strengthening the predictive ability of many financial institutions.

These changes have led to a rapid increase in the need for financial professionals with data science skills. Nearly every sector in finances is converting to greater use of data science and management from the stock market and retirement accounts to credit score calculation. A greater understanding of the interplay between data and finance is a key skill gap.

Likewise, they have opened many doors for those that are interested in analyzing their personal finances. More and more people are taking their finances into their own hands and using the data tools available to make the best decisions for them. In today’s world, the sky’s the limit for financial analysis and management!

The Rise of the Financial Analyst

Financial analysts are the professionals who are responsible for the general management of money and investments both in an industrial and personal finance realm. Typically a financial analyst will spend time reviewing and understanding the overall stock portfolio and financial standing of a client including:

  • Stocks
  • Bonds
  • Retirement accounts
  • Financial history
  • Current financial statements and reports
  • Overarching business and industry trends

From there, the analyst will provide a recommendation with data-backed findings to the client on how they should manage their finances going into the future.

As you can imagine, with all of this data to analyze, the need for financial analysts to have a background or understanding of data science has never been higher! Finance jobs requiring skills such as artificial intelligence and big data increased by over 60% in the last year. Though these new jobs are typically rooted in computer science and data analytics, most professionals still need a background in financial management as well.

The unique skills required for a position like this means there is a huge (and growing) skills gap in the financial sector. Those professionals that are qualified and able to rise to fill the need are seeing substantial pay increases and hundreds of job opportunities across the nation and the globe.

A Credit Score Example

But where does all of this data science and professional financial account management come back to impact the everyday person making financial decisions? Surprisingly, pretty much in every facet of their lives. From things like retirement accounts to faster response times in financial analysis to credit scores — data science in the financial industry is like a cloaked hand pulling the strings in the background.

Take, for example, your credit score. It is one of the single most important numbers in your life, for better or worse. A high credit score can open all sorts of financial doors and get you better interest rates on the things you need loans for. A bad score can limit the amount lenders willing to qualify you for a loan and increase the interest rate substantially, meaning you will end up paying far more money in the end.

Your credit score is calculated by several things — though we understand the basic outline of what goes into the formula, the finer points are somewhat of a mystery. We know the big factors are:

  • Personal financial history
  • Debit-credit ratio
  • Length of credit history
  • Number of new credit hits or applications

All of this data and number crunching can have a real impact on your life, just one example of how data in the financial world is relevant.

Using Data Science in Personal Finance

Given all this information, you might be thinking to yourself that what you really need is a certificate in data science. Certainly, that will open a number of career doors for you in a multitude of realms, not just the finance industry. Data science is quickly becoming a cornerstone of how most major industries do business.

However, that isn’t necessarily required to get ahead on managing your personal finances. Just a little information about programs such as Excel can get you a long way. Some may even argue that Excel is the original online data management tool as it can be used to do things like:

  • Create schedules
  • Manage budgets
  • Visualize data in charts and graphs
  • Track revenues and expenses
  • Conditionally format information
  • Manage inventory
  • Identify trends in large data sets

There are even several tools and guides out there that will help you to get started!

***

Data analysis and management is here to stay, especially when it comes to the financial industry. The tools are likely to continue to become more important and skills in their use will increase in value. Though there are a lot of professional skills using big data to manage finances, there are still a lot of tools out there that are making it easier than ever to glean insights into your personal finances and make informed financial decisions.

Must-have Skills to Master Data Science

The need to process a massive amount of data sets is making Data Science the most-demanded job across diverse industry verticals. In today’s times, organizations are actively looking for Data Scientists.

But What does a Data Scientist do?

Data Scientist design data models, create various algorithms to extract the data the organization needs, and then they analyze the gathered data and communicate the data insights with the business stakeholders.

If you are looking forward to pursuing a career in Data Science, then this blog is for you 🙂

Data Scientists often come from many different educational and work experience backgrounds but few skills are common and essential.

Let’s have a look at all the essential skills required to become a Data Scientist:

  1. Multivariable Calculus & Linear Algebra
  2. Probability & Statistics
  3. Programming Skills (Python & R)
  4. Machine Learning Algorithms
  5. Data Visualization
  6. Data Wrangling
  7. Data Intuition

Let’s dive deeper into all these skills one by one.

 

Multivariable Calculus & Linear Algebra:

Having a solid understanding of math concepts is very helpful for a Data Scientist.

Key Concepts:

  • Matrices
  • Linear Algebra Functions
  • Derivatives and Gradient
  • Relational Algebra

Probability & Statistics:

Probability and Statistics play a major role in Data Science for estimation and prediction purposes.

Key concepts required:

  • Probability Distributions
  • Conditional Probability
  • Bayesian Thinking
  • Descriptive Statistics
  • Random Variables
  • Hypothesis Testing and Regression
  • Maximum Likelihood Estimation

Programming Skills (Python & R):

Python :

Start with Python Fundamentals using a jupyter notebook, which comes pre-packaged with Python libraries.

Important Python Libraries used:

  • NumPy (For Data Exploration)
  • Pandas (For Data Exploration)
  • Matplotlib (For Data Visualization)

R:

It is a programming language and software environment used for statistical computing and graphics. 

Key Concepts required:

  • R Languages fundamentals and basic syntax
  • Vectors, Matrices, Factors
  • Data frames
  • Basic Graphics

Machine Learning Algorithms

Machine Learning is an innovative and essential field in the industry. There are quite a few algorithms out there, major ones are as follows –

  • Linear Regression
  • Logistic Regression
  • Decision Trees
  • Random Forest
  • Naïve Bayes
  • Support Vector Machines
  • Dimensionality Reduction
  • K-means
  • Artificial Neural Networks

Data Visualization:

Data visualization is very essential when it comes to analyzing a massive amount of information and data. 

To make data-driven decisions, data visualization tools, and technologies are essential in the world of Data Science.

Data Visualization tools:

  • Tableau
  • Microsoft Power Bi
  • E Charts
  • Datawrapper
  • HighCharts

Data Wrangling:

Data wrangling, this term refers to the process of cleaning and refining the messy and complex data available into a more usable format. 

It is considered one of the most crucial parts of working with data.

Important Steps to Data Wrangling:

  1. Discovering
  2. Structuring
  3. Cleaning
  4. Enriching
  5. Validating
  6. Documenting

Tools used:

  • Tabula
  • Google DataPrep
  • Data Wrangler
  • CSVkit

Data Wrangling can be done using Python and R.

Data Intuition:

Data Intuition in Data Science is an intuitive understanding of concepts. It’s one of the most significant skills required to become a Data Scientist.

It’s about recognizing patterns where none are observable on the surface.

This is something that you need to develop. It is a skill that will only come with experience.

A Data Scientist should know which Data Science methods to apply to the problem at hand.

Conclusion:

 As you can see, all these skills – from programming to algorithmic methods, work with one another to build on top of each other for gathering deeper data insights.

There are a wide number of courses available online for developing these skills and to help you become a true talent in this data industry.

Sure, this journey isn’t an easy one to follow but it’s not impossible. With sheer determination and consistency, you will be able to cross all the hurdles in your Data Science career path.

Simple RNN

Prerequisites for understanding RNN at a more mathematical level

Writing the A gentle introduction to the tiresome part of understanding RNN Article Series on recurrent neural network (RNN) is nothing like a creative or ingenious idea. It is quite an ordinary topic. But still I am going to write my own new article on this ordinary topic because I have been frustrated by lack of sufficient explanations on RNN for slow learners like me.

I think many of readers of articles on this website at least know that RNN is a type of neural network used for AI tasks, such as time series prediction, machine translation, and voice recognition. But if you do not understand how RNNs work, especially during its back propagation, this blog series is for you.

After reading this articles series, I think you will be able to understand RNN in more mathematical and abstract ways. But in case some of the readers are allergic or intolerant to mathematics, I tried to use as little mathematics as possible.

Ideal prerequisite knowledge:

  • Some understanding on densely connected layers (or fully connected layers, multilayer perception) and how their forward/back propagation work.
  •  Some understanding on structure of Convolutional Neural Network.

*In this article “Densely Connected Layers” is written as “DCL,” and “Convolutional Neural Network” as “CNN.”

1, Difficulty of Understanding RNN

I bet a part of difficulty of understanding RNN comes from the variety of its structures. If you search “recurrent neural network” on Google Image or something, you will see what I mean. But that cannot be helped because RNN enables a variety of tasks.

Another major difficulty of understanding RNN is understanding its back propagation algorithm. I think some of you found it hard to understand chain rules in calculating back propagation of densely connected layers, where you have to make the most of linear algebra. And I have to say backprop of RNN, especially LSTM, is a monster of chain rules. I am planing to upload not only a blog post on RNN backprop, but also a presentation slides with animations to make it more understandable, in some external links.

In order to avoid such confusions, I am going to introduce a very simplified type of RNN, which I call a “simple RNN.” The RNN displayed as the head image of this article is a simple RNN.

2, How Neurons are Connected

    \begin{equation*}   1 = 3 - 2 \end{equation*}

How to connect neurons and how to activate them is what neural networks are all about. Structures of those neurons are easy to grasp as long as that is about DCL or CNN. But when it comes to the structure of RNN, many study materials try to avoid showing that RNNs are also connections of neurons, as well as DCL or CNN(*If you are not sure how neurons are connected in CNN, this link should be helpful. Draw a random digit in the square at the corner.). In fact the structure of RNN is also the same, and as long as it is a simple RNN, and it is not hard to visualize its structure.

Even though RNN is also connections of neurons, usually most RNN charts are simplified, using blackboxes. In case of simple RNN, most study material would display it as the chart below.

But that also cannot be helped because fancier RNN have more complicated connections of neurons, and there are no longer advantages of displaying RNN as connections of neurons, and you would need to understand RNN in more abstract way, I mean, as you see in most of textbooks.

I am going to explain details of simple RNN in the next article of this series.

3, Neural Networks as Mappings

If you still think that neural networks are something like magical spider webs or models of brain tissues, forget that. They are just ordinary mappings.

If you have been allergic to mathematics in your life, you might have never heard of the word “mapping.” If so, at least please keep it in mind that the equation y=f(x), which most people would have seen in compulsory education, is a part of mapping. If you get a value x, you get a value y corresponding to the x.

But in case of deep learning, x is a vector or a tensor, and it is denoted with \boldsymbol{x} . If you have never studied linear algebra , imagine that a vector is a column of Excel data (only one column), a matrix is a sheet of Excel data (with some rows and columns), and a tensor is some sheets of Excel data (each sheet does not necessarily contain only one column.)

CNNs are mainly used for image processing, so their inputs are usually image data. Image data are in many cases (3, hight, width) tensors because usually an image has red, blue, green channels, and the image in each channel can be expressed as a hight*width matrix (the “height” and the “width” are number of pixels, so they are discrete numbers).

The convolutional part of CNN (which I call “feature extraction part”) maps the tensors to a vector, and the last part is usually DCL, which works as classifier/regressor. At the end of the feature extraction part, you get a vector. I call it a “semantic vector” because the vector has information of “meaning” of the input image. In this link you can see maps of pictures plotted depending on the semantic vector. You can see that even if the pictures are not necessarily close pixelwise, they are close in terms of the “meanings” of the images.

In the example of a dog/cat classifier introduced by François Chollet, the developer of Keras, the CNN maps (3, 150, 150) tensors to 2-dimensional vectors, (1, 0) or (0, 1) for (dog, cat).

Wrapping up the points above, at least you should keep two points in mind: first, DCL is a classifier or a regressor, and CNN is a feature extractor used for image processing. And another important thing is, feature extraction parts of CNNs map images to vectors which are more related to the “meaning” of the image.

Importantly, I would like you to understand RNN this way. An RNN is also just a mapping.

*I recommend you to at least take a look at the beautiful pictures in this link. These pictures give you some insight into how CNN perceive images.

4, Problems of DCL and CNN, and needs for RNN

Taking an example of RNN task should be helpful for this topic. Probably machine translation is the most famous application of RNN, and it is also a good example of showing why DCL and CNN are not proper for some tasks. Its algorithms is out of the scope of this article series, but it would give you a good insight of some features of RNN. I prepared three sentences in German, English, and Japanese, which have the same meaning. Assume that each sentence is divided into some parts as shown below and that each vector corresponds to each part. In machine translation we want to convert a set of the vectors into another set of vectors.

Then let’s see why DCL and CNN are not proper for such task.

  • The input size is fixed: In case of the dog/cat classifier I have mentioned, even though the sizes of the input images varies, they were first molded into (3, 150, 150) tensors. But in machine translation, usually the length of the input is supposed to be flexible.
  • The order of inputs does not mater: In case of the dog/cat classifier the last section, even if the input is “cat,” “cat,” “dog” or “dog,” “cat,” “cat” there’s no difference. And in case of DCL, the network is symmetric, so even if you shuffle inputs, as long as you shuffle all of the input data in the same way, the DCL give out the same outcome . And if you have learned at least one foreign language, it is easy to imagine that the orders of vectors in sequence data matter in machine translation.

*It is said English language has phrase structure grammar, on the other hand Japanese language has dependency grammar. In English, the orders of words are important, but in Japanese as long as the particles and conjugations are correct, the orders of words are very flexible. In my impression, German grammar is between them. As long as you put the verb at the second position and the cases of the words are correct, the orders are also relatively flexible.

5, Sequence Data

We can say DCL and CNN are not useful when you want to process sequence data. Sequence data are a type of data which are lists of vectors. And importantly, the orders of the vectors matter. The number of vectors in sequence data is usually called time steps. A simple example of sequence data is meteorological data measured at a spot every ten minutes, for instance temperature, air pressure, wind velocity, humidity. In this case the data is recorded as 4-dimensional vector every ten minutes.

But this “time step” does not necessarily mean “time.” In case of natural language processing (including machine translation), which you I mentioned in the last section, the numberings of each vector denoting each part of sentences are “time steps.”

And RNNs are mappings from a sequence data to another sequence data.

*At least I found a paper on the RNN’s capability of universal approximation on many-to-one RNN task. But I have not found any papers on universal approximation of many-to-many RNN tasks. Please let me know if you find any clue on whether such approximation is possible. I am desperate to know that. 

6, Types of RNN Tasks

RNN tasks can be classified into some types depending on the lengths of input/output sequences (the “length” means the times steps of input/output sequence data).

If you want to predict the temperature in 24 hours, based on several time series data points in the last 96 hours, the task is many-to-one. If you sample data every ten minutes, the input size is 96*6=574 (the input data is a list of 574 vectors), and the output size is 1 (which is a value of temperature). Another example of many-to-one task is sentiment classification. If you want to judge whether a post on SNS is positive or negative, the input size is very flexible (the length of the post varies.) But the output size is one, which is (1, 0) or (0, 1), which denotes (positive, negative).

*The charts in this section are simplified model of RNN used for each task. Please keep it in mind that they are not 100% correct, but I tried to make them as exact as possible compared to those in other study materials.

Music/text generation can be one-to-many tasks. If you give the first sound/word you can generate a phrase.

Next, let’s look at many-to-many tasks. Machine translation and voice recognition are likely to be major examples of many-to-many tasks, but here name entity recognition seems to be a proper choice. Name entity recognition is task of finding proper noun in a sentence . For example if you got two sentences “He said, ‘Teddy bears on sale!’ ” and ‘He said, “Teddy Roosevelt was a great president!” ‘ judging whether the “Teddy” is a proper noun or a normal noun is name entity recognition.

Machine translation and voice recognition, which are more popular, are also many-to-many tasks, but they use more sophisticated models. In case of machine translation, the inputs are sentences in the original language, and the outputs are sentences in another language. When it comes to voice recognition, the input is data of air pressure at several time steps, and the output is the recognized word or sentence. Again, these are out of the scope of this article but I would like to introduce the models briefly.

Machine translation uses a type of RNN named sequence-to-sequence model (which is often called seq2seq model). This model is also very important for other natural language processes tasks in general, such as text summarization. A seq2seq model is divided into the encoder part and the decoder part. The encoder gives out a hidden state vector and it used as the input of the decoder part. And decoder part generates texts, using the output of the last time step as the input of next time step.

Voice recognition is also a famous application of RNN, but it also needs a special type of RNN.

*To be honest, I don’t know what is the state-of-the-art voice recognition algorithm. The example in this article is a combination of RNN and a collapsing function made using Connectionist Temporal Classification (CTC). In this model, the output of RNN is much longer than the recorded words or sentences, so a collapsing function reduces the output into next output with normal length.

You might have noticed that RNNs in the charts above are connected in both directions. Depending on the RNN tasks you need such bidirectional RNNs.  I think it is also easy to imagine that such networks are necessary. Again, machine translation is a good example.

And interestingly, image captioning, which enables a computer to describe a picture, is one-to-many-task. As the output is a sentence, it is easy to imagine that the output is “many.” If it is a one-to-many task, the input is supposed to be a vector.

Where does the input come from? I told you that I was obsessed with the beauty of the last vector of the feature extraction part of CNN. Surprisingly the the “beautiful” vector, which I call a “semantic vector” is the input of image captioning task (after some transformations, depending on the network models).

I think this articles includes major things you need to know as prerequisites when you want to understand RNN at more mathematical level. In the next article, I would like to explain the structure of a simple RNN, and how it forward propagate.

* I make study materials on machine learning, sponsored by DATANOMIQ. I do my best to make my content as straightforward but as precise as possible. I include all of my reference sources. If you notice any mistakes in my materials, please let me know (email: yasuto.tamura@datanomiq.de). And if you have any advice for making my materials more understandable to learners, I would appreciate hearing it.

As Businesses Struggle With ML, Automation Offers a Solution

In recent years, machine learning technology and the business solutions it enables has developed into a big business in and of itself. According to the industry analysts at IDC, spending on ML and AI technology is set to grow to almost $98 billion per year by 2023. In practical terms, that figure represents a business environment where ML technology has become a key priority for companies of every kind.

That doesn’t mean that the path to adopting ML technology is easy for businesses. Far from it. In fact, survey data seems to indicate that businesses are still struggling to get their machine learning efforts up and running. According to one such survey, it currently takes the average business as many as 90 days to deploy a single machine learning model. For 20% of businesses, that number is even higher.

From the data, it seems clear that something is missing in the methodologies that most companies rely on to make meaningful use of machine learning in their business workflows. A closer look at the situation reveals that the vast majority of data workers (analysts, data scientists, etc.) spend an inordinate amount of time on infrastructure work – and not on creating and refining machine learning models.

Streamlining the ML Adoption Process

To fix that problem, businesses need to turn to another growing area of technology: automation. By leveraging the latest in automation technology, it’s now possible to build an automated machine learning pipeline (AutoML pipeline) that cuts down on the repetitive tasks that slow down ML deployments and lets data workers get back to the work they were hired to do. With the right customized solution in place, a business’s ML team can:

  • Reduce the time spent on data collection, cleaning, and ingestion
  • Minimize human errors in the development of ML models
  • Decentralize the ML development process to create an ML-as-a-service model with increased accessibility for all business stakeholders

In short, an AutoML pipeline turns the high-effort functions of the ML development process into quick, self-adjusting steps handled exclusively by machines. In some use cases, an AutoML pipeline can even allow non-technical stakeholders to self-create ML solutions tailored to specific business use cases with no expert help required. In that way, it can cut ML costs, shorten deployment time, and allow data scientists to focus on tackling more complex modelling work to develop custom ML solutions that are still outside the scope of available automation techniques.

The Parts of an AutoML Pipeline

Although the frameworks and tools used to create an AutoML pipeline can vary, they all contain elements that conform to the following areas:

  • Data Preprocessing – Taking available business data from a variety of sources, cleaning it, standardizing it, and conducting missing value imputation
  • Feature Engineering – Identifying features in the raw data set to create hypotheses for the model to base predictions on
  • Model Selection – Choosing the right ML approach or hyperparameters to produce the desired predictions
  • Tuning Hyperparameters – Determining which hyperparameters help the model achieve optimal performance

As anyone familiar with ML development can tell you, the steps in the above process tend to represent the majority of the labour and time-intensive work that goes into creating a model that’s ready for real-world business use. It is also in those steps where the lion’s share of business ML budgets get consumed, and where most of the typical delays occur.

The Limitations and Considerations for Using AutoML

Given the scope of the work that can now become part of an AutoML pipeline, it’s tempting to imagine it as a panacea – something that will allow a business to reduce its reliance on data scientists going forward. Right now, though, the technology can’t do that. At this stage, AutoML technology is still best used as a tool to augment the productivity of business data teams, not to supplant them altogether.

To that end, there are some considerations that businesses using AutoML will need to keep in mind to make sure they get reliable, repeatable, and value-generating results, including:

  • Transparency – Businesses must establish proper vetting procedures to make sure they understand the models created by their AutoML pipeline, so they can explain why it’s making the choices or predictions it’s making. In some industries, such as in medicine or finance, this could even fall under relevant regulatory requirements.
  • Extensibility – Making sure the AutoML framework may be expanded and modified to suit changing business needs or to tackle new challenges as they arise.
  • Monitoring and Maintenance – Since today’s AutoML technology isn’t a set-it-and-forget-it proposition, it’s important to establish processes for the monitoring and maintenance of the deployment so it can continue to produce useful and reliable ML models.

The Bottom Line

As it stands today, the convergence of automation and machine learning holds the promise of delivering ML models at scale for businesses, which would greatly speed up the adoption of the technology and lower barriers to entry for those who have yet to embrace it. On the whole, that’s great news both for the businesses that will benefit from increased access to ML technology, as well as for the legions of data professionals tasked with making it all work.

It’s important to note, of course, that complete end-to-end ML automation with no human intervention is still a long way off. While businesses should absolutely explore building an automated machine learning pipeline to speed up development time in their data operations, they shouldn’t lose sight of the fact that they still need plenty of high-skilled data scientists and analysts on their teams. It’s those specialists that can make appropriate and productive use of the technology. Without them, an AutoML pipeline would accomplish little more than telling the business what it wants to hear.

The good news is that the AutoML tools that exist right now are sufficient to alleviate many of the real-world problems businesses face in their road to ML adoption. As they become more commonplace, there’s little doubt that the lead time to deploy machine learning models is going to shrink correspondingly – and that businesses will enjoy higher ROI and enhanced outcomes as a result.

Six properties of modern Business Intelligence

Regardless of the industry in which you operate, you need information systems that evaluate your business data in order to provide you with a basis for decision-making. These systems are commonly referred to as so-called business intelligence (BI). In fact, most BI systems suffer from deficiencies that can be eliminated. In addition, modern BI can partially automate decisions and enable comprehensive analyzes with a high degree of flexibility in use.


Read this article in German:
“Sechs Eigenschaften einer modernen Business Intelligence“


Let us discuss the six characteristics that distinguish modern business intelligence, which mean taking technical tricks into account in detail, but always in the context of a great vision for your own company BI:

1. Uniform database of high quality

Every managing director certainly knows the situation that his managers do not agree on how many costs and revenues actually arise in detail and what the margins per category look like. And if they do, this information is often only available months too late.

Every company has to make hundreds or even thousands of decisions at the operational level every day, which can be made much more well-founded if there is good information and thus increase sales and save costs. However, there are many source systems from the company’s internal IT system landscape as well as other external data sources. The gathering and consolidation of information often takes up entire groups of employees and offers plenty of room for human error.

A system that provides at least the most relevant data for business management at the right time and in good quality in a trusted data zone as a single source of truth (SPOT). SPOT is the core of modern business intelligence.

In addition, other data on BI may also be made available which can be useful for qualified analysts and data scientists. For all decision-makers, the particularly trustworthy zone is the one through which all decision-makers across the company can synchronize.

2. Flexible use by different stakeholders

Even if all employees across the company should be able to access central, trustworthy data, with a clever architecture this does not exclude that each department receives its own views of this data. Many BI systems fail due to company-wide inacceptance because certain departments or technically defined employee groups are largely excluded from BI.

Modern BI systems enable views and the necessary data integration for all stakeholders in the company who rely on information and benefit equally from the SPOT approach.

3. Efficient ways to expand (time to market)

The core users of a BI system are particularly dissatisfied when the expansion or partial redesign of the information system requires too much of patience. Historically grown, incorrectly designed and not particularly adaptable BI systems often employ a whole team of IT staff and tickets with requests for change requests.

Good BI is a service for stakeholders with a short time to market. The correct design, selection of software and the implementation of data flows / models ensures significantly shorter development and implementation times for improvements and new features.

Furthermore, it is not only the technology that is decisive, but also the choice of organizational form, including the design of roles and responsibilities – from the technical system connection to data preparation, pre-analysis and support for the end users.

4. Integrated skills for Data Science and AI

Business intelligence and data science are often viewed and managed separately from each other. Firstly, because data scientists are often unmotivated to work with – from their point of view – boring data models and prepared data. On the other hand, because BI is usually already established as a traditional system in the company, despite the many problems that BI still has today.

Data science, often referred to as advanced analytics, deals with deep immersion in data using exploratory statistics and methods of data mining (unsupervised machine learning) as well as predictive analytics (supervised machine learning). Deep learning is a sub-area of ​​machine learning and is used for data mining or predictive analytics. Machine learning is a sub-area of ​​artificial intelligence (AI).

In the future, BI and data science or AI will continue to grow together, because at the latest after going live, the prediction models flow back into business intelligence. BI will probably develop into ABI (Artificial Business Intelligence). However, many companies are already using data mining and predictive analytics in the company, using uniform or different platforms with or without BI integration.

Modern BI systems also offer data scientists a platform to access high-quality and more granular raw data.

5. Sufficiently high performance

Most readers of these six points will probably have had experience with slow BI before. It takes several minutes to load a daily report to be used in many classic BI systems. If loading a dashboard can be combined with a little coffee break, it may still be acceptable for certain reports from time to time. At the latest, however, with frequent use, long loading times and unreliable reports are no longer acceptable.

One reason for poor performance is the hardware, which can be almost linearly scaled to higher data volumes and more analysis complexity using cloud systems. The use of cloud also enables the modular separation of storage and computing power from data and applications and is therefore generally recommended, but not necessarily the right choice for all companies.

In fact, performance is not only dependent on the hardware, the right choice of software and the right choice of design for data models and data flows also play a crucial role. Because while hardware can be changed or upgraded relatively easily, changing the architecture is associated with much more effort and BI competence. Unsuitable data models or data flows will certainly bring the latest hardware to its knees in its maximum configuration.

6. Cost-effective use and conclusion

Professional cloud systems that can be used for BI systems offer total cost calculators, such as Microsoft Azure, Amazon Web Services and Google Cloud. With these computers – with instruction from an experienced BI expert – not only can costs for the use of hardware be estimated, but ideas for cost optimization can also be calculated. Nevertheless, the cloud is still not the right solution for every company and classic calculations for on-premise solutions are necessary.

Incidentally, cost efficiency can also be increased with a good selection of the right software. Because proprietary solutions are tied to different license models and can only be compared using application scenarios. Apart from that, there are also good open source solutions that can be used largely free of charge and can be used for many applications without compromises.

However, it is wrong to assess the cost of a BI only according to its hardware and software costs. A significant part of cost efficiency is complementary to the aspects for the performance of the BI system, because suboptimal architectures work wastefully and require more expensive hardware than neatly coordinated architectures. The production of the central data supply in adequate quality can save many unnecessary processes of data preparation and many flexible analysis options also make redundant systems unnecessary and lead to indirect savings.

In any case, a BI for companies with many operational processes is always cheaper than no BI. However, if you take a closer look with BI expertise, cost efficiency is often possible.

Interview – There is no stand-alone strategy for AI, it must be part of the company-wide strategy

Ronny FehlingRonny Fehling is Partner and Associate Director for Artificial Intelligence as the Boston Consulting Group GAMMA. With more than 20 years of continually progressive experience in leading business and technology innovation, spearheading digital transformation, and aligning the corporate strategy with Artificial Intelligence he industry-leading organizations to grow their top-line and kick-start their digital transformation.

Ronny Fehling is furthermore speaker of the Predictive Analytics World for Industry 4.0 in May 2020.

Data Science Blog: Mr. Fehling, you are consulting companies and business leaders about AI and how to get started with it. AI as a definition is often misleading. How do you define AI?

This is a good question. I think there are two ways to answer this:

From a technical definition, I often see expressions about “simulation of human intelligence” and “acting like a human”. I find using these terms more often misleading rather than helpful. I studied AI back when it wasn’t yet “cool” and still middle of the AI winter. And yes, we have much more compute power and access to data, but we also think about data in a very different way. For me, I typically distinguish between machine learning, which uses algorithms and statistical methods to identify patterns in data, and AI, which for me attempts to interpret the data in a given context. So machine learning can help me identify and analyze frequency patterns in text and even predict the next word I will type based on my history. AI will help me identify ‘what’ I’m writing about – even if I don’t explicitly name it. It can tell me that when I’m asking “I’m looking for a place to stay” that I might want to see a list of hotels around me. In other words: machine learning can detect correlations and similar patterns, AI uses machine learning to generate insights.

I always wondered why top executives are so frequently asking about the definition of AI because at first it seemed to me not as relevant to the discussion on how to align AI with their corporate strategy. However, I started to realize that their question is ultimately about “What is AI and what can it do for me?”.

For me, AI can do three things really good, which humans cannot really do and previous approaches couldn’t cope with:

  1. Finding similar patterns in historical data. Imagine 20 years of data like maintenance or repair documents of a manufacturing plant. Although they describe work done on a multitude of products due to a multitude of possible problems, AI can use this to look for a very similar situation based on a current problem description. This can be used to identify a common root cause as well as a common solution approach, saving valuable time for the operation.
  2. Finding correlations across time or processes. This is often used in predictive maintenance use cases. Here, the AI tries to see what similar events happen typically at some time before a failure happen. This way, it can alert the operator much earlier about an impending failure, say due to a change in the vibration pattern of the machine.
  3. Finding an optimal solution path based on many constraints. There are many problems in the business world, where choosing the optimal path based on complex situations is critical. Let’s say that suddenly a severe weather warning at an airport forces an airline to have to change their scheduling because of a reduced airport capacity. Delays for some aircraft can cause disruptions because passengers or personnel not being able to connect anymore. Knowing which aircraft to delay, which to cancel, which to switch while causing the minimal amount of disruption to passengers, crew, maintenance and ground-crew is something AI can help with.

The key now is to link these fundamental capabilities with the business context of the company and how it can ultimately help transform.

Data Science Blog: Companies are still starting with their own company-wide data strategy. And now they are talking about AI strategies. Is that something which should be handled separately?

In my experience – both based on having seen the implementations of several corporate data strategies as well as my upbringing at Oracle – the data strategy and AI strategy are co-dependent and cannot be separated. Very often I hear from clients that they think they first need to bring their data in order before doing AI project. And yes, without good data access, AI cannot really work. In fact, most of the time spent on AI is spent on processing, cleansing, understanding and contextualizing the data. However, you cannot really know what data will be needed in which form without knowing what you want to use it for. This is why strategies that handle data and AI separately mostly fail and generate huge costs.

Data Science Blog: What are the important steps for developing a good data strategy? Is there something like a general approach?

In my eyes, the AI strategy defines the data strategy step by step as more use cases are implemented. Rather than focusing too quickly at how to get all corporate data into a data lake, it will be much more important to start creating a use-case, technology and data governance. This governance has to be established once the AI strategy is starting to mature to enable the scale up and productization. At the beginning is to find the (very few) use-cases that can serve as light house projects to demonstrate (1) value impact, (2) a way to go from MVP to Pilot, and (3) how to address the data challenge. This will then more naturally identify the elements of governance, data access and technology that are required.

Data Science Blog: What are the most common questions from business leaders to you regarding AI? Why do they hesitate to get started?

By far it the most common question I get is: how do I get started? The hesitations often come from multiple sources like: “We don’t have the talent in house to do AI”, “Our data is not good enough”, “We don’t know which use-case to start with”, “It’s not easy for us to embrace agile and failure culture because our products are mission critical”, “We don’t know how much value this can bring us”.

Data Science Blog: Most managers prefer to start small and with lower risk. They seem to postpone bigger ideas to a later stage, at least some milestones should be reached. Is that a good idea or should they think bigger?

AI is often associated (rightfully so) with a new way of working – agile and embracing failures. Similarly, there is also the perception of significant cost to starting with AI (talent, technology, data). These perceptions often lead managers wanting to start with several smaller ambition use-cases where failure isn’t that grave. Once they have proven itself somehow, they would then move on to bigger projects. The problem with this strategy is on the one side that you fragment your few precious AI resources on too many projects and at the same time you cannot really demonstrate an impact since the projects weren’t chosen based on their impact potential.

The AI pioneers typically were successful by “thinking big, starting small and scaling fast”. You start by assessing the value potential of a use-case, for example: my current OEE (Overall Equipment Efficiency) is at 65%. There is an addressable loss of 25% which would grow my top line by $X. With the help of AI experts, you then create a hypothesis of how you think you can reduce that loss. This might be by choosing one specific equipment and 50% of the addressable loss. This is now the measure against which you define your failure or non-failure criteria. Once you have proven an MVP that can solve this loss, you scale up by piloting it in real-life setting and then scaling it to all the equipment. At every step of this process, you have a failure criterion that is measured by the impact value.


Virtual Edition, 11-12 MAY, 2020

The premier machine learning
conference for industry 4.0

This year Predictive Analytics World for Industry 4.0 runs alongside Deep Learning World and Predictive Analytics World for Healthcare.

Interview – Predictive Maintenance and how it can unleash cost savings

Interview with Dr. Kai Goebel, Principal Scientist at PARC, a Xerox Company, about Predictive Maintenance and how it can unleash cost savings.

Dr. Kai Goebel is principal scientist as PARC with more than two decades experience in corporate and government research organizations. He is responsible for leading applied research on state awareness, prognostics and decision-making using data analytics, AI, hybrid methods and physics-base methods. He has also fielded numerous applications for Predictive Maintenance at General Electric, NASA, and PARC for uses as diverse as rocket launchpads, jet engines, and chemical plants.

Data Science Blog: Mr. Goebel, predictive maintenance is not just a hype since industrial companies are already trying to establish this use case of predictive analytics. What benefits do they really expect from it?

Predictive Maintenance is a good example for how value can be realized from analytics. The result of the analytics drives decisions about when to schedule maintenance in advance of an event that might cause unexpected shutdown of the process line. This is in contrast to an uninformed process where the decision is mostly reactive, that is, maintenance is scheduled because equipment has already failed. It is also in contrast to a time-based maintenance schedule. The benefits of Predictive Maintenance are immediately clear: one can avoid unexpected downtime, which can lead to substantial production loss. One can manage inventory better since lead times for equipment replacement can be managed well. One can also manage safety better since equipment health is understood and safety averse situations can potentially be avoided. Finally, maintenance operations will be inherently more efficient as they shift significant time from inspection to mitigation of.

Data Science Blog: What are the most critical success factors for implementing predictive maintenance?

Critical for success is to get the trust of the operator. To that end, it is imperative to understand the limitations of the analytics approach and to not make false performance promises. Often, success factors for implementation hinge on understanding the underlying process and the fault modes reasonably well. It is important to be able to recognize the difference between operational changes and abnormal conditions. It is equally important to recognize rare events reliably while keeping false positives in check.

Data Science Blog: What kind of algorithm does predictive maintenance work with? Do you differentiate between approaches based on classical machine learning and those based on deep learning?

Well, there is no one kind of algorithm that works for Predictive Mantenance everywhere. Instead, one should look at the plurality of all algorithms as tools in a toolbox. Then analyze the problem – how many examples for run-to-failure trajectories are there; what is the desired lead time to report on a problem; what is the acceptable false positive/false negative rate; what are the different fault modes; etc – and use the right kind of tool to do the job. Just because a particular approach (like the one you mentioned in your question) is all the hype right now does not mean it is the right tool for the problem. Sometimes, approaches from what you call “classical machine learning” actually work better. In fact, one should consider approaches even outside the machine learning domain, either as stand-alone approach as in a hybrid configuration. One may also have to invent new methods, for example to perform online learning of the dynamic changes that a system undergoes through its (long) life. In the end, a customer does not care about what approach one is using, only if it solves the problem.

Data Science Blog: There are several providers for predictive analytics software. Is it all about software tools? What makes the difference for having success?

Frequently, industrial partners lament that they have to spend a lot of effort in teaching a new software provider about the underlying industrial processes as well as the equipment and their fault modes. Others are tired of false promises that any kind of data (as long as you have massive amounts of it) can produce any kind of performance. If one does not physically sense a certain modality, no algorithmic magic can take place. In other words, it is not just all about the software. The difference for having success is understanding that there is no cookie cutter approach. And that realization means that one may have to role up the sleeves and to install new instrumentation.

Data Science Blog: What are coming trends? What do you think will be the main topic 2020 and 2021?

Predictive Maintenance is slowly evolving towards Prescriptive Maintenance. Here, one does not only seek to inform about an impending problem, but also what to do about it. Such an approach needs to integrate with the logistics element of an organization to find an optimal decision that trades off several objectives with regards to equipment uptime, process quality, repair shop loading, procurement lead time, maintainer availability, safety constraints, contractual obligations, etc.