Geschriebene Artikel über Big Data Analytics

Customer Journey Mapping: The data-driven approach to understanding your users

Businesses across the globe are on a mission to know their customers inside out – something commonly referred to as customer-centricity. It’s an attempt to better understand the needs and wants of customers in order to provide them with a better overall experience.

But while this sounds promising in theory, it’s much harder to achieve in practice. To really know your customer you must not only understand what they want, but you also need to hone in on how they want it, when they want it and how often as well.

In essence, your business should use customer journey mapping. It allows you to visualise customer feelings and behaviours through the different stages of their journey – from the first interaction, right up until the point of purchase and beyond.

The Data-Driven Approach 

To ensure your customer journey mapping is successful, you must conduct some extensive research on your customers. You can’t afford to make decisions based on feelings and emotions alone. There are two types of research that you should use for customer journey mapping – quantitative and qualitative research.

Quantitative data is best for analysing the behaviour of your customers as it identifies their habits over time. It’s also extremely useful for confirming any hypotheses you may have developed. That being so, relying solely upon quantitative data can present one major issue – it doesn’t provide you with the specific reason behind those behaviours.

That’s where qualitative data comes to the rescue. Through data collection methods like surveys, interviews and focus groups, you can figure out the reasoning behind some of your quantitative data trends. The obvious downside to qualitative data is its lack of evidence and its tendency to be subjective. Therefore, a combination of both quantitative and qualitative research is most effective.

Creating A Customer Persona

A customer persona is designed to help businesses understand the key traits of specific groups of people. For example, those defined by their age range or geographic location. A customer persona can help improve your customer journey map by providing more insight into the behavioural trends of your “ideal” customer. 

The one downside to using customer personas is that they can be over-generalised at times. Just because a group of people shares a similar age, for example, it does not mean they all share the same beliefs and interests. Nevertheless, creating a customer persona is still beneficial to customer journey mapping – especially if used in combination with the correct customer journey analytics tools.

All Roads Lead To Customer-centricity 

To achieve customer-centricity, businesses must consider using a data-driven approach to customer journey mapping. First, it requires that you achieve a balance between both quantitative and qualitative research. Quantitative research will provide you with definitive trends while qualitative data gives you the reasoning behind those trends. 

To further increase the effectiveness of your customer journey map, consider creating customer personas. They will give you further insight into the behavioural trends within specific groups. 

This article was written by TAP London. Experts in the Adobe Experience Cloud, TAP London help brands organise data to provide meaningful insight and memorable customer experiences. Find out more at wearetaplondon.com.

How Important is Customer Lifetime Value?

This is the third article of article series Getting started with the top eCommerce use cases.

Customer Lifetime Value

Many researches have shown that cost for acquiring a new customer is higher than the cost of retention of an existing customer which makes Customer Lifetime Value (CLV or LTV) one of the most important KPI’s. Marketing is about building a relationship with your customer and quality service matters a lot when it comes to customer retention. CLV is a metric which determines the total amount of money a customer is expected to spend in your business.

CLV allows marketing department of the company to understand how much money a customer is going  to spend over their  life cycle which helps them to determine on how much the company should spend to acquire each customer. Using CLV a company can better understand their customer and come up with different strategies either to retain their existing customers by sending them personalized email, discount voucher, provide them with better customer service etc. This will help a company to narrow their focus on acquiring similar customers by applying customer segmentation or look alike modeling.

One of the main focus of every company is Growth in this competitive eCommerce market today and price is not the only factor when a customer makes a decision. CLV is a metric which revolves around a customer and helps to retain valuable customers, increase revenue from less valuable customers and improve overall customer experience. Don’t look at CLV as just one metric but the journey to calculate this metric involves answering some really important questions which can be crucial for the business. Metrics and questions like:

  1. Number of sales
  2. Average number of times a customer buys
  3. Full Customer journey
  4. How many marketing channels were involved in one purchase?
  5. When the purchase was made?
  6. Customer retention rate
  7. Marketing cost
  8. Cost of acquiring a new customer

and so on are somehow associated with the calculation of CLV and exploring these questions can be quite insightful. Lately, a lot of companies have started to use this metric and shift their focuses in order to make more profit. Amazon is the perfect example for this, in 2013, a study by Consumers Intelligence Research Partners found out that prime members spends more than a non-prime member. So Amazon started focusing on Prime members to increase their profit over the past few years. The whole article can be found here.

How to calculate CLV?

There are several methods to calculate CLV and few of them are listed below.

Method 1: By calculating average revenue per customer

 

Figure 1: Using average revenue per customer

 

Let’s suppose three customers brought 745€ as profit to a company over a period of 2 months then:

CLV (2 months) = Total Profit over a period of time / Number of Customers over a period of time

CLV (2 months) = 745 / 3 = 248 €

Now the company can use this to calculate CLV for an year however, this is a naive approach and works only if the preferences of the customer are same for the same period of time. So let’s explore other approaches.

Method 2

This method requires to first calculate KPI’s like retention rate and discount rate.

 

CLV = Gross margin per lifespan ( Retention rate per month / 1 + Discount rate – Retention rate per month)

Where

Retention rate = Customer at the end of the month – Customer during the month / Customer at the beginning of the month ) * 100

Method 3

This method will allow us to look at other metrics also and can be calculated in following steps:

  1. Calculate average number of transactions per month (T)
  2. Calculate average order value (OV)
  3. Calculate average gross margin (GM)
  4. Calculate customer lifespan in months (ALS)

After calculating these metrics CLV can be calculated as:

 

CLV = T*OV*GM*ALS / No. of Clients for the period

where

Transactions (T) = Total transactions / Period

Average order value (OV) = Total revenue / Total orders

Gross margin (GM) = (Total revenue – Cost of sales/ Total revenue) * 100 [but how you calculate cost of sales is debatable]

Customer lifespan in months (ALS) = 1 / Churn Rate %

 

CLV can be calculated using any of the above mentioned methods depending upon how robust your company wants the analysis to be. Some companies are also using Machine learning models to predict CLV, maybe not directly but they use ML models to predict customer churn rate, retention rate and other marketing KPI’s. Some companies take advantage of all the methods by taking an average at the end.

5 Applications for Location-Based Data in 2020

Location-based data enables giving people relevant information based on where they are at any given moment. Here are five location data applications to look for in 2020 and beyond. 

1. Increasing Sales and Reducing Frustration

One 2019 report indicated that 89% of the marketers who used geo data saw increased sales within their customer bases. Sometimes, the ideal way to boost sales is to convert what would be a frustration into something positive. 

A French campaign associated with the Actimel yogurt brand achieved this by sending targeted, encouraging messages to drivers who used the Waze navigation app and appeared to have made a wrong turn or got caught in traffic. 

For example, a driver might get a message that said, “Instead of getting mad and honking your horn, pump up the jams! #StayStrong.” The three-month campaign saw a 140% increase in ad recall. 

More recently, home furnishing brand IKEA launched a campaign in Dubai where people can get free stuff for making a long trip to a store. The freebies get more valuable as a person’s commute time increases. The catch is that participants have to activate location settings on their phones and enable Google Maps. Driving five minutes to a store got a person a free veggie hot dog, and they’d get a complimentary table for traveling 49 minutes. 

2. Offering Tailored Ad Targeting in Medical Offices

Pharmaceutical companies are starting to rely on companies that send targeted ads to patients connected to the Wi-Fi in doctors’ offices. One such provider is Semcasting. A recent effort involved sending ads to cardiology offices for a type of drug that lowers cholesterol levels in the blood. 

The company has taken a similar approach for an over-the-counter pediatric drug and a medication to relieve migraine headaches, among others. Such initiatives cause a 10% boost in the halo effect, plus a 1.5% uptick in sales. The first perk relates to the favoritism that people feel towards other products a company makes once they like one of them.

However, location data applications related to health care arguably require special attention regarding privacy. Patients may feel uneasy if they believe that companies are watching them and know they need a particular kind of medical treatment. 

3. Facilitating the Deployment of the 5G Network

The 5G network is coming soon, and network operators are working hard to roll it out. Statistics indicate that the 5G infrastructure investment will total $275 billion over seven years. Geodata can help network brands decide where to deploy 5G connectivity first.

Moreover, once a company offers 5G in an area, marketing teams can use location data to determine which neighborhoods to target when contacting potential customers. Most companies that currently have 5G within their product lineups have carefully chosen which areas are at the top of the list to receive 5G, and that practice will continue throughout 2020. 

It’s easy to envision a scenario whereby people can send error reports to 5G providers by using location data. For example, a company could say that having location data collection enabled on a 5G-powered smartphone allows a technician to determine if there’s a persistent problem with coverage.

Since the 5G network is still, it’s impossible to predict all the ways that a telecommunications operator might use location data to make their installations maximally profitable. However, the potential is there for forward-thinking brands to seize. 

4. Helping People Know About the Events in Their Areas

SoundHound, Inc. and Wcities recently announced a partnership that will rely on location-based data to keep people in the loop about upcoming local events. People can use a conversational intelligence platform that has information about more than 20,000 cities around the world. 

Users also don’t need to mention their locations in voice queries. They could say, for example, “Which bands are playing downtown tonight?” or “Can you give me some events happening on the east side tomorrow?” They can also ask something associated with a longer timespan, such as “Are there any wine festivals happening this month?”

People can say follow-up commands, too. They might ask what the weather forecast is after hearing about an outdoor event they want to attend. The system also supports booking an Uber, letting people get to the happening without hassles. 

5. Using Location-Based Data for Matchmaking

In honor of Valentine’s Day 2020, students from more than two dozen U.S colleges signed up for a matchmaking opportunity. It, at least in part, uses their location data to work. 

Participants answer school-specific questions, and their responses help them find a friend or something more. The platform uses algorithms to connect people with like-minded individuals. 

However, the company that provides the service can also give a breakdown of which residence halls have the most people taking part, or whether people generally live off-campus. This example is not the first time a university used location data by any means, but it’s different from the usual approach. 

Location Data Applications Abound

These five examples show there are no limits to how a company might use location data. However, they must do so with care, protecting user privacy while maintaining a high level of data quality. 

CAPTCHAs lösen via Maschine Learning

Wie weit ist das maschinelle Lernen auf dem Gebiet der CAPTCHA-Lösung fortgeschritten?

Maschinelles Lernen ist mehr als ein Buzzword, denn unter der Haube stecken viele Algorithemen, die eine ganze Reihe von Problemen lösen können. Die Lösung von CAPTCHA ist dabei nur eine von vielen Aufgaben, die Machine Learning bewältigen kann. Durch die Arbeit an ein paar Problemen im Zusammenhang mit dem konvolutionellen neuronalen Netz haben wir festgestellt, dass es in diesem Bereich noch viel Verbesserungspotenzial gibt. Die Genauigkeit der Erkennung ist oftmals noch nicht gut genug. Schauen wir uns im Einzelnen an, welche Dienste wir haben, um dieses Problem anzugehen, und welche sich dabei als die besten erweisen.

Was ist CAPTCHA?

CAPTCHA ist kein fremder Begriff mehr für Web-Benutzer. Es handelt sich um die ärgerliche menschliche Validierungsprüfung, die auf vielen Websites hinzugefügt wird. Es ist ein Akronym für Completely Automated Public Turing test for tell Computer and Humans Apart. CAPTCHA kann als ein Computerprogramm bezeichnet werden, das dazu entwickelt wurde, Mensch und Maschine zu unterscheiden, um jede Art von illegaler Aktivität auf Websites zu verhindern. Der Sinn von CAPTCHA ist, dass nur ein Mensch diesen Test bestehen können sollte und Bots bzw. irgend eine Form automatisierter Skripte daran versagen. So entsteht ein Wettlauf zwischen CAPTCHA-Anbietern und Hacker-Lösungen, die auf den Einsatz von selbstlernenden Systemen setzen.

Warum müssen wir CAPTCHA lösen?

Heutzutage verwenden die Benutzer automatisierte CAPTCHA-Lösungen für verschiedene Anwendungsfälle. Und hier ein entscheidender Hinweis: Ähnlich wie Penetrationstesting ist der Einsatz gegen Dritte ohne vorherige Genehmigung illegal. Gegen eigene Anwendungen oder gegen Genehmigung (z. B. im Rahmen eines IT-Security-Tests) ist die Anwendung erlaubt. Hacker und Spammer verwenden die CAPTCHA-Bewältigung, um die E-Mail-Adressen der Benutzer zu erhalten, damit sie so viele Spams wie möglich erzeugen können oder um Bruteforce-Attacken durchführen zu können. Die legitimen Beispiele sind Fälle, in denen ein neuer Kunde oder Geschäftspartner zu Ihnen gekommen ist und Zugang zu Ihrer Programmierschnittstelle (API) benötigt, die noch nicht fertig ist oder nicht mit Ihnen geteilt werden kann, wegen eines Sicherheitsproblems oder Missbrauchs, den es verursachen könnte.

Für diese Anwendungsfälle sollen automatisierte Skripte CAPTCHA lösen. Es gibt verschiedene Arten von CAPTCHA: Textbasierte und bildbasierte CAPTCHA, reCAPTCHA und mathematisches CAPTCHA.

Es gibt einen Wettlauf zwischen CAPTCHA-Anbieter und automatisierten Lösungsversuchen. Die in CAPTCHA und reCAPTCHA verwendete Technologie werden deswegen immer intelligenter wird und Aktualisierungen der Zugangsmethoden häufiger. Das Aufrüsten hat begonnen.

Populäre Methoden für die CAPTCHA-Lösung

Die folgenden CAPTCHA-Lösungsmethoden stehen den Benutzern zur Lösung von CAPTCHA und reCAPTCHA zur Verfügung:

  1. OCR (optische Zeichenerkennung) via aktivierte Bots – Dieser spezielle Ansatz löst CAPTCHAs automatisch mit Hilfe der OCR-Technik (Optical Character Recognition). Werkzeuge wie Ocrad, tesseract lösen CAPTCHAs, aber mit sehr geringer Genauigkeit.
  2. Maschinenlernen — Unter Verwendung von Computer Vision, konvolutionalem neuronalem Netzwerk und Python-Frameworks und Bibliotheken wie Keras mit Tensorflow. Wir können tiefe neuronale Konvolutionsnetzmodelle trainieren, um die Buchstaben und Ziffern im CAPTCHA-Bild zu finden.
  3. Online-CAPTCHA-Lösungsdienstleistungen — Diese Dienste verfügen teilweise über menschliche Mitarbeiter, die ständig online verfügbar sind, um CAPTCHAs zu lösen. Wenn Sie Ihre CAPTCHA-Lösungsanfrage senden, übermittelt der Dienst sie an die Lösungsanbieter, die sie lösen und die Lösungen zurückschicken.

Leistungsanalyse der OCR-basierten Lösung

OCR Die OCR ist zwar eine kostengünstige Lösung, wenn es darum geht, eine große Anzahl von trivialen CAPTCHAs zu lösen, aber dennoch liefert sie nicht die erforderliche Genauigkeit. OCR-basierte Lösungen sind nach der Veröffentlichung von ReCaptcha V3 durch Google selten geworden. OCR-fähige Bots sind daher nicht dazu geeignet, CAPTCHA zu umgehen, die von Titanen wie Google, Facebook oder Twitter eingesetzt werden. Hierfür müsste ein besser ausgestattetes CAPTCHA-Lösungssystem eingesetzt werden.

OCR-basierte Lösungen lösen 1 aus 3 trivialen CAPTCHAs korrekt.

Leistungsanalyse der ML-basierten Methode

Schauen wir uns an, wie Lösungen auf dem Prinzip des Maschinenlernens funktionieren:

Die ML-basierte Verfahren verwenden OpenCV, um Konturen in einem Bild zu finden, das die durchgehenden Gebiete feststellt. Die Bilder werden mit der Technik der Schwellenwertbildung vorverarbeitet. Alle Bilder werden in Schwarzweiß konvertiert. Wir teilen das CAPTCHA-Bild mit der OpenCV-Funktion findContour() in verschiedene Buchstaben auf. Die verarbeiteten Bilder sind jetzt nur noch einzelne Buchstaben und Ziffern. Diese werden dann dem CNN-Modell zugeführt, um es zu trainieren. Und das trainierte CNN-Modell ist bereit, die richtige Captchas zu lösen.

Die Präzision einer solchen Lösung ist für alle textbasierten CAPTCHAs weitaus besser als die OCR-Lösung. Es gibt auch viele Nachteile dieser Lösung, denn sie löst nur eine bestimmte Art von CAPTCHAs und Google aktualisiert ständig seinen reCAPTCHA-Generierungsalgorithmus. Die letzte Aktualisierung schien die beste ReCaptcha-Aktualisierung zu sein, die disen Dienst bisher beeinflusst hat: Die regelmäßigen Nutzer hatten dabei kaum eine Veränderung der Schwierigkeit gespürt, während automatisierte Lösungen entweder gar nicht oder nur sehr langsam bzw. inakkurat funktionierten.

Das Modell wurde mit 1⁰⁴ Iterationen mit korrekten und zufälligen Stichproben und 1⁰⁵ Testbildern trainiert, und so wurde eine mittlere Genauigkeit von ~60% erreicht.

Bild-Quelle: “CAPTCHA Recognition with Active Deep Learning” @ TU München https://www.researchgate.net/publication/301620459_CAPTCHA_Recognition_with_Active_Deep_Learning

Wenn Ihr Anwendungsfall also darin besteht, eine Art von CAPTCHA mit ziemlich einfacher Komplexität zu lösen, können Sie ein solches trainiertes ML-Modell hervorragend nutzen. Eine bessere Captcha-Lösungslösung als OCR, muss aber noch eine ganze Menge Bereiche umfassen, um die Genauigkeit der Lösung zu gewährleisten.

Online-Captcha-Lösungsdienst

Online-CAPTCHA-Lösungsdienste sind bisher die bestmögliche Lösung für dieses Problem. Sie verfolgen alle Aktualisierungen von reCAPTCHA durch Google und bieten eine tadellose Genauigkeit von 99%.

Warum sind Online-Anti-Captcha-Dienste leistungsfähiger als andere Methoden?

Die OCR-basierten und ML-Lösungen weisen nach den bisherigen Forschungsarbeiten und Weiterentwicklungen viele Nachteile auf. Sie können nur triviale CAPTCHAs ohne wesentliche Genauigkeit lösen. Hier sind einige Punkte, die in diesem Zusammenhang zu berücksichtigen sind:

– Ein höherer Prozentsatz an korrekten Lösungen (OCR gibt bei wirklich komplizierten CAPTCHAs ein extrem hohes Maß an falschen Antworten; ganz zu schweigen davon, dass einige Arten von CAPTCHA überhaupt nicht mit OCR gelöst werden können, zumindest vorerst).

– Kontinuierlich fehlerfreie Arbeit ohne Unterbrechungen mit schneller Anpassung an die neu hinzugekommene Komplexität.

– Kostengünstig mit begrenzten Ressourcen und geringen Wartungskosten, da es keine Software- oder Hardwareprobleme gibt; alles, was Sie benötigen, ist eine Internetverbindung, um einfache Aufträge über die API des Anti-Captcha-Dienstes zu senden.

Die großen Anbieter von Online-Lösungsdiensten

Jetzt, nachdem wir die bessere Technik zur Lösung Ihrer CAPTCHAs geklärt haben, wollen wir unter allen Anti-Captcha-Diensten den besten auswählen. Einige Dienste bieten eine hohe Genauigkeit der Lösungen, API-Unterstützung für die Automatisierung und schnelle Antworten auf unsere Anfragen. Dazu gehören Dienste wie 2captcha, Imagetyperz, CaptchaSniper, etc.

2CAPTCHA ist einer der Dienste, die auf die Kombination von Machine Learning und echten Menschen setzen, um CAPTCHA zuverlässig zu lösen. Dabei versprechen Dienste wie 2captcha:

  • Schnelle Lösung mit 17 Sekunden für grafische und textuelle Captchas und ~23 Sekunden für ReCaptcha
  • Unterstützt alle populären Programmiersprachen mit einer umfassenden Dokumentation der fertigen Bibliotheken.
  • Hohe Genauigkeit (bis zu 99% je nach dem CAPTCHA-Typ).
  • Das Geld wird bei falschen Antworten zurückerstattet.
  • Fähigkeit, eine große Anzahl von Captchas zu lösen (mehr als 10.000 pro Minute)

Schlussfolgerung

Convolutional Neural Networks (CNN) wissen, wie die einfachsten Arten von Captcha zu bewältigen sind und werden auch mit der weiteren Enwicklung schritthalten können. Wir haben es mit einem Wettlauf um verkomplizierte CAPTCHAs und immer fähigeren Lösungen der automatisierten Erkennung zutun. Zur Zeit werden Online-Anti-Captcha-Dienste, die auf einen Mix aus maschinellem Lernen und menschlicher Intelligenz setzen, diesen Lösungen vorerst voraus sein.

Article series: 5 Clean Coding Tips

This series of articles will cover 5 clean coding tips to follow as soon as you’ve made the first steps into your coding career, with the example of python.

At the beginning of your adventure with coding, you might find that getting your code to compile without any errors and give you the output that you expect is hard enough. Conforming to any standards and style guides is at the very bottom of your concerns. You might be at the beginning of your career or you might have a lot of domain experience but not that much in coding. Or maybe until now you worked mostly on your own and never had to make your code available for others to work with it. In any case, it is worth acknowledging how crucial it is to write your code in a concise, readable and understandable way, and how much benefits it will eventually bring you.

The first thing to realize is that the whole clean coding concept has been developed for people, your fellow travelers, not for the computers. The compiler doesn’t care how you name your variables, how you split your lines or if everything is aligned in a pretty way. You could even write your code as a one gigantic, few-meters-long line, giving the interpreter just a signal – a semicolon, that the line should be split, and it will execute it perfectly.

However, it is likely that, the deeper you are into your career, the more people will have to read, understand and modify the code that you wrote. You will write code to communicate certain ideas and solutions with other people. Therefore, you need to be sure, that what you want to communicate is understandable, easy and quick to read. The coding best practice is to always code in a clean way, treating the code itself and not just the output as the result of your work.

There usually are fixed rules and standards regarding code readability. For python, it is the PEP 8[i]. Some companies elaborate on those standards where the PEP 8 is a bit vague or leaves room for interpretation. The exact formatting styles might differ at Facebook, Google[ii] or at the company you happen to work for. But before you get lost in the art of a perfect line splitting, brackets alignment technique, or the hopeless tabs or spaces battle, have a look at the 5 tips in the upcoming articles in this series. They are universal and might help you make your code, less of a chaotic mess and more of blissful delight.

List of articles in this series:

  1. Be consistent
  2. Name variables in a meaningful way (to be published soon)
  3. Take advantage of the formatting tools (to be published soon)
  4. Stop commenting the obvious (to be published soon)
  5. Put yourself in somebody else’s shoes (to be published soon)
References:

[i] https://www.python.org/dev/peps/pep-0008/
[ii] http://google.github.io/styleguide/pyguide.html

Integrate Unstructured Data into Your Enterprise to Drive Actionable Insights

In an ideal world, all enterprise data is structured – classified neatly into columns, rows, and tables, easily integrated and shared across the organization.

The reality is far from it! Datamation estimates that unstructured data accounts for more than 80% of enterprise data, and it is growing at a rate of 55 – 65 percent annually. This includes information stored in images, emails, spreadsheets, etc., that cannot fit into databases.

Therefore, it becomes imperative for a data-driven organization to leverage their non-traditional information assets to derive business value. We have outlined a simple 3-step process that can help organizations integrate unstructured sources into their data eco-system:

1. Determine the Challenge

The primary step is narrowing down the challenges you want to solve through the unstructured data flowing in and out of your organization. Financial organizations, for instance, use call reports, sales notes, or other text documents to get real-time insights from the data and make decisions based on the trends. Marketers make use of social media data to evaluate their customers’ needs and shape their marketing strategy.

Figuring out which process your organization is trying to optimize through unstructured data can help you reach your goal faster.

2. Map Out the Unstructured Data Sources Within the Enterprise

An actionable plan starts with identifying the range of data sources that are essential to creating a truly integrated environment. This enables organizations to align the sources with business objectives and streamline their data initiatives.

Deciding which data should be extracted, analyzed, and stored should be a primary concern in this regard. Even if you can ingest data from any source, it doesn’t mean that you should.

Collecting a large volume of unstructured data is not enough to generate insights. It needs to be properly organized and validated for quality before integration. Full, incremental, online, and offline extraction methods are generally used to mine valuable information from unstructured data sources.

3. Transform Unstructured Assets into Decision-Ready Insights

Now that you have all the puzzle pieces, the next step is to create a complete picture. This may require making changes in your organization’s infrastructure to derive meaning from your unstructured assets and get a 360-degree business view.

IDC recommends creating a company culture that promotes the collection, use, and sharing of both unstructured and structured business assets. Therefore, finding an enterprise-grade integration solution that offers enhanced connectivity to a range of data sources, ideally structured, unstructured, and semi-structured, can help organizations generate the most value out of their data assets.

Automation is another feature that can help speed up integration processes, minimize error probability, and generate time-and-cost savings. Features like job scheduling, auto-mapping, and workflow automation can optimize the process of extracting information from XML, JSON, Excel or audio files, and storing it into a relational database or generating insights.

The push to become a data-forward organization has enterprises re-evaluating the way to leverage unstructured data assets for decision-making. With an actionable plan in place to integrate these sources with the rest of the data, organizations can take advantage of the opportunities offered by analytics and stand out from the competition.

Introduction to Recommendation Engines

This is the second article of article series Getting started with the top eCommerce use cases. If you are interested in reading the first article you can find it here.

What are Recommendation Engines?

Recommendation engines are the automated systems which helps select out similar things whenever a user selects something online. Be it Netflix, Amazon, Spotify, Facebook or YouTube etc. All of these companies are now using some sort of recommendation engine to improve their user experience. A recommendation engine not only helps to predict if a user prefers an item or not but also helps to increase sales, ,helps to understand customer behavior, increase number of registered users and helps a user to do better time management. For instance Netflix will suggest what movie you would want to watch or Amazon will suggest what kind of other products you might want to buy. All the mentioned platforms operates using the same basic algorithm in the background and in this article we are going to discuss the idea behind it.

What are the techniques?

There are two fundamental algorithms that comes into play when there’s a need to generate recommendations. In next section these techniques are discussed in detail.

Content-Based Filtering

The idea behind content based filtering is to analyse a set of features which will provide a similarity between items themselves i.e. between two movies, two products or two songs etc. These set of features once compared gives a similarity score at the end which can be used as a reference for the recommendations.

There are several steps involved to get to this similarity score and the first step is to construct a profile for each item by representing some of the important features of that item. In other terms, this steps requires to define a set of characteristics that are discovered easily. For instance, consider that there’s an article which a user has already read and once you know that this user likes this article you may want to show him recommendations of similar articles. Now, using content based filtering technique you could find the similar articles. The easiest way to do that is to set some features for this article like publisher, genre, author etc. Based on these features similar articles can be recommended to the user (as illustrated in Figure 1). There are three main similarity measures one could use to find the similar articles mentioned below.

 

Figure 1: Content-Based Filtering

 

 

Minkowski distance

Minkowski distance between two variables can be calculated as:

(x,y)= (\sum_{i=1}^{n}{|X_{i} - Y_{i}|^{p}})^{1/p}

 

Cosine Similarity

Cosine similarity between two variables can be calculated as :

  \mbox{Cosine Similarity} = \frac{\sum_{i=1}^{n}{x_{i} y_{i}}} {\sqrt{\sum_{i=1}^{n}{x_{i}^{2}}} \sqrt{\sum_{i=1}^{n}{y_{i}^{2}}}} \

 

Jaccard Similarity

 

  J(X,Y) = |X ∩ Y| / |X ∪ Y|

 

These measures can be used to create a matrix which will give you the similarity between each movie and then a function can be defined to return the top 10 similar articles.

 

Collaborative filtering

This filtering method focuses on finding how similar two users or two products are by analyzing user behavior or preferences rather than focusing on the content of the items. For instance consider that there are three users A,B and C.  We want to recommend some movies to user A, our first approach would be to find similar users and compare which movies user A has not yet watched and recommend those movies to user A.  This approach where we try to find similar users is called as User-User Collaborative Filtering.  

The other approach that could be used here is when you try to find similar movies based on the ratings given by others, this type is called as Item-Item Collaborative Filtering. The research shows that item-item collaborative filtering works better than user-user collaborative filtering as user behavior is really dynamic and changes over time. Also, there are a lot more users and increasing everyday but on the other side item characteristics remains the same. To calculate the similarities we can use Cosine distance.

 

Figure 2: Collaborative Filtering

 

Recently some companies have started to take advantage of both content based and collaborative filtering techniques to make a hybrid recommendation engine. The results from both models are combined into one hybrid model which provides more accurate recommendations. Five steps are involved to make a recommendation engine work which are collection of data, storing of data, analyzing the data, filtering the data and providing recommendations. There are a lot of attributes that are involved in order to collect user data including browsing history, page views, search logs, order history, marketing channel touch points etc. which requires a strong data architecture.  The collection of data is pretty straightforward but it can be overwhelming to analyze this amount of data. Storing this data could get tricky on the other hand as you need a scalable database for this kind of data. With the rise of graph databases this area is also improving for many use cases including recommendation engines. Graph databases like Neo4j can also help to analyze and find similar users and relationship among them. Analyzing the data can be carried in different ways, depending on how strong and scalable your architecture you can run real time, batch or near real time analysis. The fourth step involves the filtering of the data and here you can use any of the above mentioned approach to find similarities to finally provide the recommendations.

Having a good recommendation engine can be time consuming initially but it is definitely beneficial in the longer run. It not only helps to generate revenue but also helps to to improve your product catalog and customer service.

Wie Process Mining 2020 Ihre erfolgreiche Geschäftstransformation 2020 sicherstellt

Fehlende Informationen über bestehende Prozesse sorgen dafür, dass 70% aller großen Transformationsprojekte und rund 50% aller RPA-Projekte scheitern. Grund hierfür sind mangelndes Verständnis der bestehenden Prozesse und die fehlende Verbindung zwischen der Ermittlung, Visualisierung, Analyse und Ausführung vorhandener Daten. Durch den Einsatz von Process Mining-Technologie erhalten Sie die notwendigen Informationen, die Transparenz und die quantifizierbaren Zahlen, die zur Verbesserung der Ende-zu-Ende-Prozesse für eine nachhaltige Transformation erforderlich sind.


Read this article in English:

Six ways process mining in 2020 can save your business transformation

 


Process Mining im Jahr 2020

Ihr Datenabdruck

Betrachtet man die oben genannten Zahlen (von McKinsey bzw. Ernst & Young (EY)) wird eines deutlich: Die Digitalisierung von Produkten und Dienstleistungen zwingt Unternehmen aller Größen und Branchen dazu, ihre bestehenden Geschäftsmodelle und Prozesse drastisch zu überdenken. Umso wichtiger wird Process Mining. Die Technik nutzt eindeutige Daten – sozusagen den geschäftlichen Fingerabdruck Ihres Unternehmens – um automatisch alle bestehenden Geschäftsprozesse zusammenzufügen und digital darzustellen.

Dieser digitale Nachweis ermöglicht es uns, die Funktionsweise von Prozessen (sowohl in konventioneller als auch variabler Ausführung) bis hin zu einzelnen Prozessinstanzen genau zu visualisieren. Mit anderen Worten: Process Mining deckt verborgene oder inaktive Prozesse auf, legt versteckten Mehrwert offen und ermöglicht ein sofortiges Verständnis.

Mit den richtigen Prozessen zum Erfolg

Mithilfe standardisierter und konfigurierbarer Benachrichtigungen und KPIs können Sie die unmittelbaren Auswirkungen von Prozessänderungen besser nachvollziehen. Auf diese Weise werden Fehlerraten gesenkt und das Vertrauen in das Unternehmen gestärkt. Und das ist noch nicht alles: Jeder, vom neuen Mitarbeiter bis zur C-Suite, kann die Prozesse seiner Organisation besser visualisieren, verstehen und erklären. Dies stellt sicher, dass Prozesse langfristig erfolgreich verändert werden.

Das Potenzial von Prozessen voll ausschöpfen

Im Geschäftsleben ist nicht nur die Kommunikation von entscheidender Bedeutung, sondern auch die Reaktion auf Probleme mit passenden Lösungen. Die täglichen Unternehmensabläufe – gemeint sind die zugrunde liegenden Prozesse – bilden die Verbindung zur eingesetzten Geschäftstechnologie, vom Process Mining bis zur robotergestützten „Prozessautomatisierung“. Ohne ein Verständnis für die Prozesse und tatsächliche Funktionsweise eines Unternehmens ist die Technologie jedoch redundant. Prozesse sind sozusagen das Lebenselixier eines Unternehmens.

 

Process Mining: Ihr Differenzierungsmerkmal

Integration transformativer, digitaler Technologien

Process Mining bietet weit mehr als Erkennen, Visualisieren und Analysieren: Anhand Ihrer vorhandenen Daten können Sie die Ausführung von Prozessen automatisch in Echtzeit überwachen. Diese einfache Bewertung per Mausklick ermöglicht ein sofortiges Verständnis komplexer Prozesse. Innerhalb von Transformationsprojekten, die aufgrund ihrer Natur tief greifende Änderungen in geschäftlichen und organisatorischen Aktivitäten erfordern, liefert Process Mining die visuelle Übersicht und ermöglicht sofortige Maßnahmen.

Dieser selbsttragende Ansatz führt zu nachhaltigen Ergebnissen und schafft eine Prozesskultur innerhalb des gesamten Unternehmens. Experten für digitale Transformation und Excellence können mithilfe eines solch Ansatzes leichter Prozesse nutzen, ihre Projekte und Programme untermauern und Herausforderungen bei Verhaltensänderungen bewältigen. Hierzu zählen eine leichtere Integration transformativer, digitaler Technologien, bessere operative Agilität und Flexibilität, optimierte Unternehmensführung und -kultur sowie Mitarbeiterförderung.

Drei Wege zu einem erfolgreichen Transformationsprojekt mithilfe von Process Mining:

  • Sie benötigen 100% operative Transparenz: Um all Ihre Transaktionen darstellen zu können, ist vollständige Prozesstransparenz erforderlich. Sie ermöglicht den direkten Vergleich zwischen dem Ist-Zustand und dem geplanten Prozessverlauf. Diese Konformitätsprüfung kann automatisch die Probleme und Aufgaben mit der höchsten Priorität identifizieren und die Hauptursachen für Diskrepanzen zwischen Soll und Ist hervorheben, sodass sofort Maßnahmen ergriffen werden können.
  • Sie müssen Kosten senken und die Effizienz steigern: Untersuchungen von Signavio zeigen, dass fast 60% der Unternehmen aufgrund von Ineffizienzen bei den Prozessen unnötige zusätzliche Kosten tragen mussten. Process Mining kann Ihrem Unternehmen helfen, die Kosten zu senken, da es Schwachstellen und Abweichungen entdeckt und gleichzeitig aufzeigt, welche Prozesse ausbremsen – einschließlich der Engpässe und Ineffizienzen, die sich auf den Umsatz auswirken. Process Mining bietet die Möglichkeit zu Prozessverbesserungen und vorausschauenden Strategien und somit zu positiven geschäftlichen Veränderungen.
  • Sie müssen den Einkaufs- und Verkaufszyklus optimieren: Dauert der Versand zu lange? Welcher Lieferant unterstützt Sie unzureichend? Welcher Lieferant ist der Beste? Process Mining ist Ihr One Click Trick, um Antworten auf solche Fragen zu finden und zu ermitteln, welche Einheiten die beste Leistung erbringen und welche nur Zeit und Geld verschwenden.

Process Mining und Robotic Process Automation (RPA)

Die vorteilhafte Kombination beider Technologien

RPA (Robotic Process Automation) ermöglicht die Automatisierung manueller, sich wiederholender und fehleranfälliger Aufgaben. Dies setzt jedoch voraus, dass Prozessverantwortliche genau wissen, wie und mit welchem Ziel sie Software-Roboter einsetzen und ihre Leistung kontinuierlich messen. Daher bietet die Kombination aus RPA und Process Mining Unternehmen viele Vorteile: Über die gesamte RPA-Initiative hinweg können sie die Leistung und die Vorteile ihrer Software-Roboter messen und sie bestmöglich für ihr Szenario einsetzen.

Upgrade robotergestützter Automatisierung

Mit diesen Erkenntnissen eignet sich Process Mining hervorragend als Vorbereitung für die Prozessautomatisierung: Um die Vorteile der robotergesteuerten Automatisierung vollumfänglich auszuschöpfen, müssen Organisationen nicht nur ihre bestehenden Systeme verstehen, sondern auch Möglichkeiten zur Automatisierung ermitteln. Process Mining-Werkzeuge bieten während des gesamten RPA-Zyklus wertvolle Erkenntnisse über die Prozessdaten: von der Festlegung der Strategie bis hin zu kontinuierlichen Verbesserungen und Innovationen.

 Drei Wege zu einem erfolgreichen RPA Lifecycle-Projekt mithilfe von Process Mining:

  • Sie benötigen Prozessübersichten nach bestimmten Kriterien: Um einen vollständigen Überblick über die Ende-zu-Ende-Prozesse zu erhalten, müssen Prozesse mit hohem ROI identifiziert werden, die sich für die RPA-Implementierung eignen. Auf diese Weise können Sie den optimalen Prozessfluss/-pfad ermitteln und redundante Prozesse aufdecken, die Ihnen vor der Automatisierung möglicherweise gar nicht bewusst waren.
  • Sie sind unsicher, wie Sie die Mensch-Maschine-Zyklen am besten optimieren: Indem Sie den optimalen Prozessfluss/-pfad ermitteln, können Sie auch ineffiziente Mensch-Roboter-Übergaben besser erkennen und erhalten quantifizierbare Daten zu den finanziellen Auswirkungen jedes „digitalen Mitarbeiters“ oder Prozesses. Auf diese Weise können Sie die Arbeit von Mensch und Roboter in Bezug auf Genauigkeit, Effizienz, Kosten und Projektdauer vergleichen.
  • Sie müssen besser verstehen, wie RPA ältere Prozesse und Systeme unterstützt: Durch die Integration in Cloud- und Web-/App-basierte Services können Unternehmen dank RPA auch ihre Legacy-Systeme weiter nutzen. Auf diese Weise lassen sich Legacy-Funktionen mit modernen Tools, Anwendungen und sogar mobilen Apps verbinden. Effizienz und Effektivität werden in allen wichtigen Unternehmensabteilungen, einschließlich HR, Finanzwesen und Legal, verbessert.

Process Mining für ein besseres Kundenerlebnis und Mapping

Denken Sie Kundenzufriedenheit neu

Die Integration von Process Mining in andere Technologien ist auch für eine bessere Prozessqualität und das Wachstum am Markt von entscheidender Bedeutung. So steht beim Prozessmanagement bereits die Kundenbindung im Fokus. Ein erfolgreiches Prozessmanagement ermöglicht es Unternehmen, den Kunden im Rahmen von umfassenden Effektivitätszielen zu geringstmöglichen Kosten zu begeistern, anstatt einseitige Effizienzziele zu verfolgen.

Darüber hinaus bietet Process Mining im Rahmen des Customer Journey Mapping (CJM) – insbesondere in Verknüpfung mit den zugrunde liegenden Prozessen – die Möglichkeit, bessere geschäftliche Erkenntnisse zu erzielen und diese Prozesse mit einer Outside-In-Kundenperspektive zu betrachten. Durch die Kombination aus Process Mining mit einer kundenorientierten Sicht auf die geschäftlichen Tätigkeiten wird die Kundenzufriedenheit zu einem strategischen Faktor für den geschäftlichen Erfolg.

Das volle Potenzial von Prozessen nutzen

Setzen Sie bei Process Mining-Initiativen auf Signavio Process Intelligence und erfahren Sie in unserem kostenlosen Whitepaper Erfolgreiches Process Mining mit Signavio Process Intelligence, wie Ihr Unternehmen den versteckten Mehrwert von Prozessen für sich nutzen, neue Ideen generieren und Zeit und Geld sparen kann.

Six ways process mining in 2020 can save your business transformation

The lack of information about existing processes kills 70% of large transformation projects and around 50% of RPA projects…alarming statistics. Triggering this failure rate is a lack of understanding about existing processes, and the disconnect between the discovery, visualization, analysis, and execution of existing data. So, banish the process guesswork! Utilizing process mining technology unlocks the information, visibility, and quantifiable numbers needed to improve end-to-end processes for sustainable transformation.


Read this article in German:

Wie Process Mining 2020 Ihre erfolgreiche Geschäftstransformation 2020 sicherstellt

 


Process mining in 2020

Your data fingerprint

If we consider the figures again (from McKinsey and Ernst & Young (EY) respectively), the digitization of products and services is forcing companies of all shapes and sizes, and in all industries, to dramatically reconsider their existing business models and the processes they implement. Because all activities are different, process mining uses the unique data—your company’s business fingerprint—to automatically piece together a digital representation of all your existing business processes.

This digital evidence enables us to visualize exactly how processes are operating (both the conventional path and variable executions) down to individual process instances. In other words, you can unearth processes which lie unseen or dormant, revealing hidden value, and providing an instant understanding of complex processes in minutes rather than days.

Triggering dormant success

Then, with standardized and configurable notifications and KPIs, you can further understand the immediate impact of any process change made—meaning that failure rates decrease, and company confidence is improved. And that’s not all: everyone from new employees to the C-suite can better visualize, understand, and explain their organization’s processes. This ensures that the right process change is secure and that improvement has the intended impact, every time.

Unleash the power of process

In business, we all answer to somebody, and it is critical to connect problems to real solutions. The everyday functions of companies—the processes upon which they are built—are the connection to business tech, from “process” mining to robotic “process” automation. Without process understanding, the tech is redundant because we have no idea how work has flowed in an existing application. Process is the lifeblood of operations.

Process mining: your point of differentiation

Transformative digital technology integration

In addition to the DVA of process mining—discover, visualize, analyze—is the power to monitor real-time process execution automatically from your existing data. This simple point and click assessment can provide an instant understanding of complex processes. Within transformation projects, which by their very nature require the profound transformation of business and organizational activities, processes, competencies, and models, process mining provides the visual map to facilitate immediate action.

This self-sustaining approach across an entire organization is what leads to genuinely sustainable outcomes, and builds a process culture within an organization. By taking this holistic approach, digital transformation and excellence professionals will find it easier to leverage processes, justify their projects and programs, and address behavioral change challenges.

This includes the facilitation of transformative digital technology integration, operational agility and flexibility, leadership and culture, and workforce enablement.

Three ways process mining can save your business in a transformation project:

  • You require 100% operational transparency: To chart all your transactions requires complete process transparency. This capability allows the direct comparison of actual operations to the ways that processes were designed to occur. This conformance checking can automatically identify the highest priority issues and tasks, and highlight root causes, so we can take immediate action.
  • You must reduce costs and increase efficiency: Signavio research shows that almost 60% of companies incurred additional charges from suppliers due to process inefficiencies. Process mining can help your business reduce costs because it finds vulnerabilities and deviations, whilst highlighting what is slowing you down, including the bottlenecks and inefficiencies hampering revenue. Process mining beefs up operational health via process improvements and pre-emptive strategies.
  • You must optimize the buying and selling cycle: Is shipping taking too long? Which of your suppliers supports you least? Who is outperforming whom? Process mining is your one-click trick to finding these answers and identifying which units are performing best and which are wasting time and money.

Process mining and robotic process automation (RPA)

The beneficial fusion of both technologies

Robotic process automation (RPA) provides a virtual workforce to automatize manual, repetitive, and error-prone tasks. However, successful process automation requires exact knowledge about the intended (and potential) benefits, effective training of the robots, and continuous monitoring of their performance. With this, process mining supports organizations throughout the lifecycle of RPA initiatives by monitoring and benchmarking robots to ensure sustainable benefits.

Upgrade robot-led automation

These insights are especially valuable for process miners and managers with a particular interest in process automation. To further upgrade the impact of robot-led automation, there is also a need for a solid understanding of legacy systems, and an overview of automation opportunities. Process mining tools provide critical insights throughout the entire RPA journey, from defining the strategy to continuous improvement and innovation.

 Three ways process mining can save your business in an RPA lifecycle project

  • You require process overviews, based on specific criteria: To provide a complete overview of end-to-end processes, involves the identification of high ROI processes suitable for RPA implementation. This, in turn, helps determine the best-case process flow/path, enabling you to spot redundant processes, which you may not be aware of, before automating.
  • You are unsure how best to optimize human-digital worker cycles: By mining the optimal process flow/path, we can better discover inefficient human-robot hand-off, providing quantifiable data on the financial impact of any digital worker or process. This way, we can compare human vs. digital labor in terms of accuracy, efficiency, cost, and project duration.
  • You need to understand better how RPA supports legacy processes and systems: RPA enables enterprises to keep legacy systems by making integration with cloud and web/app-based services, transforming abilities to connect legacy with modern tools, applications, and even mobile apps. Efficiency and effectiveness will be improved across crucial departments, including HR, finance, and legal.

Process mining for improved customer experience and mapping

Reconfigure customer delight

The integration of process mining with other technologies is also essential in growing the process excellence and management market. With process management, we already talk about customer engagement, which empowers companies to shift away from lopsided efficiency goals, which often frustrate customers, towards all-inclusive effectiveness goals, built around delighting customers at the lowest organizational cost possible.

Further, the application of process mining within customer journey mapping (CJM)—especially when linked to the underlying processes—offers the bundled capability of better business understanding and outside-in customer perspective, connected to the processes that deliver them. So, by connecting process mining with a customer-centric view across producing, marketing, selling, and providing products and services, customer delight becomes a strategic catalyst for success.

Unlock the full potential of process

Trigger process mining initiatives with Signavio Process Intelligence, and see how it can help your organization uncover the hidden value of process, generate fresh ideas, and save time and money. Discover more in our white paper, Managing Successful Process Mining Initiatives with Signavio Process Intelligence.

5 Things You Should Know About Data Mining

The majority of people spend about twenty-four hours online every week. In that time they give out enough information for big data to know a lot about them. Having people collecting and compiling your data might seem scary but it might have been helpful for you in the past.

 

If you have ever been surprised to find an ad targeted toward something you were talking about earlier or an invention made based on something you were googling, then you already know that data mining can be helpful. Advanced education in data mining can be an awesome resource, so it may pay to have a personal tutor skilled in the area to help you understand. 

 

It is understandable to be unsure of a system that collects all of the information online so that they can learn more about you. Luckily, so much data is put out every day it is unlikely data mining is focusing on any of your important information. Here are a few statistics you should know about mining.

 

1. Data Mining Is Used In Crime Scenes

Using a variation of earthquake prediction software and data, the Los Angeles police department and researchers were able to predict crime within five hundred feet. As they learn how to compile and understand more data patterns, crime detecting will become more accurate.

 

Using their data the Los Angeles police department was able to stop thief activity by thirty-three percent. They were also able to predict violent crime by about twenty-one percent. Those are not perfect numbers, but they are better than before and will get even more impressive as time goes on. 

 

The fact that data mining is able to pick up on crime statistics and compile all of that data to give an accurate picture of where crime is likely to occur is amazing. It gives a place to look and is able to help stop crime as it starts.

 

2. Data Mining Helps With Sales

A great story about data mining in sales is the example of Walmart putting beer near the diapers. The story claims that through measuring statistics and mining data it was found that when men purchase diapers they are also likely to buy a pack of beer. Walmart collected that data and put it to good use by putting the beer next to the diapers.

 

The amount of truth in that story/example is debatable, but it has made data mining popular in most retail stores. Finding which products are often bought together can give insight into where to put products in a store. This practice has increased sales in both items immensely just because people tend to purchase items near one another more than they would if they had to walk to get the second item. 

 

Putting a lot of stock in the data-gathering teams that big stores build does not always work. There have been plenty of times when data teams failed and sales plummeted. Often, the benefits outweigh the potential failure, however, and many stores now use data mining to make a lot of big decisions about their sales.

 

3. It’s Helping With Predicting Disease 

 

In 2009 Google began work to be able to predict the winter flu. Google went through the fifty million most searched words and then compared them with what the CDC was finding during the 2003-2008 flu seasons. With that information google was able to help predict the next winter flu outbreak even down to the states it hit the hardest. 

 

Since 2009, data mining has gotten much better at predicting disease. Since the internet is a newer invention it is still growing and data mining is still getting better. Hopefully, in the future, we will be able to predict disease breakouts quickly and accurately. 

 

With new data mining techniques and research in the medical field, there is hope that doctors will be able to narrow down problems in the heart. As the information grows and more data is entered the medical field gets closer to solving problems through data. It is something that is going to help cure diseases more quickly and find the root of a problem.

 

4. Some Data Mining Gets Ignored

Interestingly, very little of the data that companies collect from you is actually used. “Big data Companies” do not use about eighty-eight percent of the data they have. It is incredibly difficult to use all of the millions of bits of data that go through big data companies every day.

 

The more people that are used for data mining and the more data companies are actually able to filter through, the better the online experience will be. It might be a bit frightening to think of someone going through what you are doing online, but no one is touching any of the information that you keep private. Big data is using the information you put out into the world and using that data to come to conclusions and make the world a better place.

 

There is so much information being put onto the internet at all times. Twenty-four hours a week is the average amount of time a single person spends on the internet, but there are plenty of people who spend more time than that. All of that information takes a lot of people to sift through and there are not enough people in the data mining industry to currently actually go through the majority of the data being put online.

 

5. Too Many Data Mining Jobs

Interestingly, the data industry is booming. In general, there are an amazing amount of careers opening on the internet every day. The industry is growing so quickly that there are not enough people to fill the jobs that are being created.

 

The lack of talent in the industry means there is plenty of room for new people who want to go into the data mining industry. It was predicted that by 2018 there would be a shortage of 140,000 with deep analytical skills. With the lack of jobs that are being discussed, it is amazing that there is such a shortage in the data industry. 

 

If big data is only able to wade through less than half of the data being collected then we are wasting a resource. The more people who go into an analytics or computer career the more information we will be able to collect and utilize. There are currently more jobs than there are people in the data mining field and that needs to be corrected.

 

To Conclude

The data mining industry is making great strides. Big data is trying to use the information they collect to sell more things to you but also to improve the world. Also, there is something very convenient about your computer knowing the type of things you want to buy and showing you them immediately. 

 

Data mining has been able to help predict crime in Los Angeles and lower crime rates. It has also helped companies know what items are commonly purchased together so that stores can be organized more efficiently. Data mining has even been able to predict the outbreak of disease down to the state.

 

Even with so much data being ignored and so many jobs left empty, data mining is doing incredible things. The entire internet is constantly growing and the data mining is growing right along with it. As the data mining industry climbs and more people find their careers mining data the more we will learn and the more facts we will find.