Home - Data Science Blog

Was ist eigentlich Machine Learning? Artikelserie

June 6, 2017/in Artificial Intelligence, Business Analytics, Business Intelligence, Data Mining, Data Science, Deep Learning, Machine Learning, Main Category/by Benjamin Aunkofer

Machine Learning ist Technik und Mythos zugleich. Nachfolgend der Versuch einer verständlichen Erklärung, mit folgenden Artikeln: Unüberwachtes vs überwachtes Lernen Regression vs Klassifikation Parametrisierte vs nicht-parametrisierte Lernenverfahren Online- vs Offline-Lernen […]

Geht mit Künstlicher Intelligenz nur „Malen nach Zahlen“?

May 17, 2017/in Artificial Intelligence, Big Data, Business Analytics, Data Mining, Data Science, Data Science News, Deep Learning, Machine Learning, Predictive Analytics, Projectmanagement/by Conny Dethloff

Mit diesem Beitrag möchte ich darlegen, welche Grenzen uns in komplexen Umfeldern im Kontext Steuerung und Regelung auferlegt sind. Auf dieser Basis strebe ich dann nachgelagert eine Differenzierung in Bezug […]

Unsupervised Learning in R: K-Means Clustering

May 15, 2017/in Data Mining, Data Science, Machine Learning, R Statistics, Statistics, Tutorial/by Markus Lang

Die Clusteranalyse ist ein gruppenbildendes Verfahren, mit dem Objekte Gruppen – sogenannten Clustern zuordnet werden. Die dem Cluster zugeordneten Objekte sollen möglichst homogen sein, wohingegen die Objekte, die unterschiedlichen Clustern […]

In eigener Sache: Der Data Leader Day 2017

May 11, 2017/in Data Science News, Use Cases/by events

Der Data Science Blog ist Co-Organisator des Data Leader Day 2017 Der Data Leader Day am 09.11.2017 ist ein Event für Unternehmen aus dem deutschsprachigen Raum, das sich mit den Möglichkeiten und Lösungen […]

Artificial Intelligence and Data Science in the Automotive Industry

May 6, 2017/in Artificial Intelligence, Big Data, Business Analytics, Business Intelligence, Cloud, Connected Car, Data Science, Machine Learning, Main Category, Manufacturing, Projectmanagement, Use Case, Use Cases/by Volkswagen

Data science and machine learning are the key technologies when it comes to the processes and products with automatic learning and optimization to be used in the automotive industry of the future. This article defines the terms “data science” (also referred to as “data analytics”) and “machine learning” and how they are related.

Entropie – Und andere Maße für Unreinheit in Daten

May 2, 2017/in Artificial Intelligence, Business Analytics, Data Mining, Data Science, Data Science Hack, Machine Learning, Python/by Benjamin Aunkofer

Dieser Artikel ist Teil 1 von 4 der Artikelserie Maschinelles Lernen mit Entscheidungsbaumverfahren. Hierarchische Klassifikationsmodelle, zu denen das Entscheidungsbaumverfahren (Decision Tree) zählt, zerlegen eine Datenmenge iterativ oder rekursiv mit dem […]

What makes a good Data Scientist? Answered by leading Data Officers!

April 24, 2017/in Carrier, Education / Certification, Gerneral, Interviews, Main Category/by Benjamin Aunkofer

What makes a good Data Scientist? A question I got asked recently a lot by data science newbies as well as long-established CIOs and my answer ist probably not what […]

Consider Anonymization – Process Mining Rule 3 of 4

April 19, 2017/in Audit Analytics, Big Data, Business Analytics, Business Intelligence, Cloud, Data Migration, Data Mining, Data Science, Data Security, Data Warehousing, Main Category, Process Mining/by Anne Rozinat & Christian W. Günther

This is article no. 3 of the four-part article series Privacy, Security and Ethics in Process Mining. Read this article in German: “Datenschutz, Sicherheit und Ethik beim Process Mining – Regel […]

Interview mit Prof. Dr. Kai Uwe Barthel über Data Science mit Deep Learning

April 2, 2017/in Artificial Intelligence, Big Data, Data Mining, Data Science, Deep Learning, Experience, Interview mit CIO, Interviews, Machine Learning, Main Category/by Benjamin Aunkofer

Interview mit Prof. Dr. Barthel, Chief Visionary Officer der Pixolution GmbH in Berlin, über Funktion, Einsatz und Einstieg in künstliche neuronale Netze. Prof. Kai Barthel ist Gründer und CVO der Pixolution […]

Der Blick für das Wesentliche: Die Merkmalsselektion

March 30, 2017/in Big Data, Business Analytics, Data Mining, Data Science, Data Science Hack, Machine Learning, Predictive Analytics, Python, Tool Introduction, Tutorial/by Christoph Gresch

In vielen Wissensbasen werden Datensätze durch sehr große Merkmalsräume beschrieben. Während der Generierung einer Wissensbasis wird versucht jedes mögliche Merkmal zu erfassen, um einen Datensatz möglichst genau zu beschreiben. Dabei […]

All/Apache Spark/Data Science Hack/Hadoop/Java/JavaScript/Neo4J/Octave/optimization/Python/R Statistics/Scala/SQL/Tools/Tutorial

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines

October 1, 2024

The Crucial Intersection of Generative AI and Data Quality: Ensuring Reliable Insights

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines

August 22, 2024

Looking Ahead: The Future of Data Preparation for Generative AI

September 20, 2023

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Lambda Architecture vs Kappa Architecture for Big Data Cloud Platforms? Let us discuss which architecture suits best for what use cases.

June 27, 2023

Big Data – Lambda or Kappa Architecture?

November 10, 2022

Graphendatenbank Neo4j 5 Release veröffentlicht

Google Cloud run with Infrastructure by Code using Terraform

November 8, 2022

Google Cloud Run – Tutorial

October 6, 2022

Control the visibility of the PowerBI visuals based on condition

July 4, 2022

5 Apache Spark Best Practices

December 6, 2021

process.science presents a new release

August 30, 2021

Process Mining mit Fluxicon Disco – Artikelserie

August 3, 2016

Was ist eigentlich Apache Spark?

November 3, 2015

Die Abschätzung von Pi mit Apache Spark

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines

October 1, 2024

The Crucial Intersection of Generative AI and Data Quality: Ensuring Reliable Insights

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines

August 22, 2024

Looking Ahead: The Future of Data Preparation for Generative AI

September 20, 2023

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

Lambda Architecture vs Kappa Architecture for Big Data Cloud Platforms? Let us discuss which architecture suits best for what use cases.

June 27, 2023

Big Data – Lambda or Kappa Architecture?

November 10, 2022

Graphendatenbank Neo4j 5 Release veröffentlicht

Google Cloud run with Infrastructure by Code using Terraform

November 8, 2022

Google Cloud Run – Tutorial

October 6, 2022

Control the visibility of the PowerBI visuals based on condition

July 4, 2022

5 Apache Spark Best Practices

December 6, 2021

process.science presents a new release

August 30, 2021

Process Mining mit Fluxicon Disco – Artikelserie

June 1, 2016

Eine Hadoop Architektur mit Enterprise Sicherheitsniveau

May 20, 2016

Eine Hadoop Architektur mit Enterprise Sicherheitsniveau

August 22, 2019

Wie passt Machine Learning in eine moderne Data- & Analytics Architektur?

January 7, 2019

Über die Integration symbolischer Inferenz in tiefe neuronale Netze

March 26, 2018

Distributed Computing – MapReduce Algorithmus

November 14, 2017

Big Data Essentials – Intro

October 25, 2017

Aika: Ein semantisches neuronales Netzwerk

August 18, 2015

Extraktion von Software-Metriken aus Java-Dateien mit ANTLR4

May 17, 2016

Handeln in Netzwerken ohne Enmesh-Effekt

November 10, 2022

Graphendatenbank Neo4j 5 Release veröffentlicht

February 29, 2020

Introduction to Recommendation Engines

February 18, 2016

Data Science mit Neo4j und R

April 12, 2016

KNN: Rückwärtspass

January 13, 2019

Training eines Neurons mit dem Gradientenverfahren

September 15, 2016

Die Rastrigin-Funktion

April 26, 2016

Machine Learning mit Python – Minimalbeispiel

November 3, 2015

Die Abschätzung von Pi mit Apache Spark

October 30, 2015

Wie lernen Maschinen?

October 13, 2015

Wie lernen Maschinen?

July 12, 2021

Coffee Shop Location Predictor

June 6, 2021

Rethinking linear algebra part two: ellipsoids in data science

May 10, 2021

How to make a toy English-German translator with multi-head attention heat maps: the overall architecture of Transformer

April 22, 2021

Positional encoding, residual connections, padding masks: covering the rest of Transformer components

April 11, 2021

Bag of Words: Convert text into vectors

April 7, 2021

Multi-head attention mechanism: “queries”, “keys”, and “values,” over and over again

January 27, 2021

On the difficulty of language: prerequisites for NLP with deep learning

December 4, 2020

Top 10 Python Libraries Of All Time

April 28, 2020

Article series: 5 Clean Coding Tips – 5.Put yourself in somebody else’s shoes

April 15, 2020

Article series: 5 Clean Coding Tips – 4. Stop commenting the obvious

Support Vector Machines for Text Recognition

March 20, 2021

Hand Written Alphabet recognition Using Support Vector Machine

November 18, 2020

Web Scraping Using R..!

May 19, 2020

Einführung und Vertiefung in R Statistics mit den Dortmunder R-Kursen!

February 18, 2020

Python vs R: Which Language to Choose for Deep Learning?

February 4, 2020

Multi-touch attribution: A data-driven approach

January 13, 2020

Getting started with the top eCommerce use cases

February 21, 2019

A common trap when it comes to sampling from a population that intrinsically includes outliers

January 17, 2019

Cross-industry standard process for data mining

December 10, 2018

Fuzzy Matching mit dem Jaro-Winkler-Score zur Auswertung von Markenbekanntheit und Werbeerinnerung

August 13, 2018

How To Remotely Send R and Python Execution to SQL Server from Jupyter Notebooks

February 6, 2016

Neural Nets: Time Series Prediction

November 3, 2015

Die Abschätzung von Pi mit Apache Spark

August 22, 2019

Wie passt Machine Learning in eine moderne Data- & Analytics Architektur?

June 24, 2019

Erstellen und benutzen einer Geodatenbank

August 13, 2018

How To Remotely Send R and Python Execution to SQL Server from Jupyter Notebooks

June 20, 2018

Bringing intelligence to where data lives: Python & R embedded in T-SQL

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines

October 1, 2024

The Crucial Intersection of Generative AI and Data Quality: Ensuring Reliable Insights

Continuous Integration and Continuous Delivery (CI/CD) for Data Pipelines

August 22, 2024

Looking Ahead: The Future of Data Preparation for Generative AI

July 4, 2022

5 Apache Spark Best Practices

December 6, 2021

process.science presents a new release

August 30, 2021

Process Mining mit Fluxicon Disco – Artikelserie

May 10, 2021

How to make a toy English-German translator with multi-head attention heat maps: the overall architecture of Transformer

September 15, 2020

Process Mining mit PAFnow – Artikelserie

April 9, 2020

Ein Einblick in die Aktienmärkte unter Berücksichtigung von COVID-19

December 23, 2019

Artikelserie: BI Tools im Vergleich – Power BI von Microsoft

December 6, 2019

Artikelserie: BI Tools im Vergleich – Datengrundlage

September 20, 2023

Why using Infrastructure as Code for developing Cloud-based Data Warehouse Systems?

July 12, 2021

Coffee Shop Location Predictor

June 6, 2021

Rethinking linear algebra part two: ellipsoids in data science

May 10, 2021

How to make a toy English-German translator with multi-head attention heat maps: the overall architecture of Transformer

April 7, 2021

Multi-head attention mechanism: “queries”, “keys”, and “values,” over and over again

November 18, 2020

Web Scraping Using R..!

September 9, 2020

Test-data management support in Test Automation Development

April 28, 2020

Article series: 5 Clean Coding Tips – 5.Put yourself in somebody else’s shoes

April 15, 2020

Article series: 5 Clean Coding Tips – 4. Stop commenting the obvious

April 6, 2020

Article series: 5 Clean Coding Tips – 3. Take Advantage of the Formatting Tools.