What`s it all about

Decision Trees, Regression Trees, Classification Trees, Boosted Trees, Rotation Forest, Random Forest.. To see the Machine Learning Forests for the trees, we give you an overview of the most important terms:

Data

Data can be divided into structured and unstructured data. Structured data is, for example, sensor data stored as text files or sales figures from a data warehouse. Both types of data can be analyzed with supervised or unsupervised machine learning methods.

Machine Learning

Machine learning (ML) uses statistics, computer science, and artificial intelligence to extract knowledge from data, which has led to breakthrough results in recent years. The use of ML techniques to find patterns in large amounts of data is also known as data mining. Classical coding typically attempts to translate a conceptualized model into code that converts inputs into outputs. ML changes this approach: the machine "learns" the model itself by comparing input data to output data. "Machine learning is the science of getting computers to act without being explicitly programmed." The advantage of this approach is that even new inputs can be correctly assigned by the system without having to explicitly code rules first.

Supervised Machine Learning

The idea behind supervised learning is that there are already correct answers to this data from the past. For example, you may have had sales in the past and now want to extrapolate your sales data into the future using a simple linear regression. If instead you want to categorize your production process images into scrap and yield, you can use logistic regression. To categorize a risk into more than two classes, extend logistic regression multinomial. Product recommendations for customers are also a quite often used example, in this case using matrix factorization.

Unsupervised machine learning

Using frameworks to find patterns by clustering your multidimensional data or isolating unusual data points. Unlike supervised learning, your data doesn't have to be labeled when you try to gain insights from it through unsupervised learning.

Transfer Learning: standing on the shoulders of giants

Transfer learning involves the use of pre-trained models that can significantly reduce the effort, time, and cost of training. This can be a fruitful approach for computer vision projects, because instead of starting from scratch, one can use successfully trained models for similar image objects. This can also be very useful if one should have only a limited number of training images available.

Data Driven Company

The opportunities in the field of machine learning are many and the pace of development is rapid. Whether your organization is already staffed with analysts, data engineers, data scientists, chief data officers, and statisticians, or you're just starting out: Take advantage of the opportunity offered by the "sexiest job of the 21st century" (Harvard Business Review: Data Scientist) because:

What the future might bring us

“By 2020, some 50 billion smart devices will be connected, along with additional billions of smart sensors, ensuring that the global supply of data will continue to more than double every two years” (McKinsey Quarterly: Straight Talk About Big Data). The real surprise about this McKinsey survey, however, is that today it is estimated that only about 1% of all this data is analyzed at all. Changing that is our mission.

Our offer to you

If you are looking for assistance in discovering, interpreting, and communicating meaningful patterns in your data, we look forward to hearing from you. We can help you gain new insights from your data by uncovering unknown trends, seasonality, or patterns. We offer analysis as a service as well as training.

Below you'll find successful seeds for data analysis. Click on the "Data Science Petri dish" for more details:

Tutorials

  • Shopping cart analysis with FP-Growth by Spark: Learn how to analyze shopping carts in terms of sales items per sales transaction ID (click here to read on Medium).

  • Recommender Systems: Collaborative Filtering between article and customer: Sparsity, Similarity and Implicit Binary Collaborative Filtering explained step by step with Python code (click here to read on Medium).

You can find an overview of all our other articles here.

Data Driven Dealings Development

The Data Driven Dealings Development book (click on this link to read an extract on Amazon) is for anyone who wants to analyze sales data and shopping carts and create per-customer product recommendations using Python. The book covers both the theoretical aspects about analytics and the practical part of coding, including complete code and data. Both unsupervised and supervised machine learning techniques are applied using the Pandas, Scikit-Learn, Tensorflow, and Turicreate stacks (among others).

Statistical Process Control

f you need to deliver stable, high quality with a high number of cycles and a high level of automation, statistical process control (SPC) can help you do this (Jesko Rehberg for Additive Academy: Quality Control Charts for stable processes in the food industry). Quality control charts and process capability analyses help you to keep your process under control and to be informed in time when systematic changes occur in your process.

Let us get in contact

DAR combines a python professor physicist and a closely cutting controller to drive your business analytics. We look forward to meeting you.

Our point of contact

Benefit from our data products. We love making Data Science easily accessible for you.