When starting an AI project, there’s a laundry list of machine learning models to choose from: Linear regression, decision trees, SVM, Naive Bayes, kNN, K-Means, Random Forest, and more.
Each model has a unique set of strengths and weaknesses, and is best suited for different applications. At Obviously.AI, we’re building out a feature that makes it easier than ever to select the right model for your application.
Picking an AI Model
Obviously.AI is a no-code AI tool that lets you build machine learning models in clicks — no technical expertise needed.
Soon, you’ll be able to select a specific AI algorithm to use on your data. Regardless of whether you’re using Obviously.AI for your AI needs, here are some of the strengths and weaknesses of popular models.
Decision trees are a great option if you need a highly explainable, simple model.
While they tend to be less accurate than neural networks, they’re much easier to understand, and are made of just 3 elements:
- Root node, or the very first decision split
- Intermediate nodes, or decision splits after the root node
- Leaf node, or the final decision
Decision trees are how we intuitively make decisions, so it’s a great format for anyone to understand a model. Above, the box “Am I hungry?” is the root node, “Have I already had chocolate?” is an intermediate node, and the “Don’t eat it” and “Eat it!” are leaf nodes.
Random Forest is decision trees on steroids.
The above is a simple diagram illustrating how they work.
Random forests are an “ensemble” method, which creates many decision trees and takes the average prediction. In reality, you could have dozens, hundreds, or even thousands of decision trees, which makes Random Forest less explainable than a single decision tree, but typically more accurate.
Naive Bayes is a classification method that uses probability theory to make decisions. Given probabilities of certain events, you can estimate the probability of another event.
Naive Bayes is often used for tasks like spam filtering, text classification, sentiment analysis, and recommender engines. In comparison to Random Forest, Naive Bayes is less likely to overfit.
Multi-Layered Neural Networks
Multi-layered neural networks are at the heart of state-of-the-art AI. The most common phrase you’ll hear is “deep learning,” which refers to neural networks with many layers.
Essentially, these are compound mathematical functions that make predictions by finding parameters that minimize error. When people talk about “black box AI,” they’re referring to deep learning, because we don’t intuitively understand how complex algorithms work, in contrast to something like a decision tree.
However, neural networks are powerful because of their complexity, and have an extremely wide-range of use-cases. They’re generally a lot more accurate than simpler models, especially when using big data.
A perceptron is the simplest form of a neural network. Unlike “deep learning,” which has many hidden layers, a perceptron has just one hidden layer.
They’re far less commonly used nowadays, and are naturally less accurate, but more explainable, than multi-layered neural networks, making them suitable for very small datasets.
As the name implies, the k-Nearest Neighbors (kNN) algorithm works by assuming that a given data point is similar to the nearest K points, where K is roughly the square root of the number of data points.
In the above example, the goal is to classify the green circle as one of two classes, which are symbolized as either the blue squares or the red triangles.
If K is 3, then the 3 “nearest neighbors” are two red triangles and one blue square, so it’ll be classified as a red triangle.
If K is 5, then the 5 “nearest neighbors” are three blue squares and two red triangles, so it’ll be classified as a blue square.
k-Means, while similar sounding to kNN, is a completely different algorithm.
kNN is used for labelled data, which makes it a “supervised learning” problem, while k-Means is used for unlabelled data, making it an “unsupervised learning” problem.
While in kNN, “k” refers to the number of nearest neighbors, the “k” in k-Means refers to the number of clusters.
Logistic regression is similar to the linear regression we all learned in grade school — y=mx+b — except it’s used when the independent variable is a class, like “yes” or “no,” instead of a number.
Gradient Boosting is very similar to Random Forest, in that it’s an ensemble method that creates many decision trees.
The difference is that Gradient Boosting builds decision trees one at a time, rather than independently, to correct errors made by previous trees. This can be more accurate than Random Forest, but note that Gradient Boosting is more sensitive to overfitting, takes longer to train (because trees are built sequentially), and is harder to tune.
Elastic Net is an alternative to Least Squares Regression, where elastic-net attempts to discard less important variables and also reduce overfitting.
As with other “alternative” solutions, elastic-net is more complex, and thus less explainable, than the standard solution.
Selecting the right machine learning model can be tricky, but it’s all based on your goals.
If you have a lot of data and need a highly-accurate model, but don’t care as much about explainability, then multi-layered neural networks may be your best bet.
If explainability is a priority, then decision trees are the way to go. Random Forest and Gradient Boosting fall somewhere in the middle.
Ultimately, however, selecting the right model to use isn’t as big of a concern as it used to be. It used to be a large engineering effort to try out a new approach, but now it can be done in the click of a button.