Euclidean distance measures the length of the shortest line between two points. It’s commonly used in machine learning algorithms. Learn how to calculate it in Python.
Density-based spatial clustering of applications with noise (DBSCAN) is a clustering algorithm used to define clusters in a data set and identify outliers. Here’s how it works.
A Gaussian mixture model is a soft clustering machine learning method used to determine the probability each data point belongs to a given cluster. Learn more.
Term frequency-inverse document frequency (TF-IDF) is an NLP technique that measures the importance of each word in a sentence. Here’s how to create your own.
Tesseract is an optical character recognition engine used to extract text from images, and it can be accessed in Python through the library pytesseract. Here’s what to know.