Data science is a “concept to unify statistics, data analysis, machine learning and their related methods” in order to “understand and analyze actual phenomena” with data. It employs techniques and theories drawn from many fields within the broad areas of mathematics, statistics, information science, and computer science, in particular from the subdomains of machine learning, data mining, databases, and visualization.

Topology and business

Nothing as applicable as an abstract theory. Topology might seem abstract and foundational (like number theory) but has a wide range of applications in life. The combination of topology and data science, in particular, is called topological…

What is persistent homology?

About topological data analysis and persistent homology in particular.

Grakn has something new and exciting but not yet ready for prime time.

Graph nets

Learning from graphs rather than from tabular data.

Cross validated predictions

Crossval predictions instead of the usual fit-predict approach.

Kernel visualization

Kernels can be seen as histogram generalizations.

Visualizing decision tree boundaries

Visualizing the decision intersections for a 2D classification via decision trees.

Isotonic regression

A lesser-known, step-like function approximation method.

Forests for feature importance

A classic example of using a (random) forest classifier to sort features.

Grid searching the optimal hyperparameter

Using sklearn GridSearch to optimize hyperparameters.