Much Ado About Machine Learning
Updated: Sep 2, 2021

Discussions of artificial intelligence (AI), machine learning (ML), and even “deep learning” permeate every industry on earth now. Evangelists proclaim that AI will change everything we do. They make us all think, “If I’m not using AI in every meaningful business process, then I am behind the power curve…and somebody else is coming to eat my lunch.” Business leaders praise their AI/ML initiatives and pour investment into their data science capabilities. AI/ML solutions get published with results most people do not understand, like the ROC curve above (which actually happens to be demonstrating poor performance). And those who are not fully committed to AI/ML are not sure how much they are missing out on, or whether it matters.
For businesses without billion-dollar R&D budgets or dozens (hundreds even?) of data scientists, what does this torrent of AI/ML chatter really mean? Should they be standing up data science departments, or paying for specialized AI software licenses? For most businesses the answer is “Probably not.” Most businesses encounter an AI/ML use case relatively infrequently, so maintaining the capability in-house is probably not worthwhile. Open source packages for common software languages (e.g., Python and R) provide for almost any business need that an AI/ML solution would provide, without paying a hefty licensing fee.
Before the main discussion, a brief history lesson. The term “artificial intelligence” was coined in its modern usage at a Dartmouth College conference in 1956 [Harvard note on history of AI]. Generally, AI refers to any sort of technology that seeks to mimic human intelligence. The ultimate test of AI is often considered to be the “Turing Test” named for British computing pioneer Alan Turing. An AI solution passes the Turing Test if it can convince a human that it is also human. Machine Learning is generally understood to mean the set of algorithms that can be applied to improve the performance of an artificial intelligence solution by “learning” over time. It is therefore considered a subset of artificial intelligence [2016 Forbes Article].
Some AI/ML problems require very sophisticated methods (image recognition, natural language processing, adversarial networks) and are appropriate for tough problems with very large markets. Most businesses face problems which do not require those specialized solutions, but they are still at a loss for how to develop answers. For a large portion of these businesses, solutions can be found with statistical learning techniques (the algorithms on which sophisticated artificial intelligence is built) without the time and cost of building something that could be termed “AI.”
Let’s think about two business leaders and the choices they make in statistical learning projects.
Streaming Reed
Streaming Reed leads one of the world’s largest streaming video companies, with licensed content from every major production company in the world. He has stood up a production company himself and his home-grown content is now competitive with the best in the marketplace. In the United States, virtually every person is a paying subscriber or is one degree of separation from a paying subscriber. Streaming Reed has thousands, if not tens of thousands, of data scientists and machine learning engineers who have built algorithms that identify the behaviors of subscribers and accurately predict the content they will respond to. A LinkedIn people search of “data science” and his company’s name produces 27,000 results. The data science army’s outputs direct Streaming Reed’s production company, advertising campaigns, and real-time suggestions for viewers. Streaming Reed’s business is built around AI/ML and he has a clear need for a large, dedicated staff of professionals working around the clock.
Apparel Annie
Apparel Annie is a leader in retail apparel stores. She has two brands she leads: Luxury Label and Common Clothes. The two brands originated as separate companies but are now both led by Apparel Annie after coming together in an acquisition. Supply chain efficiencies, customer preferences, and shared leadership have made the two brands seem less and less distinct over time. Apparel Annie is trying to determine (1) whether she should retain two separate brands, and (2) if she keeps two separate brands, whether to use shared advertising campaigns. She has data on foot traffic, customer demographics, sales volumes, and individual customer purchase activity. For customers who carry a store-branded credit card she has much more data. But Apparel Annie does not have a team of data scientists and she is not looking to build an always-on machine learning platform to offer real-time suggestions to her customers. Apparel Annie needs answers that are faster, tailored to the problem in front of her, and accessible to her audience (the marketing and branding leaders she works with). And she only encounters a problem like this a couple of times each year.
Streaming Reed does not need any help. He can achieve any AI/ML project he can imagine with his in-house staff. Apparel Annie has no interest in retaining AI/ML talent as permanent staff, but she could use some help right now. Apparel Annie is in luck because the proliferation of open source statistical learning packages makes it possible to get an answer to her business problem – without the permanent staff and without great expense.
So, what kinds of solutions would Apparel Annie pursue? With her customer and transaction data she could seek to build classifier models [Ox Road Observation on Classifier Models]. These could tell her whether the customers are really different between her two branded stores, Luxury Label and Common Clothes. Examples might include a logistic regression model capable of predicting the likelihood that a Common Clothes customer also shops at Luxury Label. Or she might benefit from a clustering algorithm that more clearly defines the characteristics which differ between the two stores’ customer groups. A clustering algorithm might even give her insight into customer segmentations she had never thought of before. Selecting the algorithm, adjusting parameters, and interpreting results all have their challenges, but they do not require full-time staff or years of work.
If you are an Apparel Annie (or any other kind of Annie), facing a problem without a clear solution, don’t think that you need to become a Streaming Reed. Talk to us at Ox Road Partners and we’ll see if there is a better way to meet your needs.