Why 99% of business users find it hard to use a self-serve data platform. Because it’s actually DIY.
Many a time I have witnessed conversations between the business team and technology team that go something like this:
Biz: We don’t have data.
Tech: We have already given you the data.
Biz: We can’t find it.
Tech: Let me show you. It’s very simple. You go here, then you do this, click here, then click here, then do this, then do that …
Biz: ok. (scratches his head)
Every few weeks, this conversation would repeat for the same data. …
How you can use anomaly detection to identify customers likely to churn? And also provide required information to reduce churn.
A few years ago, I helped an Indian payments company build its data platform. The company provided card swipe machines to its customers. Its customers included large retail chains to mom-n-pop stores. The company had a huge feet-on-street team to acquire and service customers.
The company’s revenue had two components — a fixed rental fee per machine, and a variable transaction processing fee. The higher the number of card swipes on the machine, the better for the company.
How time-lag in your funnel impacts derived metrics.
Say you work for a financial services company in the lending business. You receive loan applications that you process in exactly 2 days. For every 20 applications, you approve one. This means your approval ratio is 5%.
Say below is how your applications and approvals look like. Your approvals lag the applications by 2 days.
Why anomaly detection at scale is hard, expensive, and noisy.
Say you work for an online retailer. Your store sells 1000 products. You want to run anomaly detection on daily orders for each of these 1000 products. This means the following:
Number of Metrics = 1 (Orders)
Number of Dimension Values = 1000 (1000 products)
Number of Metric Combinations = 1000 (1 metric * 1000 dimension values)
This means the anomaly detection algorithm runs 1000 times every day, once for each metric combination.
Let’s say you want to monitor Orders by another dimension — State (50 unique values). You also…
or is it Natural Language Query being sold as Search?
Say you want to search “augmented analytics”. You go to Google, enter “augmented analytics” and Search. You get 10 results from different websites on the home screen.
Now imagine the following. You go to search engine XXX . You must first enter the website, so you enter “gartner.com”. Then you enter “augmented analytics”. Click Search and all you get is one result from gartner.com.
This is the current state of Search-driven Analytics.
You first select a table. Then you specify the query in natural language. Click Search. The system uses…
Let’s say I recently joined a fictitious online fashion store as an analyst. One morning I receive a message from Bill, our VP of Analytics.
Bill: Jack from reverse logistics team called. said he sees lots of return requests being placed yesterday. Most of these requests are for Nike t-shirts. @Sachin can you please investigate?
Me: Sure @Bill. Looking into it right away.
So I go ahead and start investigating.
I write a query to pull total return requests data for last 30 days. Total return requests indeed went up by ~10% yesterday.
I then modify my query to pull…
Anomaly is when a metric deviates from normal. But how do you define what is normal?
Over the last few weeks, I gave demo of our product to a few friends and acquaintances. These people range from engineers to product managers to analysts to non-technical business folks. One of the key takeaways from these conversations was that most people equated anomalies with rule-based alerts — an alert that triggers when a metric goes outside a specified range. For example, alert when CPU utilization is greater than 90%.
If you do a Google Search for anomalies, you get the following result…
Why anomaly detection for a business metric differs from that for a technical metric.
In my previous post What is an Anomaly, we looked at why it is not simple to define what is normal for a metric. What’s normal for a metric depends on at least two factors — granularity and the amount of data. In this post, we’ll look at how the metric type also matters in anomaly detection.
Let’s classify metrics into two — technical and business.
Technical metrics are the ones that engineering teams monitor. Examples include CPU and memory utilization, number of API requests.
As an entrepreneur in India, I frequently came across comparisons of Indian and Chinese economies — Indian economy is just 10–15 years behind China’s; Indian economy will follow China’s footsteps, rather than America’s; Indian entrepreneurs should emulate successful business models from China;…
Intrigued, I set out to build my own mental model.
I wrote this post a few years ago. This is one of my learnings consulting for a few Indian companies across manufacturing, telecom & CPG.
In my previous post, I talked about the broken last leg in the Indian distribution system, one which is fully dependent on the salesman.
In developed markets, technology takes over the salesman’s job. There’s no manual order taking by a salesman. Whenever the store wants more product, an order is electronically sent to the manufacturer. In industry jargon, its a Pull based system — the store pulls or requests for stock whenever it wants.