If you have ever worked with relational databases, you have probably come across situations where a
JOIN between certain combinations of tables has produced incorrect outputs. In some cases, we might catch these right away, but at other times, these issues escape attention until a user or customer notifies us of the incorrect results.
This has a high cost because trust in data systems is hard to earn and easy to lose.
When something like this happens, we usually scramble to fix the problem on a case-by-case basis, but if we pause for a moment, we can actually start to…
Everybody and their dog want to be data-driven. There are many ways data can augment our decision-making and drive incredible results.
We are witnessing an unbelievable transformation in every sector, driven by advances in AI and analytics. Who doesn’t want in on the magic?
Yet, for most of us, building these systems in-house is still out of reach because most organizations have not built the muscle or acquired the resources to drive such initiatives.
So, where can we begin our data-driven journey? Do we need analysts to give us golden insights? Do we need data scientists to build predictive models?
In the previous post, I wrote about how you can measure the quality of your data assets. I also alluded that you should prioritize your measurement efforts based on the value the data bring to your business since the act of measurement itself has a cost associated with it that may exceed the benefit that information provides. In this post, I’ll be going over how you can actually quantify the value of data, and ways to apply it to your information strategy to drive meaningful results.
Most companies have an obsession with measuring OKRs, KPIs, and business metrics. The mantra…
You need reliable information to make decisions about risk and business outcomes. Often this information is so valuable that you may seek to purchase it directly through 3rd-party data providers to augment your internal data.
Yet, how often do you consider the impact of data quality on your decisions? Poor information can have detrimental effects on business decisions, and consequently, your business performance, innovation, and competitiveness. In fact, according to Gartner, 40% of all business initiatives fail to achieve their targeted benefits as a result of poor data quality.
Organizations that treat information as a corporate asset should engage in…
The advantage of using data to drive innovation and business results is often discussed, with mounds of evidence backing up its efficacy. However, data by itself has 0 value. It is people that dissect data (digital or analog) to extract the information required to solve tough problems. It makes sense, then, to talk about how the human component of our systems affects the quality of our solutions.
Among the human factors that lead to extraordinary performance, team diversity has been one of the least understood and highly under-utilized levers at our disposal.
One of the most precious pieces of information you can have to run your business effectively is to understand the revenue being generated as a direct result of some business activity.
Knowing which efforts are paying off allows you to double down and scale them up, while understanding where time and money are being wasted allows you to get more done by doing less.
This is the problem of attribution.
Marketers deal with this issue head-on because they must spend limited resources on multiple channels of influence. Prospects have many touch-points with the business during their customer journey. …
Conversion rates, or the percentage of people taking some desired action in your business, are among the most fundamental metrics you need to be aware of.
Track the conversion rates over a series of steps, and you have yourself a (conversion) funnel.
Looking at each step’s conversion rates can be critical for identifying bottlenecks and identifying issues with steps in your process; moreover, looking at the funnel as a whole gives you a mathematical engine that tells you how many outputs you can expect to derive from inputs into your process.
The latter is very useful when you know how…
Too often in business, we ignore measurements with small sample sizes because they are not “statistically significant,” but rarely, if ever, do we actually do the math to support that decision!
The fact is that it is precisely when we are highly uncertain about something that any data will greatly reduce our uncertainty.
I’m going to play out 2 scenarios from a fantastic book called “How to Measure Anything” by Douglas Hubbard so you can experience the “Aha!” moment for yourself.
By the end of this blog post, you’ll have an understanding of situations where small data can tell you…
When I first started building the Data Vault at Georgian, I couldn’t decide what column data type to use as my tables’ primary key.
I had heard that integer joins vastly outperform string joins, and I was worried about degrading join performance as our data grew.
In the SQL databases of the operational world, this decision is pretty much made for you by giving you auto-incrementing integer primary keys out of the box.
However, in the Data Warehousing world, whether you’re building a Kimball or Data Vault or something else, you need to make this choice explicitly.
If you’ve ever tried to build an enterprise data warehouse using BigQuery, you’ll quickly realize that the auto-incrementing primary keys you were so fond of in operational databases are not a thing in BigQuery.
This is by design. It’s not the job of your data warehouse to ensure referential integrity as your operational databases do, but rather to store and process tons of historical data.
However, this shift in perspective does not make the problem of finding a way to represent unique entities go away. …