Jennifer Jones’ Post

The Power of Running Analytics on Incomplete Data – Better Than Guessing In 2022, Sentier published a thought leadership article that started with the admittedly overly simplified statement of “Good data equals good analytics and bad data equals bad analytics.” No one wants to work with bad data, but what qualifies as good data, or more importantly, good enough data? In the world of data-driven decision-making, there's a common misconception that data needs to be pristine and complete before it can be useful. However, waiting for perfect data can lead to missed opportunities for financial growth and strategic planning. Instead, running analytics on imperfect data is often a far better approach than relying on gut feelings or guesswork. Incomplete data occurs more often than we would like. Some examples of data limitations include: • Limited Quantity – New situations such as brand launch or COVID • Sharing Limitations – Data unavailable from copromote teams or other departments such as Medical Affairs • Limited Reporting – Not all payers may report identified sales data or privacy restrictions block a subset of data Despite these data gaps, the show must go on. Stakeholders still want data-driven answers to critical business questions to make the best decisions for the company. Therefore, we must maximize the utility and insights derived from the available data. Some of the effective approaches for overcoming incomplete data include: • Data Augmentation: Using domain knowledge or interpolation, supplement with similar data sources to create proxy data • Synthetic Data Generation: Use techniques like bootstrapping or generative adversarial networks (GANs) to create synthetic data that mimics the properties of the original dataset • Iterative Analytics: As we work to gather additional data, perform analysis iteratively, refining models and methods as more data becomes available. Each iteration will improve the accuracy and relevance of the insights. • Focus on Key Metrics or Segmentations: Focus on analyzing the subset of data that is complete by using techniques like pairwise deletion where only the incomplete cases are excluded from key analyses rather than the entire dataset. Determine if other segmentations are likely to respond in a similar manner.   Waiting for perfect data is a luxury that most businesses can't afford. Incomplete data can still point you in the right direction and reduce the risk of significant missteps. Decisions based on partial data are more grounded than those based purely on intuition. Remember, the goal is not to achieve perfection but to make the best possible decisions based upon the data at hand. #sentier #dataanalytics #dataengineer

To view or add a comment, sign in

Explore topics