Data Types, Error Types, Analytics Types and Robocalls
This article provides an overview of important data, analytics types, and different error types. While these concepts apply to virtually any form of statistical analysis, the following focuses on the impact of analytics engines that seek to simultaneously provide consumer protection from unwanted robocalls while minimizing harm to legitimate businesses that seek to their clients and/or potential customers.
For example, the Do Not Originate (DNO) database is a repository of telephone numbers that may not be associated with call origination. This is an example of deterministic data as DNO data is based on telephone numbers of a certain type such as malformed numbers, non-NANPA numbers, non-allocated numbers, non-assigned numbers, or numbers that receive calls only.
DNO Telephone Numbers are an example of Deterministic Data
Therefore, use of DNO data is an example of how analytics engines may deterministically identify unwanted robocalls with an extremely low false positive rate. DNO data may be used with either content-based or event-based analytics.
Important Statistics Concepts for Robocall Analysis: False Positives and False Negatives
False Positives: Incorrect Identification of a Condition
In statistics, a false positive occurs when a test incorrectly indicates the presence of a condition or attribute that is not actually present. It's a type I error, where the test results in a positive outcome when it should have been negative.
In the context of robocalls, a false positive measurement from an analytics perspective would mean that the system incorrectly identifies a legitimate call as a robocall.
This could lead to the blocking or flagging of valid calls, causing inconvenience or missed important communications for users. Minimizing false positives is crucial in developing effective robocall detection systems to maintain the accuracy of call blocking or filtering mechanisms.
False Negatives: Wrongful Indication of Condition Not Present
A false negative in statistics occurs when a test incorrectly indicates the absence of a condition or attribute that is actually present. It's a type II error, where the test results in a negative outcome when it should have been positive.
In the context of robocalls, a false negative from an analytics perspective would mean that the system fails to detect a robocall, incorrectly allowing it to go through as if it were a legitimate call.
This situation is problematic as it can lead to users receiving unwanted and potentially malicious calls, undermining the effectiveness of the robocall detection system. Minimizing false negatives is essential to enhance the accuracy of identifying and blocking undesired robocalls.
False Positives and Missed Business Calls
False positives may cause consumers to miss important business calls such as:
It's essential for call filtering or blocking systems to minimize false positives to ensure consumers don't miss these vital communications.
Content-based Analytics vs. Event-based Analytics
Content-based Analytics
Content-based analytics refers to the process of analyzing and extracting insights from the actual content of data. This approach involves examining the characteristics, patterns, and features within the content itself, rather than relying solely on metadata or external information.
In various fields, content-based analytics can be applied:
Content-based analytics is particularly valuable in gaining a deeper understanding of data, enabling more nuanced and context-aware insights across different types of content.
Recommended by LinkedIn
Event-based Analytics
Event-based analytics involves analyzing and deriving insights from specific occurrences or events within a system or dataset. Instead of continuously monitoring all data, this approach focuses on capturing and analyzing events that are significant or relevant to a particular context. The events may represent occurrences, transactions, changes in state, or other noteworthy activities.
Key aspects of event-based analytics include:
Event-based analytics finds applications in various domains, including finance, cybersecurity, Internet of Things (IoT), and business intelligence, where timely and context-aware insights are crucial.
Analytics Engines use both Deterministic and Stochastic Data
Analytics Engine use of Deterministic Data
Analytics engines use deterministic data to perform analysis based on known and certain information. Deterministic data consists of explicit values or facts without uncertainty. Here's how analytics engines leverage deterministic data:
Overall, deterministic data plays a crucial role in building reliable and accurate analytical models and systems. It forms the foundation for making informed decisions based on explicit and certain information.
Analytics Engine use of Probabilistic Data
Analytics engines make determinations based on stochastic data by employing probabilistic methods and statistical techniques. Stochastic data involves uncertainty and randomness, and analytics engines adapt to this by using probability distributions and modeling techniques. Here's how it's done:
By incorporating probabilistic thinking and statistical methods, analytics engines can handle stochastic data effectively, providing valuable insights even in situations where outcomes are not deterministic.
Analytics Engines use Data to Protect Consumers from Unwanted Robocalls
Various analytics engines and technologies are employed to protect consumers from unwanted robocalls. These systems use a combination of rule-based algorithms, machine learning, and real-time analysis to identify and filter out potentially harmful or nuisance calls. Some key components include:
Various technologies and approaches are often leveraged as comprehensive solutions aimed at minimizing false positives and negatives, ensuring that consumers are protected from unwanted robocalls without missing important calls.
About the Author: Gerry Christensen
In his current role, Gerry Christensen is responsible for regulatory compliance as an internal advisor to Caller ID Reputation® and its customers as well as externally in terms of policy-making, industry solutions and standards. In this capacity, Gerry relies on his knowledge of regulations regarding B2C communications engagement. This includes the Truth in Caller ID Act, the Telephone Consumer Protection Act of 1991, state "mini-TCPA" laws and statutes governing consumer contact, various Federal Communications Commission rules, and the Federal Trade Commission's Telemarketing Sales Rule (FTC TSR)
Christensen coined the term, "Bad Actor's Dilemma", which conveys the notion that unlawful callers often (1) don't self-identify and/or (2) commit brand impersonation (explicit or implied), when calling consumers. These rules are addressed explicitly in the FTC TSR (see 310.3 and 310.4) and implicitly in the Truth in Caller ID Act. Christensen has expertise in VoIP, messaging and other IP-based communications. Gerry is also an expert in solutions necessary to identify unwanted robocalls as well as enabling wanted business calls. This includes authentication, organizational identity, and use of various important data resources such as the DNO, DNC and RND.
Gerry is also an expert in technologies and solutions to facilitate accurate and consistent communications identity. This includes authentication and validation methods such as STIR/SHAKEN as well as various non-standard techniques. His expertise also includes non-network/telephone number methods such as cryptographically identifiable means of verifying organizational identity. In total, Christensen's knowledge and skills make him uniquely qualified as an industry expert in establishing a trust framework for supporting wanted business communications.