Top 5 Reasons Why PolicyScope Data is on Databricks -- #4: Safe LLMs
www.bcmstrategy2.com

Top 5 Reasons Why PolicyScope Data is on Databricks -- #4: Safe LLMs

This is a story about game-changing innovation and the ability to position proactively for market demand just as it emerges.

The Back Story

During the Great Financial Crisis, I was advising large hedge funds on a range of geo-economic & geopolitical risks. By 2011, I was measuring public policy risks for them every morning using publicly available information. My clients did well when so many were in the red.

While the economic and financial world were in upheaval, @IBM launched an impressive publicity stunt: their AI language model (Watson) played Jeopardy ...and won.

It was a lightbulb moment. I realized that (i) every bit of my analytical process could be automated and (ii) that the language WAS the data. The problem was that text-based language could not be used to train the predictive analytics models of the day. Text is not objective; integers were needed.

So I patented an automated metadata tagging process that measures momentum in the public policy process for the purpose of powering predictive analytics models. If you want to measure momentum and directionality in public policy, our award-winning process is the only way to do it.

It was very early.

Enter Databricks

Last year, amid the LLM frenzy, Databricks was first to the market with an innovative offering. They understood from the beginning that enterprises would not be able to deploy fully open LLMs due to security and data privacy concerns.

They were the first to unveil a small language model (Dolly) that could be deployed in a dedicated instance. Users would not be required to share their proprietary information, knowledge, and data with the rest of the world. Databricks shares with BCMstrategy, Inc. a deep respect for data and a deep understanding of the enterprise B2B market needs which prioritize data security, ownership, and reliability.

Our company's process generates proprietary data predominantly from open source official sector language inputs. This is language that moves markets and changes people's lives. It needs to be treated with respect and great care.

Consequently, we save every word in context, tagged and structured using a proprietary ontology crafted from deep subject matter expertise and leadership experience as policymakers on the global stage. We could not use the LLMs on the market last year for experimentation without undermining our patented tech.

Partnering with Databricks was a no-brainer.

So last year we tokenized our language data and, with the help of external contractors and Amazon Web Services (AWS) , we trained Databricks' Dolly model to read, write, and connect the dots like a senior government official. All in a dedicated instance.

The patented way that we ingest, tag and structure the language data made training runs efficient and accurate even with a small language model. It's still not cheap, but it is far more cost-effective and accurate compared with the kitchen sink approach to training and tuning language models.

What Comes Next -- DBRX and Our Clients

Innovation in this space is accelerating. Databricks did not stop with Dolly. 2024 brings a new state-of-the-art LLM at Databricks (DBRX) that prioritizes what enterprise clients need: dedicated instances that solve for security and data privacy while maximizing the capacity to experiment in a safe sandbox environment before deploying to production.

Firms retain full ownership of both their input data AND outputs. Firms retain full flexibility to use a broad range of open source GenAI solutions and related libraries (particularly Hugging Face) with a deep bench of enhancement options like production-ready retrieval augmented generation that facilitate efficient and effective model training. The Databricks Mosaic Research foundation model APIs streamline the training and deployment process. Implementing private, self-contained hosting is as easy as downloading the model from the Databricks Marketplace.

These powerful tools enable us to offer customized GenAI models to our clients focused like a laser on the key public policy issues of the day: monetary policy, renewable energy policy, climate-related policy, digital currency policy. The Databricks platform enables our small company to offer enterprise-grade technology infrastructure and streamlined data sharing capabilities that otherwise would cost a fortune to build internally.

Today, we gladly introduce to you the Poli suite of language data and signals powered by Databricks & AWS. You will never again see public policy as a random exogenous variable.

Early adopters in the next few months will reap the largest alpha gains. This means that institutional investors serving savers, 401k plans, pension plans, and retirees will soon benefit from the informational advantages that accrue to those that deploy PolicyScope data and signals to inform investment decisions with state-of-the-art public policy data.

Our clients can now pair PolicyScope Data and Signals (the quantitative momentum data generated by our award-winning patented process) with the underlying language we have tagged and stored to meet a wide range of use cases....starting with automated research assistant tasks and, of course, predictive analytics.

Yes, you can measure the path towards a policy decision and anticipate accurately the outcome using PolicyScope Data and Signals. And if you know the path, anticipating the market reaction function becomes a matter of mathematics. Our tickerized data enables firms to measure for the first time the MACD of policy activity on a par with the MACD of any given tradeable asset. The efficient frontier has been bumped out.

Our strategic partners at Databricks and AWS continue to innovate at a rapid clip, increasing the accessibility of a broad range of ML/AU tools. We look forward to continuing that journey with them. It's going to be a great ride.

#alternativedata #innovation #datascience #globalmacro #predictiveanalytics #femalefounder

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics