Expert Data: Bedrock’s Path to Leading the Race for AI Dominance
For those interested, I am posting a blog we ran at Bedrock Knowledge, Inc. back in June that continues to age very well (https://thebedrock.co).
At Bedrock Knowledge, we are always encouraged when we see others picking up on the same signals we have been seeing for a long time about the trajectory of technology. The most recent example of this was Mark Sullivan’s latest “AI Decoded” newsletter for Fast Company in which he argues (correctly) that expert training data will define which companies and AI-enabled products (note I didn’t say “AI companies,” which is already almost as obsolete a term as “technology companies”) will rise to the top. Sullivan quotes Ali Golshan of Gretel:
“We built really great general purpose machines that talk like humans, but just like humans [who] are not experts, they’re generalists…Now, what we’re saying is that these general purpose machines need to become experts.”
As the operating system for lifelong learning for any enterprise with global interests — from the Fortune 500 to the armed services — Bedrock is, of course, always generating expert data from multiple sources: first from our own ever-growing libraries of courses, serious games, and intelligence streams, but also from our users who are professionals and typically themselves experts in one or more subjects. Bedrock is designed to use all of that data within clear ethical limits. We don’t compromise the privacy of the user or the integrity of our customers.
This is the best of both worlds: tremendously useful and unique expert data, without ethical downsides.
If the future of AI dominance hinges on access to high-quality, expert-curated data — and we think it does — Bedrock is positioned to dominate the race to productize AI to serve human knowledge needs.
The Growing Importance of Expert Data
General AI models, which are trained on massive, unfiltered datasets from the open web, suffer from inaccuracies and inconsistencies. These shortcomings are the result of training on noisy, unreliable data (including random social media posts), leading to so-called “hallucinations,” in which AI generates incorrect or nonsensical information. This is why, for example, when you ask Gemini or ChatGPT for quotes about, say, a historical figure, its default answer will be to literally make these quotes up, even when you ask it to be truthful and link to sources.
In contrast, AI models trained on carefully curated, high-quality datasets demonstrate superior performance, reliability, and relevance. This distinction is crucial for applications that require precision, such as enterprise solutions and mission-critical operations, like what we do at Bedrock.
Bedrock Knowledge’s Strategic Advantage
Bedrock is not just another educational platform; it is a comprehensive ecosystem designed to facilitate lifelong learning through expertly curated content and network effects. Our platform includes a wide range of resources, from detailed courses and intelligence reports to interactive experiences (including our library of games) — all generated with subject matter experts with whom we have unique relationships, because Bedrock is a spin-off of War on the Rocks. This extensive repository of high-quality, niche-specific data is our strategic advantage in the AI arms race.
Strategic Directions for Leveraging Expert Data
We have identified several strategic initiatives to harness the power of our expert data for AI development:
2. Enhanced Data Extraction and Utilization: To maximize the value of our content, we will ensure that it is easily indexable for AI training purposes. This involves redesigning our courseware and games to facilitate automated data indexing. Additionally, we will continue to enhance our business intelligence tools to capture valuable user-generated content, which will serve as a valuable data source for ingestion into AI models. This dual approach of leveraging both in-house-curated and user-generated content will strengthen our AI solutions.
3. Integrating the Competency Framework: Bedrock’s comprehensive competency framework provides a structured taxonomy for global issues, making it an invaluable resource for training AI models. This framework helps AI systems understand and organize complex knowledge domains effectively, ensuring that our AI solutions are both comprehensive and accurate. By integrating our competency framework into AI training, we can enhance the relevance and reliability of our AI solutions.
4. Addressing Privacy and Anonymization: Privacy concerns remain a significant challenge in the AI landscape. Implementing anonymization technologies allows us to leverage user data for AI training without compromising individual privacy. Companies like Gretel, Privly, and Authentic8 offer solutions that can help us de-identify data, ensuring that we can use it effectively while protecting user privacy. Balancing data utility and privacy protection will enhance the trustworthiness of our AI solutions. Our default position is (a) our customers own their own data, and (b) we will never sell personal information to third parties for advertising or sales. That’s not our business model. We embrace and go far beyond our SOC 2 compliance.
Recommended by LinkedIn
Enhancing Bedrock’s Market Position
To further strengthen our market position, we will focus in the near term on several key areas:
2. Continued User Growth: This is a no-brainer, but user engagement and growth, in line with our existing business model, will generate increasingly valuable user-sourced data. Expanding discussions to allow users to make comments in line with transcripts and timestamps can foster richer interactions and ensure that our content remains current and relevant. Scraping (ethically) and moderating these discussions will help us identify and address contentious or outdated information, maintaining the quality and value of our content. Remember, this isn’t just any user-sourced data. Our users are professionals and experts.
3. Strategic Partnerships and Government Data: Forming strategic partnerships with government agencies to become a gateway for anonymized government data can provide long-term benefits. This approach aligns with initiatives like OpenAI’s EDU platform, which aims to scrape data from educational institutions. By positioning ourselves as a trusted intermediary, we can access valuable data while maintaining strict privacy and security standards. Again, this is where being a spinoff of War on the Rocks confers advantages: We have a trusted relationship with the Defense Department and are already serving Air Force users.
Conclusion
The AI arms race is increasingly defined by the quality of training data. Bedrock Knowledge, with its extensive repository of expertly curated content, is uniquely positioned to lead this race. By focusing on high-quality, niche-specific data, we can develop superior AI solutions that meet the demanding standards of mission-critical applications. Our strategic initiatives, from developing customized AI models to licensing our data and enhancing user engagement, position us at the forefront of AI-driven lifelong learning.
As the AI landscape continues to evolve, Bedrock’s commitment to quality, precision, and innovation ensures that we remain a leader in providing reliable, high-quality AI solutions. By leveraging our expert data, we can navigate the complexities of the AI arms race and deliver unparalleled value to our users and partners. The future of AI is bright, and Bedrock is poised to lead the way.
Key Takeaways
(A) Develop customized AI models using proprietary content.
(B) Enhance data extraction and utilization for better AI training.
(C) Integrate a comprehensive competency framework to improve AI relevance and accuracy.
(D) Address privacy concerns through anonymization and strict ethical standards while maintaining data utility.
(A) Developing an AI tutor and chat/query experience for reliable, high-quality data access.
(B) Encouraging user engagement and growth to generate valuable user-sourced data.
(C) Forming strategic partnerships with government agencies for anonymized data access.