We're hiring! We're building an IDE that lets engineering teams develop AI-powered products 10x faster. If you're a highly-driven, talented individual interesting in shaping the future of AI-powered software, then apply to work at Athina. Location: San Francisco
Athina AI (YC W23)
Technology, Information and Internet
San Francisco, California 3,573 followers
A data-centric IDE for teams to prototype, experiment, evaluate and monitor production-grade AI
About us
Athina helps LLM developers prototype, experiment, evaluate and monitor production-grade AI pipelines.
- Website
-
https://athina.ai
External link for Athina AI (YC W23)
- Industry
- Technology, Information and Internet
- Company size
- 2-10 employees
- Headquarters
- San Francisco, California
- Type
- Privately Held
- Founded
- 2022
Locations
-
Primary
San Francisco, California, US
-
Remote, US
-
Bangalore, Karnataka, IN
Employees at Athina AI (YC W23)
Updates
-
𝟮𝟬𝟮𝟱 𝘄𝗶𝗹𝗹 𝗯𝗲 𝗮𝗹𝗹 𝗮𝗯𝗼𝘂𝘁 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁𝘀 𝗮𝗻𝗱 𝗵𝗼𝘄 𝘁𝗵𝗲𝘆 𝗯𝗿𝗶𝗻𝗴 𝗶𝗻𝗻𝗼𝘃𝗮𝘁𝗶𝗼𝗻 𝗶𝗻 𝗼𝘂𝗿 𝘄𝗼𝗿𝗹𝗱 𝘄𝗶𝘁𝗵 𝘁𝗵𝗲𝗶𝗿 𝗶𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝘁 𝗱𝗲𝗰𝗶𝘀𝗶𝗼𝗻 𝗺𝗮𝗸𝗶𝗻𝗴. This last year, AI agents evolved from basic tools to advanced systems capable of reasoning, collaborating, solving complex tasks, and taking informed actions. To sum it up, we curated this comprehensive list of Top 10 HackerNews Posts of the year 2024 for AI Agents along with the complete details of Upvotes, Comments and Summary of Conversation . Check it out: 1️⃣ A real time AI video agent with under 1 second of latency 🧠✨ 2️⃣ Agent.exe, a cross-platform app to let Claude 3.5 Sonnet control your machine 3️⃣ LlamaGym – fine-tune LLM agents with online reinforcement learning 4️⃣ Tarsier – Vision utilities for web interaction agents 🤖 5️⃣ Flow – A dynamic task engine for building AI agents 📖 6️⃣ Nous – Open-Source Agent Framework with Autonomous, SWE Agents, WebUI 🗃 7️⃣ Steel.dev – An open-source browser API for AI agents and apps 8️⃣ Windsurf – Agentic IDE 9️⃣ Use functional tokens for AI agents to simplify app workflows 🛠 1️⃣0️⃣ Codel – Autonomous Open Source AI Developer Agent 🚀 Curious to delve deeper into their details and understand the technical aspects and impact? Read the full blog from the first comment 👇
-
RAG Systems played a key role in transforming how Enterprise AI evolved in 2024 🧠 To highlight what captured the most attention around RAG in 2024, we’ve curated a comprehensive list of the 𝗧𝗼𝗽 𝟭𝟬 𝗛𝗮𝗰𝗸𝗲𝗿 𝗡𝗲𝘄𝘀 𝗣𝗼𝘀𝘁𝘀 𝗼𝗳 𝘁𝗵𝗲 𝗬𝗲𝗮𝗿 about RAG. Each post includes detailed stats on upvotes, comments, and a summary of the key conversations. Check it out: 1️⃣ FastGraphRAG – Better RAG using good old PageRank 🧠✨ 2️⃣ Pg_vectorize - Vector search and RAG on Postgres 3️⃣ Open-source Rule-based PDF parser for RAG 4️⃣ Autoflow, a Graph RAG based and conversational knowledge base tool 🤖 5️⃣ Solving the out-of-context chunk problem for RAG 📖 6️⃣ Greptile (YC W24) - RAG on codebases that actually works 🗃 7️⃣ R2R V2 - A open source RAG engine with prod features 8️⃣ Better RAG Results with Reciprocal Rank Fusion and Hybrid Search 9️⃣ Txtai: Open-source vector search and RAG for minimalists 🛠 1️⃣0️⃣ I want flexible queries, not RAG 🚀 Curious to delve deeper into their details and understand their technical aspects? Read the full blog from the first comment 👇
-
Hacker News has become an invaluable resource for developers exploring the latest in AI Development and Innovation 🧑💻🧠 This week, We’ve curated the top 5 most insightful posts on RAG (Retrieval-Augmented Generation)—highlighting key discussions and practical takeaways. 1️⃣ 𝗧𝗶𝘁𝗹𝗲: RAG Logger: An Open-Source Alternative to LangSmith 𝗨𝗽𝘃𝗼𝘁𝗲𝘀: 95 𝗟𝗶𝗻𝗸: https://lnkd.in/gcbcH98E 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗶𝘁 𝗮𝗯𝗼𝘂𝘁: RAG Logger is a simple, open-source RAG pipeline logging tool with suggested enhancements like visualization, OpenTelemetry support, and replay features. 2️⃣ 𝗧𝗶𝘁𝗹𝗲: Collab Notebook – RAG on Your Unstructured Data 𝗨𝗽𝘃𝗼𝘁𝗲𝘀: 14 𝗟𝗶𝗻𝗸: https://lnkd.in/ghK-EPQ8 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗶𝘁 𝗮𝗯𝗼𝘂𝘁: The post outlines using LangChain and Unstructured IO to address unstructured data challenges in RAG with FAISS, LLMs, and Athina AI evaluation. 3️⃣ 𝗧𝗶𝘁𝗹𝗲: Web RAG to generate perplexity like answers from your docs in browser 𝗨𝗽𝘃𝗼𝘁𝗲𝘀: 5 𝗟𝗶𝗻𝗸: https://lnkd.in/gH7_wj5X 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗶𝘁 𝗮𝗯𝗼𝘂𝘁: The system offers a private, browser-based solution for indexing, searching, and generating responses using GTE-small, SQLite, and WebLLM, with zero API costs 👩💻 4️⃣ 𝗧𝗶𝘁𝗹𝗲: LLM apps, AI Agents, and RAG tutorials with step-by-step instructions 𝗨𝗽𝘃𝗼𝘁𝗲𝘀: 3 𝗟𝗶𝗻𝗸: https://lnkd.in/gTt8qMy8 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗶𝘁 𝗮𝗯𝗼𝘂𝘁: A curated repository of RAG-powered LLM applications, showcasing models from OpenAI, Anthropic, Google, and open-source options like LLaMA. 5️⃣ 𝗧𝗶𝘁𝗹𝗲: GraphRAG SDK 0.4.0: Simplify RAG with Graph Databases 𝗨𝗽𝘃𝗼𝘁𝗲𝘀: 2 𝗟𝗶𝗻𝗸: https://lnkd.in/gDtM5CGA 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗶𝘁 𝗮𝗯𝗼𝘂𝘁: The module simplifies RAG application development with graph databases, multi-LLM support, smarter queries, LiteLLM integration, and cost-effective deployment 🚀
-
Unable to keep a track of latest LLM Research? 🧠 We made this comprehensive list of Top 10 LLM Papers of the week to help you keep with the advancements. Here’s a list of all the papers we covered: 1️⃣ Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents 🧠✨ 2️⃣ MultiCodeBench: How Well Do LLMs Generate Code for Different Application Domains? Benchmark and Evaluation 3️⃣ Precise Length Control in Large Language Models 4️⃣ PROMO: Prompt Tuning for Item Cold-start Recommendation 🤖 5️⃣ Qwen 2.5 Technical Report 📖 6️⃣ AutoFeedback: Using Generative AI and Multi-Agents to Provide Automatic Feedback 🗃 7️⃣ Robustness-aware Automatic Prompt Optimization 8️⃣ DRUID: A Reality Check on Context Utilisation for Retrieval-Augmented Generation 9️⃣ Alignment Faking in Large Language Models 🛠 1️⃣0️⃣ TheAgentCompany: Benchmarking AI for Real-World Tasks 🚀 Curious to delve deeper into their details and understand their influence on our LLM pipelines? Read the full blog from the first comment 👇
-
Athina AI (YC W23) reposted this
🚀 DeepSeek just dropped DeepSeek V3, their latest open-source model, and it's turning heads! 🌟 This powerhouse model has been making waves by outperforming some of the best on standard benchmarks, earning a spot among the top 5 models alongside Qwen 2.5, Llama 3.1, Claude Sonnet, and GPT-4o. ✨ Key Highlights of DeepSeek V3: — 671B MoE parameters with 37B activated at any time 💡 — Input token cost: $0.27/M tokens — Output token cost: $1.1/M tokens — Speed: Processes 60 tokens/sec ⚡ — Training data: A whopping 14.8T tokens 🧠 — ~11x cheaper than OpenAI O1 mini! 💰 API access is live now 👉 https://lnkd.in/d69mycCh Which of these features excites you the most?
-
Athina AI (YC W23) reposted this
Evaluating your LLM's performance is crucial, but knowing when, why, and how to use the right metrics can make all the difference. 🎯 Our team published an in-depth article that covers all the aspects: https://lnkd.in/dQbjD4aU Here’s a quick breakdown: ⚙️ Broad evaluation categories to focus on: ⚡️Text similarity metrics: BLEU, ROUGE, METEOR, Levenshtein distance 📝 ⚡️Semantic similarity metrics: Cosine similarity, MoverScore calculation 🤖 ⚡️LLM as a Judge: Use eval libraries like RAGAS, Open AI evals, Guardrails AI, Protect AI, etc., that utilize LLMs as judges to assess quality 🧐 ⚡️Qualitative metrics: Don’t forget user feedback (score, thumbs up/down) or edit distance of responses (compression-based edit distance) 💬 🔍 When to use LLM evaluation metrics: 1️⃣ During development to identify model strengths and weaknesses 2️⃣ Before deployment to ensure reliability 3️⃣ For continuous improvement as part of your AI lifecycle Help meet ethical and fair AI standards Check out the full article by Haziqa Sajid here 👇
-
Athina AI (YC W23) reposted this
[Colab Notebook]: Improve your LLM output using RAG Fusion If you're building a domain-specific RAG (e.g. medical, financial, legal etc) and struggling to improve its performance, try RAG Fusion. 💡 When should you consider RAG Fusion? 1️⃣ Ambiguous or poorly formulated queries: Users often struggle to articulate their questions, especially when unfamiliar with domain-specific vocabulary. 2️⃣ Large-scale information retrieval: Perfect for applications handling massive datasets or diverse information sources. 3️⃣ Complex queries: When nuanced or intricate user queries demand a broader context to deliver accurate results. 🧠 What is RAG Fusion? RAG Fusion is an advanced technique that builds on the multi-query retriever method. Here's how it works: - Creates multiple variations of the user’s query. - Retrieves results for each query from your vector database. - Applies Reciprocal Rank Fusion to score and re-rank the retrieved documents. - Uses the re-ranked results to generate more accurate responses. Our team has simplified the implementation for you with a ready-to-run Colab notebook: https://lnkd.in/dSAwX8QV ⭐️ If you find this useful, please leave a star!
-
Athina AI (YC W23) reposted this
[Colab Notebook] Build a RAG on Your Unstructured Data 📄➡️💡 Building a RAG application is a powerful way to unlock insights from your data! But when you move to real-world data, things get tricky. 🤔 🔑 Key Challenges: 🏗️ Prototyping RAG with structured data is easy. But what about unstructured data? Pdfs, emails, images, tables, and Excel sheets? 🧩 It is often a pain to make unstructured data LLM-ready. If not handled correctly, you end up with broken tables, poor chunking, and low-quality outputs. 🛠️ To help solve this, our team created a Colab notebook that: Uses unstructured.io to parse and prepare unstructured data for LLMs Integrates LangChain to build the RAG on top of the open-source vector DB, FAISS 🔥 Ready to give it a try? Here's the link to the notebook: https://lnkd.in/dWxfnsQa ⭐️ If you find this useful, please leave a star!
-
Athina AI (YC W23) reposted this
🚀 Build AI Workflows in Minutes with Flows on Athina AI (YC W23) 🌟 We built a Flow that checks the sentiment of a keyword across multiple channels in just a few clicks. 🧩 Check it out here: https://lnkd.in/dzmGYHup Here’s how it works: 1️⃣ Input a keyword (e.g., “AI trends”) 2️⃣ Runs a neural search using Exa to fetch relevant results. 3️⃣ Focuses on results from two channels: News and Twitter. 4️⃣ Uses a custom code block to extract, process, and format data from both channels. 5️⃣ Calls GPT 4-O mini to calculate sentiment scores and renders them in a clean, structured format. 6️⃣ Finally, GPT 4-O mini summarizes the overall sentiment for easy interpretation. This is just the beginning—Flows lets you build and deploy multi-step AI workflows faster than ever. 🌍 What will you build today? 😉