Heliospan provides meaning - the future of AI based information search

AVA-X: AI Swiss Made.

Face the Future: Advanced AI for Facial Recognition in Access Control and Investigations.

Published Sep 13, 2022

Es wurde kein Alt-Text für dieses Bild angegeben.

Tobias Bolliger is acceleration manager for the AI based solutions of AVA-X, the Swiss address for Artificial Intelligence. Here he talks about the future of AI based search.

What are you actually working on?

Tobias Bolliger: Among other things, we are currently working on an innovative way to make sense of large volumes of documents and unstructured data thanks to artificial intelligence. We call it: Heliospan.

What is Heliospan?

Heliospan contains a set of state-of-the-art technologies for data mining, text analysis and search. The search engine is build around machine intelligence, able to do a search by better understanding the meaning of the user’s query and the document pool, called semantic search. This makes it different and unique compared to lexical search where the search engine looks for literal matches of the query words, without understanding the meaning of the query.

By mining and analysing text data, Heliospan can discover interesting patterns, extract useful knowledge, and support decision making, with an emphasis on machine learning and statistical approaches that can be generally applied to arbitrary text data in any natural language with no or minimal human effort.

What are the coming challenges?

We are competing with the big players on the market. As a small company from Switzerland, our resources for marketing and sales are limited (we prefer to put our resources into research and development 😉). Even if we are convinced of the quality and innovation of our products and can compete with the best, the community still does not know and trust us enough.

What are you proud of?

With Sentinel, we have already developed a market-leading facial recognition system that is used by investigative authorities. With Heliospan, we now want to provide a solution that revolutionizes the handling of large amounts of unstructured data. We are particularly proud that 100% of Heliospan's research and development takes place in Winterthur. We attach particular importance to the fact that our solutions are developed from scratch and without any dependencies on third-party products. Therefore we can also guarantee full compliance with GDPR and other data protection regulations.

For whom is Heliospan useful?

Heliospan can maximize its benefits wherever large unstructured data volumes are present. This is now the case in almost all industries (Financial, Media, Government and many more). Areas in which Heliospan has already proven itself successfully are:

online media publisher uses Heliospan for a better customer experience and automating the tagging and clustering of content like www.inside-it.ch
collection of several thousands of health related articles were made searchable semantically and with the help of Heliospan.

Recommended by LinkedIn

Navigating the AI Landscape: RAG, Rockset's New…

Brij kishore Pandey 4 months ago

Building Agentic AI Applications using LangGraph - A…

Data Science Dojo 3 months ago

AI’s hunger games: A lucrative data market is…

VentureBeat 9 months ago

Interview: Matthias Mueller

Contact Tobias if you want to have a chat about Heliospan.

Do you know that? Semantic search - topic modelling - word association mining? If not please keep on reading.

Semantic Search Engine

The core and most important module of Heliospan is the state-of-the-art semantic search and explore engine. It enables users to browse and link words, paragraphs and documents based on their meaning and not just by doing classical string comparison or N-gram vector space model. Heliospan uses deep learning to build vector representations of not only the words, but also on paragraphs and whole documents. In the scientific literature this technique is known as Word2Vec and Word2Doc. By training the documents vectors at the same time as the word embeddings, allows Heliospan to connect meaning from a single word, to paragraphs to documents.

Topic Modelling

To have a better way of managing the explosion of electronic documents, it requires new technologies that deal with automatically organising, searching, indexing, and browsing large collections of unstructured text. Heliospan can cluster a large set of unstructured documents into topics and returns for each cluster the most relevant words of that topic. These words can be seen as automatically discovered tags, a summary of the topic in just a few words. The word clouds can be used as an entry point for the user to browse and navigate through a document archive. Another use case is to use topic modelling for having a quick overview of what is trending in the news, twitter, or any other streaming data source.

Word Association Mining

Words with strong syntagmatic relations usually tend to co-occur frequently together while having relatively low individual occurrence. There are many applications where syntagmatic relation mining is important. For example, in retrieval, words that have strong syntagmatic relations with the original query words can be used to expand the search query in order to enhance retrieval results. Another application is opinion summarization; for example, we can extract the top K syntagmatically related words to “iPhone 11” from a corpus of customer reviews in order to summarise the users’ feedback. Knowledge of word association can be helpful in clustering words that are related to each other.

The techniques used to mine word associations can be generally classified into two categories. The first is hypothesis testing, where statistical tests are used to determine if the co-occurrence of two words happened by chance or due to an actual correlation. The second category is information-theoretic and is based on measures such as mutual information. This method is part of Heliospan.

Name Entity Recognition

Name entity recognition (NER) is the ability to identify the names of things, such as people, companies or locations in a text. NER is an important area of research in machine learning and natural language processing (NLP), because it can be used to answer many real-world questions, such as: “Does a tweet contain the name of a person”, “does the tweet also provide his current location”, “which companies were mentioned in a news article” or “were specified products mentioned in complaints or reviews?” NER is integrated in Heliospan and part of active research. Note that at this moment, only the English language is supported.

To view or add a comment, sign in

Heliospan provides meaning - the future of AI based information search

AVA-X: AI Swiss Made.

Face the Future: Advanced AI for Facial Recognition in Access Control and Investigations.

Recommended by LinkedIn

Do you know that? Semantic search - topic modelling - word association mining? If not please keep on reading.

More articles by this author

Insights from the community

Others also viewed

World Models, GenSQL for Database Analysis, Synthetic Data for Training ML Models, and GenAI for Quantum Computing

Open-Source Synthetic Data Tools, AI Voice Agents, World Models, Retrieval Systems, and Special ODSC Europe and West Deals

What sets great retrieval augmented generation apart — and why vector search isn’t enough for AI

Hybrid Search: The Next Frontier Beyond Vector Search!

Towards Advanced RAG

GenAI Weekly — Edition 25

Commercial AI vs Open Source: What You Should Choose?

Vectors of Change: The Evolution of Database Technologies in the AI Era!

ODSC’s AI Weekly Recap: Week of March 8th, 2023

Save the world by using RAG &Federated AI

Explore topics

Recommended by LinkedIn

Do you know that? Semantic search - topic modelling - word association mining? If not please keep on reading.

Iris Investigation: Your Unparalleled Ally in Law Enforcement Investigations

Jan 22, 2024

Revolutionäre Zutrittskontrolle: Die SAFOS AG und AVA-X präsentieren innovative Schleusentechnik

Sep 11, 2023

Datenschutz und Datensicherheit im Fokus: Unser ethisches Manifest

Apr 27, 2023

Einfacher geht's nicht: das biometrische Ticket

Mar 23, 2023

Ermittler entlasten, Ermittlungen beschleunigen

Mar 2, 2023

Der sichere Flughafen - ganz ohne Gedränge

Feb 23, 2023

Mehr Komfort für die Gäste, mehr Sicherheit fürs Hotel

Feb 8, 2023

Hohe Sicherheit für Besucher, Mitarbeitende und Lieferanten

Jan 25, 2023

Wie gelingt der praktische Einstieg zum Thema Künstliche Intelligenz?

Dec 7, 2022

Die Technologie der Gesichtserkennung verstehen

Nov 23, 2022

Insights from the community

Others also viewed

World Models, GenSQL for Database Analysis, Synthetic Data for Training ML Models, and GenAI for Quantum Computing

Open-Source Synthetic Data Tools, AI Voice Agents, World Models, Retrieval Systems, and Special ODSC Europe and West Deals

What sets great retrieval augmented generation apart — and why vector search isn’t enough for AI

Hybrid Search: The Next Frontier Beyond Vector Search!

Towards Advanced RAG

GenAI Weekly — Edition 25

Commercial AI vs Open Source: What You Should Choose?

Vectors of Change: The Evolution of Database Technologies in the AI Era!

ODSC’s AI Weekly Recap: Week of March 8th, 2023

Save the world by using RAG &Federated AI

Explore topics