What is Helios?
Helios is a B2B technology to allow searching of large commercial music libraries by using music itself as the search key.
Helios is a B2B technology to allow searching of large commercial music libraries by using music itself as the search key.
There are many uses for Helios. Here are a few examples.
You have a digital jukebox in bars, restaurants, and pubs. You want venue patrons to be able to play more music like whatever they just paid to hear.
You have a music catalogue management platform that publishers and labels use to track their digital assets. Your customers want to be able to search within their own catalogue using your slick platform.
You have an online digital music store and you’d like to be able to make intelligent product recommendations to your customers based on what songs they’ve already got in their shopping cart before they check out.
You have a streaming music service whose DJs need inspiration for coming up with either channel based or custom curated playlists faster.
You market software for DJs, such as plugins to manage their library. While they’re playing live a plugin in their favourite software suggests some tracks to mix or play next.
You have a commercial music library of any size. Perhaps you are a major record label, perhaps independent, or maybe you’ve amalgamated music from multiple labels, independent artists, or publishers. You need to assist your clients obtain synchronization licenses quickly for appropriate pieces of music for their film, TV, documentary, commercials, video games, or some other context.
There are countless other examples, but let’s talk about this last one. Nearly always your client approaches you with samples already in hand. “Hey, do you have anything like this?” This could be an MP3 or a YouTube video URL. Because Helios allows you to search the catalogue using music itself as the search key, you could use the customer’s samples directly to help them find what they’re looking for.
Traditionally, in the absence of such technology, the way this has been done for decades may surprise many. It is both costly and involves many hours or even days of manual human labour which delays the business process. The business must manually search, usually using textual tags, and listen to a great deal of irrelevant music in the hopes of finding the one the client is actually willing to spend money on.
Helios works by analyzing the actual sound of each audio track it’s provided. This is called sub-symbolic analysis. It does this by performing a complex digital signal processing analysis on both time and spectral domains.
It actually listens to your music rather than just guessing what you want based on what it thinks someone like you listened to before.
The scientific research Helios was based on is extensive and draws upon topics in physics, mathematics, and computing science.
Sub-symbolic audio analysis is a kind of analysis that listens to the actual audio itself. A sub-symbolic music similarity engine makes recommendations to help users discover music based on what a given song actually sounds like.
The statistical method is an alternative and traditional approach. It is “tone deaf” in the sense that it has no idea what any of your music actually sounds like. If Bob likes songs A, B, and C; and Sue likes A and B too; and the system thinks Bob and Sue are pretty similar, it will suggest song C to Sue.
The system doesn’t listen to the song, but it still might have made a relevant recommendation. This works to a certain extent, but quickly starts to fall apart due to the cold start problem.
If a song is new to the system and nobody has rated it yet, it can’t recommend it to anyone. Or, if a user is new, it can’t make a recommendation either because the user hasn’t listened to any music yet.
Not quite. Fingerprinting is a term used by music identification services like SoundHound or Shazam. These platforms are used primarily for musicidentification, not music discovery. That’s an interesting problem, and although Helios can do that too, this was not the primary problem the studio was concerned with.
What industry asked for was a method to find similar music based on music itself. That is, not using statistical tricks like trying to correlate user generated song ratings with predictions for a given user, but actual sub-symbolic analysis of the song’s objective traits.
No. Shazam is a music identification service. Give it a song and it identifies it. Give Helios a song and it finds others similar to it.
Spotify is a music discovery service because it does help users find similar music, except that it is limited to Spotify’s catalogue. That means you cannot use it with your own catalogue.
Consider that most of the world’s music is not on Spotify, but even if it was, Spotify does not allow you to use its service for either non-personal or commercial purposes.
Yes! Please see the next question.
Yes! Helios does not require you to upload or send your music anywhere to be analyzed. You can have your own local installation.
Analysis of your commercial music catalogue never require any of it to leave your network. To do otherwise is reckless and would invite customers to get sued by whomever owns the rights to your digital assets.
If you’re asking whether you can run your own Helios server, the answer is yes. But try to avoid using the term “cloud” because it is a marketing buzzword with no coherent meaning that spreads confusion.
Although Helios is proprietary, it is not a Service as a Software Substitute (SaaSS) because you are allowed and encouraged to host your own server. SaaSS denies you the right to run your own copy of the server software.
Many different organizations were consulted throughout the design of Helios. Since the technology began development in 2015 it was accompanied with regular and lengthy and industry consultations with key stakeholders.
These stakeholders included regulatory bodies, legal experts, commercial broadcasters, streaming services, record labels, recording artists, songwriters, one of the world’s largest digital music stores, and many others.
One of the recurring themes that came up over the studio’s many industry consultations over the years was a major restriction not well understood outside of industry. Catalogue digital assets must not leave the internal network for legal reasons. This is a deal breaker for many.
The Helios architecture factored this otherwise catastrophic legal constraint into its architectural design from inception. Customers get their own copy to run locally on their own network and, if they choose, to serve requests inside or outside of it.
Besides the legal reason, there are a number of other technical and financial reasons. If you needed to send your music outside of your network, you would normally be expected to pay hosting fees to a third party to “manage” redundant data you already have.
For many, such as digital music libraries and record labels as examples, as if the legalities weren’t already reason enough, uploading tens of terabytes or petabytes of digital audio over the internet to a third party’s storage facility is simply not practical. The fastest commercial broadband connections available can still only control the width of the pipe between you and your ISP. What’s beyond that is entirely beyond your ISP’s control.
When your costly digital assets do not need to leave your network any good insurer worth their salt will reduce your premiums because the architecture is a wonderful risk mitigation strategy when your entire business revolves around your digital assets.
If the service you built around Helios instead required access to a centralized hosted platform we controlled and there were internet connectivity issues one day that were not your fault, your system would be crippled. Your customers would not be able to find the music they need.
Further, consider the scenario where internet connectivity wasn’t an issue, but for whatever reason the centralized hosted solution is taken offline at our end for any reason. The entire service would be rendered useless, not simply localized to a single business.
Pricing varies based on a number of factors, but the studio is confident it can eventually undercut any competitor’s pricing – if not already. Please get in touch with us and request a quotation if you haven’t already.
Yes! But we recommend asking us about more secure next-generation alternatives to AWS.
Yes! But first you must make sure you are allowed to redistribute your digital audio assets outside of your network. In many cases you may be legally restricted from doing so.
But if that is not an issue, there will be an additional reasonable monthly fee for renting and maintaining your server. If at a later date your organization decides it would like to host within its own data centre, that isn’t a problem and this fee will be waived.
There is no theoretical limit, but practical limitations are governed only by the hardware you run it on. Helios will take advantage of specialized hardware acceleration whenever available, such as MMX, Streaming SIMD Extensions, and multi-core parallel computing.
Helios was designed to scale. In particular, for very large catalogues in the hundreds of thousands to millions of songs.
The studio hears regularly that vendors X or Y are allegedly already doing the same thing. Nine times out of ten they are not, and the remaining minority of the time they are (or were), but there was a major limitation that made it unusable for many.
To build something like Helios is not a casual undertaking and requires a significant scientific, financial, and technological investment.
SoundCloud has a mostly great platform for what it aims to do. It has a music recommendation feature called Autoplay. Autoplay makes an automated recommendation for the next song to play based on the one you are currently listening to.
Unfortunately despite SoundCloud’s best efforts, it’s users have been lamenting for years what they believe are poor quality recommendations.
Based on extensive experimentation and observation on how SoundCloud’s Autoplay feature appears to work, Helios greatly outperforms it in terms of the quality of its recommendations and the broad versatility of its applications.
On the latter point, SoundCloud’s Autoplay is not reusable. This means it cannot be integrated into another music related product or service outside of SoundCloud’s own platform.
Helios has been professionally developed. It aims to be compliant with the relevant aspects of the following standards and conventions among others whenever possible:
Most of Helios is written in portable high performance native C++17 with build environment assistance of the industry standard GNU Autotools.
Helios supports WAV, FLAC, Ogg/Vorbis, Speex, WMA, MP3, M4A, AAC, Dolby Digital, DTS, Apple Lossless, Dolby TrueHD, Monkey’s Audio, Opus, RealAudio 1/2, WavPack, WMA Lossless / Professional, and many others. If you’re not sure, just ask.
Short answer is no. But the longer answer you may find of interest.
Note that embedded metadata in this context refers to the usual song data like artist, title, album, cover art, and so forth. It is normally stored inside of each song file.
The Helios DSP metadata on the other hand is different and is not embedded inside of each song file, but exists separately within your Helios database. This latter metadata describes how each song actually sounded.
It’s always great if your files already contain embedded metadata, such as ID3 for MP3s or the much better Vorbis comments tags for FLAC and Ogg/Vorbis. However, many businesses do not do this and instead prefer to store their catalogues assets in WAV master files and identify each song by a unique hash or other identifier meaningful internally to their organization.
The WAV format is derived from the RIFF container format which was designed in 1991. It did not anticipate contemporary requirements, such as the need for metadata like artist, title, album, cover art, and so forth.
Nevertheless, the way WAV format is generally used today, it still has some benefits. It is not compressed which makes it fast to load. It is a simple format that is easy to work with from a software development standpoint. It therefore remains in widespread use for storing high quality studio masters.
But this can be done without sacrificing any audio quality, preserving the original audio data bit for bit, and using only about half the disk space on average. Consider migrating your catalogue to FLAC.
If Helios detects embedded metadata in your files, whatever format that they are in, it will extract and store that data. Although convenient, embedded metadata is notessential and should not normally have any impact on how songs are analyzed.
Helios can take some time to analyze your catalogue, but it only needs to analyze each song once. On a modern computer you can as a rough estimate assume about one song per second per CPU for the initial analysis.
However, the installation itself could potentially take under a minute or two. Helios uses the most advanced package management system in the world which has been under continuous development for nearly two decades. It is used in hundreds of operating systems.
Once installed Helios runs as a system service exporting a RESTful API. Helios will even take care of configuring your router automatically via UPnP if you ask it to.
If your IT is experienced with standard server environments, there is a very good chance they will be required to learn little to nothing that is new to them in order to install Helios.
Yes and no, but mostly yes.
Users are strongly discouraged from trying to run Helios in a type-2 or hosted hypervisor like VMWare Workstation or VirtualBox. These are not appropriate to host any kind of high performance application or system service in a production environment. That’s not to say it wouldn’t work, but you are putting a high performance application in an environment with a very limited capacity.
Having said that, virtualisation or type-1 native hypervisors are perfectly acceptable like KVM. This is common place in data centres around the world. This is probably what you want.
System administrators need to manage large infrastructures by abstracting customer needs from the underlying hardware that may need to change from time to time. Software engineers require the efficiencies of high performance computing that must have direct access to hardware to be useful in some circumstance. A type-1 hypervisor reconciles these needs.
Docker containers should be fine, though they may need some tweaking.
Helios runs as a system service where it services incoming client requests. It integrates into your environment like any other system service.
There are different competing Init systems that have been proposed over the years. In striking the balance between different needs Helios supports both legacy SysV Init and the industry standard systemd(1). Most major distributions support the latter. You can verify here.
Potentially nothing on site is required to host Helios if you’d prefer to not host it yourself and have us host it for you.
But if you’ve already got infrastructure, great. Helios is designed for flexibility. It can be deployed in large scale data centres with distributed environments right down to the small scale off the shelf commodity hardware of a single machine.
Experienced system administrators familiar with standard GNU/Linux environments will have no difficulty at all deploying Helios.
Helios exports a RESTful API which is very simple to use and cross platform. As a result there are an infinite number of contexts in terms of websites, mobile, embedded, and desktop applications.
Helios ships with a pure Python module and a library of sample code out of the box to get you up and running as quickly as possible.
To simply add a button in your catalogue next to a song to search for others similar to it is very easy to do.
You. Period. It’s your data. You do what you want with it, but just make sure that you are not legally restricted from using the music that you’d like to provide Helios.
Note that DSP metadata in this context is not embedded inside of each song file, but exists separately within your Helios database. This metadata describes how each song actually sounded.
Embedded metadata on the other hand refers to the usual song data like artist, title, album, cover art, and so forth. It is normally stored inside of each song file.
Helios allows customers the freedom from the archaic bondage of textual tags to describe things like “melancholy”, “dramatic”, and other labels. Textual tags are not only labour intensive to manually generate, but even if generated automatically they have many serious problems.
Consider a thought experiment in another world. Imagine you were not searching for music, but instead were looking for a picture. You use a system that contains a catalogue of pictures. Each picture has been manually tagged by someone with a smell they recorded of the object in the picture. You can directly experience the smell when you click a button in the catalogue next to the picture. That is, one set of symbols (smells) is used to find objects in another set (pictures).
The people in that world figured out how to make their computers work with smells long before they figured out how to get them to understand pictures. Eventually they figure out the latter so that they can search for pictures with pictures.
Unfortunately the social inertia in that world still encouraged some people to search for pictures using smells because that’s what they’ve been accustomed to do for a really long time. Of course, they forgot that what they were really looking for was a picture and not a smell.
But there’s other problems with textual tags. Imagine you have a customer on the other side of the planet looking for a song in your catalogue. You’ve got the perfect song for them, but the keywords the customer uses aren’t appropriate.
Perhaps your tags are in English, but they speak something else. Or maybe they understand English, or you’ve managed to have your tags translated into a language they’re familiar with. There’s still a problem. Merely translating the semantics perfectly from one language to another, when possible, still doesn’t mean that they become equally useful. Consider your customer might have thought dramatic in their culture instead of melancholy – what the perfect song buried in your catalogue was tagged with.
In our experience, the customer almost always comes already with music samples in hand that describe the perfect tune to accompany their commercial, documentary, or other medium. Just use the music, save yourself the trouble, and make the sale in a fraction of the time.
You can use URLs as search keys for Bandcamp, Mixcloud, SoundCloud, Vimeo, YouTube, and many others.
The likelihood of Helios actually causing any kind of loss to your digital music assets or any other kind of major business disruption is astronomically small.
Helios was designed from the ground up to be portable. It will take advantage of specialized hardware to increase its performance where available to make the best use of your existing infrastructure.
Helios is currently ported to:
Progress is also currently underway to port Helios to:
All communication with your Helios server, whether on site or hosted elsewhere, can optionally be encrypted using industry standard Transport Layer Security (TLS). You may use either your own self-signed certificate or one issued by any certificate authority.
Encryption ensures that all of your digital assets cannot be intercepted by an unauthorized third party.