LyrAIcs: Artificial Intelligence for Lyrics

Music streaming services are becoming the most significant way for people to listen to music, representing globally around 304.9 million subscribers in 2019. Latin-music took a 9.4% share of 2018 market scene and is among the five top music genres: in 2019, half of Spotify annual 20-song list were Spanish-language songs. The success of streaming services is mainly based on the tailor-made playlist they offer to users based on their listening habits and 54% of consumers say playlists are replacing albums in their listening habits. To build those lists and fit them to the user’s preferences, music streaming providers have developed software tools and techniques, called Recommender Systems, which apply big data and machine learning to the large amounts of data collected from their users to provide personalized suggestions.

Current Recommendation Systems do not take into account the semantic information of song lyrics, which contain a very interesting set of unstructured data and additional qualitative information that is not gathered by metadata. In a song we distinguish between the music and the lyrics; the lyrics are formed by text, usually in the form of poetry. The retrieval of this knowledge by applying Natural Language Processing technologies is a great source of information that can be used to enhance recommendation systems by opening unexplored possibilities for the music market, such as automatic lyrics screening to detect non-compliant topics for specific age groups, or racial and gender groups. By combining some of the most advanced technologies, we will create an application that will analyze song lyrics and will enrich Recommender Systems. This will open the door to countless opportunities for music streaming companies.

The Project

Artificial Inteligence for Lyrics Comprehension

The aim of LyrAIcs is to build an AI based recommendation engine for song lyrics that will enrich the potential of existing music and song classifications with the content extracted from the lyrics text, using Artificial Intelligence and Natural Language Processing technologies. It will be built combining the algorithms generated by the POSTDATA-ERC funded project, devoted to automated poetry analysis and classification.

The algorithms generated by POSTDATA (Grant Agreement: 679528) are able to analyze and classify Spanish poetry in three levels: 1) metrical and rhythmic 2) semantic and conceptual and 3) emotional and sentimental. These algorithms have been built combining existing open-source libraries, together with the latest Deep Learning libraries and they are able to extract poetry features classified following a poetry ontology that can be used for processing the texts of the lyrics and enrich the RS with qualitative and real-time analyzed metadata.


LyrAIcs proposes to create a product that will transform all these powerful algorithms into a webservice that will be commercialized into the music market, by linking its APIs to the main Music Streaming Services. The automated generated metadata extracted from the lyrics analysis will enrich their existing Recommendation Systems real time and without the need of manual tagging. Algorithms will be able to “learn” and improve efficiency and accuracy by being trained with the existing metadata and tags and by the interaction with new datasets, classification and continuous interaction with data provided by users.

Our engine will target a market of +1,5Bn songs/year, as it will be mainly focused on the Spanish content of the streaming world. No other tool or platform offers content and language analysis for Spanish (whereas in English there are already beta solutions available), and it is the highest growing market (16%) with the Latino music and the number of consumers listening to that music increasing exponentially.


Research Team

Elena González-Blanco

IE University

Pablo Gervás

Universidad Complutense

Álvaro Torrente

Universidad Complutense

This project has received funding from the European Research Council (ERC) under the European Union’s
Horizon 2020 research and innovation programme (grant agreement No 964009).


810px-flag_of_europe    erc_logo-1

Interested in the program?

Contact us for more information.