Unearthing Spotify: the hidden world behind everyday plays
Do you listen to music every day? Do you remember how you listened to music a few years ago? Facing the collapsed traditional music industry, Spotify is the world’s most popular audio streaming subscription service, with almost 300 million users. It is interesting to think about what the hidden world behind the music is, when music is no longer a physical product. Therefore, this blog will take an “inverse” method to analyze the infrastructure, the hidden world behind the plays, and explain how it works for searching and listening. Now, before diving into the infrastructure of Spotify, enjoy this song and join us on a voyage into the black box of Spotify.
What is Infrastructure?
Infrastructures are not easily visible and mostly function quietly in the background, which is supported by the little knowledge people have on the way they work and how they influence our everyday lives. The question arises how infrastructures can shape our deeper understanding of media platforms.
A social and theoretical understanding of infrastructure is key to the design of new media applications in our highly networked, information convergent society (Star and Bowker 230). Infrastructure is, essentially, what runs beneath structures. One can look up to bridges, highways, and even to sewage (Star and Bowker 230) as practical, physical, examples of it.
Seldom do people think about those structures on daily occasions even if they are required for modern life to even take place. In the case of new media objects, such as digital mediums, infrastructure is also present. It runs beneath what the user can not see on the material surface of the interface. Therefore, a focus on infrastructure encourages media scholars to think more “elementally” about “what media are made of” (Mattern 1).
To understand better what composes Spotify’s infrastructure, Star and Ruhleder’s analysis of salient features, mentioned in Star and Bowker’s “How to Infrastructure” chapter of the New Media Handbook, can be of use. It gives the following properties to conceptualize what can be seen as infrastructure:
“Infrastructure is sunk into, and inside of, other structures, social arrangements, and technologies” (Star and Bowker 231). Infrastructure, therefore, can be understood as the processes and structures that are constantly running underneath the platform, providing technical support to what is seen and de facto used in Spotify.
“Infrastructure is transparent to use, in the sense that it does not have to be reinvented each time or assembled for each task, but invisibly supports those tasks” (Star and Bowker 231). Users do not need, nor are allowed access to Spotify’s back end. Consumers only get to see the surface because that is what they fundamentally use.
- Reach or scope:
“This may be either spatial or temporal: infrastructure has reached beyond a single event or one-site practice” (Star and Bowker 231). Spotify’s infrastructure exists to sustain an ever continuous process of usage, and that is a general trait of platforms, as their operators constantly tinker with their governing instruments to keep end-users and complementors tied to the platform (Poell et al. 8).
- Learned as part of membership:
The taken-for-grantedness of artifacts and organizational arrangements is essential for membership in a community of practice. Strangers and outsiders find infrastructure as a target object to be learned about. New participants acquire a naturalized familiarity with its objects as they become members (Star and Bowker 231). Upon understanding and learning the processes underneath what is seen in Spotify, one may, nonetheless, gather a new perspective on the platform.
- Links with conventions of practice:
“Infrastructure both shapes and is shaped by the conventions of a community of practice, e.g. the ways that cycles of day–night work are affected by and affect electrical power rates and needs” (Star and Bowker 231). It is imperative to mention that the way Spotify’s infrastructure was assembled has an effect on the platform’s usage by end-users and also, on the other hand, the way that users demand data will have an impact in the infrastructural organization.
- Embodiment of standards:
“Modified by scope and often by conflicting conventions, infrastructure takes on transparency by plugging into other infrastructures and tools in a standardized fashion” (Star and Bowker 231). Infrastructure is seen as a whole. Loose strings of information can be as meaningless as a railroad without a train. In the case of Spotify, for a simplified example: one server is part of its infrastructure, but the whole of the platform’s infrastructure is composed by a standardized organization that will contain this server combined with other infrastructural parts such as databases, protocols, etc.
- Built on an installed base:
“Infrastructure does not grow de novo; it wrestles with the inertia of the installed base and inherits strengths and limitations from that base” (Star and Bowker 231). The structures and processes which allowed digital music streaming to happen already existed before Spotify and the platform inherited strengths and limitations from them. An example can be observed in the acquisition of physical data servers which are previously built structures and ended up becoming a nuisance to the business expansion due to high maintenance costs (Google). Now Spotify is coming from the process of shifting the data storage infrastructure to another base, the Google Cloud, that will also carry its strengths, cloud based storage, and limitations, Spotify now does not own completely the means to store its data anymore.
- Becomes visible upon breakdown:
“The normally invisible quality of working infrastructure becomes visible when it breaks: the server is down, the bridge washes out, there is a power blackout. Even when there are back-up mechanisms or procedures, their existence further highlights the now visible infrastructure” (Star and Bowker 231). The streaming platform users, for example, only remember that programming exists when the interface glitches. There are some aspects of the platform’s UX design that help to maintain the invisible aspects of the platform’s infrastructure. The spinning wheels, loading bars, and ‘throbbers’ we regularly encounter are important cultural icons for how time is treated and managed today (Diether and Gauthier 62) and in the case of Spotify, they give the user something to see while the infrastructure fails to deliver instant results.
Based on this analysis of Spotify’s salient features, the following technical components will be considered infrastructures of the platform: their servers, databases, protocols, internal processes, and some selected aspects of its user experience design, essentially those that specifically relate to infrastructural affairs, as to maintain salient features such as transparency and embeddedness.
Inverting the infrastructure to unearth Spotify
The development of the internet led to the addition of new infrastructures that are even more hidden than the infrastructures that were defining the socio-technological systems before. This development and the consequent increased complexity of infrastructures lead to the creation of new conceptual methods in order to better understand the significance of studying infrastructures.
Analyzing cyber infrastructure requires a systematic method that enables us to uncover the underlying power of infrastructures. In their work“Toward Information Infrastructure Studies: Ways of Knowing in a Networked Environment”Bowker et al. explain the rising interest in infrastructures as a topic of investigation and the ”calls to study infrastructure in science have lead to the emergence of methods for making it and associated, emergent roles visible” (98). These methods include practical methods like observing during moments of failure or conceptual methods as the ”infrastructural inversion” (98). Infrastructural inversion is defined by shifting attention from the activities invisibly supported by an infrastructure to the activities that enable the infrastructure to function.
Star and Bowker explain that conducting an infrastructural inversion is a constant battle against the tendency of infrastructure to disappear, except when breaking down (233). Therefore, they emphasize that this method is looking closely at technologies and arrangements which, by design and development, tend to be invisible and challenging to uncover. Infrastructural inversion helps reveal these normally invisible structures and furthermore gives them what they call ‘’causal prominence in many areas normally accredited to heroic actors, social movements, or cultural mores’’ (233).
Bowker et al. explain that this is a method of understanding the underlying nature of infrastructural work through uncovering the ethical, political, and social decisions that were made during the implementation, development, and maintenance of these infrastructures (99). Moreover, Bowker et al. point to the fact that analytically examining infrastructure requires “going backstage” and focussing on changes in infrastructural relations, rather than infrastructural components (99). Therefore, this blog will follow this method and ‘invert’ the underlying infrastructure of Spotify in order to get a different perspective on the platform’s inner workings.
What happens when you ‘search’ for a song?
Before we explain what happens when you search, we need to know what databases Spotify uses and how you are connected to this database. The infrastructural process starts as soon as you login to Spotify. Right after logging in, you will immediately establish a long-lived TCP connection to Spotify which enables quick searches and instant plays (Kreitz and Niemela). TCP stands for Transmission Control Protocol. It allows for a reliable data stream without having to use a three-way handshake which would increase the latency.
In recent years Spotify has undergone a change from having their own database servers, to having their servers in the Google cloud. Google offers a IaaS, short for Infrastructure-as-a-Service, which means that Google “uses virtualization technology to make their physical data centers and hardware accessible via the Internet, often referred to as the cloud” (Ekholm 1). This change was a necessity for Spotify because they were expanding at a quick rate and having their own physical data servers was financially a bad decision due to the high costs of maintaining and expanding, but it also has some other advantages: it removes the complexity of the system, which frees up time that they can use to give a better user experience and focus more on data insights and machine learning (Google). It allowed Spotify to expand its database without expanding their physical databases.
Google’s cloud environment is divided into 31 regions and each region has one or more zones with data centers (Google). Spotify uses a number of these regions for their users to connect to. This established connection is based on your location and also the interconnectedness between different regions, which increases the speed of your searches and plays (Ekholm 5).
So how does this infrastructure enable search? When you login, you are connected to an Access Point with a TCP connection, through this connection you are able to access the storage servers of Spotify which are now in the Google cloud without any delay. When you search for a song, your query will go to a production storage server. It will first look through the primary database with which you are connected, if it is unable to find what you are looking for, it will look through the other databases which are connected to the primary database (Yangratoke et al. 213).
What happens when you ‘play’ a song?
Spotify has an extensive database of more than 60 million songs and an humongous amount of almost 300 million users, but is still able to instantly play a song when you press ‘play’. This is due to the low latency, or fast response time, that the company has attained to increase the quality for the listeners. To make this possible, there are a lot of infrastructural roads that Spotify has taken.
Spotify uses data from three different sources to play a song: your local cache, other clients, and a backend site. This is a peer-assisted system that Spotify uses to offload backend servers (Yangratoke et al. 211). If you start playing a song which you have already played, the song might still be in your cache and it will load instantly. This shows that it is not always useful to delete your cache. At the same time it is also searching for the song in the Spotify storage and in the networks of other users with peer-to-peer technology (211).
In addition to this peer-assisted system, Spotify also uses fragmented audio files. This means that every “audio file is split into a header file and several fragments, which can be downloaded one after the other”(Schwind et al.). This allows for a constant data stream which the TCP connection enables. Instead of downloading the full song or all fragments, your client will download the first part of a track and start playing. It will “avoid downloading data from the server unless it is necessary to maintain playback quality or keep down latency” (Kreitz and Niemela). Each fragment has a fixed playtime of 10 seconds (Schwind et al.). Pressing ‘pause’ to buffer the whole song is not possible, since it will not be buffering unless it requests the next fragment while playing.
The hidden information of Spotify
With examining infrastructure as the entry point, we “go backstage” through the fog of Spotify’s digital music streaming, and arrive at the data server and various protocols, and connect them to the process of searching and playing music.
When analyzing “what Spotify is made of” with element analysis, we get a picture of an embedded, invisible, continuous-changeable, and standardized infrastructure.
Then, the process transporting from the infrastructure to the user, through the TCP connection, the production storage server in Google Cloud provides the match check with the primary database and others to the “search” function. Moreover, the “play” function is constructed of a peer-assisted system and fragmented audio files.
Looking at the history of the music industry, it was suffering a major downfall in its profits until 2015 when, for the first time in the millennium, it got back on growth with labels reporting profits higher than 5 billion with only streaming being responsible for 1 billion. When listeners take the dramatically increasing profit of digital music as the end of the physical music format, like CDs, the physical format does not disappear, rather, it shifts to the database servers, storage systems, and caches, and all of them owned by Spotify instead of listeners.
What is being observed in consumer behavior is a general shift from ownership to access (Goldelnik 42), this might provide a reasonable explanation for the ownership’s shifting. And it is of more relevance to consumers to actually be able to listen to songs than owning the data. Spotify had a major role in providing that access, which could be only be done through having a foundation in previously built infrastructures, such as the protocols before mentioned.
Marshall McLuhan coined the infamous phrase “The Medium is the Message”, with message meaning the change of scale, pace, or pattern that the medium introduces in human affairs (8). Bowker, however, seems to add a new layer of complexity to that concept. Harnessing infrastructural inversion, a method that describes the fact “that historical changes frequently ascribed to some spectacular product of age are frequently more a feature of an infrastructure permitting the development of that product” (Bowker 233), one can see Spotify through the lenses of the inversion and understand the utmost importance that infrastructure held in its significant disruptions of the music industry. Or, in other words, Infrastructure is the message.
Bowker, Geoffrey C., C. Geoffrey, and W. Bernard Carlson. Science on the run: Information management and industrial geophysics at Schlumberger, 1920-1940. MIT press, 1994.
Dieter, Michael, and David Gauthier. “On the Politics of Chrono-Design: Capture, Time and the Interface.” Theory, Culture & Society, vol. 36, no. 2, Mar. 2019, pp. 61–87. DOI.org (Crossref), doi:10.1177/0263276418819053.
Ekholm, Harald, and Daniel Englund. “Cost optimization in the cloud: An analysis on how to apply an optimization framework to the procurement of cloud contracts at Spotify.” (2020).
Godelnik, Raz. “Millennials and the Sharing Economy: Lessons from a ‘Buy Nothing New, Share Everything Month’ Project.” Environmental Innovation and Societal Transitions, vol. 23, June 2017, pp. 40–52. DOI.org (Crossref), doi:10.1016/j.eist.2017.02.002.
Kreitz, Gunnar, and Fredrik Niemela. “Spotify–large scale, low latency, P2P music-on-demand streaming.” 2010 IEEE Tenth International Conference on Peer-to-Peer Computing (P2P). IEEE, 2010
Logan, Robert K. Understanding new media: extending Marshall McLuhan. Peter Lang, 2010..
Mattern, Shannon. “Infrastructural Tourism.” Places Journal, no. 2013, July 2013. DOI.org (Crossref), doi:10.22269/130701.
Poell, Thomas, et al. “Platformisation.” Internet Policy Review, vol. 8, no. 4, Nov. 2019. DOI.org (Crossref), doi:10.14763/2019.4.1425.
‘Regions and zones | Compute Engine Documentation’. Google Cloud, https://cloud.google.com/compute/docs/regions-zones?hl=nl. 15 October 2020.
Schwind, Anika, et al. “Streaming characteristics of spotify sessions.” 2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX). IEEE, 2018.
‘Spotify — Company Info’. Spotify, https://newsroom.spotify.com/company-info/. 15 October 2020.
‘Spotify Case Study’. Google Cloud, https://cloud.google.com/customers/spotify?hl=nl. 15 October 2020.
Star, Susan Leigh, and Geoffrey C. Bowker. “How to infrastructure.” Handbook of new media: Social shaping and social consequences of ICTs (2006): 230-245.
Yanggratoke, Rerngvit, et al. “On the performance of the Spotify backend.” Journal of Network and Systems Management 23.1 (2015): 210-237.