The relevance of music information representation metadata from the perspective of expert users 1

The general goal of this research was to verify which metadata elements of music information representation are relevant for its retrieval from the perspective of expert music users. Based on a bibliographical research, a comprehensive metadata set of music information representation was developed and transformed into a questionnaire for data collection, which was applied to students and professors of the Graduate Program in Music at the Federal University of Rio Grande do Sul. The results show that the most relevant information for expert music users is related to identification and authorship responsibilities. The respondents from Composition and Interpretative Practice areas agree with these results, while the respondents from Musicology/Ethnomusicology and Music Education areas also consider the metadata related to the historical context of composition relevant.


Introduction
When it comes to the availability of music to a community of users, regardless of the environment (traditional library, Web, database) and file format (sound recording or score), as Kobashi (2007, online, our translation)4 stated, as far the practices of information representation are concerned, "[...] whatever the perspective adopted, why, what for, and for whom the information is organized determine its construction".Thus, it is important to know which target public of users is to be reached so that the available document does not fall into oblivion due to poor management of information organization.Martinez-Silveira and Oddone (2007) state that since the early years of the twenty-first century it is clear that the approaches to user studies are more social, i.e., the historical and social contexts to which the user belongs are strongly considered.
Therefore, the present study considers the relationship that a certain group of users has with the music information when they search and retrieve it.Thus, from a comprehensive set of metadata elements based on the literature, the aim of the study was to find which metadata was relevant for music information retrieval.The parameter for delimiting "expert users" was those users who were students or professors of graduate courses in Music.The objective was to establish a minimum set of metadata elements for music information representation and to establish links between the relevance of metadata mentioned by the respondents and the educational and research context to which they belong.The establishment of these links allows checking the overlapping between music information representation and the social context to which a particular group of users belongs.

Users and representation of music information: studies
According to Downie (2003), there are four major challenges for music information: multirepresentational, multicultural, multidisciplinary, and multiexperiential.This approach indicates that the user community of music information is heterogeneous concerning the characteristics of each individual, the relationship each one has with music (pleasure, professional, research) and therefore the possibilities of addressing this kind of information.However, it is possible to predict some general aspects related to the social context which may assist in targeting specific user studies.Therefore, Cotta (1998) highlights three basic types of music users: 1) Those seeking music for pleasure, probably searching for the material using the name of the record, music, singer, composer, band or genre; 2) Those who use music as a professional activity, who require more detailed information about the musical works, such as the level of technical performance difficulty, songs performed by certain instruments, or certain types of performance such as solo, two voices, among others; 3) Those who seek music as an object of research.In this case, the classification of music must be related to other areas of knowledge, for example, a particular historical and socioeconomic context of musical creation, or the relationship between music and religion, etc. Kim and Belkin (2002) considered the possibility of an access point that is not usually used in the description of musical information.In this study, the authors identified how emotional experiences induced by music in human beings could be used as descriptors.The authors Kim and Belkin (2002, online) affirm that "[...] we can also think about other music information needs of those who cannot, or do not wish to represent their music information needs in musical terms".
The research was conducted at Rutgers University in the United States with students of the Master program in Information Science, PhD program in Communication, professors from different university programs, and undergraduate students participated in the study.Participants who were music experts were excluded from the study.
The participants were divided into two groups.The first group was asked to write down three or more words or expressions that each participant judged interesting to describe the excerpts of music from seven different songs.The second group, heard the same seven pieces of music, was asked to write down three or more words or expressions that they would use if they needed to search for the excerpt of music heard.
The expressions described were grouped into seven categories: emotions, musical features, movements, occasions and events, objects, nature and concepts.Most expressions collected in both groups (description and search question) were related to the categories emotions, occasions, and events.The author also confronted the expressions suggested by the groups and found many similarities.
With the results of the research, Kim and Belkin (2002, online) found that the relationships individuals had with music were based on subjective aspects "Perceived as in an affective relationship to anything perceptible in association with, but at the same time in some way structurally distinguishable from, the strictly 'musical' structures".Michels (1992, p.11, our translation) 5 affirms: "Since the meaning of the song becomes real through sound, the most appropriate interpretation of music is sonorous", confirming that music information representation should not be linked only to the physical, technical, and bibliographic aspects, but also to the interpretative aspect.
Based on previous studies, Kim and Belkin (2002) argue that the type of user who searches for music by subject or emotional experiences, most often do not have their needs met considering the complexity of information representation.The author´s study, although it is not conclusive, become relevant the possibility, and specially the need, to explore new techniques of music information representation and to specialize user studies in order to serve them better.
In a similar study, Lesaffre et al. (2008) selected a group of 94 people who had the following characteristics: under the age of 35 years, who devoted a third of their time on the Internet for activities related to music, preferred pop, rock, and classical music genres, among other characteristics.These people were invited to listen to 160 excerpts of music for 4 sessions (40 excerpts per session) and attribute semantic descriptors related to emotional, structural and kinaesthetic aspects of the music to these excerpts.The descriptors were previously determined and respondents should only associate these descriptors to excerpts of music according to their personal evaluation.The model to describe the music used in the experiment consisted of 3 categories of semantic descriptors: 1) Affective/emotive: this category included two subcategories: "appraisal", composed of descriptors such as cheerful, aggressive, anxious, among others; and the subcategory "interest" with descriptors such as pleasant, indifferent, annoying.
3) Kinaesthetic: this category includes descriptors associated with memory, such as style recognition, not recognizing the music, well known song and descriptors connected to judgment such as beautiful, easy/difficult.The analysis of the descriptors was conducted to observe correlations between the different descriptions.In the analysis, the authors grouped the descriptions according to the demographic and musical characteristics of respondents.The strongest correlations were observed between the structural and the affective/ emotive descriptors, particularly those related to "appraisal".However, the authors state that this correlation between descriptors was strongly observed mainly in relation to excerpts of music identified as familiar.In this respect, this characteristic is the one that has the strongest impact on the semantic description of music by users.According to Lesaffre et al. (2008), users are capable of relating cognitive-related semantic observations for music through verbal language, which indicate the possibility of using these observations in the description of music information in information retrieval systems.
From the perspective of Cruz (2008), among expert and lay users in music, the latter type is the one that has more difficulties in accessing musical material.This is primarily due to the fact that this user has less knowledge of theory and musical structure and, therefore, is often not capable of recognizing or analyzing the available musical work and identifying its characteristics."[...] the select audience of expert users can still benefit from more sophisticated music systems and databases because they dominate the musical language while lay people -which 5 "Puesto que el sentido de la musica cobra realidad em el sonido, la interpretación más apropriada de la musica es la sonora".are a much larger amount of users -have become marginalized" (Cruz, 2008, p.6, our translation)6 .
On the other hand, the statement of Cruz (2008) about expert users allows us to reflect on the relationship between the domain of this kind of language and the expectation of being able to use it successfully when searching for information.
Focusing on the expert music users, Lai and Chan (2010) conducted a study with students taking different undergraduate and postgraduate courses in Music at the Hong Kong Baptist University (HKBU).
The authors divided the students into two groups, using as criteria the course curriculum to which they belonged.The first group was composed of students from courses that gave greater emphasis on instrumental practice and the second group with students in courses that focused more on the theoretical aspects.The research was related to access and use of musical material available in the university library.
The relevant aspect found in the research conducted by Lai and Chan ( 2010) is that the highest rate of dissatisfaction among users of both groups is related to access to sound recordings and scores.Access to music literature was better evaluated, i.e., the difficulties that professionals face regarding the treatment of sound records and scores are reflected in the dissatisfaction of users related to the access to material.The main characteristic of users raised in the analysis of the authors is the need to conduct searches by genre and historical period (romanticism, classicism, etc.).
Conducting a user study to guide the practices of particular service information must be adapted to each situation.Thus, it is necessary to consider the material available in the collection, the team of professionals involved, the type of database and availability of documents, the use given to music by the users, and other specific characteristics of information environment and the user community being studied.

Methods
The bibliographic survey conducted to develop a comprehensive set of metadata elements for music information representation consisted of documents published since 2003, year that Stephen Downie published a chapter in the Annual Review of Information Science and Technology (ARIST ) called "Music Information Retrieval", which was adopted as a chronological marker for the survey.The sum of the information from the literature and metadata standards (W3C and JISC) resulted in a table of 47 metadata elements of music information representation that was transformed into the questionnaire to collect data.Data collection was carried out with 15 professors and 61 students enrolled in the Programa de Pós-Graduação em Música da Universidade Federal do Rio Grande do Sul (PPGMus/UFRGS, Graduate Program in Music of the Federal University of Rio Grande do Sul), a course that obtained the best evaluation of the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (Capes, Coordination for the Improvement of Higher Education Personnel) -concept 7 -in 2010 (2007-2009 triennium).
When answering the questionnaire, for each representation metadata of music information the respondents had to assign one of the following four ratings: extremely relevant, relevant, little relevant, and irrelevant.Respondents were instructed that the relevance would be related to the search of music information in libraries and databases on the Web.In the first part of the questionnaire, questions related to the profile of the respondents were included.The answers were tabulated in a database developed in Microsoft Access which allowed extracting reports and graphs that support the analysis of the research data.
Based on Pereira (2004), we calculated the Relative Weighted Frequency (RWF) of each metadata, which consists of the following equation: RWF = ER x 1+ R x 0.5 -LR x 0.5 -I x 1 (Chart 1).
In which: ER = number of respondents who repor ted "extremely relevant ", R = number of respondents who reported "relevant", LR = number of The metadata that comprises the minimum set of representation of music information are those that achieved 50% or more positive points, i.e., those that have a greater tendency to maximum relevance than the average position (zero) or irrelevance.

Results and Discussion
The extraction of the manifestations of the authors of the 11 documents surveyed resulted in a set of 47 metadata elements (Chart 2) comprising 7 categories.respondents who reported "little relevant", I = number of respondents who reported "irrelevant".Here is an example: Practices, showing balance between theoretical research and the more practical ones.
Accordingly, we deem relevant to read the results partially with the purpose of finding any peculiarity in sets of metadata.The results were divided into two major groups: the first, covering the theoretical areas from a social, anthropological and pedagogical perspective in music, collecting questionnaires from respondents of Music Education and Musicology/Ethnomusicology.The second group consists of the areas in Composition and Interpretive Practice, which were deemed more practical courses focusing on music from the perspective of compositional techniques and performance.Thus, the result that consists of the full set of data analysis was first presented, followed by the partial results.
A total of 18 metadata were considered to comprise the minimum set of music information representation, as shown in Table 1.
Thus, among all facets of music that unfolds into characteristics with different levels of representation complexity, those in Table 1 are the key concepts that the expert user reports when searching for a musical document.Therefore, the music information representation would consist of recoverable metadatarelevant metadata-and still other metadata elements of music information representation.This information would, at first, allow information retrieval and the selection of documents by the user.
As shown in Table 1, the metadata with higher averages of relevance are "album title or set of scores", "song title" and "name(s) of composer(s)".According to Smiraglia (2001), the title of the musical work is a significant Free-text notes on the content of the document.
Free-text notes on the conditions of the work, aspects of conservation.Indication of file compression format -printed or digital score (e.g.PDF, CD, mp3, disc).Availability for score retrieval by images or symbols.
Categorization of music based on rhythmic and instrumental composition (e.g.jazz, blues).Indication if the song has vocals or if it is only instrumental.Indication if the vocal is male, female, child.A note or tonal center around which the music is composed (e.g.major A, flat B).Indication of rhythm of work (e.g.2/4, 6/8).Display individually when there is a set of songs.Structure adopted for musical performance (e.g.piano reduction, two voices) and description of the instruments that are part of the musical performance.Sonata, concert, fugue, etc.
Original nationality of song composer.Nationality the artist/band that interpret the song.Name and geographical location of the studio, event, program, which was recording (if sound recording).Temporal musical style (e.g.classicism, romantism).Date when the song was written (exact date, century).Place and date of first publication or recording (e.g.: date of CD recording or score publication).Recommendation for use of music (e.g.rest, activity).Relationship with the subjective sensation caused by music (e.g.: sorrow, joy).Social identity of music (e.g.: wedding, funeral, for children).

Categories
Interoperability information, file compression, transfer protocols.Music retrieval by the similarity of the voice or instrument melody.Indicates a link or URL to access the digital file in the database or elsewhere in the network.Person responsible for filling out the metadata.Number of times that the recorded information of the document has been accessed by users.

Categories
Metadata Description

Conclusion
Source: By authors (2011).responsibility for the composition.When Chaves (2010, p.93, our translation) 7 emphasizes that "Access to music information -and within it the access to the repertoire -is unlimited, but it is one of the contradictions of our time [...]", he discusses the importance of students in Composition needing documental knowledge to develop 7 "O acesso à informação musical -e dentro dela o acesso ao repertório -, é ilimitado e essa é uma das contradições de nosso tempo [...]".
different styles of composition, represented by different composers and periods, and based on these data develop their own expressiveness.The author further states that the "Management of sources" is something that directly affects the development of the repertoire.Note that the metadata "lyric author" also refers to an intellectual responsibility of authorship, requiring specific identity, distinguishing it from the responsibility of composition.
Clearly and specifically differentiating the composer from the performer becomes crucial when the interpretation of the performer of the song plays a strong role in the characterization of this song.For example (among many that could be cited), imagine the song Tico-Tico no Fuba by Zequinha de Abreu played by Hermeto Pascoal and Sivuca or interpreted by the Berlin Philharmonic Orchestra.We would note cultural differences between the classical music and popular music translated by different arrangement of the two performances, as well as the actual performance of the musicians (gestural movements, clothing, etc.), and the visual aspect that blends with other subjective sensations forming a single image of music (Michels, 1992;Clarke, 2004) that causes certain feelings in the listener.Freire (2010, p.22, our translation)8 , discussing the philosophical assumptions involved in music research, states that: [...] Performance situations [...] will always be understood as interpretive manifestations, socially and historically conditioned, and thus embedded in broader processes of esthetics or even the expectation of the performer and audience of this interpretation.That is, music interpretation is not understood as predetermined by the author or by time [...].
According to Clarke (2004), the performance occupies the central position in the musical culture of a particular group, making the name of the artist relevant information both for studies in the area of performance and Ethnomusicology.
Within this context, the discussion on the relevance of the metadata "arrangement" was considered.The arrangement of the song can be connected to an adaptation, for example, when the music is ready to be performed by a larger group of instruments than the original, or when the opposite occurs (Grove;Sadie, 1994).Thus, a piano student, for example, is naturally interested to know whether the score available in the library is a piano reduction or a music arrangement for other instruments.A work originally composed for an orchestra undergoes significant adaptations when redesigned for piano when special importance is given to the metadata "name of the arranger", responsible for the instrumental preparation and sound adaptation.
The description of musical instruments brings a more significant terminological problem.The study of Ballesté (2011, p.679, our translation)9 on the conceptual and terminological organization of plucked stringed musical instruments from the nineteenth century states that "The spellings and meanings of related terms vary according to the region, social group, and historical period".In a score one will find to which instrument the document refers, however, searching for accuracy in the description of the instruments is more laborious when considering the information on sound.Currently, with synthesizers and other electronic instruments, it is often not possible to identify whether the sound is being produced by the musical instrument or by the synthesizer (Caesar, 2010).In the same way, similarities in timbre or hearing of instruments from different cultures can cause misinterpretation of sound.We still have to consider the multitude of types of existing musical instruments that makes it difficult to recognize all of them simply by listening.
The problem exposed does not minimize the importance of this information in the representation of music information.On the contrary, it increases its importance especially in ethnological research in which the cultural gap observed -the definition of "other"regarding the culture of the observer, is exactly what justifies these studies (Cambria, 2008;Piedade, 2010), as well as the performative based studies in which the technique of musical performance, connected to the instrument used, becomes more evident.
Music genres, according to Janotti Jr. (2003, p.37, our translation)10 , are not only marked by the "Form or style of a musical text in the strict sense, but rather by the perception of the audience of its 'forms' and 'styles' through performances presupposed by genres".Thus, from a geographical and chronological point of view, one can understand, as stated by Janotti Jr. (2003), that any mapping of a stylistic genre is temporary.The references of the musical genre evolve and adapt to new cultural conceptions and the market context.In this sense, the analysis of the musical genre implies the significant application of perception subjectivity.Thus, given the relevance pointed out by the respondents to the metadata "genre", it becomes evident that the librarian needs to know the conceptual universe that involves the analysis of music genres and still have supporting tools that ensure maximum precision for this definition.
The relevance of information regarding the editorial facet of music, represented by the metadata elements in the category "aspects of production and editing", is justified since there are significant differences between the editions of the same work, as stated by Downie (2003) and Cruz (2008).Thus, the editorial information (editor's name, place and date of issue) connected to the "version" of the work that is being recorded (original, remixed) and the "type of recording" (live or in studio).Downie (2003) and Cruz (2008) point out the importance of the editorial information concerning the scores, particularly those related to the music performance that may appear in different ways when the score is reedited.According to Freire (2010, p.33, our translation) 11 "[...] comparing, e.g., different forms of music notation or different versions of the same work can provide an understanding of many different aspects of nature, such as epistemological, socio-cultural and esthetic characteristics of the time, among others".
Research in music can have a documental and analytical nature, and sound recording and the score are important materials for analysis, together with other documental forms such as audiovisual and textual documents.In this sense, the "collection to which the work belongs" may be an important aspect in locating and analyzing retrieved information especially regarding the editor responsible for the collection and the criteria that led to the collection (composer, chronology, or genre).One may also conduct a research using scores or other forms of notation, making visualization and comparison of musical structures possible.In this regard, Freire (2010) points out that editors and performers are seen as mediators that interfere in the understanding of music by the audience (which can be another performer, listener, etc).
Despite the metadata "historical period" not having reached the average of 50% and therefore not being within the minimum set, the metadata "date of creation" and "publication date", which refer to the creation period (exact date, decade or century) and date of first publication, is information that gives the user conditions to comprehend, for example, whether a song belongs to the classical genre of the classical historical period or if it was composed in the contemporary period.This analysis provides guidance for the user to have an overview of the cultural context prevailing at the time of creation (particularly when followed by the metadata "nationality of music", "nationality of performer" and "original data").For Piedade (2010), studies in ethnomusicology break the music/culture dichotomy, emphasizing that music needs to be examined from a holistic perspective that includes elements of the cultural domain and not reduced to the sound dimension.Thus, it is possible to infer that the information registration of the musical document of a geographical and chronological nature becomes relevant as soon as this information is part of the research interests in Music and can therefore be fundamental in the process of documentary survey.
The metadata "music notation" may be relevant when it becomes a possibility beyond the textual language for the search and retrieval of music.The "file format" is a significant metadata for obvious reasons as it is essential information for the selection of materials by the user, including the potential retrieval of the soundtrack and score of the same song, which makes the study of the music possible (Clarke, 2004).
Note that in the category "emotional and social aspects" no metadata reached the mean of 50% and, moreover, these three metadata elements presented the highest negative average.The emotional and social value of music is closely related to cultural issues, and the personal experiences of the listener are added to these variables (Moraes, 1986).Therefore, the metadata elements that intend to determine certain emotional and social meanings related to the musical document cause doubts regarding the universality of the meaning and reliability of this information concerning the music information being described.One can also consider that a study with this type of music feature may be more related to the "creative role of the listener" (Piedade, 2010).In this case, the social and emotional characteristics of music become indicative of cultural or private individuals and not information about music.
Regarding the partial analysis of the results by grouping the areas of research, we found that the areas of Musicology/Ethnomusicology and Music Education indicate a larger number of relevant metadata, totaling 22 metadata elements.
In addition to the previously discussed metadata, the RWF in the questionnaires of the respondents from the areas of Musicology/Ethnomusicology and Music Education was higher than 50% for the following metadata: "complete song lyrics", "original nationality of song", "historical period" and "technical description".Note that, in fact, there is greater interest in the information that refers to the historical and social character of the musical document.As for complete lyrics, the verbal information provides rich semantic content for historical, ethnographic and pedagogical analysis.According to Piedade (2010), all prerequisites that organize the musical dimension, including those covered by the linguistic categories, should be considered in studies in Ethnomusicology.
The respondents from the areas of research of Composition and Interpretive Practices were more succinct in the set of relevant metadata elements, pointing to only 15 metadata elements.These areas of research abdicate from those metadata that are part of the minimum set of representation, "version", "editing" and "musical genre", although the latter has achieved a relatively high average score (40.00%).The metadata "song title" and "name of composer" reached 98.33% of relevance, exceeding the average relevance of these metadata compared with the minimum set and the framework of more theoretical areas.

Conclusion
The classification of the types of users of music information (those seeking music for pleasure, professionally or as an object of research) seems to find support in the results of this research that confirm the relevance of information about date of creation, publication date, and arrangement (including a description of the instruments) for users who consider music as a professional practice and/or object of research.It is noteworthy, however, that further studies comparing the observations of relevance from lay users with those from expert users would elucidate the categorization of types of users.
Another aspect of the results that are similar to those found in the literature review is that the non-musical aspects (such as the social and emotional dimension) are probably of greater interest to lay and non-expert users.
In the present research, however, one can notice that the interest of experts is not related to the "musical language" (notation system, rhythmic figures, musical elements such as rhythm, tone, etc.), but rather to the information that identifies the musical document objectively.Based on partial observations of the subfields in Music, the interdisciplinary expansion of Ethnomusicology and Music Education became clearer which probably made the users related to these areas interested in the external contextual aspects of music.
It seems that of all the music characteristics that can be represented, the expert user community gives special importance to information related to the intellectual responsibility of authorship and other information that can be verbally described, an aspect that does not invalid the need for familiarity with the musical language.
Therefore, the present study discusses areas of research that would give continuity and complement the results of the three main themes discussed.First, we pointed out the analysis and mapping of specific fields of knowledge, in this case Music and its subfields.The aim of this study was to understand the paradigms and mechanisms of development of this science, the main problems pointed out by researchers, documents and forms of expression of this community.The second theme, related to the first, points out the importance of conceptual studies in the field of Music in the sense of recognition of the terms and their relationships within this field of knowledge.Thus, the relevance of the metadata elements would be combined with terminological precision.
The third issue is related to the study of the profile of users that needs to consider the specific context of each library, the type of material available, and the system used to describe and search for documents.Within this context, it is also possible to investigate the relationship between the expectations of the users regarding the library services and the possibility of using certain metadata to search for and retrieve information.This aspect indicates a further reflection on the following question: could the non-relevance of certain metadata (such as sound aspects) be related to the low expectations of users regarding library services and databases?
It was found that music information is an issue that still has many promising areas of research, therefore representing a compelling trajectory for the development of Information Science.
The bibliographical and documentary sources used were as follows: a) Proceedings of the International Society of Music Information Retrieval Conference (ISMIR); b) Annual Review of Information Science and Technology (ARIST); c) Encontro Nacional de Pesquisa em Ciência da Informação (ENANCIB, Proceedings of the National Meeting on Research in Information Science); d) World Wide Web Consortium (W3C); e) Joint Information Systems Committee (JISC).

Figure 1 .
Figure 1.Number of respondents per area of concentration.Source: Research data (2011).

Chart 1 .
Exemple of calculation of Relative Weighted Frequency (RWF).Date of creation.Date of publication.CD title, album, compilation or set of scores.Title of each song that consists of recording or score.Name of person responsible for the intellectual production of original music.Name of the arranger responsible for adapting the music to the execution context.Name of the lyric author.Name of the artist, band, orchestra that interpret the song.In case of non-original work, indication of the song title and original composer.Name of responsible music producer (individual and/or company).Indication if the work is original, remixed, adapted.Type of copyright (e.g.: creative commons, rights reserved).Name of the copyright owner.Name of the individual editor or organization responsible for the editing of the CD, disc, score.Place, date and edition number in case of reissue of the same work.Name of the record label.Specifies the type of sound capture and sound recording (live, in studio, etc).Language of the album artwork or edition.Name of collection to which the work belongs.Duration time of the set and of individual songs.Previous published discography by the same performer.Complete song lyrics.Song lyrics translated.Language of song lyrics.

Table 1 .
Minimal set of metadata for representation of music information.