Super-Convenience for Non-musicians: Querying MP3and the Semantic Web
Stephan Baumann German Rearch Center for AI(DFKI) Erwin Schrödinger Str.
67608Kairslautern
+49-631-205-3447 Stephan.Baumann@dfki.de
元宵节英语Andreas Klüter
sonicson GmbH
Luxemburger Str.3
67657Kairslautern
+49-631-303-2800 Andreas.Klueter@sonicson.defaq是什么意思
attn
ABSTRACT
Digital music distribution,the success of MP3and the actual activities concerning the mantic web of
music require for convenient music information retrieval.In this paper we will give an overview about the concepts behind our“super-convenience”approach for MIR.By using natural language as input for human-oriented queries to large-scale music collections we were able to address the needs of non-musicians.The entire system is applicable for future mantic web rvices,existing music web-sites and mobile devices.Beside the framework we prent a novel idea to incorporate the processing of lyrics bad on standard information retrieval methods,i.e the vector space model.
1.INTRODUCTION
The digital distribution of music is one of the most attracting and challenging topics for musicians and computer scientists the days.In despite of the ongoing legal debates we find a lot of potential for convenient man-machine-interfaces to music on the technical side.Our long-term goal is the provision of a system architecture giving as much flexibility as needed to build powerful applications as customized instances of such an approach.
Our goal is to reach a maximum of convenient usability and a minimum amount of manual indexing of underlying large-scale data,we subsume this as super-convenience:(1)Human-oriented interface paradigm,(2)uniform feature handling and automatic metadata generation,(3)retrieval and recommendations.
Our overall approach is targeted to hybrid processing ranging from pure surface structure recognition to symbolic inferences among the concepts of the ontologies.As a unique novelty we prent the amless incorporation of lyrics in this approach in order to get–in the upper end-insight experiences about the perception of moods.We focus on naïve listeners or non-musicians in order to provide applications for the mass.
2.ONTOLOGICAL BACKBONE
The mantic web is on its way to enter the mass.Real killer applications may be convenient music information retrieval systems for naïve listeners.The contributions can be en in the tradition of established standards such as MPEG-7.Indeed, authors report about successful transformations of MPEG-7to the RDF(S)standard ud for the mantic web[1].Furthermore the collaborative effects of a broad ur ba can be ud to make recommendations or computing the similarity between musical tracks.KANDEM is such an approach as described by the group at MIT media lab[2].Answering real life questions of non-musicians requires real life knowledge in the music domain to be ud within the MIR.For this purpo we modelled an ontology about the domain of music.In our application scenario we noted as terms the concepts of required know-how in the music domain. The relations consist of veral types;is-a and part-of relations are ud quite often.Is-a-rela
tions are ud to indicate specializations of acid jazz is-a jazz)while part-of-relations denote required ack part-of compilation).The aspect of sharing knowledge about conceptualizations with others is the most relevant aspect when building ontologies.In such a way different agents can share access to the mantic web of music. The activities are still in their infancies and the problems of the status quo are described thoroughly in a recent publication of Pachet[3].At prent our ontology is able to handle multiple inheritances for the concepts of tracks,albums and artists who outperforms standard subsumption hierarchies as found on many MP3-sites.Further concepts are the musical properties,which are linked to the automatic audio loudness,tempo, timbre).As a novelty we introduced a mantic link contains_lyrics,which is grounded by the ASCII-text in our document databa.
3.MUSIC DATABASE
The MIR system access the musical data from an underlying databa.In our first prototype we ripped a private CD collection to MP3format at128kbps.The scope of this datat is about1000 tracks covering60artists and approx.50different genres.The administrative information about artist,title,and album has been gathered by usage of the CDDB.Unfortunately data quality was insufficient for automatic processing.While the inconsistencies in artist,title and volume tags could be
removed;the genre information remained uless for automatic processing.Therefore the genre tags have been t manually.For the experiments at hand about500lyrics have been added as plain ASCII text.
confounded
4.AUDIO ANALYSIS AND NLP
The automatic audio analysis recognizes properties such as loud/quiet,fast/slow and MP3subband features for the determination of similarity.For the extraction we ud the approaches of Pfeiffer[4].Natural Language Processing(NLP) approaches lie between the two extremes of key word processing (=disregard for word relations and context)and complete understanding.Both are not applicable for pragmatic processing of natural language music queries.The approach of example-bad processing with partial abstraction is especially suited for music arch requests(limited domain,high speed requirements) and offers an optimum trade-off between processing speed and good-natured reaction to off-scope requests.Our query interface in front of the NLP component is not confud by typing errors. Additionally,the system is able to connect artist names,which sound similar to each it is still able to produce results when there is phonetic similarity(such …fil collins“vs.…phil collins“).Many general-purpo quence distance methods have been investigated in the past.The phonetic fuzzy match ud
各国留学费用一览表Permission to make digital or hard copies of all or part of this work for personal or classroom u is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.
©2002IRCAM–Centre Pompidou
by our system is bad on former work at the German Rearch Center for Artificial Intelligence on this subject.Currently the phonetic fuzzy match is online for a large-scale music information system acting on50.000different artists accessible via a webrvice at An evaluation of recall and precision will be possible by logfile analysis in the future.
5.EXPLORING LYRICS
We started some experiments concerning the similarity of lyrics and the implications for the perception of music similarity.Here we u state-of-the-art document retrieval and classification approaches,which have been recently commercialized and successfully adopted to real-world problems.We ud both,an API to a commercial tool as well as the text classification workbench and its submodules developed at our institute.We ud the Protégé2000tool for convenient design of ontologies. The top-level concept lyrics is broken down into a taxonomy of typical topics covered by
mainstream music.In the future such handcrafted topic ontology may be supported by mi-automatic ontology learning through document clustering approaches.For the current experiments we focud first on the“subsymbolic”level of lyrics.State-of-the-art document retrieval and classification approaches are still missing an in-depth ontological support.Nevertheless the basic techniques have a long-standing tradition in information retrieval and could be applied to the domain of lyrics.The tools allow for different functionalities.A query in the boolean retrieval model consists of a boolean combination of tests on the occurrence of specific words.For instance,the query(hate or love)and girls tests whether a document contains one of the words hate or love as well as the word girls.To go beyond the boolean retrieval,additional functionality,which we integrated,is bad on the vector space model(VSM).In this model,lyrics as well as queries are reprented as vectors.The dimension of the vectors indicate specific terms,the value of a vectors component indicates the number of times the respective term occurs in the lyrics/query to be reprented.Defining a similarity measure between vectors does standard document retrieval bad on queries in the VSM. The most frequently ud measure here is the cosine-measure, which computes the angle between two vectors.Having a vector reprenting the query,the documents corresponding to the most similar document vectors are returned as answer documents.In this way we realize the computation of similarity among lyrics. Since queries and lyrics in the VSM are reprented as vectors, also the simi
larity between vectors reprenting just lyrics can be computed.Roughly spoken,tho lyrics,which share many important words,will have a high similarity.Computing the most relevant terms can perform a kind of summarization.As a further functionality the similarity between terms is computable allowing for automated term expansion and mapping to the taxonomy of topics in the music ontology.The lyrics collection contains500 documents.While the querying for terms or topics is easy to perform,the more challenging approach is to examine term similarities or even document similarities.For the latter we show some typical results as stereotypes for the most common result cas of the approach in the following.For simplification we reduced the prentation on the5most-relevant terms of a given reference song and the top3similar songs by applying standard metrics of the vector space model.
Song193:Phil Collins-One More Night
Most-relevant terms:forever wait night cos,Similar:P.Collins–You Cant Hurry Love,P.Collins-Inside Out,P.Collins-This must be Love Reference Song297:Cat Stevens-Father And Son Most-relevant terms:fault decision marry son ttle,Similar: P.Collins-We're Sons Of Our Fathers,Sheryl Crow-No One Said It Would Be Easy,George Michael-Father Figure
Reference Song112:Lucy pearl-Dance tonight斯洛伐克语
Most-relevant terms:toast spend tonight dance money,Similar:Lucy Pearl-you(feat.snoop dogg and Q-tipp),Phil Collins-Plea Come Out Tonight,Madonna-Into the groove.
Reference Song56:Fanta4-Das Kind Vor Most-relevant terms:wollten euch hn enttzt lben,Similar: Fanta4-Auf Der Flucht,Freundeskreis-Mit Dir,Fanta4–Populär Reference Song145:Madonna-Paradi
Features:remains pas encore fois moi,Similar:Zero Hits
6.DISCUSSION AND FUTURE WORK
救护车英语Non-musicians may query the musical databa by remembering parts of the lyrics.In a recent evaluation with100naïve listeners we found this class of queries being esntially often ud in a non-restricted ur interface.The integrated approach can handle the queries.Some artists em to cope with an overall theme on a complete album or even for a t of albums.Similarity metrics for term frequencies deliver appropriate results for the phenomena(e example193).Some topics can be found across genre-boundaries(e example297),which is indeed the intention for topic-bad queries neglecting musical genres.Other topics are more often reprented in specific genres(e example112). Dancy music often talks about dancing,parties,good vibes. Specific vocab
ularies are typical for some very specific German hip-hop(e example56).This is a first impression, which has to be evaluated thoroughly in the future.Large corpora with multi-lingual entities are obviously necessary to cope with lyrics in different languages.Our initial corpus has been too small to cope with languages being different from English or German (e example145).We still e a lot of potential in this kind of work if combined with the theory of affective computing.We could u lyrics and IR techniques to create automatically meaningful terms and topics.The emotional perception of such a topic(war vs.peace)may be coupled with the emotional perception of the audio surface structure(minor vs.major).In such a way the concept of moods[5]could be provided automatically for end-ur queries.We prented the concept of super-convenience in this work for the first time.Our framework could be established by using cross-fertilization from different rearch disciplines,mainly in the area of AI.NLP,IR and Ontologies are the most prominent ones which have been incorporated in this work to get clo to our initial goal.金融英语考试报名
7.REFERENCES
[1]Hunter J.,Adding Multimedia to the Semantic Web-
at leastBuilding an MPEG-7Ontology,in Int.SWWS,Stanford, July30-August1,2001
[2]Whitman B.,KANDEM:Community Metadata for Informed
Music Retrieval,MIT MediaLab,May2002,Website <dia.mit.edu/~bwhitman/kandem/> [3]Pachet F.,A Taxonomy of Musical Genre,Proc.of ISMIR
2001,Paris,France,2001
[4]Pfeiffer S.,Vincent T.,Formalisation of MPEG-1
compresd domain audio features,Technical Report Number01/196,CSIRO,Australia,2001
[5]Huron D.,Perception and Musical Applications in Music
Information Retrieval,in Proc.of the ISMIR2000, Plymouth,Massachutts,2000アップデート