Thanks a lot Matej for this in-depth review. I will address all the issues you pointed out and make another draft.
Reagards, On Tue, Apr 30, 2013 at 2:11 AM, Matěj Laitl <ma...@laitl.cz> wrote: > On 29. 4. 2013 vedant agarwala wrote: > > In the meantime, I have updated my project proposal as shown below. Your > > inputs are is still as valuable so please comment :) > > Hi, I'm very sorry to reply this late, I haven't unfortunately found time > earlier, neither did any other developer apparently. > > Alberto, perhaps you'd be able to review this proposal (unless you want to > submit a similar one) too? (as you've lately touched the code > significantly) > > > Proposal Title: Improving and modularizing tag guessing > > > > Motivation for Proposal / Goal: Currently, Amarok “guesses” tags of music > > files through the MusicBrainz web service using either existing tags or > > MusicIP (PUID) audio fingerprints. This method, however, is somewhat > > outdated. MusicBrainz is phasing out MusicIP in favour of AcousID and > other > > methods of guessing tags have emerged. The aim of this project is to > create > > an abstract base class for tag guessing that can be suitably inherited by > > other classes that aim to guess tags. > > > > Getting tags from MusicBrainz is hard coded into the Amarok source code. > > Hence tag guessing is not modularized; this makes it hard to add new Tag > > Getters (like, for example, Last.fm). Moreover, MusicBrainz code is not > > very well documented. So, it'll be very time consuming for a programmer > > trying to add features even just to MusicBrainz since they will have to > > read and understand the code. As of now, every track’s tags can only be > > changed one by one that also after waiting a few seconds for each. > > The last sentence is not true, you can change then all in batch using the > Edit > Tags dialog or throught existing MusicBrainz support. See also how Alberto > Villa > > > More often than not, the tags returned by MusicBrainz is incorrect or > tags > > of different kinds are returned. > > This might get better thanks to recent Alberto's patch. I view MusicBrainz > as > a good way for slight corrections to existing tags, but is often provides > very > bad results for files with no tags at all (which is expected). > > > Say when a user is listening to a particular song he/she realizes that > some > > tags to the song are missing. He/she will try to correct this by himself > > and will be very happy to see a “Get tags from MusicBrainz” button. > Rather > > than updating the tags himself he/she clicks on the button and selects > the > > correct tags. But this is where it all ends. The user is only saved from > > the trouble of typing. He/she still has to choose the correct tags. Many > > times the user will not find any tags at all or they will be guessed > wrong > > entirely. > > I don't think that this project, even is successful, will make the > situation > above significantly better. > > > Moreover, there is no way of updating/correcting tags of multiple files. > > This is entirely false. > > > If > > a user adds some new songs to the local collection with poorly formatted > > tags. It is not uncommon for tracks to contain a track name in the format > > “trackname - albumname - artistname” with the other details blank. Many > > tracks contain names of websites (like downloaded from) at the end of > one > > or more tags of the file. A user will want these tags to be fixed but it > > will be a real pain to do this one by one. If he/she clicks on the “guess > > tags from file” he/she is greeted by another dialog that takes a long > while > > to understand. Its likely that he/she will just close this dialog. Even > if > > he/she gets tags from MusicBrainz (assuming the fetched tags are > correct), > > getting them from each file one by one will be too much of a task for > him. > > This paragraph is very misleading - such problem doesn't exist in current > Amarok. > > > Implementation Details: > > - > > Creating a generic framework for tag getters: I will replace the > > existing musicbrainz directory by a taggetter directory. It will > contain > > Controller and Provider classes (similar to StatSyncing) under the > > namespace TagGetter. The settings UI and related classes will also be > > present in this directory. I will change the musicbrainzTagger() in > the > > TagDialog class to tagGetterController(). Once this is called, the > > TagGetter::Controller (a singleton class) will be invoked and process > of > > tag guessing will begin. The Controller will have a list (a QList of > > ProviderPointers) of available Providers and it will create objects of > > each of them. Each of those objects will implementing the abstract base > > class Provider. As required, the Controller will be calling the methods > > (and connecting signals to slots) of the Provider on the main thread and > > via polymorphism appropriate methods of each Provider will be called. > > Right. > > > Each TagGetter::Provider will have its own directory under the > TagGetter > > directory. They will contain classes and one that will implement the > > Provider class. Data will be shared among the classes (in the > TagGetter > > namespace) probably via QExplicitlySharedDataPointer. > > what data? > > > Each provider will > > be running in parallel. As the different Providers do their work, results > > and progress of the work will available to the Controller via the signals > > connected to the slots of the Controller. Establishing a connection to > the > > web service server, converting the music into data sendable over the > > internet, fetching results and the error handling will all be part of > each > > Provider’s work. Separate threads won’t be required because the network > > operations will be handled by Qt (QNetworkRequest and QNetworkReply) and > > they are the most time consuming. If something else takes time to run > (like > > decoding the audio and creating the fingerprint) then the provider will > > have to manage this on another thread (probably by using the > > ThreadWeaver::Job). As part of the contract of each provider, each of > its > > methods must return quickly. This is important because the Controller > will > > guarantee calling these methods on the main thread. > > Right. > > > The Controller will provide the Providers with track details (through > > the track pointer)such as track data (name, album, artist, length, > file > > location etc.). The framework will contain a TagGetterMeta class. > Objects > > of this class will store a TrackPtr to identify the the track in the > > Collection and metadata (name, artist, album, etc.) returned by the > > providers. Providers will return a QList of objects of this class. The > > metadata that has been updated will be non-zero. > > Oh, I'm getting a bit lost in this, but I hope I get the point, which seems > okay. > > > Providers should not store > > this data but they should keep the data that authenticates the provider > > (probably an API key that is received on authentication) as long as the > > objects of themselves exist since many lists of tracks can sent to be the > > Provider in succession, either by the user or programmatically (refer to > > last point of “implementation details”). > > > A AbstractSettings class will also > > be present so that users can customize the settings of each Provider and > > each Provider can store a small amount of data in the data using the > > KGlobal::config(). This will also be beneficial for the programmer(s) who > > add more Providers. It will also keep a consistent look and feel of all > the > > TagGetterProviders. > > Hmmm, the original intent was to make the providers rather dumb and almost > invisible to the user. What do you think would be need to be configurable > by > the user? > > > Rewriting MusicBrainz tag getter: I will rewrite MusicBrainz code > > according to the above framework. All the existing classes will > shifted > > to the new musicbrainz subdirectory inside the taggetter directory. The > > MusicBrainzFinder will become MusicBrainzProvider, implementing the > > abstract Provider class. > > Right, but I wouldn't say "rewrite", but rather "adapt to be an > implementation > of the new abstract class". > > > Other code will also be re-written but its work > > will mainly remain the same, except for one major difference. Currently, > > libofa is used to create the MusicIP (PUID) that is sent to MusicBrainz. > > MusicBrainz is phasing out PUIDs in favour of AcustIDs. > > > > Hence, the MusicBrainzProvide will use > > Chromaprint<http://acoustid.org/chromaprint>to compute AcustIDs rather > > than the MusicIP generated by libofa. > > Oh beware! The musizbrainz and musicip are rather independent and were > intended to be factored out into 2 separate providers! (one changing to > acoustid) > > > Codecs will identified using ffmpeg. > > What/why? > > > Phonon will be used for decoding while Chromaprint will be used to > generate > > the AcoustID. > > Have you checked this is possible? What Phonon methods will you use? > > > Chromaprint uses the > > standard C > > library[*]< > https://bitbucket.org/acoustid/chromaprint/src/master/src/chroma > > print.h>so using this won’t be a problem. > > Eh? What is a connection between a project using a standard C library and > feasibility of its usage? > > > I will make Chromaprint a part of > > the optional tag guessing package (replacing libofa) and update the > > cmakelists, making Chromaprint as a requirement for MusicBrainz tag > > guessing. Hence, the MusicBrainzProvider will be available only if > > Chromaprint has been installed. A HAS_CHROMAPRINT macro will keep track > of > > this. The provider will be “available” only if the macro is defined. Just > > as now, the “The::networkAccessManager()” > > will handle the connection to the MusicBrainz server. The audio > > fingerprint generated by Chromaprint will be sent to MusicBrainz via a > > QNetoworkRequest and a slot will be connected to listen for its reply. > > The above is obvious and it is perhaps redundant to mention it. > > > Creating Last.fm tag getter: Create the Last.fm tag getter. First, I > > will add the tag based service and then, if schedule permits, I can > add > > the fingerprint based service. Hence, two LastFm Providers will be > created. > > Implementation will be very easy thanks to > > liblastfm<https://github.com/lastfm/liblastfm>-> A Qt C++ library > for the Last.fm webservices. Both the Providers will > > share the authentication data and reply so that network requests aren’t > > needlessly duplicated. They will be in the same directory but will have > two > > available providers. The network replies by Last.fm is in XML format. > > Hence, the QtXML module will parse the replies accordingly. > > In fact, liblastfm does it. (and we already use it in Amarok) > > > Now the > > similarities end. The network requests and replies will have to handled > > differently by the different providers, since one will be tag based and > the > > other fingerprint based. > > > > For the LastfmTagProvider the track/artist name has to sent to the > > Last.fm webservice API. The webservice will then return the other > track > > metadata. After being correctly parsed it will signal the required > slot > > so that the Controller receives the tracks whose metadata has been > fetched. > > For the LastFmFingerprintProvider we will first have to generate a > > “fingerprint” of the track using the Last Fm > > Fingerprinter<https://github.com/lastfm/Fingerprinter>. > > This audio fingerprint will then be sent to the Last fm web service > via a > > call to the > > track.getFingerprintMetadata< > http://www.last.fm/api/show/track.getFingerpri > > ntMetadata>. Then the web service will return the track metadata along > with > > a “confidence rating” to specify how accurately the track has been > > identified. All this data will be parsed by the QtXML module and sent to > > the Controller in a manner similar to above. > > ^^ the above could be shortened, many things mentioned are rather obvious. > > > A better GUI: A better GUI is required. Though I will need to discuss > > with the Amarok developers what the new GUI should like so that it > > follows the general look and feel of Amarok, I am proving my > > interpretation as follows: I will replace the existing "Get tags from > > MusicBrainz" button by "Get tags from the following services: ". It will > be > > followed by a drop-down list with check-boxes for each item containing > > names of the services. > > > > [image: http://troll.ws/image/458a61bc#.UX0fS0GH7gx] > > I don't like it. We should just query all enabled services, users don't > want > to think about it every time. If some service doesn't work well for them, > they > should be able to disable it in Plugins section. > > > In “Amarok Settings” there will be a tag guessing section and every > > TagGetterProvider can create a tab to change settings of the > individual > > provider. > > I don't like this either. Every setting just bothers the user, the > providers > should just work without any configuration. > > > As getting tags from the individual services will run in > > parallel, results will be displayed as soon as they are fetched via > the > > TagGetter::Controller. This list will persist in memory so that if the > > user closes the dialog and then reopens it, tags won’t have to be fetched > > again (until Amarok restart). > > Oh please not, this is very convoluted and may lead to very unexpected > memory > usage. > > > The user can choose the best tags from the > > generated list. If tags from multiple sources are the same they should be > > clubbed and appear higher up the list compared to the others. > > clubbed: yes; higher in the lists: yes, but I propose using the max(a, b) > rather than a + b function to combine the scores. > > > Tags from > > different sources have to cross-checking for this similarity. > > I don't understand this sentence. > > > Each set of > > tags that have been fetched will be accompanied by a “confidence rating”- > > between 0 and 1, and provided by the respective Provider. For similar > tags > > the ratings will be added (or maybe similar way of increasing the rating > > that can be found out by testing). The list will be continually sorted > > according to the rating of each tag. The list will be a bullet list with > > colors depicting the “confidence rating”- 0 to 1 = red to green (as it is > > now). > > Well, this is largely what the current music brainz dialog is, isn't it? > > > All user visible strings (buttons and tool-tips) will be written > > using i18n(). i18nc() will be used wherever requiered- like, for a label > > stating "number of track(s)". > > This is an obvious requirement, no need to mention it. > > > A GUI for getting tags of multiple songs simultaneously: Its a > feature > > by which tags for multiple tracks (if not the entire local collection) > > can be changed. Getting tags requires many user clicks as of now, > > especially for multiple tracks. To solve this, a "Tag Getter Wizard" will > > select the best tags for every track. > > This is already working, please try to understand *actual* Amarok features > more! > > > Best ones are written to files > > without user approval but in a reversible way. Tags will be saved to > > database and can be reversed/changed anytime (even after amarok restart) > in > > both ways- 1) for individual tracks 2) through a list showing the changed > > tags. > > Oh, please don't. This would be really convoluted to implement and use. > > > A user can start the wizard select the playlists and/or tacks whose > tags > > need to be guesses, then make it run in the background or make it run > in > > the foreground showing the details of each track whose tags are being > > changed (and missing ones added). > > Hmm, no, please don't make this part of the project (+ it is already > working > to some extent). > > > Or, he/she can set it to run on the > > entire local collection and/or as new tracks are added to the local > > collection. > > This doesn't look like a good idea to me. > > > I will make decoupled and modularized code so that support > > for this wizard can easily be extended to other collections. > > After the > > wizard has completed its operation a list will show the user the > updated > > tags each a button that will reverse the the changes. Not all fetched > > tags will be updated. Though I can achieve the right balance only by > user > > testing, as of now I set the default as "tags with confidence rating >1 > > will be updated". This can easily be changed by the user in the settings > > page. Again, I will write every code according to the KDE > > internationalization guildlines. The exact look of this wizard will be > > decided after discussion with the Amarok team but a rough idea of > > features is depicted below:[image: > > http://troll.ws/image/438a77c9#.UX0fXEGH7gx] > > No, please don't do this. I'm already pretty fine this is current > "MusicBrainz > Tags Dialog" - especially when recent Alberto's patch is applied. > > > <--- GSoC commences---> > > > > week 3: Make the directories and write the abstract classes (with > > documentation). > > > > week 4: Polish, make cmakelists for, write make tests and compile the > > written abstract classes. > > > > July- > > > > week 1: Re-write the MusicBrainz tag getter according to the framework > with > > better documentation > > > > week 2: Add the features (like using AcoustID) to this MusicBrainz tag > > getter > > > > week 3: Compile and run the new MusicBrainz (displaying results in the > old > > UI itself) > > > > week 4: Write the new GUI code. Run MusicBrainz tag getter in this > fashion > > > > <--- Mid term ---> > > > > August- > > > > week 1: Write the Last.fm tag getter (as many features as possible) > > > > week 2: Finish writing and test the Last.fm tag getter > > > > week 3: Test the Last.fm and MusicBrainz tag getter together in the new > GUI > > > > week 4: Make the tag getter wizard > > > > September- > > > > week 1: Test the new wizard. Write make tests to make sure that new tag > > getters follow the set framework > > > > week 2: Improve documentation. Fix bugs that have been discovered over > the > > weeks > > > > <--- suggested “pens down” ---> > > > > week 3: Fix Bugs, streamline and optimize the code. > > > > <--- firm “pens down” ---> > > > > week 4: Do some code cleaning and fix bugs so that my project can be > pushed > > into the master branch. > > > > October- > > > > week 1: Improve documentation. Fix bugs. > > Vedant, I really think that implementing all the stuff you've outlined is > unachievable, at least not in a code quality we'd expect. Please take some > time to significantly reduce the amount of work you plan to do. Many of the > ideas you have on top of the idea (published by devs) simply don't align > with > Amarok goals and/or are based on misunderstanding of how current tag > guessing > works. > > > Other Obligations: I have no other obligations. I can easily spent about > 50 > > hours a week: around 8 hrs a day in slots of 2 to 3 hrs, one each in the > > afternoon (10am-1pm), evening (4pm-6pm) and at night (10pm-1am), with > some > > hours less on weekends. > > Hehe, I don't think you need to be this detailed. :-) > > > Summer vacations will be going on till mid July. > > Even after college starts, I will be following the same schedule for GSoC > > coding. It is easily possible because very few classes are held in the > > beginning of the semester and even if they do clash with my coding > > time-period, the college teachers excuse GSoC students' absence from > class. > > Since I will have no other coursework obligations, I can continue to code > > 50 hours a week (with a similar schedule) even up till September. > > > > About Me: I am currently in my second undergraduate year in National > > Institute of Technology, Durgapur, India, studying Computer Science and > > Engineering. I have experience coding experience with C/C++, Java > > (including Android and making GUI using Java swing) and web services. I > > have submitted 3 patches (Junior Jobs, one each for > > Rekonq[1]<https://git.reviewboard.kde.org/r/107662/>and Amarok > > [2] <https://git.reviewboard.kde.org/r/109283/> as well as a improved > > formatting patch for Amarok[3] < > https://git.reviewboard.kde.org/r/109295/>) > > and 2 more are under review (both Junior Jobs for > > Amarok[4]<https://git.reviewboard.kde.org/r/110101/> > > [5] <https://git.reviewboard.kde.org/r/110082/>). > > > > I love coding for open source. I’m sure working with a mentor who is > > virtually present won’t be a problem. I have interacted (mainly with > Matej > > a.k.a. Strohel on IRC) over IRC, the amarok mailing list, reviewboard and > > also the KDE bug tracking system. During the work period code can easily > be > > shared via GitHub. > > We prefer personal scratch git repositories on KDE infrastructure for this. > > > If it is possible, meeting a mentor face to face will > > obviously be much more helpful. I have worked with my college seniors in > > this fashion who have introduced me to Linux and Open-Source. Countless > > times I have been to their rooms for advice and to solve my problems. > > > > After GSoC I plan to become an active developer for Amarok and also for > > other projects of KDE. > > As stated above, this looks to me like a way too much work to do. Perhaps > you > are trying to make the proposal more interesting, but I fear that the > effect is > the opposite - we'd actually appreciate if you wanted to implement much > less > things (ideally just what the initial idea was about), but make it > interesting > by level of insight, research and thought that has been put to it. > > Also please study/use the current tag guessing in Amarok more (and do have > a > look at Alberto's patch) to avoid a situation where you describe lack of > Amarok feature while it is not the case. > > Regards, > Matěj > _______________________________________________ > Amarok-devel mailing list > Amarok-devel@kde.org > https://mail.kde.org/mailman/listinfo/amarok-devel >
_______________________________________________ Amarok-devel mailing list Amarok-devel@kde.org https://mail.kde.org/mailman/listinfo/amarok-devel