Re: baloo metadata scrapers
Hey Martin Baloo is only designed for searching for files. It is not a meta-data store. If the data you're extracting is present in the file, then it could potentially be added to Baloo. The important thing to look at is how to present the data, and how does the data help in searching. For example - providing the name of an actor in a movie could make the movie popup when searching for that name. This is great, however, the user needs to be given a clear explanation of why that file is in the search results. Our main search interface, krunner, isn't designed to provide this additional context. Therefore, adding actor information, is not something I would want to ship. -- Vishesh Handa On Fri, Sep 30, 2016 at 9:47 PM, Martin Bednar wrote: > Hi, > > having very fond memories of the potential I saw in the tvshows:// > experimental kioslave (from the days of nepomuk), so I was left wondering > whether something like that would be possible with baloo. > > I currently have a working scraper (currently for movies, tvshows should be > easy to tack on) and am wondering if there is a way to use baloo to store the > metadata. I read a bit about baloo on api.kde.org, but am not able to find any > indication how (if) such a project would be feasible. > > Thanks, > > Martin. > > > -- > Why top posting? > Why are forums read from top to bottom?
Re: baloo metadata scrapers
Hi, thanks for answering. On Friday, 7 October 2016 10:44:02 CEST Vishesh Handa wrote: > Hey Martin > > Baloo is only designed for searching for files. It is not a meta-data > store. If the data you're extracting is present in the file, then it > could potentially be added to Baloo. It is not, I get that data from an external source (currently themoviedb). > The important thing to look at is how to present the data, and how > does the data help in searching. For example - providing the name of > an actor in a movie could make the movie popup when searching for that > name. This is great, however, the user needs to be given a clear > explanation of why that file is in the search results. Agreed, the dream actually is creating something along the lines of the multimedia library described on the nepomukport page[1] for potential use in Plasma Media Center. I found https://community.kde.org/Baloo/Architecture since my email, and am now thinking of creating a new data/search store plugin. > Our main search interface, krunner, isn't designed to provide this > additional context. Therefore, adding actor information, is not > something I would want to ship. Understood. I'd like to ask for a few pointers though : Does documentation on how to write new search/store plugins exist? >From what I understood, I'll have to take care of storing the data myself. I see that baloo uses LMDB, Xapian and SQLite. Any pointers as to which db system (or a completely different one) is well adapted to this kind of use? At the moment I tend to lean towards LMDB, but have no real technical argument for it... Thanks, Martin [1] https://community.kde.org/Baloo/NepomukPort#Bangarang
Re: baloo metadata scrapers
On Oct 7, 2016 12:07, "Martin Bednar" wrote: > > Hi, > > thanks for answering. > > On Friday, 7 October 2016 10:44:02 CEST Vishesh Handa wrote: > > Hey Martin > > > > Baloo is only designed for searching for files. It is not a meta-data > > store. If the data you're extracting is present in the file, then it > > could potentially be added to Baloo. > > It is not, I get that data from an external source (currently themoviedb). > > > The important thing to look at is how to present the data, and how > > does the data help in searching. For example - providing the name of > > an actor in a movie could make the movie popup when searching for that > > name. This is great, however, the user needs to be given a clear > > explanation of why that file is in the search results. > > Agreed, the dream actually is creating something along the lines of the > multimedia library described on the nepomukport page[1] for potential use in > Plasma Media Center. > > I found https://community.kde.org/Baloo/Architecture since my email, and am > now thinking of creating a new data/search store plugin. > Ignore this. Actually ignore all docs you find online on any wiki. It is too old. The git repo is the only thing that is valid. > > Our main search interface, krunner, isn't designed to provide this > > additional context. Therefore, adding actor information, is not > > something I would want to ship. > > Understood. > > I'd like to ask for a few pointers though : > Does documentation on how to write new search/store plugins exist? > From what I understood, I'll have to take care of storing the data myself. I > see that baloo uses LMDB, Xapian and SQLite. Any pointers as to which db > system (or a completely different one) is well adapted to this kind of use? At > the moment I tend to lean towards LMDB, but have no real technical argument > for it... Basically I would recommend looking at this from the presentation and usability side first. Technology later. Do you want some kind of kioslave? Through Krunner? Something else? What kind of workflow and what kind of users. Once this is more clear, you can figure out which is the best way to go about what you want. > > Thanks, > > Martin > > [1] https://community.kde.org/Baloo/NepomukPort#Bangarang
Scrap Baloo Thread Feedback
Hey guys I was told there is a thread about scrapping Baloo. All Baloo discussion used to happen on kde-devel and that's where the review requests go. It's the only reason I am still subscribed to kde-devel. I must say, the thread is overall quite disappointing. There seems to be no scientific or rationale cost based analysis of this. How about a list of requirements and priorities are drawn up and then possible solutions are evaluated according to it? Right now, random requirements such as NFS and 32bit systems are coming up. Are these really that important? I specifically designed Baloo to not care about both network mounts and 32-bit systems. Yes, Baloo has bugs and it won't handle more than 32bit-inodes. These things, as all others, can be fixed. It's really a question of what is important. Lets not target the outliers. Many of these decisions were deliberately taken. How about requirements such as resource consumption, ease of integration, search speed are taken into consideration? Come on guys. We're engineers over here. (If the discussion continues on kde-frameworks-devel, I probably won't see it) -- Vishesh Handa
Re: Scrap Baloo Thread Feedback
On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote: > Hey guys > > I was told there is a thread about scrapping Baloo. All Baloo > discussion used to happen on kde-devel and that's where the review > requests go. It's the only reason I am still subscribed to kde-devel. Heya, Baloo is a framework nowadays, therefore it totally makes sense to have the discussion on kde-framework-devel. There's been tons of discussion around Baloo on kde-framework-devel already. kde-frameworks-devel is also where the CI messages for the baloo repo go to. It likely makes sense for you to subscribe, no? Cheers, Kevin > (snip) -- Kevin Funk | kf...@kde.org | http://kfunk.org signature.asc Description: This is a digitally signed message part.
Re: Scrap Baloo Thread Feedback
On Fri, Oct 7, 2016 at 5:57 PM, Kevin Funk wrote: > On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote: >> Hey guys >> >> I was told there is a thread about scrapping Baloo. All Baloo >> discussion used to happen on kde-devel and that's where the review >> requests go. It's the only reason I am still subscribed to kde-devel. > > Heya, > > Baloo is a framework nowadays, therefore it totally makes sense to have the > discussion on kde-framework-devel. > > There's been tons of discussion around Baloo on kde-framework-devel already. > kde-frameworks-devel is also where the CI messages for the baloo repo go to. > > It likely makes sense for you to subscribe, no? > I don't understand why all framework discussions must happen on the same list. It just adds to a crazy amount of noise, which one then needs to parse through. > Cheers, > Kevin > >> (snip) > > > -- > Kevin Funk | kf...@kde.org | http://kfunk.org
Re: Scrap Baloo Thread Feedback
On Fri, Oct 7, 2016 at 6:14 PM, Christoph Cullmann wrote: >>> >> >> I don't understand why all framework discussions must happen on the >> same list. It just adds to a crazy amount of noise, which one then >> needs to parse through. > > If you would have baloo-devel I could understand that point, > but not with some other generic mailing list like kde-devel which > has the same amount of noise and is not even dedicated to 'frameworks' > or 'baloo'. If you guys plans to use frameworks devel, then please change the review requests. It was just too much noise for me, and I found the noise/signal ratio way lower in kde-devel. Baloo-devel was specifically not chosen as it would just an another silo in kde. Nepomuk used to suffer from that. -- Vishesh Handa
Re: Scrap Baloo Thread Feedback
On Fri, Oct 7, 2016 at 6:01 PM, Vishesh Handa wrote: > On Fri, Oct 7, 2016 at 5:57 PM, Kevin Funk wrote: >> On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote: >>> Hey guys >>> >>> I was told there is a thread about scrapping Baloo. All Baloo >>> discussion used to happen on kde-devel and that's where the review >>> requests go. It's the only reason I am still subscribed to kde-devel. >> >> Heya, >> >> Baloo is a framework nowadays, therefore it totally makes sense to have the >> discussion on kde-framework-devel. >> >> There's been tons of discussion around Baloo on kde-framework-devel already. >> kde-frameworks-devel is also where the CI messages for the baloo repo go to. >> >> It likely makes sense for you to subscribe, no? >> > > I don't understand why all framework discussions must happen on the > same list. It just adds to a crazy amount of noise, which one then > needs to parse through. Arguing that it should be elsewhere because you'd like to ignore the rest of the traffic in kde-frameworks doesn't sound very constructive, especially considering how they're the "noise" that actually improves the frameworks. Maybe you can better configure your e-mail client differently so we can focus on the issue at matter? Aleix
Re: Scrap Baloo Thread Feedback
On Fri, Oct 7, 2016 at 6:20 PM, Aleix Pol wrote: >>> >> >> I don't understand why all framework discussions must happen on the >> same list. It just adds to a crazy amount of noise, which one then >> needs to parse through. > > Arguing that it should be elsewhere because you'd like to ignore the > rest of the traffic in kde-frameworks doesn't sound very constructive, > especially considering how they're the "noise" that actually improves > the frameworks. > > Maybe you can better configure your e-mail client differently so we > can focus on the issue at matter? This is not about how it should be. I'm informing them why it was chosen to be somewhere else. This decision can be changed. Frameworks collectively may or may not improve by having everything in one place. Lets not treat it as a axiom. An analogy could be that we get commit emails, but we get to choose which projects we are interested in. We don't make everyone subscribe to kde-commits, and then put their own complex filters on top. -- Vishesh Handa
Re: Scrap Baloo Thread Feedback
On Friday, 7 October 2016 18:27:30 CEST Vishesh Handa wrote: > On Fri, Oct 7, 2016 at 6:20 PM, Aleix Pol wrote: > >> I don't understand why all framework discussions must happen on the > >> same list. It just adds to a crazy amount of noise, which one then > >> needs to parse through. > > > > Arguing that it should be elsewhere because you'd like to ignore the > > rest of the traffic in kde-frameworks doesn't sound very constructive, > > especially considering how they're the "noise" that actually improves > > the frameworks. > > > > Maybe you can better configure your e-mail client differently so we > > can focus on the issue at matter? > > This is not about how it should be. I'm informing them why it was > chosen to be somewhere else. This decision can be changed. > > Frameworks collectively may or may not improve by having everything in > one place. Lets not treat it as a axiom. > > An analogy could be that we get commit emails, but we get to choose > which projects we are interested in. We don't make everyone subscribe > to kde-commits, and then put their own complex filters on top. We are moving out from the main point of the discussion, but I'd like to point out that the support for topics in mailman[1] covers this use case. Not sure how to make reviewboard or phabricator add a topic to the notification email, though. (Also, we did not have a final decision whether we should go back to kde-core- devel for Frameworks-related topic). [1] http://www.list.org/mailman-member/node29.html Ciao -- Luigi
Re: Scrap Baloo Thread Feedback
Hi, > Hey guys > > I was told there is a thread about scrapping Baloo. All Baloo > discussion used to happen on kde-devel and that's where the review > requests go. It's the only reason I am still subscribed to kde-devel. That is nice, but given baloo is a framework, that was unexpected, sorry. > > I must say, the thread is overall quite disappointing. There seems to > be no scientific or rationale cost based analysis of this. How about a > list of requirements and priorities are drawn up and then possible > solutions are evaluated according to it? Actually, the bugs.kde.org page tells you the facts: The bug number was constant increasing since > 1 year. The thread lists some other facts what is wrong ATM and should be fixed. And to replace baloo with something else based for example on tracker was just one proposal. An other was to fix baloo + port it to an other database. > > Right now, random requirements such as NFS and 32bit systems are > coming up. Are these really that important? I specifically designed > Baloo to not care about both network mounts and 32-bit systems. Yes, > Baloo has bugs and it won't handle more than 32bit-inodes. These > things, as all others, can be fixed. It's really a question of what is > important. Lets not target the outliers. Many of these decisions were > deliberately taken. That are no random requirements, sorry, you could call it random restrictions, too. That is not that productive, or? 1) 32-bit systems are still there and if that is a design decision to NOT support them, that is ok, but then bad for Plasma, no official support for 32-bit systems, baloo is IMHO the only framework with such requirements. And I see not that we have hinted any distro that they shall not compile it for 32-bit. 2) No NFS: Ok, fair game, but then, it should check that and disable itself completely if $HOME where the db is stored is a NFS, can live with that, too, but not with the current "we random crash" behavior. => That is a user experience we don't want, or? 3) > 32-bit inodes: That is normal and should work, but even if it should not: Atm you get inconsistent and then later assertion fails or crashs. => I can live with all restrictions but the current handling of them, that always ends in "crash" is IMHO not that acceptable. But that is "my" opinion, that might vary in the eyes of others. > > How about requirements such as resource consumption, ease of > integration, search speed are taken into consideration? Come on guys. > We're engineers over here. What is the argument here? If you take a look at bugs.kde.org, you see that people are complaining about all of that with baloo. I see no evidence nowhere that e.g. baloo is "superior" to what GNOME uses or any other solution (perhaps beside nepomuk, ok...). I fixed in a few days more bugs than were fixed in 1 year and triaged more than ever, still a lot is to be done. (and I did really not do a lot, just remove things like 'self destruct if index > 5GB' or 'crash for ever on db corruption') A graph tells more than words: https://bugs.kde.org/reports.cgi?product=frameworks-baloo&output=show_chart&datasets=CONFIRMED&datasets=ASSIGNED&datasets=REOPENED&datasets=UNCONFIRMED&datasets=RESOLVED&banner=1 Given the current open bugs, one will need to: 1) review all extractors, they have still close to zero error handling and will just crash or OOM you on bad files 2) review + fix the complete data base handling to handle errors and perhaps swap the DB 3) fix the indexer to have some resource limits to avoid OOM and Co. if e..g extractors fail ... Therefore there was my proposal, given we lack manpower, to implement baloo API on top of e.g. tracker to avoid all this and let tracker handle that. To check if that is at all feasible, I did some quick and dirty implementation (still modulo filling of the metadata in the results + tagging, which is a problem, but that was only to see if e.g. search works) https://quickgit.kde.org/?p=clones%2Fbaloo%2Fcullmann%2Ftbaloo.git That is just a proposal and then I started the discussion. Until now, we have one other proposal, by Boudhayan, to fixup baloo. > > (If the discussion continues on kde-frameworks-devel, I probably won't see it) I won't see it on kde-devel, please, frameworks related stuff should really be discussed on the frameworks list. Greetings Christoph -- - Dr.-Ing. Christoph Cullmann - AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com Science Park 1 Tel: +49-681-38360-22 66123 Saarbrücken Fax: +49-681-38360-20 GERMANYWWW: http://www.AbsInt.com Geschäftsführung: Dr.-Ing. Christian Ferdinand Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234
Re: Scrap Baloo Thread Feedback
Hi, > On Fri, Oct 7, 2016 at 5:57 PM, Kevin Funk wrote: >> On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote: >>> Hey guys >>> >>> I was told there is a thread about scrapping Baloo. All Baloo >>> discussion used to happen on kde-devel and that's where the review >>> requests go. It's the only reason I am still subscribed to kde-devel. >> >> Heya, >> >> Baloo is a framework nowadays, therefore it totally makes sense to have the >> discussion on kde-framework-devel. >> >> There's been tons of discussion around Baloo on kde-framework-devel already. >> kde-frameworks-devel is also where the CI messages for the baloo repo go to. >> >> It likely makes sense for you to subscribe, no? >> > > I don't understand why all framework discussions must happen on the > same list. It just adds to a crazy amount of noise, which one then > needs to parse through. If you would have baloo-devel I could understand that point, but not with some other generic mailing list like kde-devel which has the same amount of noise and is not even dedicated to 'frameworks' or 'baloo'. Greetings Christoph -- - Dr.-Ing. Christoph Cullmann - AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com Science Park 1 Tel: +49-681-38360-22 66123 Saarbrücken Fax: +49-681-38360-20 GERMANYWWW: http://www.AbsInt.com Geschäftsführung: Dr.-Ing. Christian Ferdinand Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234
Re: Scrap Baloo Thread Feedback
Hi, > On Fri, Oct 7, 2016 at 6:14 PM, Christoph Cullmann > wrote: >>> >>> I don't understand why all framework discussions must happen on the >>> same list. It just adds to a crazy amount of noise, which one then >>> needs to parse through. >> >> If you would have baloo-devel I could understand that point, >> but not with some other generic mailing list like kde-devel which >> has the same amount of noise and is not even dedicated to 'frameworks' >> or 'baloo'. > > If you guys plans to use frameworks devel, then please change the > review requests. > > It was just too much noise for me, and I found the noise/signal ratio > way lower in kde-devel. Baloo-devel was specifically not chosen as it > would just an another silo in kde. Nepomuk used to suffer from that. I use the power of e-mail filters to filter the review requests in a subfolder, I think others might do the same. (same for CI) I don't see a point in changing that policy if all others can live with it. Greetings Christoph -- - Dr.-Ing. Christoph Cullmann - AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com Science Park 1 Tel: +49-681-38360-22 66123 Saarbrücken Fax: +49-681-38360-20 GERMANYWWW: http://www.AbsInt.com Geschäftsführung: Dr.-Ing. Christian Ferdinand Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234
Applying as a mentor
Hey I want to apply to SOK as a mentor as I have a few proposal for some projects. Could anyone please guide me where to submit the proposals and what are the minimum requirements to be a mentor? Regards Prathmesh Ranaut
Re: Applying as a mentor
I would also like to know more about it. Also, I'd like to know if there's something I should do after applying as a student. Em 7 de out de 2016 13:41, "Prathmesh Ranaut" escreveu: > Hey > > I want to apply to SOK as a mentor as I have a few proposal for some > projects. Could anyone please guide me where to submit the proposals and > what are the minimum requirements to be a mentor? > > Regards > Prathmesh Ranaut >
Re: Scrap Baloo Thread Feedback
Each project might have its own list and those lists have a hook for its content to make a copy to kde-frameworks list so if anyone want to watch all framework project related mails s/he can watch all of them in one place and if a person want to follow just one project s/he can just be part of related project's list Ex: Baloo might have a list like baloo-devel and all mails goes to this list can be copied to kde-framworks list so it will solve all related problems.(Also we can add a tag for each list name ( [Baloo] for example for Baloo-dev list ) so in kde-frameworks might be more understandable content Ömer Fadıl Usta about.me/omerusta 2016-10-07 19:20 GMT+03:00 Christoph Cullmann : > Hi, > >> On Fri, Oct 7, 2016 at 6:14 PM, Christoph Cullmann >> wrote: > I don't understand why all framework discussions must happen on the same list. It just adds to a crazy amount of noise, which one then needs to parse through. >>> >>> If you would have baloo-devel I could understand that point, >>> but not with some other generic mailing list like kde-devel which >>> has the same amount of noise and is not even dedicated to 'frameworks' >>> or 'baloo'. >> >> If you guys plans to use frameworks devel, then please change the >> review requests. >> >> It was just too much noise for me, and I found the noise/signal ratio >> way lower in kde-devel. Baloo-devel was specifically not chosen as it >> would just an another silo in kde. Nepomuk used to suffer from that. > I use the power of e-mail filters to filter the review requests in a > subfolder, > I think others might do the same. (same for CI) > > I don't see a point in changing that policy if all others can live with it. > > Greetings > Christoph > > -- > - Dr.-Ing. Christoph Cullmann - > AbsInt Angewandte Informatik GmbH Email: cullm...@absint.com > Science Park 1 Tel: +49-681-38360-22 > 66123 Saarbrücken Fax: +49-681-38360-20 > GERMANYWWW: http://www.AbsInt.com > > Geschäftsführung: Dr.-Ing. Christian Ferdinand > Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234