Re: baloo metadata scrapers

2016-10-07 Thread Vishesh Handa
Hey Martin

Baloo is only designed for searching for files. It is not a meta-data
store. If the data you're extracting is present in the file, then it
could potentially be added to Baloo.

The important thing to look at is how to present the data, and how
does the data help in searching. For example - providing the name of
an actor in a movie could make the movie popup when searching for that
name. This is great, however, the user needs to be given a clear
explanation of why that file is in the search results.

Our main search interface, krunner, isn't designed to provide this
additional context. Therefore, adding actor information, is not
something I would want to ship.

--
Vishesh Handa


On Fri, Sep 30, 2016 at 9:47 PM, Martin Bednar  wrote:
> Hi,
>
> having very fond memories of the potential I saw in the tvshows://
> experimental kioslave (from the days of nepomuk), so I was left wondering
> whether something like that would be possible with baloo.
>
> I currently have a working scraper (currently for movies, tvshows should be
> easy to tack on) and am wondering if there is a way to use baloo to store the
> metadata. I read a bit about baloo on api.kde.org, but am not able to find any
> indication how (if) such a project would be feasible.
>
> Thanks,
>
> Martin.
>
>
> --
> Why top posting?
> Why are forums read from top to bottom?


Re: baloo metadata scrapers

2016-10-07 Thread Martin Bednar
Hi,

thanks for answering.

On Friday, 7 October 2016 10:44:02 CEST Vishesh Handa wrote:
> Hey Martin
> 
> Baloo is only designed for searching for files. It is not a meta-data
> store. If the data you're extracting is present in the file, then it
> could potentially be added to Baloo.

It is not, I get that data from an external source (currently themoviedb).

> The important thing to look at is how to present the data, and how
> does the data help in searching. For example - providing the name of
> an actor in a movie could make the movie popup when searching for that
> name. This is great, however, the user needs to be given a clear
> explanation of why that file is in the search results.

Agreed, the dream actually is creating something along the lines of the 
multimedia library described on the nepomukport page[1] for potential use in 
Plasma Media Center.

I found https://community.kde.org/Baloo/Architecture since my email, and am 
now thinking of creating a new data/search store plugin.

> Our main search interface, krunner, isn't designed to provide this
> additional context. Therefore, adding actor information, is not
> something I would want to ship.

Understood. 

I'd like to ask for a few pointers though :
Does documentation on how to write new search/store plugins exist?
>From what I understood, I'll have to take care of storing the data myself. I 
see that baloo uses LMDB, Xapian and SQLite. Any pointers as to which db 
system (or a completely different one) is well adapted to this kind of use? At 
the moment I tend to lean towards LMDB, but have no real technical argument 
for it...

Thanks,

Martin

[1] https://community.kde.org/Baloo/NepomukPort#Bangarang


Re: baloo metadata scrapers

2016-10-07 Thread Vishesh Handa
On Oct 7, 2016 12:07, "Martin Bednar"  wrote:
>
> Hi,
>
> thanks for answering.
>
> On Friday, 7 October 2016 10:44:02 CEST Vishesh Handa wrote:
> > Hey Martin
> >
> > Baloo is only designed for searching for files. It is not a meta-data
> > store. If the data you're extracting is present in the file, then it
> > could potentially be added to Baloo.
>
> It is not, I get that data from an external source (currently themoviedb).
>
> > The important thing to look at is how to present the data, and how
> > does the data help in searching. For example - providing the name of
> > an actor in a movie could make the movie popup when searching for that
> > name. This is great, however, the user needs to be given a clear
> > explanation of why that file is in the search results.
>
> Agreed, the dream actually is creating something along the lines of the
> multimedia library described on the nepomukport page[1] for potential use
in
> Plasma Media Center.
>
> I found https://community.kde.org/Baloo/Architecture since my email, and
am
> now thinking of creating a new data/search store plugin.
>

Ignore this.

Actually ignore all docs you find online on any wiki. It is too old. The
git repo is the only thing that is valid.

> > Our main search interface, krunner, isn't designed to provide this
> > additional context. Therefore, adding actor information, is not
> > something I would want to ship.
>
> Understood.
>
> I'd like to ask for a few pointers though :
> Does documentation on how to write new search/store plugins exist?
> From what I understood, I'll have to take care of storing the data
myself. I
> see that baloo uses LMDB, Xapian and SQLite. Any pointers as to which db
> system (or a completely different one) is well adapted to this kind of
use? At
> the moment I tend to lean towards LMDB, but have no real technical
argument
> for it...

Basically I would recommend looking at this from the presentation and
usability side first. Technology later.

Do you want some kind of kioslave? Through Krunner? Something else? What
kind of workflow and what kind of users.

Once this is more clear, you can figure out which is the best way to go
about what you want.

>
> Thanks,
>
> Martin
>
> [1] https://community.kde.org/Baloo/NepomukPort#Bangarang


Scrap Baloo Thread Feedback

2016-10-07 Thread Vishesh Handa
Hey guys

I was told there is a thread about scrapping Baloo. All Baloo
discussion used to happen on kde-devel and that's where the review
requests go. It's the only reason I am still subscribed to kde-devel.

I must say, the thread is overall quite disappointing. There seems to
be no scientific or rationale cost based analysis of this. How about a
list of requirements and priorities are drawn up and then possible
solutions are evaluated according to it?

Right now, random requirements such as NFS and 32bit systems are
coming up. Are these really that important? I specifically designed
Baloo to not care about both network mounts and 32-bit systems. Yes,
Baloo has bugs and it won't handle more than 32bit-inodes. These
things, as all others, can be fixed. It's really a question of what is
important. Lets not target the outliers. Many of these decisions were
deliberately taken.

How about requirements such as resource consumption, ease of
integration, search speed are taken into consideration? Come on guys.
We're engineers over here.

(If the discussion continues on kde-frameworks-devel, I probably won't see it)

--
Vishesh Handa


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Kevin Funk
On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote:
> Hey guys
> 
> I was told there is a thread about scrapping Baloo. All Baloo
> discussion used to happen on kde-devel and that's where the review
> requests go. It's the only reason I am still subscribed to kde-devel.

Heya,

Baloo is a framework nowadays, therefore it totally makes sense to have the 
discussion on kde-framework-devel.

There's been tons of discussion around Baloo on kde-framework-devel already. 
kde-frameworks-devel is also where the CI messages for the baloo repo go to.

It likely makes sense for you to subscribe, no?

Cheers,
Kevin

> (snip)


-- 
Kevin Funk | kf...@kde.org | http://kfunk.org

signature.asc
Description: This is a digitally signed message part.


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Vishesh Handa
On Fri, Oct 7, 2016 at 5:57 PM, Kevin Funk  wrote:
> On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote:
>> Hey guys
>>
>> I was told there is a thread about scrapping Baloo. All Baloo
>> discussion used to happen on kde-devel and that's where the review
>> requests go. It's the only reason I am still subscribed to kde-devel.
>
> Heya,
>
> Baloo is a framework nowadays, therefore it totally makes sense to have the
> discussion on kde-framework-devel.
>
> There's been tons of discussion around Baloo on kde-framework-devel already.
> kde-frameworks-devel is also where the CI messages for the baloo repo go to.
>
> It likely makes sense for you to subscribe, no?
>

I don't understand why all framework discussions must happen on the
same list. It just adds to a crazy amount of noise, which one then
needs to parse through.

> Cheers,
> Kevin
>
>> (snip)
>
>
> --
> Kevin Funk | kf...@kde.org | http://kfunk.org


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Vishesh Handa
On Fri, Oct 7, 2016 at 6:14 PM, Christoph Cullmann  wrote:
>>>
>>
>> I don't understand why all framework discussions must happen on the
>> same list. It just adds to a crazy amount of noise, which one then
>> needs to parse through.
>
> If you would have baloo-devel I could understand that point,
> but not with some other generic mailing list like kde-devel which
> has the same amount of noise and is not even dedicated to 'frameworks'
> or 'baloo'.

If you guys plans to use frameworks devel, then please change the
review requests.

It was just too much noise for me, and I found the noise/signal ratio
way lower in kde-devel. Baloo-devel was specifically not chosen as it
would just an another silo in kde. Nepomuk used to suffer from that.

--
Vishesh Handa


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Aleix Pol
On Fri, Oct 7, 2016 at 6:01 PM, Vishesh Handa  wrote:
> On Fri, Oct 7, 2016 at 5:57 PM, Kevin Funk  wrote:
>> On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote:
>>> Hey guys
>>>
>>> I was told there is a thread about scrapping Baloo. All Baloo
>>> discussion used to happen on kde-devel and that's where the review
>>> requests go. It's the only reason I am still subscribed to kde-devel.
>>
>> Heya,
>>
>> Baloo is a framework nowadays, therefore it totally makes sense to have the
>> discussion on kde-framework-devel.
>>
>> There's been tons of discussion around Baloo on kde-framework-devel already.
>> kde-frameworks-devel is also where the CI messages for the baloo repo go to.
>>
>> It likely makes sense for you to subscribe, no?
>>
>
> I don't understand why all framework discussions must happen on the
> same list. It just adds to a crazy amount of noise, which one then
> needs to parse through.

Arguing that it should be elsewhere because you'd like to ignore the
rest of the traffic in kde-frameworks doesn't sound very constructive,
especially considering how they're the "noise" that actually improves
the frameworks.

Maybe you can better configure your e-mail client differently so we
can focus on the issue at matter?

Aleix


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Vishesh Handa
On Fri, Oct 7, 2016 at 6:20 PM, Aleix Pol  wrote:
>>>
>>
>> I don't understand why all framework discussions must happen on the
>> same list. It just adds to a crazy amount of noise, which one then
>> needs to parse through.
>
> Arguing that it should be elsewhere because you'd like to ignore the
> rest of the traffic in kde-frameworks doesn't sound very constructive,
> especially considering how they're the "noise" that actually improves
> the frameworks.
>
> Maybe you can better configure your e-mail client differently so we
> can focus on the issue at matter?

This is not about how it should be. I'm informing them why it was
chosen to be somewhere else. This decision can be changed.

Frameworks collectively may or may not improve by having everything in
one place. Lets not treat it as a axiom.

An analogy could be that we get commit emails, but we get to choose
which projects we are interested in. We don't make everyone subscribe
to kde-commits, and then put their own complex filters on top.

--
Vishesh Handa


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Luigi Toscano
On Friday, 7 October 2016 18:27:30 CEST Vishesh Handa wrote:
> On Fri, Oct 7, 2016 at 6:20 PM, Aleix Pol  wrote:
> >> I don't understand why all framework discussions must happen on the
> >> same list. It just adds to a crazy amount of noise, which one then
> >> needs to parse through.
> > 
> > Arguing that it should be elsewhere because you'd like to ignore the
> > rest of the traffic in kde-frameworks doesn't sound very constructive,
> > especially considering how they're the "noise" that actually improves
> > the frameworks.
> > 
> > Maybe you can better configure your e-mail client differently so we
> > can focus on the issue at matter?
> 
> This is not about how it should be. I'm informing them why it was
> chosen to be somewhere else. This decision can be changed.
> 
> Frameworks collectively may or may not improve by having everything in
> one place. Lets not treat it as a axiom.
> 
> An analogy could be that we get commit emails, but we get to choose
> which projects we are interested in. We don't make everyone subscribe
> to kde-commits, and then put their own complex filters on top.

We are moving out from the main point of the discussion, but I'd like to point 
out that the support for topics in mailman[1] covers this use case. Not sure 
how to make reviewboard or phabricator add a topic to the notification email, 
though.

(Also, we did not have a final decision whether we should go back to kde-core-
devel for Frameworks-related topic).

[1] http://www.list.org/mailman-member/node29.html

Ciao
-- 
Luigi


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Christoph Cullmann
Hi,

> Hey guys
> 
> I was told there is a thread about scrapping Baloo. All Baloo
> discussion used to happen on kde-devel and that's where the review
> requests go. It's the only reason I am still subscribed to kde-devel.
That is nice, but given baloo is a framework, that was unexpected, sorry.

> 
> I must say, the thread is overall quite disappointing. There seems to
> be no scientific or rationale cost based analysis of this. How about a
> list of requirements and priorities are drawn up and then possible
> solutions are evaluated according to it?
Actually, the bugs.kde.org page tells you the facts: The bug number
was constant increasing since > 1 year. The thread lists some other facts
what is wrong ATM and should be fixed.

And to replace baloo with something else based for example on tracker was just 
one
proposal.

An other was to fix baloo + port it to an other database.

> 
> Right now, random requirements such as NFS and 32bit systems are
> coming up. Are these really that important? I specifically designed
> Baloo to not care about both network mounts and 32-bit systems. Yes,
> Baloo has bugs and it won't handle more than 32bit-inodes. These
> things, as all others, can be fixed. It's really a question of what is
> important. Lets not target the outliers. Many of these decisions were
> deliberately taken.
That are no random requirements, sorry, you could call it random restrictions, 
too.
That is not that productive, or?

1) 32-bit systems are still there and if that is a design decision to NOT 
support them,
that is ok, but then bad for Plasma, no official support for 32-bit systems, 
baloo is IMHO
the only framework with such requirements. And I see not that we have hinted 
any distro
that they shall not compile it for 32-bit.

2) No NFS: Ok, fair game, but then, it should check that and disable itself 
completely if $HOME
where the db is stored is a NFS, can live with that, too, but not with the 
current "we random
crash" behavior. => That is a user experience we don't want, or?

3) > 32-bit inodes: That is normal and should work, but even if it should not: 
Atm you get inconsistent
and then later assertion fails or crashs.

=> I can live with all restrictions but the current handling of them, that 
always ends in "crash" is
IMHO not that acceptable. But that is "my" opinion, that might vary in the eyes 
of others.

> 
> How about requirements such as resource consumption, ease of
> integration, search speed are taken into consideration? Come on guys.
> We're engineers over here.
What is the argument here? If you take a look at bugs.kde.org, you see that 
people are complaining about all
of that with baloo. I see no evidence nowhere that e.g. baloo is "superior" to 
what GNOME uses
or any other solution (perhaps beside nepomuk, ok...).

I fixed in a few days more bugs than were fixed in 1 year and triaged more than 
ever, still a lot is to be done.
(and I did really not do a lot, just remove things like 'self destruct if index 
> 5GB' or 'crash for ever on
db corruption')

A graph tells more than words:

https://bugs.kde.org/reports.cgi?product=frameworks-baloo&output=show_chart&datasets=CONFIRMED&datasets=ASSIGNED&datasets=REOPENED&datasets=UNCONFIRMED&datasets=RESOLVED&banner=1

Given the current open bugs, one will need to:

1) review all extractors, they have still close to zero error handling and will 
just crash or OOM you on bad files
2) review + fix the complete data base handling to handle errors and perhaps 
swap the DB
3) fix the indexer to have some resource limits to avoid OOM and Co. if e..g 
extractors fail
...

Therefore there was my proposal, given we lack manpower, to implement baloo API 
on top of e.g. tracker to avoid all this
and let tracker handle that.

To check if that is at all feasible, I did some quick and dirty implementation 
(still modulo filling of the metadata in the results + tagging,
which is a problem, but that was only to see if e.g. search works)

https://quickgit.kde.org/?p=clones%2Fbaloo%2Fcullmann%2Ftbaloo.git

That is just a proposal and then I started the discussion.

Until now, we have one other proposal, by Boudhayan, to fixup baloo.

> 
> (If the discussion continues on kde-frameworks-devel, I probably won't see it)
I won't see it on kde-devel, please, frameworks related stuff should really
be discussed on the frameworks list.

Greetings
Christoph

-- 
- Dr.-Ing. Christoph Cullmann -
AbsInt Angewandte Informatik GmbH  Email: cullm...@absint.com
Science Park 1 Tel:   +49-681-38360-22
66123 Saarbrücken  Fax:   +49-681-38360-20
GERMANYWWW:   http://www.AbsInt.com

Geschäftsführung: Dr.-Ing. Christian Ferdinand
Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Christoph Cullmann
Hi,

> On Fri, Oct 7, 2016 at 5:57 PM, Kevin Funk  wrote:
>> On Friday, 7 October 2016 17:24:26 CEST Vishesh Handa wrote:
>>> Hey guys
>>>
>>> I was told there is a thread about scrapping Baloo. All Baloo
>>> discussion used to happen on kde-devel and that's where the review
>>> requests go. It's the only reason I am still subscribed to kde-devel.
>>
>> Heya,
>>
>> Baloo is a framework nowadays, therefore it totally makes sense to have the
>> discussion on kde-framework-devel.
>>
>> There's been tons of discussion around Baloo on kde-framework-devel already.
>> kde-frameworks-devel is also where the CI messages for the baloo repo go to.
>>
>> It likely makes sense for you to subscribe, no?
>>
> 
> I don't understand why all framework discussions must happen on the
> same list. It just adds to a crazy amount of noise, which one then
> needs to parse through.
If you would have baloo-devel I could understand that point,
but not with some other generic mailing list like kde-devel which
has the same amount of noise and is not even dedicated to 'frameworks'
or 'baloo'.

Greetings
Christoph

-- 
- Dr.-Ing. Christoph Cullmann -
AbsInt Angewandte Informatik GmbH  Email: cullm...@absint.com
Science Park 1 Tel:   +49-681-38360-22
66123 Saarbrücken  Fax:   +49-681-38360-20
GERMANYWWW:   http://www.AbsInt.com

Geschäftsführung: Dr.-Ing. Christian Ferdinand
Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Christoph Cullmann
Hi,

> On Fri, Oct 7, 2016 at 6:14 PM, Christoph Cullmann  
> wrote:

>>>
>>> I don't understand why all framework discussions must happen on the
>>> same list. It just adds to a crazy amount of noise, which one then
>>> needs to parse through.
>>
>> If you would have baloo-devel I could understand that point,
>> but not with some other generic mailing list like kde-devel which
>> has the same amount of noise and is not even dedicated to 'frameworks'
>> or 'baloo'.
> 
> If you guys plans to use frameworks devel, then please change the
> review requests.
> 
> It was just too much noise for me, and I found the noise/signal ratio
> way lower in kde-devel. Baloo-devel was specifically not chosen as it
> would just an another silo in kde. Nepomuk used to suffer from that.
I use the power of e-mail filters to filter the review requests in a subfolder,
I think others might do the same. (same for CI)

I don't see a point in changing that policy if all others can live with it.

Greetings
Christoph

-- 
- Dr.-Ing. Christoph Cullmann -
AbsInt Angewandte Informatik GmbH  Email: cullm...@absint.com
Science Park 1 Tel:   +49-681-38360-22
66123 Saarbrücken  Fax:   +49-681-38360-20
GERMANYWWW:   http://www.AbsInt.com

Geschäftsführung: Dr.-Ing. Christian Ferdinand
Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234


Applying as a mentor

2016-10-07 Thread Prathmesh Ranaut
Hey


I want to apply to SOK as a mentor as I have a few proposal for some projects. 
Could anyone please guide me where to submit the proposals and what are the 
minimum requirements to be a mentor?


Regards
Prathmesh Ranaut





Re: Applying as a mentor

2016-10-07 Thread André Brait
I would also like to know more about it. Also, I'd like to know if there's
something I should do after applying as a student.

Em 7 de out de 2016 13:41, "Prathmesh Ranaut" 
escreveu:

> Hey
>
> I want to apply to SOK as a mentor as I have a few proposal for some
> projects. Could anyone please guide me where to submit the proposals and
> what are the minimum requirements to be a mentor?
>
> Regards
> Prathmesh Ranaut
>


Re: Scrap Baloo Thread Feedback

2016-10-07 Thread Ömer Fadıl USTA
Each project might have its own list and those lists have a hook for
its content to make a copy to
kde-frameworks list so if anyone want to watch all framework project
related mails s/he can watch all of them in
one place and if a person  want to follow just one project s/he can
just be part of related project's list
Ex: Baloo might have a list like baloo-devel and all mails goes to
this list can be copied to kde-framworks list
so it will solve all related problems.(Also we can add a tag for each
list name ( [Baloo] for example for Baloo-dev list )
so in kde-frameworks might be more understandable content

Ömer Fadıl Usta
about.me/omerusta



2016-10-07 19:20 GMT+03:00 Christoph Cullmann :
> Hi,
>
>> On Fri, Oct 7, 2016 at 6:14 PM, Christoph Cullmann  
>> wrote:
>

 I don't understand why all framework discussions must happen on the
 same list. It just adds to a crazy amount of noise, which one then
 needs to parse through.
>>>
>>> If you would have baloo-devel I could understand that point,
>>> but not with some other generic mailing list like kde-devel which
>>> has the same amount of noise and is not even dedicated to 'frameworks'
>>> or 'baloo'.
>>
>> If you guys plans to use frameworks devel, then please change the
>> review requests.
>>
>> It was just too much noise for me, and I found the noise/signal ratio
>> way lower in kde-devel. Baloo-devel was specifically not chosen as it
>> would just an another silo in kde. Nepomuk used to suffer from that.
> I use the power of e-mail filters to filter the review requests in a 
> subfolder,
> I think others might do the same. (same for CI)
>
> I don't see a point in changing that policy if all others can live with it.
>
> Greetings
> Christoph
>
> --
> - Dr.-Ing. Christoph Cullmann -
> AbsInt Angewandte Informatik GmbH  Email: cullm...@absint.com
> Science Park 1 Tel:   +49-681-38360-22
> 66123 Saarbrücken  Fax:   +49-681-38360-20
> GERMANYWWW:   http://www.AbsInt.com
> 
> Geschäftsführung: Dr.-Ing. Christian Ferdinand
> Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234