> On May 5, 2014, 2:05 p.m., Milian Wolff wrote:
> > What is the speed difference compared to the old QString API, but without 
> > the word-count there? Afaik, the word-count is the major bottleneck and 
> > removing it alone should greatly speed up the test.
> > 
> > Having a QByteArray in the API would be fine if you document that the data 
> > _must_ be UTF8. But a meaningful performance test here must include the 
> > later conversion to std::string for xapian, imo. I.e. what you want to test 
> > is file -> qbytearray -> std::string vs. file -> qstring -> std::string.
> 
> Vishesh Handa wrote:
>     Original: 60 msecs
>     Without Word Count: 30 msecs
>     Without Word Count + ByteArray: 8 msecs
> 
> Milian Wolff wrote:
>     cool, looks promising. And how slow would be your patch right now, just 
> with result->append(QString::fromUtf8(arr)); Or is that then the 30msecs? 
> Just wondering what the impact of using STL instead of QIODevice/QFile is 
> here.
> 
> Vishesh Handa wrote:
>     With QString::fromUtf8 - About 9msecs. So, it's all QFile.
>     
>     I've tried testing out actually indexing a file before and after this 
> patch (and a small patch in Baloo, to directly convert it into a QString). I 
> can barely make out any difference. Just about 100-200 msecs on a file which 
> takes 11 seconds.
>     
>     It might just make sense to discard the whole append(QByteArray) and just 
> ship the QFile parts.

yes that sounds like a good approach.


- Milian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/117996/#review57314
-----------------------------------------------------------


On May 5, 2014, 2:01 p.m., Vishesh Handa wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/117996/
> -----------------------------------------------------------
> 
> (Updated May 5, 2014, 2:01 p.m.)
> 
> 
> Review request for Baloo and Milian Wolff.
> 
> 
> Repository: kfilemetadata
> 
> 
> Description
> -------
> 
>     Add an append(QByteArray) method to the ExtractionResult
> 
>     This way plugins can choose to return the data in utf8 or as a QString,
>     and the clients can either just let the standard QString::fromUtf8
>     function do its magic, or implement some special handling if they wish.
> 
>     This speeds up the PlainTextExtractor quite a bit (60msec vs 8.3msec)
> 
>     Unfortunately, this meant discarding the extraction of WordCount from
>     the Plain Text extractor. Though considering the speed difference, I
>     think it is worth it.
> 
> 
> Diffs
> -----
> 
>   autotests/indexerextractortests.cpp 6b7c605 
>   autotests/simpleresult.h f3793b5 
>   src/extractionresult.h 76dfe59 
>   src/extractionresult.cpp 9bc7946 
>   src/extractors/plaintextextractor.cpp 5a38857 
> 
> Diff: https://git.reviewboard.kde.org/r/117996/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vishesh Handa
> 
>

>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<

Reply via email to