[ 
https://issues.apache.org/jira/browse/MINDEXER-151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamás Cservenák updated MINDEXER-151:
-------------------------------------
    Fix Version/s: 7.0.0

> Speed up Index update from remote
> ---------------------------------
>
>                 Key: MINDEXER-151
>                 URL: https://issues.apache.org/jira/browse/MINDEXER-151
>             Project: Maven Indexer
>          Issue Type: Improvement
>            Reporter: Tamás Cservenák
>            Priority: Major
>             Fix For: 7.0.0
>
>
> Currently, if you execute from examples the BasicUsageExample, it will 
> perform "full" update, and the full update (to get from "empty" index to "up 
> to date" index) takes 15 or more minutes. Yes, Central index is huge, but 
> there is room for improvement.
> Steps happening during update(s):
>  * properties file downloaded
>  * GZ file(s) downloaded (depending is it incremental or full)
>  * the GZ files are processed into temporary Lucene index
>  * the target (being updated) indexing context index is "replaced" (or 
> merged, depends) with temporary Lucene index
> Downloading files are several seconds, but it is the processing of the GZIP 
> raw records into Lucene index that takes long time. This can be improved.
> The work done here is somewhat interleaved with MINDEXER-150 as well, as the 
> duplicate code (incremental download handling, consuming the downloaded GZ 
> files) should be reused, not duplicated across modules.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to