[
https://issues.apache.org/jira/browse/OPENNLP-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Martin Wiesner resolved OPENNLP-1661.
-------------------------------------
Resolution: Fixed
> Fix custom models being wiped from OpenNLP user.home directory
> --------------------------------------------------------------
>
> Key: OPENNLP-1661
> URL: https://issues.apache.org/jira/browse/OPENNLP-1661
> Project: OpenNLP
> Issue Type: Bug
> Components: Models
> Affects Versions: 2.5.0
> Reporter: Martin Wiesner
> Assignee: Martin Wiesner
> Priority: Major
> Fix For: 2.5.1
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Currently, a Maven build ({{mvn clean test}}) wipes existing models in the
> '{{user.home/.opennlp}}' directory, as the code in
> {{AbstractDownloadUtilTest#cleanupWhenOnline}} will clean those up before the
> related methods in {{DownloadUtil}} will be tested.
> However, this causes some headache, if custom-trained models with similar
> name patterns exist in that directory, as:
> _wipeExistingModelFiles("\-tokens\-");_
> _wipeExistingModelFiles("\-sentence\-");_
> _wipeExistingModelFiles("\-pos\-");_
> _wipeExistingModelFiles("\-lemmas\-");_
> will be executed. Moreover, this also causes a lot of overhead for dev
> people, as each run of the whole test suite will clean up either in the
> target directory of {{opennlp-tools}} module, or even worse, the local
> '{{user.home/.opennlp}}' directory, causing at least 128 (32 langs x 4 model
> types) models to be downloaded (over and over again).
> Aims:
> * Ensure no (custom) model is accidentally removed from
> '{{user.home/.opennlp}}'.
> * Ensure models downloads aren't repeated if they exist locally & are "valid"
> (_sha512_)
> * Validate freshly downloaded models AND existing ones to discover broken
> model files
> * Reduce download volume required for full (IT) builds
> * Reduce load for ASF infrastructure
> * Reduce overall ecological footprint
> Note: Same applies for 'ci' Maven profile. As long as no "mvn clean" is
> executed, existing models kept in a build's {{target}} folder should not be
> wiped and not be re-downloaded per test suite execution.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)