[ 
https://issues.apache.org/jira/browse/MRESOLVER-274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17623657#comment-17623657
 ] 

ASF GitHub Bot commented on MRESOLVER-274:
------------------------------------------

cstamas commented on code in PR #197:
URL: https://github.com/apache/maven-resolver/pull/197#discussion_r1004161901


##########
src/site/markdown/remote-repository-filtering.md:
##########
@@ -0,0 +1,70 @@
+# Remote Repository Filtering
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+A new Maven Resolver feature that allows filtering of Artifact by 
RemoteRepository based on various (extensible) 
+criteria.
+
+## Why?
+
+Remote Repository Filtering (RRF) is a long asked feature of Maven, and plays 
huge role when your build uses
+several remote repositories. In such cases Maven "searches" the ordered list 
(effective POM) of remote repositories,
+and artifact gets resolved using "first wins" strategy. This have several 
implications:
+
+* your build gets slower, as if your artifact is in Nth repository, Maven must 
make N-1 requests that will result in
+  404 Not Found only to get to Nth repository to finally get the artifact.
+* you build "leaks" artifact requests, as those repositories are asked for 
artifacts, that does not (or worse,
+  cannot) have them. Still, those remote repository operators do get your 
requests in access logs.
+* to "simplify" things, users tend to use MRM "group" (or "virtual") 
repositories, that causes  data loss on
+  Maven Project side (project loses artifact origin information) and ends up 
in disasters, as at the end these
+  "super-uber groups" grow uncontrollably, their member count become 
uncontrollabble (as new members are being
+  added as time passes), or created groups count grows uncontrollably, and 
project start loosing the knownled
+  about their required remote repositories, needed to (re)build a project, 
hence these projects become
+  unbuildable without the MRM, projects become bound to MRM.
+
+So Maven by default gets slower as remote repositories are added, leaks your 
own build informations to remote
+repository operators, and current solutions offered to solve this problem just 
end up in disasters (most often).
+
+## What it is?
+
+Imagine you can instruct Maven which repository can contain what artifact? 
Instead of "round robin" searching
+for artifacts in remote repositories, Maven could be instructed in controlled 
way to directly reach only the
+needed remote repository.
+
+With RRF, Maven build does NOT have to slow down with new remote repositories 
added, and will not leak either
+build information anywhere, as it will get things from where they should be 
get from.
+
+## What it is not?
+
+When it solely comes to dependencies, don't forget
+[maven-enforcer-plugin](https://maven.apache.org/enforcer/enforcer-rules/bannedDependencies.html)
 rules that are doing
+exactly that. RRF is NOT an alternative means to these enforcer rules, they 
are alternative tools to make your build
+more faster and more private, optimized, without loosing build information 
(remote repositories should be in POM).
+
+## Maven Central is special
+
+Maven Central (MC) repository is special in this respect, as Maven will always 
try to get things from here, as your build,
+plugins, plugin dependencies, extension, etc will most often come from here. 
While you CAN filter MC, filtering MC is
+most often a bad idea (filtering, as in "limiting what can come from it"). On 
other hand, MC itself offers helps
+to prevent request leakage to it (publishes available prefixes, see below).
+
+So, **most often** limiting "what can be fetched" from MC is a bad idea, it 
**can be done** but in very very cautious way,
+as otherwise you risk your build. RRF does not distinguish the "context" of an 
artifact, it merely filters them out
+by {artifact, remoteRepository) pair, and by limiting MC you can easily get 
into state where you break your build (as
+plugin depends on filtered artifact).

Review Comment:
   I plan to extend doco, probably to reuse this "demo" (not code but text from 
it): https://github.com/cstamas/rrf-demo





> Introduce Remote Repository Filter feature
> ------------------------------------------
>
>                 Key: MRESOLVER-274
>                 URL: https://issues.apache.org/jira/browse/MRESOLVER-274
>             Project: Maven Resolver
>          Issue Type: New Feature
>          Components: Resolver
>            Reporter: Tamas Cservenak
>            Assignee: Tamas Cservenak
>            Priority: Major
>             Fix For: 1.9.0
>
>
> The feature, as it's name says should be able to "filter" RemoteRepositories 
> by some criteria ("known bad GAVs", "allowed groupId", etc).
> In short, this feature allows following filtering: "should be Artifact 
> available from RemoteRepository?" and is able to employ several combination 
> (via consensus, or later possibly other strategies) of several "filter 
> sources" that are extensible (via adding new components).
> Filter is used in two places:
>  * in connector, preventing remote artifact to be fetched from remote 
> repository (100% reliable)
>  * in resolution, preventing locally *cached* artifact to be resolved 
> (reliable as much as your local repository is "clean", ie. if you used Simple 
> LRM on it, it does not track remote origins will fail to filter, while 
> EnhancedLRM does track it and will work as expected).
> By default this feature is "dormant" (resolver behaves exactly same as before 
> without it). This is intended as "low level" feature that later can be built 
> upon, and implement some more user friendly solutions like MNG-6763. Hence, 
> this issue and resolver code changes are NOT meant to completely implement 
> MNG-6763, but more like to provide needed (lower level) functionalities to 
> make it possible.
> Filters implemented in this round:
>  * groupId - provide a list of groupIds per remote repository
>  * prefix - use prefixes file for allowed prefixes (example central 
> [https://repo.maven.apache.org/maven2/.meta/prefixes.txt] or ASF releases 
> [https://repository.apache.org/content/repositories/releases/.meta/prefixes.txt)]
>  * maybe package up an artifact holding list of "known" bad artifacts and 
> consume that (and enforce it)
>  * etc...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to