On 2016-05-02 21.33, Junio C Hamano wrote:
> Junio C Hamano <[email protected]> writes:
> Let's step back a bit and make sure we are on the same page. I
> think this "series" conflates a bit too many things into a single
> topic.
>
> * The comparison between the index and the working tree, i.e. "git
> diff", should compare result of convert_to_git() with what is in
> the index, and the world around it should be made consistent with
> that. Your separate "git blame" fix to add missing knowledge
> that convert_to_git() would not do s/CRLF/LF/ for a path whose
> index entry already is contaminated with CR falls into this
> category and was a very good thing to do.
>
> * A convert_to_git() and convert_to_working_tree() pair that do not
> roundtrip would (by definition) leave contents in the working
> tree, that, when passed through convert_to_git(), will be
> different from the index, upon completion of "reset --hard". We
> _should_ fix it so that "git diff" _reports_ differences.
> Currently, lstat(2) based optimization hides this in a racy way
> (when racy Git kicks in to reinspect the index and the working
> tree file actually matches, it finds out that they do not match),
> it is a bug that needs to be fixed, not 10/10 where it tries to
> hide the differences consistently and spreads the bug. I haven't
> studied 8/10 carefully yet, but it seems to attempt the same.
>
> * I think the "text=auto eol=THIS" that did not mean "I do not care
> to specify which ones are text files. Please detect the file
> type, and for those automatically detected, please make sure that
> the contents follwo THIS eol convention." was a bug, and what
> 07/10 tries to do is a good thing.
>
> By the way, lack of the cover letter of this series made it more
> painful to write a reply like this than necessary. A cover letter
> for a trivial 3-patch series might be overkill, but for anything
> with substance that spans more than 4-5 patches, a cover letter to
> describe the overall direction would really help.
The 10/10 needs to be replaced with something different, and I start to
get a better picture, why.
read_cache.c/ce_compare_data() checks what "git add" will change in the index.
sha1_file.c/index_fd() will read the file content from the working tree,
run convert_to_git() and calculate the sha1, which read_cache.c feeds into
hashcmp().
When convert_to_git() is run, 3 steps are taken:
- apply the filter, if any
- run crlf_to_git(), if attributes and core.autocrlf say so
- run ident, if specified.
Now the crlf_to_git() uses has_cr_in_index(const char *path) to find
out if we should keep the CRLF or not.
Side note: the suggested patches will use
get_convert_stats_sha1(get_sha1_from_cache(path),)
This works pretty well under normal "git add", but fails in t6038,
when we do a merge.
The new function get_sha1_from_index() can not find the sha1,
"read-cache.c/get_sha1_from_index:2392 path=file pos=-3"
and falls into the code path for
/*
* We might be in the middle of a merge, in which
* case we would read stage #2 (ours).
*/
read-cache.c/get_sha1_from_index:2408 path=file pos=3
read-cache.c/get_sha1_from_index:2416 path=file sha1=ad55e2
(The line number are with debug code)
The problem is, that ad55e2 is _not_ the sha1, which
read_cache/ce_compare_data() had been asked to look at.
Instead we should check the blob with 99b633.
The result is that convert/get_convert_stats_sha1() is called on the
wrong blob (the one with CRLF instead the one without CRLF), and this
makes t6038 from 9/10 fail.
10/10 rescues the situation, by using the correct blob :-)
In short, ce_compare_data() needs to forward ce->sha1 the whole way into
convert.c/crlf_to_git() and get_convert_stats_sha1().
While at it, I realized that we call a convert_attrs(&ca, path) a couple
of times (e.g. in would_convert_to_git(), to find out that we really don't have
any attributes set.
It could be nice to do that only once.
The next step will be to add the improvements/fixes for the ce_compare_data()
chain as described above, and then put 7/10..9/10 on top of that.
This will probably take some time, so that's why I asked if 1/10..4/10 could
proceed as is ?
(and the next version with cover letter, sorry for that)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html