Re: [PATCH] diff-highlight: Fix broken multibyte string

2015-04-04 Thread Jeff King
On Fri, Apr 03, 2015 at 03:24:09PM -0700, Kyle J. McKay wrote: > >I thought that meant we could also optimize out the "map" call entirely, > >and just use the first split (with "*") to end up with a list of $COLOR > >chunks and single characters, but it does not seem to work. So maybe I > >am misr

Re: [PATCH] diff-highlight: Fix broken multibyte string

2015-04-03 Thread Kyle J. McKay
On Apr 3, 2015, at 15:08, Jeff King wrote: Doing: diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff- highlight/diff-highlight index 08c88bb..1c4b599 100755 --- a/contrib/diff-highlight/diff-highlight +++ b/contrib/diff-highlight/diff-highlight @@ -165,7 +165,7 @@ sub highlight_

Re: [PATCH] diff-highlight: Fix broken multibyte string

2015-04-03 Thread Jeff King
On Fri, Apr 03, 2015 at 11:19:24AM +0900, Yi, EungJun wrote: > > I timed this one versus the existing diff-highlight. It's about 7% > > slower. That's not great, but is acceptable to me. The String::Multibyte > > version was a lot faster, which was nice (but I'm still unclear on > > _why_). > > I

Re: [PATCH] diff-highlight: Fix broken multibyte string

2015-04-03 Thread Jeff King
On Thu, Apr 02, 2015 at 06:59:50PM -0700, Kyle J. McKay wrote: > It should work as well as the original did for any 1-byte encoding. That > is, if it's not valid UTF-8 it should pass through unchanged and any single > byte encoding should just work. But, as you point out, multibyte encodings > o

Re: [PATCH] diff-highlight: Fix broken multibyte string

2015-04-02 Thread Yi, EungJun
> I timed this one versus the existing diff-highlight. It's about 7% > slower. That's not great, but is acceptable to me. The String::Multibyte > version was a lot faster, which was nice (but I'm still unclear on > _why_). I think the reason is here: > sub split_line { >local $_ = shift; >

Re: [PATCH] diff-highlight: Fix broken multibyte string

2015-04-02 Thread Kyle J. McKay
On Apr 2, 2015, at 18:24, Jeff King wrote: On Thu, Apr 02, 2015 at 05:49:24PM -0700, Kyle J. McKay wrote: Subject: [PATCH v2] diff-highlight: do not split multibyte characters When the input is UTF-8 and Perl is operating on bytes instead of characters, a diff that changes one multibyte chara

Re: [PATCH] diff-highlight: Fix broken multibyte string

2015-04-02 Thread Jeff King
On Thu, Apr 02, 2015 at 05:49:24PM -0700, Kyle J. McKay wrote: > Subject: [PATCH v2] diff-highlight: do not split multibyte characters > > When the input is UTF-8 and Perl is operating on bytes instead > of characters, a diff that changes one multibyte character to > another that shares an initia

Re: [PATCH] diff-highlight: Fix broken multibyte string

2015-04-02 Thread Kyle J. McKay
On Mar 30, 2015, at 15:16, Jeff King wrote: > Yeah, I agree the current output is not ideal, and this should address > the problem. I was worried that multi-byte splitting would make things > slower, but in my tests, it actually speeds things up! [...] > Unfortunately, String::Multibyte is not a

Re: [PATCH] diff-highlight: Fix broken multibyte string

2015-03-30 Thread Jeff King
On Tue, Mar 31, 2015 at 12:55:33AM +0900, Yi EungJun wrote: > From: Yi EungJun > > Highlighted string might be broken if the common subsequence is a proper > subset > of a multibyte character. For example, if the old string is "진" and the new > string is "지", then we expect the diff is rendered

[PATCH] diff-highlight: Fix broken multibyte string

2015-03-30 Thread Yi EungJun
From: Yi EungJun Highlighted string might be broken if the common subsequence is a proper subset of a multibyte character. For example, if the old string is "진" and the new string is "지", then we expect the diff is rendered as follows: -진 +지 but actually it was rendered as follo