Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-15 Thread J. Roeleveld
On Thursday 03 December 2009 20:20:03 fe...@crowfix.com wrote: > I have a project which requires normalizing names, and by that, I mean > converting to lower case etc, whatever eliminates redundancies. I > know Unicode has a different "normalize" meaning, but for my purposes, > that has already be

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-05 Thread felix
On Sun, Dec 06, 2009 at 11:45:43AM +0900, daid kahl wrote: > Well, I don't think "n" is really a syllable. It's a sound, and it's > the only part of the syllabary in Japanese that doesn't have a vowel. > I'm not really convinced this is a syllable in reality. It's certainly a syllable in their sy

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-05 Thread daid kahl
>> such as (I am guessing now) saw-umm-bee-yaw-koo.  To write Tokyo in >> the proper furigana is probably something like toh-o-kee-yoh-o. Oh, I should mention that this is in writing correct. But the yo is a subscript, so it's also a modifier, so the ki part isn't pronounced, it's modified into a

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-05 Thread daid kahl
> Our handling is simple -- we don't yet. I don't know how to handle > things like that, or the previous example of Copenhagen in different > languages. Look at Naples -- that's not what Italins call it. Venice > is really bad -- no idea how English got it so mangled. Speaking of > Japanese, th

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-05 Thread felix
On Sun, Dec 06, 2009 at 10:58:59AM +0900, daid kahl wrote: > I'm curious about your handling of Japanese, just because I'm living > outside Tokyo these days. My grasp on Japanese is basically rubbish, > but I can at least claim to know a thing or two. Our handling is simple -- we don't yet. I d

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-05 Thread daid kahl
> I have a project which requires normalizing names, and by that, I mean > converting to lower case etc, whatever eliminates redundancies.  I > know Unicode has a different "normalize" meaning, but for my purposes, > that has already been done.  Maybe I should call it standardization or > make up a

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-04 Thread Neil Bothwick
On Fri, 4 Dec 2009 22:50:52 +0200, Alan McKinnon wrote: > Three consecutive "e"'s looks weird Are you calling my laptop weird? ;-) -- Neil Bothwick THE BORG: Calm, Cool and Collective... signature.asc Description: PGP signature

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-04 Thread Alan McKinnon
On Friday 04 December 2009 15:42:56 Volker Armin Hemmann wrote: > On Freitag 04 Dezember 2009, fe...@crowfix.com wrote: > > If enough Europeans are in the habit of taking > > shortcuts and skipping umlauts and accents and cedilla and tildes, > > we don't. Because skipping Umlaut, accent&co create

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-04 Thread Volker Armin Hemmann
On Freitag 04 Dezember 2009, fe...@crowfix.com wrote: > If enough Europeans are in the habit of taking > shortcuts and skipping umlauts and accents and cedilla and tildes, we don't. Because skipping Umlaut, accent&co creates a completly new word. Probably one that is already there. Munster is a

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-04 Thread felix
On Fri, Dec 04, 2009 at 10:17:30AM +0100, Patrick Holthaus wrote: > You cannot simply leave the umlaut out since it is considered as a separate > letter for itself. You cannot choose whether to write an "?" or an "o". Like > Renat said, there are words that completely change their meaning when

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-04 Thread Patrick Holthaus
Hey! > So do people type in Busingen different ways depending on how they > feel, do some people always leave off the umlaut, do some always use > it? You cannot simply leave the umlaut out since it is considered as a separate letter for itself. You cannot choose whether to write an "ö" or an "

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread felix
On Fri, Dec 04, 2009 at 01:03:23AM +0100, Volker Armin Hemmann wrote: > look at my name, ok? > > Just dropping the Umlaut is wrong. No if, but, maybe. It is wrong. Error. > Mistake. Fail. If you can not enter ?, ? or ?, you must transform them to ae, > oe or ue. I'd like to find a program which

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread Alan McKinnon
On Friday 04 December 2009 02:03:23 Volker Armin Hemmann wrote: > look at my name, ok? > > Just dropping the Umlaut is wrong. No if, but, maybe. It is wrong. Error. > Mistake. Fail. If you can not enter ä, ö or ü, you must transform them to > ae, oe or ue. > Your name shows here in 7-bit ASCII

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread Volker Armin Hemmann
look at my name, ok? Just dropping the Umlaut is wrong. No if, but, maybe. It is wrong. Error. Mistake. Fail. If you can not enter ä, ö or ü, you must transform them to ae, oe or ue.

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread felix
On Fri, Dec 04, 2009 at 12:38:34AM +0200, Arttu V. wrote: > I assume you have already removed the language problem from the > equation? I.e., the fact that K?benhavn, Copenhague, K??penhamina and > Copenhagen all mean the same place, just in different European > languages (Danish, Spanish, Finnish

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread felix
On Thu, Dec 03, 2009 at 08:32:45PM -0200, Francisco Ares wrote: > What about a set of dictionaries? And also a library for mistyped word > search? Way too much effort for this. Nice idea, might even be fun, but it's just trying to avoid the common things, and I mainly wondered about how often pe

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread Arttu V.
On 12/3/09, fe...@crowfix.com wrote: > I have a project which requires normalizing names, and by that, I mean > converting to lower case etc, whatever eliminates redundancies. I assume you have already removed the language problem from the equation? I.e., the fact that København, Copenhague, Kööp

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread Francisco Ares
On Thu, Dec 3, 2009 at 6:29 PM, Renat Golubchyk wrote: > On Thu, 3 Dec 2009 12:07:26 -0800 > fe...@crowfix.com wrote: > > So do people type in Busingen different ways depending on how they > > feel, do some people always leave off the umlaut, do some always use > > it? > > If you want to leave of

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread Alan McKinnon
On Friday 04 December 2009 00:07:33 Volker Armin Hemmann wrote: > On Donnerstag 03 Dezember 2009, Renat Golubchyk wrote: > > Hi! > > > > On Thu, 3 Dec 2009 11:20:03 -0800 > > > > fe...@crowfix.com wrote: > > > In Germany is a district "Busingen", with an umlauted 'u'. Is it > > > reasonable to con

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread Volker Armin Hemmann
On Donnerstag 03 Dezember 2009, Renat Golubchyk wrote: > Hi! > > On Thu, 3 Dec 2009 11:20:03 -0800 > > fe...@crowfix.com wrote: > > In Germany is a district "Busingen", with an umlauted 'u'. Is it > > reasonable to consider it the same word whether with or without the > > unlauted u? > > No. Fo

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread Renat Golubchyk
On Thu, 3 Dec 2009 12:07:26 -0800 fe...@crowfix.com wrote: > So do people type in Busingen different ways depending on how they > feel, do some people always leave off the umlaut, do some always use > it? If you want to leave of the umlaut you have to be absolutely sure that there exists no other

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread felix
On Thu, Dec 03, 2009 at 08:50:08PM +0100, Renat Golubchyk wrote: > I'd suggest you use a unicode library. BTW, what about cyrillic > letters or other alphabets? Those may have nothing to do with ASCII. Or > is your project restricted to latin letters? The data is already in normalized Unicode. M

Re: [gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread Renat Golubchyk
Hi! On Thu, 3 Dec 2009 11:20:03 -0800 fe...@crowfix.com wrote: > In Germany is a district "Busingen", with an umlauted 'u'. Is it > reasonable to consider it the same word whether with or without the > unlauted u? No. For many words it would be ok, but not for all. For example, "drucken" means "

[gentoo-user] [OT] Need advice from people who use non-ascii all day long

2009-12-03 Thread felix
I have a project which requires normalizing names, and by that, I mean converting to lower case etc, whatever eliminates redundancies. I know Unicode has a different "normalize" meaning, but for my purposes, that has already been done. Maybe I should call it standardization or make up a new cromu