https://bugs.documentfoundation.org/show_bug.cgi?id=89578
--- Comment #3 from Nick Levinson <[email protected]> --- It appears that multi-word support is by treating each constituent one-word string as correctly spelled even though it's wrong or too rare for inclusion unless adjacent to another word that is itemized for left-adjacency or right-adjacency (including for phrases that are 3 or more words long). This applies to legal and medical terminology, place names, business names, personal names, foreign phrases that have been accepted into English usage albeit if italicized (recommending italicization to a user might be a separate feature request), and probably unlimited other categories. So, now, "York", "Los", "Hampshire", "est", and "Francisco" are accepted, even though as standalone words in English they're probably very rare, so they should be marked as wrong by default unless the user wants to allow exceptions. Rarities are usually omitted from spell-check dictionaries because in a typical user's context the string is more likely to be a misspelling the user will want to correct. Merriam-Webster's Third (approximate title) dictionary, unabridged, says in its frontmatter that if a word is formed in English as set solid, hyphenated, and spaced, it is entered into the dictionary with only one form. Usually, the senses, pronunciations, etymologies, etc. would be the same anyway, and that saves space, but that means that even that unabridged dictionary is not an authority for determining whether unlisted forms are uncommon in English. An introductory book on computers, I think on Linux, said that "file system" and "filesystem" do not have the same meaning. The only way that occurs to me to solve that problem in a spell-check would be with a tooltip or similar display asking the user which meaning is intended. Back to accepting "York", "Los", etc.: I disagree with that being the solution to recognizing "New York", "Los Angeles", "New Hampshire", "id est" (the expansion of "i.e."), and "San Francisco", respectively. But I also know that designing spell-check to recognize multi-word strings is harder. My guess is to do multiple passes, with a separate dictionary for each number of spaces in a string and a pass through the whole document or through recent edits for strings with the most spaces per string and then repeating until ending with a pass for spaceless strings. This also needs a way to assign a string being accepted into a supplemental dictionary into the supplemental dictionary for the right number of spaces within the string. It is possible to use one dictionary sorted first by number of spaces and then by today's sortation method, but for user-editable dictionaries when a user is trying to find, edit, or add an entry that would be confusing. How https://bugs.documentfoundation.org/show_bug.cgi?id=154499 indirectly relates to this I'm not clear, but I think it does. -- You are receiving this mail because: You are the assignee for the bug.
