This is an automated email from the ASF dual-hosted git repository. ggregory pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/commons-codec.git
The following commit(s) were added to refs/heads/master by this push: new a38cf236 Javadoc a38cf236 is described below commit a38cf236ca02ab54b2d518324cde08dbd3313c78 Author: Gary Gregory <garydgreg...@gmail.com> AuthorDate: Sun Jun 4 11:31:27 2023 -0400 Javadoc Close HTML tags --- .../commons/codec/language/bm/BeiderMorseEncoder.java | 12 ++++++++---- .../java/org/apache/commons/codec/language/bm/Lang.java | 11 +++++++++-- .../org/apache/commons/codec/language/bm/Languages.java | 4 ++++ .../org/apache/commons/codec/language/bm/NameType.java | 12 +++++++++--- .../apache/commons/codec/language/bm/PhoneticEngine.java | 5 +++++ .../java/org/apache/commons/codec/language/bm/Rule.java | 8 ++++++-- .../org/apache/commons/codec/language/bm/RuleType.java | 14 +++++++++++--- .../org/apache/commons/codec/net/QuotedPrintableCodec.java | 8 ++++++++ .../java/org/apache/commons/codec/net/RFC1522Codec.java | 3 +++ 9 files changed, 63 insertions(+), 14 deletions(-) diff --git a/src/main/java/org/apache/commons/codec/language/bm/BeiderMorseEncoder.java b/src/main/java/org/apache/commons/codec/language/bm/BeiderMorseEncoder.java index a8768282..21ae48c6 100644 --- a/src/main/java/org/apache/commons/codec/language/bm/BeiderMorseEncoder.java +++ b/src/main/java/org/apache/commons/codec/language/bm/BeiderMorseEncoder.java @@ -25,12 +25,13 @@ import org.apache.commons.codec.StringEncoder; * <p> * Beider-Morse phonetic encodings are optimised for family names. However, they may be useful for a wide range of * words. + * </p> * <p> * This encoder is intentionally mutable to allow dynamic configuration through bean properties. As such, it is mutable, * and may not be thread-safe. If you require a guaranteed thread-safe encoding then use {@link PhoneticEngine} * directly. - * <p> - * <b>Encoding overview</b> + * </p> + * <h2>Encoding overview</h2> * <p> * Beider-Morse phonetic encodings is a multi-step process. Firstly, a table of rules is consulted to guess what * language the word comes from. For example, if it ends in "{@code ault}" then it infers that the word is French. @@ -42,14 +43,15 @@ import org.apache.commons.codec.StringEncoder; * representation. Again, sometimes there are multiple ways this could be done and sometimes things that can be * pronounced in several ways in the source language have only one way to represent them in this average phonetic * language, so the result is again a set of phonetic spellings. + * </p> * <p> * Some names are treated as having multiple parts. This can be due to two things. Firstly, they may be hyphenated. In * this case, each individual hyphenated word is encoded, and then these are combined end-to-end for the final encoding. * Secondly, some names have standard prefixes, for example, "{@code Mac/Mc}" in Scottish (English) names. As * sometimes it is ambiguous whether the prefix is intended or is an accident of the spelling, the word is encoded once * with the prefix and once without it. The resulting encoding contains one and then the other result. - * <p> - * <b>Encoding format</b> + * </p> + * <h2>Encoding format</h2> * <p> * Individual phonetic spellings of an input word are represented in upper- and lower-case roman characters. Where there * are multiple possible phonetic representations, these are joined with a pipe ({@code |}) character. If multiple @@ -57,6 +59,7 @@ import org.apache.commons.codec.StringEncoder; * these blocks are then joined with hyphens. For example, "{@code d'ortley}" has a possible prefix. The form * without prefix encodes to "{@code ortlaj|ortlej}", while the form with prefix encodes to " * {@code dortlaj|dortlej}". Thus, the full, combined encoding is "{@code (ortlaj|ortlej)-(dortlaj|dortlej)}". + * </p> * <p> * The encoded forms are often quite a bit longer than the input strings. This is because a single input may have many * potential phonetic interpretations. For example, "{@code Renault}" encodes to " @@ -64,6 +67,7 @@ import org.apache.commons.codec.StringEncoder; * encodings as they consider a wider range of possible, approximate phonetic interpretations of the original word. * Down-stream applications may wish to further process the encoding for indexing or lookup purposes, for example, by * splitting on pipe ({@code |}) and indexing under each of these alternatives. + * </p> * <p> * <b>Note</b>: this version of the Beider-Morse encoding is equivalent with v3.4 of the reference implementation. * </p> diff --git a/src/main/java/org/apache/commons/codec/language/bm/Lang.java b/src/main/java/org/apache/commons/codec/language/bm/Lang.java index b6657b24..d03d6fbe 100644 --- a/src/main/java/org/apache/commons/codec/language/bm/Lang.java +++ b/src/main/java/org/apache/commons/codec/language/bm/Lang.java @@ -36,18 +36,23 @@ import org.apache.commons.codec.Resources; * <p> * This class encapsulates rules used to guess the possible languages that a word originates from. This is * done by reference to a whole series of rules distributed in resource files. + * </p> * <p> * Instances of this class are typically managed through the static factory method instance(). * Unless you are developing your own language guessing rules, you will not need to interact with this class directly. + * </p> * <p> * This class is intended to be immutable and thread-safe. - * <p> - * <b>Lang resources</b> + * </p> + * <h2>Lang resources</h2> * <p> * Language guessing rules are typically loaded from resource files. These are UTF-8 encoded text files. * They are systematically named following the pattern: + * </p> * <blockquote>org/apache/commons/codec/language/bm/lang.txt</blockquote> + * <p> * The format of these resources is the following: + * </p> * <ul> * <li><b>Rules:</b> whitespace separated strings. * There should be 3 columns to each row, and these will be interpreted as: @@ -65,6 +70,7 @@ import org.apache.commons.codec.Resources; * </ul> * <p> * Port of lang.php + * </p> * * @since 1.6 */ @@ -119,6 +125,7 @@ public class Lang { * <p> * In normal use, you will obtain instances of Lang through the {@link #instance(NameType)} method. * You will only need to call this yourself if you are developing custom language mapping rules. + * </p> * * @param languageRulesResourceName * the fully-qualified resource name to load diff --git a/src/main/java/org/apache/commons/codec/language/bm/Languages.java b/src/main/java/org/apache/commons/codec/language/bm/Languages.java index 6f121ec2..4dd05f12 100644 --- a/src/main/java/org/apache/commons/codec/language/bm/Languages.java +++ b/src/main/java/org/apache/commons/codec/language/bm/Languages.java @@ -33,10 +33,12 @@ import org.apache.commons.codec.Resources; * <p> * Language codes are typically loaded from resource files. These are UTF-8 * encoded text files. They are systematically named following the pattern: + * </p> * <blockquote>org/apache/commons/codec/language/bm/${{@link NameType#getName()} * languages.txt</blockquote> * <p> * The format of these resources is the following: + * </p> * <ul> * <li><b>Language:</b> a single string containing no whitespace</li> * <li><b>End-of-line comments:</b> Any occurrence of '//' will cause all text @@ -48,8 +50,10 @@ import org.apache.commons.codec.Resources; * </ul> * <p> * Ported from language.php + * </p> * <p> * This class is immutable and thread-safe. + * </p> * * @since 1.6 */ diff --git a/src/main/java/org/apache/commons/codec/language/bm/NameType.java b/src/main/java/org/apache/commons/codec/language/bm/NameType.java index df3d5f53..038e160a 100644 --- a/src/main/java/org/apache/commons/codec/language/bm/NameType.java +++ b/src/main/java/org/apache/commons/codec/language/bm/NameType.java @@ -26,13 +26,19 @@ package org.apache.commons.codec.language.bm; */ public enum NameType { - /** Ashkenazi family names */ + /** + * Ashkenazi family names. + */ ASHKENAZI("ash"), - /** Generic names and words */ + /** + * Generic names and words. + */ GENERIC("gen"), - /** Sephardic family names */ + /** + * Sephardic family names. + */ SEPHARDIC("sep"); private final String name; diff --git a/src/main/java/org/apache/commons/codec/language/bm/PhoneticEngine.java b/src/main/java/org/apache/commons/codec/language/bm/PhoneticEngine.java index 267e42b6..b1d81557 100644 --- a/src/main/java/org/apache/commons/codec/language/bm/PhoneticEngine.java +++ b/src/main/java/org/apache/commons/codec/language/bm/PhoneticEngine.java @@ -41,12 +41,15 @@ import org.apache.commons.codec.language.bm.Rule.Phoneme; * into account the likely source language. Next, this phonetic representation is converted into a * pan-European 'average' representation, allowing comparison between different versions of essentially * the same word from different languages. + * </p> * <p> * This class is intentionally immutable and thread-safe. * If you wish to alter the settings for a PhoneticEngine, you * must make a new one with the updated settings. + * </p> * <p> * Ported from phoneticengine.php + * </p> * * @since 1.6 */ @@ -97,6 +100,7 @@ public class PhoneticEngine { * <p> * This will lengthen phonemes that have compatible language sets to the expression, and drop those that are * incompatible. + * </p> * * @param phonemeExpr the expression to apply * @param maxPhonemes the maximum number of phonemes to build up @@ -237,6 +241,7 @@ public class PhoneticEngine { /** * Joins some strings with an internal separator. + * * @param strings Strings to join * @param sep String to separate them with * @return a single String consisting of each element of {@code strings} interleaved by {@code sep} diff --git a/src/main/java/org/apache/commons/codec/language/bm/Rule.java b/src/main/java/org/apache/commons/codec/language/bm/Rule.java index 6b16f5f4..d695b47f 100644 --- a/src/main/java/org/apache/commons/codec/language/bm/Rule.java +++ b/src/main/java/org/apache/commons/codec/language/bm/Rule.java @@ -39,6 +39,7 @@ import org.apache.commons.codec.language.bm.Languages.LanguageSet; * <p> * Rules have a pattern, left context, right context, output phoneme, set of languages for which they apply * and a logical flag indicating if all languages must be in play. A rule matches if: + * </p> * <ul> * <li>the pattern matches at the current position</li> * <li>the string up until the beginning of the pattern matches the left context</li> @@ -49,16 +50,19 @@ import org.apache.commons.codec.language.bm.Languages.LanguageSet; * <p> * Rules are typically generated by parsing rules resources. In normal use, there will be no need for the user * to explicitly construct their own. + * </p> * <p> * Rules are immutable and thread-safe. - * <p> - * <b>Rules resources</b> + * </p> + * <h2>Rules resources</h2> * <p> * Rules are typically loaded from resource files. These are UTF-8 encoded text files. They are systematically * named following the pattern: + * </p> * <blockquote>org/apache/commons/codec/language/bm/${NameType#getName}_${RuleType#getName}_${language}.txt</blockquote> * <p> * The format of these resources is the following: + * </p> * <ul> * <li><b>Rules:</b> whitespace separated, double-quoted strings. There should be 4 columns to each row, and these * will be interpreted as: diff --git a/src/main/java/org/apache/commons/codec/language/bm/RuleType.java b/src/main/java/org/apache/commons/codec/language/bm/RuleType.java index a5feb75d..860a19b7 100644 --- a/src/main/java/org/apache/commons/codec/language/bm/RuleType.java +++ b/src/main/java/org/apache/commons/codec/language/bm/RuleType.java @@ -24,11 +24,19 @@ package org.apache.commons.codec.language.bm; */ public enum RuleType { - /** Approximate rules, which will lead to the largest number of phonetic interpretations. */ + /** + * Approximate rules, which will lead to the largest number of phonetic interpretations. + */ APPROX("approx"), - /** Exact rules, which will lead to a minimum number of phonetic interpretations. */ + + /** + * Exact rules, which will lead to a minimum number of phonetic interpretations. + */ EXACT("exact"), - /** For internal use only. Please use {@link #APPROX} or {@link #EXACT}. */ + + /** + * For internal use only. Please use {@link #APPROX} or {@link #EXACT}. + */ RULES("rules"); private final String name; diff --git a/src/main/java/org/apache/commons/codec/net/QuotedPrintableCodec.java b/src/main/java/org/apache/commons/codec/net/QuotedPrintableCodec.java index 7ab77076..43176b7c 100644 --- a/src/main/java/org/apache/commons/codec/net/QuotedPrintableCodec.java +++ b/src/main/java/org/apache/commons/codec/net/QuotedPrintableCodec.java @@ -247,6 +247,7 @@ public class QuotedPrintableCodec implements BinaryEncoder, BinaryDecoder, Strin * <p> * This function implements a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in * RFC 1521 and is suitable for encoding binary data and unformatted text. + * </p> * * @param printable * bitset of characters deemed quoted-printable @@ -264,6 +265,7 @@ public class QuotedPrintableCodec implements BinaryEncoder, BinaryDecoder, Strin * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in * RFC 1521 and is suitable for encoding binary data and unformatted text. + * </p> * * @param printable * bitset of characters deemed quoted-printable @@ -347,6 +349,7 @@ public class QuotedPrintableCodec implements BinaryEncoder, BinaryDecoder, Strin * <p> * This function fully implements the quoted-printable encoding specification (rule #1 through rule #5) as * defined in RFC 1521. + * </p> * * @param bytes * array of quoted-printable characters @@ -387,6 +390,7 @@ public class QuotedPrintableCodec implements BinaryEncoder, BinaryDecoder, Strin * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in * RFC 1521 and is suitable for encoding binary data and unformatted text. + * </p> * * @param bytes * array of bytes to be encoded @@ -403,6 +407,7 @@ public class QuotedPrintableCodec implements BinaryEncoder, BinaryDecoder, Strin * <p> * This function fully implements the quoted-printable encoding specification (rule #1 through rule #5) as * defined in RFC 1521. + * </p> * * @param bytes * array of quoted-printable characters @@ -421,6 +426,7 @@ public class QuotedPrintableCodec implements BinaryEncoder, BinaryDecoder, Strin * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in * RFC 1521 and is suitable for encoding binary data and unformatted text. + * </p> * * @param sourceStr * string to convert to quoted-printable form @@ -571,6 +577,7 @@ public class QuotedPrintableCodec implements BinaryEncoder, BinaryDecoder, Strin * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in * RFC 1521 and is suitable for encoding binary data and unformatted text. + * </p> * * @param sourceStr * string to convert to quoted-printable form @@ -592,6 +599,7 @@ public class QuotedPrintableCodec implements BinaryEncoder, BinaryDecoder, Strin * Depending on the selection of the {@code strict} parameter, this function either implements the full ruleset * or only a subset of quoted-printable encoding specification (rule #1 and rule #2) as defined in * RFC 1521 and is suitable for encoding binary data and unformatted text. + * </p> * * @param sourceStr * string to convert to quoted-printable form diff --git a/src/main/java/org/apache/commons/codec/net/RFC1522Codec.java b/src/main/java/org/apache/commons/codec/net/RFC1522Codec.java index 16bbcf48..f5051865 100644 --- a/src/main/java/org/apache/commons/codec/net/RFC1522Codec.java +++ b/src/main/java/org/apache/commons/codec/net/RFC1522Codec.java @@ -56,6 +56,7 @@ abstract class RFC1522Codec { * <p> * This method constructs the "encoded-word" header common to all the RFC 1522 codecs and then invokes * {@link #doEncoding(byte[])} method of a concrete class to perform the specific encoding. + * </p> * * @param text * a string to encode @@ -86,6 +87,7 @@ abstract class RFC1522Codec { * <p> * This method constructs the "encoded-word" header common to all the RFC 1522 codecs and then invokes * {@link #doEncoding(byte[])} method of a concrete class to perform the specific encoding. + * </p> * * @param text * a string to encode @@ -112,6 +114,7 @@ abstract class RFC1522Codec { * <p> * This method processes the "encoded-word" header common to all the RFC 1522 codecs and then invokes * {@link #doDecoding(byte[])} method of a concrete class to perform the specific decoding. + * </p> * * @param text * a string to decode