[commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
I noticed there is not (or I could not find) a function within commons-math
to count the number of significant figures in a number string. I wrote a
function to do it and want to make sure I'm not missing something within
commons-math before submitting a PR.

This feature is necessary when working with scientific/clinical data which
was reported with significant figures in mind, and for which calculation
results must respect the sigfig count. As far as I could tell there is no
Number implementation which correctly respects this. e.g.

"11000" has 2 significant figures,
"11000." has 5
".11000" has 5
"11000.0" has 6

Other points:
* BigDecimal.precision is not a substitute because it trailing whole zeros
are significant
* Floats, which can report scientific notation, are not a substitute when
calculations must be exact
* Ive also considered extending BigDecimal to support tracking and
enforcing sigfigs. This would still require the function to initially count
them.

Is this appropriate for a PR? Or have I missed an existing feature?

Dan


Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Alex Herbert
Hi,

On Wed, 9 Aug 2023 at 12:27, Daniel Watson  wrote:

> This feature is necessary when working with scientific/clinical data which
> was reported with significant figures in mind, and for which calculation
> results must respect the sigfig count. As far as I could tell there is no
> Number implementation which correctly respects this. e.g.
>
> "11000" has 2 significant figures,
> "11000." has 5
> ".11000" has 5
> "11000.0" has 6

This functionality is not in Commons AFAIK. Is the counting to accept
a String input?

Q. What is the use case that you would read data in text format and
have to compute the significant figures? Or are you reading data in
numeric format and computing the decimal significant figures of the
base-2 data representation? Note: Differences between base-10 and
base-2 representations can lead to an implementation that satisfies
one use case and not others due to rounding conversions (see
NUMBERS-199 [1]). I would advise against this and only support text
input when referring to decimal significant figures.

I presume you have input data of unknown precision, are computing
something with it, then outputting a result and wish to use the
minimum significant figures from the input data. If the output
significant figures are critical then this is a case where I would
expect the reported significant figures to be manually verified; thus
automation only partly increases efficiency by providing a first pass.

Note: I do not think there is an easy way to handle trailing zeros
when not all the zeros are significant [1]. This is in part due to the
different formats used for this such as an overline/underline on the
last significant digit. I do not think we wish to support parsing of
non-ascii characters and multiple options to identify significant
zeros. Thoughts on this?

Secondly, you are reliant on the text input being correctly formatted.
For example if I include the number 1, it would have to be represented
as e.g. 1.000 to not limit the significant figures of the rest of the
input data.

Thirdly, if accepting string input then you have the issue of first
identifying if the string is a number. This is non-trivial. For
example there is a function to do it in o.a.c.lang3.math in
NumberUtils.isCreatable.

Finding a home in commons will elicit different opinions from the
various contributors. The math and numbers projects are more related
to numeric computations. They output results in a canonical format,
typically the JDK toString representation of the number(s).
Conversions to different formats are not in scope, and parsing is
typically from the canonical format using JDK parse methods. There is
a o.a.c.text.numbers package in the text component that currently has
formatting for floating-point numbers to various text representations.
But parsing of Strings is not supported there. And lang has the
NumberUtils.

As for a class that tracks significant figures through a computation,
that will require some consideration. Do we create the class to wrap
Number and track significant digits of a configurable base. This would
allow BigDecimal with base 10, or int and double with base 2. Since
Number does not specify arithmetic then this has to be captured
somehow. It may be able to use the Numbers implementation of Field [3]
in o.a.c.numbers.field. Or simplify to only using a BigDecimal
wrapper.

In summary this may be simpler with an ideal use case. For example the
input is text, it must be parable to a BigDecimal and the number of
significant figures is identified using the text input. Support of
significant zeros is limited to the full length of the trailing zeros,
or the first zero if no decimal point is provided. This could be
handled with a parse method that returns both the BigDecimal and the
significant figures.

Alex

[1] https://issues.apache.org/jira/browse/NUMBERS-199
[2] 
https://en.wikipedia.org/wiki/Significant_figures#Ways_to_denote_significant_figures_in_an_integer_with_trailing_zeros
[3] http://mathworld.wolfram.com/Field.html

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
Before I answer your questions - I'll say that looking at the commons-math
codebase it is apparent that it's focused on specific functional
computation, rather than util-like features. So I agree this probably
doesn't fit well there. I honestly did not know commons-numbers existed.
I'll check there and then either move this discussion there or commons-lang.

(I'll respond to your questions anyway just in case this ever comes up
again or anyone is curious)

The use case is reading of text data (e.g. CSV) where significant figures
are implied according to the standard rules. Data that is already typed to
a standard java Number would have no inherent significant figure tracking
and it cannot be reliably determined (for the reasons you mentioned). If
the data is represented in that fashion then sigfigs must be
provided/applied separately.

The significant figures of the input data are inherently "verified" because
scientific calculations of this nature are provided by humans (obviously
cant account for some forms of human error) and humans will know
the precision of their apparatus, and can communicate it using the standard
rules of sigfigs - If thats not the case then the user should not be using
this api. Because the input data is verified, the output data is also
"verified" as long as this logic is correct.

I don't believe there is a need for repeating special characters when a
number of significant figures is known. In the case of infinite precision,
the BigDecimal class already handles that. When significant figures are
known then something like 1000/3 can and should be reported as 0.3 (or in
scientific notation) because there is only a single significant figure in
that calculation. A repeating 3 would imply precision that does not exist.
(Admittedly I need to double check this. I know that for pure mathematical
values e.g. conversion from feet to inches, the conversion has infinite
precision. However as long as the initial measurement has a precision then
the output will also necessarily have that same precision). Intermediate
calculations can use infinite precision, which could be handled internally
via BigDecimal. But final results should be reported with proper sigfig
rules applied.

You are correct that "1" would not be the same as "1.000" and for clinical
/ scientific data this is known to be important. "1" implies 1 sigfig,
"1.000" implies 4. This is why the data most likely will be represented as
text.

Determining if the String is a number is simpler in this case I think?
Assuming decimal base (and potentially scientific notation) there are a
limited number of characters and syntax. isCreateable() attempts to handle
different bases as well as type qualifiers whereas this logic would be
restricted to decimal base and syntax. (theoretically I suppose you could
use a different bases, but scientific calculations are rarely, if ever,
carried out in anything other than decimal. Seems natural that they would
be out of scope).

As for a wrapped class, my initial thought (though I havent worked out the
details) would be to extend BigDecimal and use its arithmetic logic.
Relevant methods would be overridden to ensure the sigfig subclass is
returned. There may be issues with that, I havent fleshed it out.

Ultimately the initial goal would be to simply count the number of sigfigs
through some text util/parse method. The fact that sigfigs are normally
conveyed via textual representation means that many of the issues you might
encounter trying to derive them from pure numbers doesn't apply.

Hope that answers more questions than it creates!

Dan

On Wed, Aug 9, 2023 at 8:48 AM Alex Herbert 
wrote:

> Hi,
>
> On Wed, 9 Aug 2023 at 12:27, Daniel Watson  wrote:
>
> > This feature is necessary when working with scientific/clinical data
> which
> > was reported with significant figures in mind, and for which calculation
> > results must respect the sigfig count. As far as I could tell there is no
> > Number implementation which correctly respects this. e.g.
> >
> > "11000" has 2 significant figures,
> > "11000." has 5
> > ".11000" has 5
> > "11000.0" has 6
>
> This functionality is not in Commons AFAIK. Is the counting to accept
> a String input?
>
> Q. What is the use case that you would read data in text format and
> have to compute the significant figures? Or are you reading data in
> numeric format and computing the decimal significant figures of the
> base-2 data representation? Note: Differences between base-10 and
> base-2 representations can lead to an implementation that satisfies
> one use case and not others due to rounding conversions (see
> NUMBERS-199 [1]). I would advise against this and only support text
> input when referring to decimal significant figures.
>
> I presume you have input data of unknown precision, are computing
> something with it, then outputting a result and wish to use the
> minimum significant figures from the input data. If the output
> significant figures are critical then th

Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Alex Herbert
On Wed, 9 Aug 2023 at 15:43, Daniel Watson  wrote:
>
> Hope that answers more questions than it creates!

It does not address the issue of the last significant zero, e.g:

1 (4 sf)
1 (3 sf)
1 (2 sf)

One way to solve this with standard parsing would be to use scientific notation:

1.000e4
1.00e4
1.0e4

Note that for the example of inch to cm conversions the value 2.54
cm/inch is exact. This leads to the issue that there should be a way
to exclude some input from limiting the detection of the lowest
significant figure (i.e. mark numbers as exact). This puts some
responsibility on the provider of the data to follow a format; and
some on the parser to know what fields to analyse.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
Ah I see what you were asking. Yes it is up to the human entering data to
understand that 1 has exactly one sigfig according to standard
convention. If you need it to have more then you must write it in full
scientific notation. Obviously If a specific precision is required due to
some flaw in the dataset then the user could manually override the detected
sigfig count. But the assumption of the parsing logic is that the input
abides by the standard convention, which is well defined. I don't see it as
being much different than any other Number class expecting the input to
abide by a specific format. Conventions for SigFig counting are well
defined. It just so happens that most people don't often need them (but the
same could be said for o.a.c.numbers.Complex).

As far as exact calculations, if the user did:

BigSigFig result = new BigSigFig("1.1").multiply(new BigDecimal("2.54"))

I would expect the BigSigFig class should understand that BigDecimal has no
sigfig limit, and would retain it's current minimum of 2. It would only
apply a new minimum in the case of operating against another BigSigFig...

BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))

The result of that should be a BigSigFig with an internal value of exactly
2.2 but would output as "2" to respect the new sigfig count. I think
something like that should be possible. In the end this is more of a
parsing / formatting exercise. The wrinkle is the tracking aspect, where we
need to dynamically reduce the sigfigs based on other operations. That's
where a wrapper class I think comes in handy.

Dan


On Wed, Aug 9, 2023 at 11:23 AM Alex Herbert 
wrote:

> On Wed, 9 Aug 2023 at 15:43, Daniel Watson  wrote:
> >
> > Hope that answers more questions than it creates!
>
> It does not address the issue of the last significant zero, e.g:
>
> 1 (4 sf)
> 1 (3 sf)
> 1 (2 sf)
>
> One way to solve this with standard parsing would be to use scientific
> notation:
>
> 1.000e4
> 1.00e4
> 1.0e4
>
> Note that for the example of inch to cm conversions the value 2.54
> cm/inch is exact. This leads to the issue that there should be a way
> to exclude some input from limiting the detection of the lowest
> significant figure (i.e. mark numbers as exact). This puts some
> responsibility on the provider of the data to follow a format; and
> some on the parser to know what fields to analyse.
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


[Crypto] Compile From Source and Openssl3 Support

2023-08-09 Thread Daniel Thertell
Hey All,

I am looking to compile Commons Crypto from source and I am wondering if
there is any documentation for this process? I am trying to build Gary
Gregory's OpenSSL3 branch but I am encountering the following error. I know
this isn't the main branch but I am hoping someone will still be able to
help out. I receive the following error when i run "make linux64" (i
received a similar error on a M1 when i ran 'make mac64')

*** No rule to make target
'target/jni-classes/org_apache_commons_crypto_random_OpenSslCryptoRandomNative.h',
needed by
'target/commons-crypto-3_0_x-Linux-x86_64/OpenSslCryptoRandomNative.o'.
Stop.

Any help or ideas would be appreciated!

Thanks,
Dan Thertell


[crypto] Compiling from source

2023-08-09 Thread Daniel Thertell
Hey All,

I am looking to compile Commons Crypto from source and I am wondering if
there is any documentation for this process? I am trying to build Gary
Gregory's OpenSSL3 branch but I am encountering the following error, I know
this isn't the main branch but I am hoping someone will still be able to
help out. I receive the following error when i run "make linux64" on an
ubuntu server VM (i received a similar error on a mac M1 when i ran 'make
mac64')

*** No rule to make target
'target/jni-classes/org_apache_commons_crypto_random_OpenSslCryptoRandomNative.h',
needed by
'target/commons-crypto-3_0_x-Linux-x86_64/OpenSslCryptoRandomNative.o'.
Stop.

Any help or ideas would be appreciated! Also i am sorry if duplicate emails
were received, my first attempt to send this resulted in an error message
being sent back!

Thanks,
Dan Thertell


Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Alex Herbert
On Wed, 9 Aug 2023 at 17:13, Daniel Watson  wrote:

> BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))

Multiply is easy as you take the minimum significant figures. What
about addition?

12345 + 0.0001

Here the significant figures should remain at 5.

And for this:

12345 + 10.0
12345 + 10
12345 + 1
12345 + 1.0
12345 + 1.00

You have to track the overlap of significant digits somehow.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Crypto] Compile From Source and Openssl3 Support

2023-08-09 Thread Gary Gregory
I should have kept notes!

Gary

On Wed, Aug 9, 2023, 1:49 PM Daniel Thertell  wrote:

> Hey All,
>
> I am looking to compile Commons Crypto from source and I am wondering if
> there is any documentation for this process? I am trying to build Gary
> Gregory's OpenSSL3 branch but I am encountering the following error. I know
> this isn't the main branch but I am hoping someone will still be able to
> help out. I receive the following error when i run "make linux64" (i
> received a similar error on a M1 when i ran 'make mac64')
>
> *** No rule to make target
>
> 'target/jni-classes/org_apache_commons_crypto_random_OpenSslCryptoRandomNative.h',
> needed by
> 'target/commons-crypto-3_0_x-Linux-x86_64/OpenSslCryptoRandomNative.o'.
> Stop.
>
> Any help or ideas would be appreciated!
>
> Thanks,
> Dan Thertell
>


Re: [Crypto] Compile From Source and Openssl3 Support

2023-08-09 Thread Daniel Thertell
Hey Gary,

lol ya I also have that note keeping issue!
By any chance do you know what the version env variable should be? I am
using 3_0_X right now.

Thanks,
Dan Thertell

On Wed, Aug 9, 2023 at 2:10 PM Gary Gregory  wrote:

> I should have kept notes!
>
> Gary
>
> On Wed, Aug 9, 2023, 1:49 PM Daniel Thertell  wrote:
>
> > Hey All,
> >
> > I am looking to compile Commons Crypto from source and I am wondering if
> > there is any documentation for this process? I am trying to build Gary
> > Gregory's OpenSSL3 branch but I am encountering the following error. I
> know
> > this isn't the main branch but I am hoping someone will still be able to
> > help out. I receive the following error when i run "make linux64" (i
> > received a similar error on a M1 when i ran 'make mac64')
> >
> > *** No rule to make target
> >
> >
> 'target/jni-classes/org_apache_commons_crypto_random_OpenSslCryptoRandomNative.h',
> > needed by
> > 'target/commons-crypto-3_0_x-Linux-x86_64/OpenSslCryptoRandomNative.o'.
> > Stop.
> >
> > Any help or ideas would be appreciated!
> >
> > Thanks,
> > Dan Thertell
> >
>


[Codec] clearing input byte array vs not

2023-08-09 Thread Gary Gregory
Hi all,

Any thoughts on https://github.com/apache/commons-codec/pull/197

Gary


Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
I believe the convention is to take the *least* precise term and apply that
precision (here "precision" != "sigfigs" - Ive been using both terms to
mean sigfigs, but for these purposes precision is actually defined as how
small a fraction the measurement is able to convey - e.g 0.01 is more
precise than 1.1, despite the latter having more sigfigs).

The results should be...

12345 + 10.0 = 12355
12345 + 10 =  12355
12345 + 1 =  12346
12345 + 1.0 =  12346
12345 + 1.0 = 12346

None of these will have decimal places because the left term was not
precise enough to have them. When adding/subtracting you can end up with
more significant figures in your result than you had in one of your terms,
you just can end up with a more "precise" result than either of your
terms.e.g.

999.0 + 9.41 = 1008.4
4 sigfigs + 3 sigfigs = 5 sigfigs - It's perfectly fine that we ended up
with more here, as long as we didnt increase the "precision".

So in this case I think the correct logic is to add the two terms together
in the normal way, reduce the precision to that of the limiting term, and
then recalculate the number of significant figures on the result.

I believe that, conveniently, the BigDecimal class already tracks this as
scale(). So the information is available to determine the new precision. It
would just be a matter of retaining it within the wrapper class and
applying it when producing the final output string. I'd need to play around
with a few more examples, but I think that's the logic at a high level.

Dan

On Wed, Aug 9, 2023 at 2:08 PM Alex Herbert 
wrote:

> On Wed, 9 Aug 2023 at 17:13, Daniel Watson  wrote:
>
> > BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))
>
> Multiply is easy as you take the minimum significant figures. What
> about addition?
>
> 12345 + 0.0001
>
> Here the significant figures should remain at 5.
>
> And for this:
>
> 12345 + 10.0
> 12345 + 10
> 12345 + 1
> 12345 + 1.0
> 12345 + 1.00
>
> You have to track the overlap of significant digits somehow.
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [Codec] clearing input byte array vs not

2023-08-09 Thread Mark Thomas

Reject it. And document the existing behavior.

Mark


On 09/08/2023 19:52, Gary Gregory wrote:

Hi all,

Any thoughts on https://github.com/apache/commons-codec/pull/197

Gary



-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Crypto] Compile From Source and Openssl3 Support

2023-08-09 Thread Daniel Thertell
Hey Gary

I believe I managed to get it to build, however I do have a few questions.

1. Why were the make targets for the header files commented out and
pointing to the wrong locations (in the make file)?
2. After successfully running make, how do I package everything into a JAR
for testing?

Thanks,
Dan Thertell

On Wed, Aug 9, 2023 at 2:13 PM Daniel Thertell  wrote:

> Hey Gary,
>
> lol ya I also have that note keeping issue!
> By any chance do you know what the version env variable should be? I am
> using 3_0_X right now.
>
> Thanks,
> Dan Thertell
>
> On Wed, Aug 9, 2023 at 2:10 PM Gary Gregory 
> wrote:
>
>> I should have kept notes!
>>
>> Gary
>>
>> On Wed, Aug 9, 2023, 1:49 PM Daniel Thertell  wrote:
>>
>> > Hey All,
>> >
>> > I am looking to compile Commons Crypto from source and I am wondering if
>> > there is any documentation for this process? I am trying to build Gary
>> > Gregory's OpenSSL3 branch but I am encountering the following error. I
>> know
>> > this isn't the main branch but I am hoping someone will still be able to
>> > help out. I receive the following error when i run "make linux64" (i
>> > received a similar error on a M1 when i ran 'make mac64')
>> >
>> > *** No rule to make target
>> >
>> >
>> 'target/jni-classes/org_apache_commons_crypto_random_OpenSslCryptoRandomNative.h',
>> > needed by
>> > 'target/commons-crypto-3_0_x-Linux-x86_64/OpenSslCryptoRandomNative.o'.
>> > Stop.
>> >
>> > Any help or ideas would be appreciated!
>> >
>> > Thanks,
>> > Dan Thertell
>> >
>>
>


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
Here's my stab at a spec. Wanted to clarify some parts of the Case
interface first before jumping into the implementations. Wondering what a
good package name for this stuff is, given that "case" is a reserved word?

Case (interface)
The Case interface defines two methods:
* String format(Iterable tokens)
The format method accepts an Iterable of String tokens and returns a single
String formatted according to the implementation. The format method is
intended to handle transforming between cases, thus tokens passed to the
format() method need not be properly formatted for the given Case instance,
though they must still respect any reserve character restrictions.
* List parse(String string)
The parse method accepts a single string and returns a List of string
tokens that abide by the Case implementation.
Note: format() and parse() methods must be fully reciprocal. ie. On a
single Case instance, when calling parse() with a valid string, and passing
the resulting tokens into format(), a matching string should be returned.

DelimitedCase (base class for kebab and snake)
Defines a Case where all tokens are separated by a single character
delimiter. The delimiter is considered a reserved character and is not
allowed to appear within tokens when formatting. No further restrictions
are placed on token contents by this base implementation. Tokens can
contain any valid Java String character. DelimitedCases can support
zero-length tokens, which can occur if there are no characters between two
instances of the delimiter or if the parsed string begins or ends with the
delimiter.
Note: Other Case implementations may not support zero-length tokens, and
attempts to call format(...) with empty tokens may fail.

KebabCase
Extends DelimitedCase and initializes the delimiter as the hyphen '-'
character. This case allows only alphanumeric characters within tokens.

SnakeCase
Extends DelimitedCase and initializes the delimiter as the underscore '_'
character. This case allows only alphanumeric characters within tokens.

PascalCase
Defines a Case where tokens begin with an uppercase alpha character. All
subsequent token characters must be lowercase alpha or numeric characters.
Whenever an uppercase alpha character is encountered, the previous token is
considered complete and a new token begins, with the uppercase character
being the first character of the new token. PascalCase does not allow
zero-length tokens when formatting, as it would violate the reciprocal
contract of format() and parse().

CamelCase
Extends PascalCase and sets one additional restriction - that the first
character of the first token (ie the first character of the full string)
must be a lowercase alpha character (rather than the uppercase requirement
of PascalCase). All other restrictions of PascalCase apply.


On Tue, Aug 8, 2023 at 8:55 PM Daniel Watson  wrote:

> Kebab case is extremely common for web identifiers, eg html element ids,
> classes, attributes, etc.
>
> In regards to PascalCase, i agree that most people won't understand the
> reasoning behind the name, but it is nevertheless a widely accepted term
> for that case style. If an alternative is deemed necessary then
> "ProperCase" might work - since that is also how English proper nouns are
> cased. Understanding that name just depends on your knowledge of English
> grammar.
>
> A spec can definitely be written for the 4 provided concrete
> implementations. And... I may eat these words but... the spec should not be
> all that complex. I will take a stab at it.
>
> Thanks for the feedback!
> Any other thoughts or comments are welcome!
>
> Dan
>
>
> On Tue, Aug 8, 2023, 7:45 PM Elliotte Rusty Harold 
> wrote:
>
>> This is a good idea and seems like useful functionality. In order to
>> accept it into commons, it needs solid documentation and excellent
>> test coverage. I've worked on code like this in another language (not
>> Java) and the production bugs were bad. E.g. what happens when a
>> string contains numbers as well as letters?
>>
>> I'd like to see a full spec that unambiguously defines how every
>> Unicode string is converted into camel/snake/kebab case. The spec
>> should be independent of the code. That's not easy to write but it's
>> essential.
>>
>> I don't want any loose/strict modes. It should all be strict according to
>> spec.
>>
>> I've never heard of kebab cases before. Is that a common name? I'd
>> also like to rename Pascal case. How many programmers under 40 have
>> even heard of Pascal, much less are familiar with its case
>> conventions?
>>
>> Long story short - a PR is premature until there's an agreed upon spec.
>>
>> On Tue, Aug 8, 2023 at 8:04 PM Daniel Watson 
>> wrote:
>> >
>> > I have a bit of code that adds the ability to parse and format strings
>> into
>> > various case patterns. Wanted to check if it's of worth and in-scope for
>> > commons-text...
>> >
>> > Its a bit broader than the existing CaseUtils.toCamelCase(...) method.
>> > Rather than simply formatting to

Re: [Codec] clearing input byte array vs not

2023-08-09 Thread Gary D. Gregory
Done and done in git master.

Next, is how to document or change 
org.apache.commons.codec.digest.Crypt.crypt(byte[], String): The method clears 
the input byte array for all input types _except_ when calling UnixCrypt [1].

I could: 
(1) Document the inconsistency (right now, I left it unsaid)
(2) Make UnixCrypt.crypt() clear its input password for consistency.

WDYT?

TY!
Gary
[1]:
   public static String crypt(final byte[] keyBytes, final String salt) {
if (salt == null) {
return Sha2Crypt.sha512Crypt(keyBytes);
}
if (salt.startsWith(Sha2Crypt.SHA512_PREFIX)) {
return Sha2Crypt.sha512Crypt(keyBytes, salt);
}
if (salt.startsWith(Sha2Crypt.SHA256_PREFIX)) {
return Sha2Crypt.sha256Crypt(keyBytes, salt);
}
if (salt.startsWith(Md5Crypt.MD5_PREFIX)) {
return Md5Crypt.md5Crypt(keyBytes, salt);
}
return UnixCrypt.crypt(keyBytes, salt);
}


On 2023/08/09 19:16:59 Mark Thomas wrote:
> Reject it. And document the existing behavior.
> 
> Mark
> 
> 
> On 09/08/2023 19:52, Gary Gregory wrote:
> > Hi all,
> > 
> > Any thoughts on https://github.com/apache/commons-codec/pull/197
> > 
> > Gary
> > 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Crypto] Compile From Source and Openssl3 Support

2023-08-09 Thread Gary Gregory
The branch is work in progress from a while ago, and it did not work
completely, that much i remember. I can't take the time today to look at it
today, I'm looking at other issues in Commons.

Gary

On Wed, Aug 9, 2023, 4:27 PM Daniel Thertell  wrote:

> Hey Gary
>
> I believe I managed to get it to build, however I do have a few questions.
>
> 1. Why were the make targets for the header files commented out and
> pointing to the wrong locations (in the make file)?
> 2. After successfully running make, how do I package everything into a JAR
> for testing?
>
> Thanks,
> Dan Thertell
>
> On Wed, Aug 9, 2023 at 2:13 PM Daniel Thertell 
> wrote:
>
> > Hey Gary,
> >
> > lol ya I also have that note keeping issue!
> > By any chance do you know what the version env variable should be? I am
> > using 3_0_X right now.
> >
> > Thanks,
> > Dan Thertell
> >
> > On Wed, Aug 9, 2023 at 2:10 PM Gary Gregory 
> > wrote:
> >
> >> I should have kept notes!
> >>
> >> Gary
> >>
> >> On Wed, Aug 9, 2023, 1:49 PM Daniel Thertell 
> wrote:
> >>
> >> > Hey All,
> >> >
> >> > I am looking to compile Commons Crypto from source and I am wondering
> if
> >> > there is any documentation for this process? I am trying to build Gary
> >> > Gregory's OpenSSL3 branch but I am encountering the following error. I
> >> know
> >> > this isn't the main branch but I am hoping someone will still be able
> to
> >> > help out. I receive the following error when i run "make linux64" (i
> >> > received a similar error on a M1 when i ran 'make mac64')
> >> >
> >> > *** No rule to make target
> >> >
> >> >
> >>
> 'target/jni-classes/org_apache_commons_crypto_random_OpenSslCryptoRandomNative.h',
> >> > needed by
> >> >
> 'target/commons-crypto-3_0_x-Linux-x86_64/OpenSslCryptoRandomNative.o'.
> >> > Stop.
> >> >
> >> > Any help or ideas would be appreciated!
> >> >
> >> > Thanks,
> >> > Dan Thertell
> >> >
> >>
> >
>


Re: [Crypto] Compile From Source and Openssl3 Support

2023-08-09 Thread Daniel Thertell
ya that's totally fine!
I will continue to try and figure this out.

Thanks,
Dan Thertell


On Wed, Aug 9, 2023 at 4:53 PM Gary Gregory  wrote:

> The branch is work in progress from a while ago, and it did not work
> completely, that much i remember. I can't take the time today to look at it
> today, I'm looking at other issues in Commons.
>
> Gary
>
> On Wed, Aug 9, 2023, 4:27 PM Daniel Thertell  wrote:
>
> > Hey Gary
> >
> > I believe I managed to get it to build, however I do have a few
> questions.
> >
> > 1. Why were the make targets for the header files commented out and
> > pointing to the wrong locations (in the make file)?
> > 2. After successfully running make, how do I package everything into a
> JAR
> > for testing?
> >
> > Thanks,
> > Dan Thertell
> >
> > On Wed, Aug 9, 2023 at 2:13 PM Daniel Thertell 
> > wrote:
> >
> > > Hey Gary,
> > >
> > > lol ya I also have that note keeping issue!
> > > By any chance do you know what the version env variable should be? I am
> > > using 3_0_X right now.
> > >
> > > Thanks,
> > > Dan Thertell
> > >
> > > On Wed, Aug 9, 2023 at 2:10 PM Gary Gregory 
> > > wrote:
> > >
> > >> I should have kept notes!
> > >>
> > >> Gary
> > >>
> > >> On Wed, Aug 9, 2023, 1:49 PM Daniel Thertell 
> > wrote:
> > >>
> > >> > Hey All,
> > >> >
> > >> > I am looking to compile Commons Crypto from source and I am
> wondering
> > if
> > >> > there is any documentation for this process? I am trying to build
> Gary
> > >> > Gregory's OpenSSL3 branch but I am encountering the following
> error. I
> > >> know
> > >> > this isn't the main branch but I am hoping someone will still be
> able
> > to
> > >> > help out. I receive the following error when i run "make linux64" (i
> > >> > received a similar error on a M1 when i ran 'make mac64')
> > >> >
> > >> > *** No rule to make target
> > >> >
> > >> >
> > >>
> >
> 'target/jni-classes/org_apache_commons_crypto_random_OpenSslCryptoRandomNative.h',
> > >> > needed by
> > >> >
> > 'target/commons-crypto-3_0_x-Linux-x86_64/OpenSslCryptoRandomNative.o'.
> > >> > Stop.
> > >> >
> > >> > Any help or ideas would be appreciated!
> > >> >
> > >> > Thanks,
> > >> > Dan Thertell
> > >> >
> > >>
> > >
> >
>


Re: [Codec] clearing input byte array vs not

2023-08-09 Thread Elliotte Rusty Harold
This makes sense to me. The existing behavior seems surprising and
incorrect. Is there a reason for it?

On Wed, Aug 9, 2023 at 6:53 PM Gary Gregory  wrote:
>
> Hi all,
>
> Any thoughts on https://github.com/apache/commons-codec/pull/197
>
> Gary



-- 
Elliotte Rusty Harold
elh...@ibiblio.org

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Elliotte Rusty Harold
What happens when a token contains an unpermitted character?

On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson  wrote:
>
> Here's my stab at a spec. Wanted to clarify some parts of the Case
> interface first before jumping into the implementations. Wondering what a
> good package name for this stuff is, given that "case" is a reserved word?
>
> Case (interface)
> The Case interface defines two methods:
> * String format(Iterable tokens)
> The format method accepts an Iterable of String tokens and returns a single
> String formatted according to the implementation. The format method is
> intended to handle transforming between cases, thus tokens passed to the
> format() method need not be properly formatted for the given Case instance,
> though they must still respect any reserve character restrictions.
> * List parse(String string)
> The parse method accepts a single string and returns a List of string
> tokens that abide by the Case implementation.
> Note: format() and parse() methods must be fully reciprocal. ie. On a
> single Case instance, when calling parse() with a valid string, and passing
> the resulting tokens into format(), a matching string should be returned.
>
> DelimitedCase (base class for kebab and snake)
> Defines a Case where all tokens are separated by a single character
> delimiter. The delimiter is considered a reserved character and is not
> allowed to appear within tokens when formatting. No further restrictions
> are placed on token contents by this base implementation. Tokens can
> contain any valid Java String character. DelimitedCases can support
> zero-length tokens, which can occur if there are no characters between two
> instances of the delimiter or if the parsed string begins or ends with the
> delimiter.
> Note: Other Case implementations may not support zero-length tokens, and
> attempts to call format(...) with empty tokens may fail.
>
> KebabCase
> Extends DelimitedCase and initializes the delimiter as the hyphen '-'
> character. This case allows only alphanumeric characters within tokens.
>
> SnakeCase
> Extends DelimitedCase and initializes the delimiter as the underscore '_'
> character. This case allows only alphanumeric characters within tokens.
>
> PascalCase
> Defines a Case where tokens begin with an uppercase alpha character. All
> subsequent token characters must be lowercase alpha or numeric characters.
> Whenever an uppercase alpha character is encountered, the previous token is
> considered complete and a new token begins, with the uppercase character
> being the first character of the new token. PascalCase does not allow
> zero-length tokens when formatting, as it would violate the reciprocal
> contract of format() and parse().
>
> CamelCase
> Extends PascalCase and sets one additional restriction - that the first
> character of the first token (ie the first character of the full string)
> must be a lowercase alpha character (rather than the uppercase requirement
> of PascalCase). All other restrictions of PascalCase apply.
>
>
> On Tue, Aug 8, 2023 at 8:55 PM Daniel Watson  wrote:
>
> > Kebab case is extremely common for web identifiers, eg html element ids,
> > classes, attributes, etc.
> >
> > In regards to PascalCase, i agree that most people won't understand the
> > reasoning behind the name, but it is nevertheless a widely accepted term
> > for that case style. If an alternative is deemed necessary then
> > "ProperCase" might work - since that is also how English proper nouns are
> > cased. Understanding that name just depends on your knowledge of English
> > grammar.
> >
> > A spec can definitely be written for the 4 provided concrete
> > implementations. And... I may eat these words but... the spec should not be
> > all that complex. I will take a stab at it.
> >
> > Thanks for the feedback!
> > Any other thoughts or comments are welcome!
> >
> > Dan
> >
> >
> > On Tue, Aug 8, 2023, 7:45 PM Elliotte Rusty Harold 
> > wrote:
> >
> >> This is a good idea and seems like useful functionality. In order to
> >> accept it into commons, it needs solid documentation and excellent
> >> test coverage. I've worked on code like this in another language (not
> >> Java) and the production bugs were bad. E.g. what happens when a
> >> string contains numbers as well as letters?
> >>
> >> I'd like to see a full spec that unambiguously defines how every
> >> Unicode string is converted into camel/snake/kebab case. The spec
> >> should be independent of the code. That's not easy to write but it's
> >> essential.
> >>
> >> I don't want any loose/strict modes. It should all be strict according to
> >> spec.
> >>
> >> I've never heard of kebab cases before. Is that a common name? I'd
> >> also like to rename Pascal case. How many programmers under 40 have
> >> even heard of Pascal, much less are familiar with its case
> >> conventions?
> >>
> >> Long story short - a PR is premature until there's an agreed upon spec.
> >>
> >> On Tue, Aug 8, 2023 at 8:04 PM Daniel Watson 
> >>

Re: [Codec] clearing input byte array vs not

2023-08-09 Thread Gary Gregory
The class comment says the code originates in FreeBSD C.

Gary

On Wed, Aug 9, 2023, 6:03 PM Elliotte Rusty Harold 
wrote:

> This makes sense to me. The existing behavior seems surprising and
> incorrect. Is there a reason for it?
>
> On Wed, Aug 9, 2023 at 6:53 PM Gary Gregory 
> wrote:
> >
> > Hi all,
> >
> > Any thoughts on https://github.com/apache/commons-codec/pull/197
> >
> > Gary
>
>
>
> --
> Elliotte Rusty Harold
> elh...@ibiblio.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Gary Gregory
Probably should be an IAE...?

Gary

On Wed, Aug 9, 2023, 6:07 PM Elliotte Rusty Harold 
wrote:

> What happens when a token contains an unpermitted character?
>
> On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson  wrote:
> >
> > Here's my stab at a spec. Wanted to clarify some parts of the Case
> > interface first before jumping into the implementations. Wondering what a
> > good package name for this stuff is, given that "case" is a reserved
> word?
> >
> > Case (interface)
> > The Case interface defines two methods:
> > * String format(Iterable tokens)
> > The format method accepts an Iterable of String tokens and returns a
> single
> > String formatted according to the implementation. The format method is
> > intended to handle transforming between cases, thus tokens passed to the
> > format() method need not be properly formatted for the given Case
> instance,
> > though they must still respect any reserve character restrictions.
> > * List parse(String string)
> > The parse method accepts a single string and returns a List of string
> > tokens that abide by the Case implementation.
> > Note: format() and parse() methods must be fully reciprocal. ie. On a
> > single Case instance, when calling parse() with a valid string, and
> passing
> > the resulting tokens into format(), a matching string should be returned.
> >
> > DelimitedCase (base class for kebab and snake)
> > Defines a Case where all tokens are separated by a single character
> > delimiter. The delimiter is considered a reserved character and is not
> > allowed to appear within tokens when formatting. No further restrictions
> > are placed on token contents by this base implementation. Tokens can
> > contain any valid Java String character. DelimitedCases can support
> > zero-length tokens, which can occur if there are no characters between
> two
> > instances of the delimiter or if the parsed string begins or ends with
> the
> > delimiter.
> > Note: Other Case implementations may not support zero-length tokens, and
> > attempts to call format(...) with empty tokens may fail.
> >
> > KebabCase
> > Extends DelimitedCase and initializes the delimiter as the hyphen '-'
> > character. This case allows only alphanumeric characters within tokens.
> >
> > SnakeCase
> > Extends DelimitedCase and initializes the delimiter as the underscore '_'
> > character. This case allows only alphanumeric characters within tokens.
> >
> > PascalCase
> > Defines a Case where tokens begin with an uppercase alpha character. All
> > subsequent token characters must be lowercase alpha or numeric
> characters.
> > Whenever an uppercase alpha character is encountered, the previous token
> is
> > considered complete and a new token begins, with the uppercase character
> > being the first character of the new token. PascalCase does not allow
> > zero-length tokens when formatting, as it would violate the reciprocal
> > contract of format() and parse().
> >
> > CamelCase
> > Extends PascalCase and sets one additional restriction - that the first
> > character of the first token (ie the first character of the full string)
> > must be a lowercase alpha character (rather than the uppercase
> requirement
> > of PascalCase). All other restrictions of PascalCase apply.
> >
> >
> > On Tue, Aug 8, 2023 at 8:55 PM Daniel Watson 
> wrote:
> >
> > > Kebab case is extremely common for web identifiers, eg html element
> ids,
> > > classes, attributes, etc.
> > >
> > > In regards to PascalCase, i agree that most people won't understand the
> > > reasoning behind the name, but it is nevertheless a widely accepted
> term
> > > for that case style. If an alternative is deemed necessary then
> > > "ProperCase" might work - since that is also how English proper nouns
> are
> > > cased. Understanding that name just depends on your knowledge of
> English
> > > grammar.
> > >
> > > A spec can definitely be written for the 4 provided concrete
> > > implementations. And... I may eat these words but... the spec should
> not be
> > > all that complex. I will take a stab at it.
> > >
> > > Thanks for the feedback!
> > > Any other thoughts or comments are welcome!
> > >
> > > Dan
> > >
> > >
> > > On Tue, Aug 8, 2023, 7:45 PM Elliotte Rusty Harold  >
> > > wrote:
> > >
> > >> This is a good idea and seems like useful functionality. In order to
> > >> accept it into commons, it needs solid documentation and excellent
> > >> test coverage. I've worked on code like this in another language (not
> > >> Java) and the production bugs were bad. E.g. what happens when a
> > >> string contains numbers as well as letters?
> > >>
> > >> I'd like to see a full spec that unambiguously defines how every
> > >> Unicode string is converted into camel/snake/kebab case. The spec
> > >> should be independent of the code. That's not easy to write but it's
> > >> essential.
> > >>
> > >> I don't want any loose/strict modes. It should all be strict
> according to
> > >> spec.
> > >>
> > >> I've never heard of kebab cases befo

Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Hasan Diwan
[content inline]

On Wed, 9 Aug 2023 at 15:08, Elliotte Rusty Harold 
wrote:

> What happens when a token contains an unpermitted character?
>

Three possibilities:
1. null -- favoured by Square's HTTP implementation.
2. a checked Exception --preferred by the JDK.
3. an unchecked Exception -- leveraged by various commons projects,  like
commons-math's interpolate method. -- H
-- 
OpenPGP: https://hasan.d8u.us/openpgp.asc
If you wish to request my time, please do so using
*bit.ly/hd1AppointmentRequest
*.
Si vous voudrais faire connnaisance, allez a *bit.ly/hd1AppointmentRequest
*.

Sent
from my mobile device
Envoye de mon portable


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Elliotte Rusty Harold
Checked exception is almost certainly wrong. But I'm not sure we need
any exception at all here. I don't think these methods need any
exceptions at all aside from NullPointerException for null inputs.
Otherwise, every string should have a deterministic representation in
camel case, pascal case, etc.

Is there any external documentation of these forms we could reference?

On Wed, Aug 9, 2023 at 10:21 PM Hasan Diwan  wrote:
>
> [content inline]
>
> On Wed, 9 Aug 2023 at 15:08, Elliotte Rusty Harold 
> wrote:
>
> > What happens when a token contains an unpermitted character?
> >
>
> Three possibilities:
> 1. null -- favoured by Square's HTTP implementation.
> 2. a checked Exception --preferred by the JDK.
> 3. an unchecked Exception -- leveraged by various commons projects,  like
> commons-math's interpolate method. -- H
> --
> OpenPGP: https://hasan.d8u.us/openpgp.asc
> If you wish to request my time, please do so using
> *bit.ly/hd1AppointmentRequest
> *.
> Si vous voudrais faire connnaisance, allez a *bit.ly/hd1AppointmentRequest
> *.
>
> Sent
> from my mobile device
> Envoye de mon portable



-- 
Elliotte Rusty Harold
elh...@ibiblio.org

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Hasan Diwan
[content inline]

On Wed, 9 Aug 2023 at 15:47, Elliotte Rusty Harold 
wrote:

> Checked exception is almost certainly wrong. But I'm not sure we need
> any exception at all here. I don't think these methods need any
> exceptions at all aside from NullPointerException for null inputs.
> Otherwise, every string should have a deterministic representation in
> camel case, pascal case, etc.
>
I was just adumbrating possibilities here. Personally, I prefer checked
exceptions to unchecked, but think null in this case is best. -- H
-- 
OpenPGP: https://hasan.d8u.us/openpgp.asc
If you wish to request my time, please do so using
*bit.ly/hd1AppointmentRequest
*.
Si vous voudrais faire connnaisance, allez a *bit.ly/hd1AppointmentRequest
*.

Sent
from my mobile device
Envoye de mon portable


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Gary Gregory
How is null helpful when you provide a string with a typo for example? I
would expect something like "Illegal character '[' at index 34".

Gary

On Wed, Aug 9, 2023, 6:52 PM Hasan Diwan  wrote:

> [content inline]
>
> On Wed, 9 Aug 2023 at 15:47, Elliotte Rusty Harold 
> wrote:
>
> > Checked exception is almost certainly wrong. But I'm not sure we need
> > any exception at all here. I don't think these methods need any
> > exceptions at all aside from NullPointerException for null inputs.
> > Otherwise, every string should have a deterministic representation in
> > camel case, pascal case, etc.
> >
> I was just adumbrating possibilities here. Personally, I prefer checked
> exceptions to unchecked, but think null in this case is best. -- H
> --
> OpenPGP: https://hasan.d8u.us/openpgp.asc
> If you wish to request my time, please do so using
> *bit.ly/hd1AppointmentRequest
> *.
> Si vous voudrais faire connnaisance, allez a *bit.ly/hd1AppointmentRequest
> *.
>
>  >Sent
> from my mobile device
> Envoye de mon portable
>


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
Currently I'm planning a set of exceptions that are thrown for various
reasons. I created multiple classes to allow for clearer testing.

ReservedCharacterException (extends InvalidCharacterException below) -
thrown specifically when a reserved character is encountered within a token.

InvalidCharacterException (extends IllegalArgumentException) thrown
directly any time an illegal character is encountered.

ZeroLengthTokenException (extends Illegal arg excep) - thrown when a zero
length token is encountered and Case does not support it.

There are a few other error cases I believe. I'm not looking at the code
right this moment but I'm fairly certain about the need for the above 3.


On Wed, Aug 9, 2023, 6:08 PM Elliotte Rusty Harold 
wrote:

> What happens when a token contains an unpermitted character?
>
> On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson  wrote:
> >
> > Here's my stab at a spec. Wanted to clarify some parts of the Case
> > interface first before jumping into the implementations. Wondering what a
> > good package name for this stuff is, given that "case" is a reserved
> word?
> >
> > Case (interface)
> > The Case interface defines two methods:
> > * String format(Iterable tokens)
> > The format method accepts an Iterable of String tokens and returns a
> single
> > String formatted according to the implementation. The format method is
> > intended to handle transforming between cases, thus tokens passed to the
> > format() method need not be properly formatted for the given Case
> instance,
> > though they must still respect any reserve character restrictions.
> > * List parse(String string)
> > The parse method accepts a single string and returns a List of string
> > tokens that abide by the Case implementation.
> > Note: format() and parse() methods must be fully reciprocal. ie. On a
> > single Case instance, when calling parse() with a valid string, and
> passing
> > the resulting tokens into format(), a matching string should be returned.
> >
> > DelimitedCase (base class for kebab and snake)
> > Defines a Case where all tokens are separated by a single character
> > delimiter. The delimiter is considered a reserved character and is not
> > allowed to appear within tokens when formatting. No further restrictions
> > are placed on token contents by this base implementation. Tokens can
> > contain any valid Java String character. DelimitedCases can support
> > zero-length tokens, which can occur if there are no characters between
> two
> > instances of the delimiter or if the parsed string begins or ends with
> the
> > delimiter.
> > Note: Other Case implementations may not support zero-length tokens, and
> > attempts to call format(...) with empty tokens may fail.
> >
> > KebabCase
> > Extends DelimitedCase and initializes the delimiter as the hyphen '-'
> > character. This case allows only alphanumeric characters within tokens.
> >
> > SnakeCase
> > Extends DelimitedCase and initializes the delimiter as the underscore '_'
> > character. This case allows only alphanumeric characters within tokens.
> >
> > PascalCase
> > Defines a Case where tokens begin with an uppercase alpha character. All
> > subsequent token characters must be lowercase alpha or numeric
> characters.
> > Whenever an uppercase alpha character is encountered, the previous token
> is
> > considered complete and a new token begins, with the uppercase character
> > being the first character of the new token. PascalCase does not allow
> > zero-length tokens when formatting, as it would violate the reciprocal
> > contract of format() and parse().
> >
> > CamelCase
> > Extends PascalCase and sets one additional restriction - that the first
> > character of the first token (ie the first character of the full string)
> > must be a lowercase alpha character (rather than the uppercase
> requirement
> > of PascalCase). All other restrictions of PascalCase apply.
> >
> >
> > On Tue, Aug 8, 2023 at 8:55 PM Daniel Watson 
> wrote:
> >
> > > Kebab case is extremely common for web identifiers, eg html element
> ids,
> > > classes, attributes, etc.
> > >
> > > In regards to PascalCase, i agree that most people won't understand the
> > > reasoning behind the name, but it is nevertheless a widely accepted
> term
> > > for that case style. If an alternative is deemed necessary then
> > > "ProperCase" might work - since that is also how English proper nouns
> are
> > > cased. Understanding that name just depends on your knowledge of
> English
> > > grammar.
> > >
> > > A spec can definitely be written for the 4 provided concrete
> > > implementations. And... I may eat these words but... the spec should
> not be
> > > all that complex. I will take a stab at it.
> > >
> > > Thanks for the feedback!
> > > Any other thoughts or comments are welcome!
> > >
> > > Dan
> > >
> > >
> > > On Tue, Aug 8, 2023, 7:45 PM Elliotte Rusty Harold  >
> > > wrote:
> > >
> > >> This is a good idea and seems like useful functionality. In order to
> > >> accept it i

Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
Meant to add...

The reason I would favor exceptions is that the underlying implementation
can be easily customized. If the user needs to allow non alphanumeric
characters there is a boolean flag in the underlying abstract class
(AbstractConfigurableCase) that will simply turn that validation off. I
don't think we need to make any specific implementation be significantly
error tolerant.

An extension of snake case to allow all characters should look like..


class MySnakeCase extends SnakeCase {
MySnakeCase(){
super().
this.alphanuneric = false;
}
}


On Wed, Aug 9, 2023, 7:29 PM Daniel Watson  wrote:

> Currently I'm planning a set of exceptions that are thrown for various
> reasons. I created multiple classes to allow for clearer testing.
>
> ReservedCharacterException (extends InvalidCharacterException below) -
> thrown specifically when a reserved character is encountered within a token.
>
> InvalidCharacterException (extends IllegalArgumentException) thrown
> directly any time an illegal character is encountered.
>
> ZeroLengthTokenException (extends Illegal arg excep) - thrown when a zero
> length token is encountered and Case does not support it.
>
> There are a few other error cases I believe. I'm not looking at the code
> right this moment but I'm fairly certain about the need for the above 3.
>
>
> On Wed, Aug 9, 2023, 6:08 PM Elliotte Rusty Harold 
> wrote:
>
>> What happens when a token contains an unpermitted character?
>>
>> On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson 
>> wrote:
>> >
>> > Here's my stab at a spec. Wanted to clarify some parts of the Case
>> > interface first before jumping into the implementations. Wondering what
>> a
>> > good package name for this stuff is, given that "case" is a reserved
>> word?
>> >
>> > Case (interface)
>> > The Case interface defines two methods:
>> > * String format(Iterable tokens)
>> > The format method accepts an Iterable of String tokens and returns a
>> single
>> > String formatted according to the implementation. The format method is
>> > intended to handle transforming between cases, thus tokens passed to the
>> > format() method need not be properly formatted for the given Case
>> instance,
>> > though they must still respect any reserve character restrictions.
>> > * List parse(String string)
>> > The parse method accepts a single string and returns a List of string
>> > tokens that abide by the Case implementation.
>> > Note: format() and parse() methods must be fully reciprocal. ie. On a
>> > single Case instance, when calling parse() with a valid string, and
>> passing
>> > the resulting tokens into format(), a matching string should be
>> returned.
>> >
>> > DelimitedCase (base class for kebab and snake)
>> > Defines a Case where all tokens are separated by a single character
>> > delimiter. The delimiter is considered a reserved character and is not
>> > allowed to appear within tokens when formatting. No further restrictions
>> > are placed on token contents by this base implementation. Tokens can
>> > contain any valid Java String character. DelimitedCases can support
>> > zero-length tokens, which can occur if there are no characters between
>> two
>> > instances of the delimiter or if the parsed string begins or ends with
>> the
>> > delimiter.
>> > Note: Other Case implementations may not support zero-length tokens, and
>> > attempts to call format(...) with empty tokens may fail.
>> >
>> > KebabCase
>> > Extends DelimitedCase and initializes the delimiter as the hyphen '-'
>> > character. This case allows only alphanumeric characters within tokens.
>> >
>> > SnakeCase
>> > Extends DelimitedCase and initializes the delimiter as the underscore
>> '_'
>> > character. This case allows only alphanumeric characters within tokens.
>> >
>> > PascalCase
>> > Defines a Case where tokens begin with an uppercase alpha character. All
>> > subsequent token characters must be lowercase alpha or numeric
>> characters.
>> > Whenever an uppercase alpha character is encountered, the previous
>> token is
>> > considered complete and a new token begins, with the uppercase character
>> > being the first character of the new token. PascalCase does not allow
>> > zero-length tokens when formatting, as it would violate the reciprocal
>> > contract of format() and parse().
>> >
>> > CamelCase
>> > Extends PascalCase and sets one additional restriction - that the first
>> > character of the first token (ie the first character of the full string)
>> > must be a lowercase alpha character (rather than the uppercase
>> requirement
>> > of PascalCase). All other restrictions of PascalCase apply.
>> >
>> >
>> > On Tue, Aug 8, 2023 at 8:55 PM Daniel Watson 
>> wrote:
>> >
>> > > Kebab case is extremely common for web identifiers, eg html element
>> ids,
>> > > classes, attributes, etc.
>> > >
>> > > In regards to PascalCase, i agree that most people won't understand
>> the
>> > > reasoning behind the name, but it is nevertheless a widely accepted
>> term
>> > > fo

[ANNOUNCEMENT] Apache Commons DbUtils 1.8.0

2023-08-09 Thread Gary Gregory
The Apache Commons DbUtils team is pleased to announce the release of
Apache Commons DbUtils 1.8.0.

The Apache Commons DbUtils package is a set of Java utility classes
for easing JDBC development.

This is a feature and bug fix release, read the change log here:
https://commons.apache.org/proper/commons-dbutils/changes-report.html#a1.8.0

For complete information on Apache Commons DbUtils, including
instructions on how to submit bug reports, patches, or suggestions for
improvement, see the Apache Commons DbUtils website:

https://commons.apache.org/proper/commons-dbutils/

Download it from
https://commons.apache.org/proper/commons-dbutils/download_dbcp.cgi

Enjoy,
Gary Gregory
Apache Commons Team

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Gary Gregory
IMO, these can all be replaced by IAE because there is nothing I would
do as a call site if I caught one of these custom exceptions vs.
another, it's all the same issue, probably bad user input. The only
reason to create a custom exception would be to wrap additional
information like a location (line number, column number), but that's
not what you describe here. You can imagine an editor catching a
syntax error exception and extracting a line and column number and
changing the style for that area of the text.

Gary

On Wed, Aug 9, 2023 at 7:30 PM Daniel Watson  wrote:
>
> Currently I'm planning a set of exceptions that are thrown for various
> reasons. I created multiple classes to allow for clearer testing.
>
> ReservedCharacterException (extends InvalidCharacterException below) -
> thrown specifically when a reserved character is encountered within a token.
>
> InvalidCharacterException (extends IllegalArgumentException) thrown
> directly any time an illegal character is encountered.
>
> ZeroLengthTokenException (extends Illegal arg excep) - thrown when a zero
> length token is encountered and Case does not support it.
>
> There are a few other error cases I believe. I'm not looking at the code
> right this moment but I'm fairly certain about the need for the above 3.
>
>
> On Wed, Aug 9, 2023, 6:08 PM Elliotte Rusty Harold 
> wrote:
>
> > What happens when a token contains an unpermitted character?
> >
> > On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson  wrote:
> > >
> > > Here's my stab at a spec. Wanted to clarify some parts of the Case
> > > interface first before jumping into the implementations. Wondering what a
> > > good package name for this stuff is, given that "case" is a reserved
> > word?
> > >
> > > Case (interface)
> > > The Case interface defines two methods:
> > > * String format(Iterable tokens)
> > > The format method accepts an Iterable of String tokens and returns a
> > single
> > > String formatted according to the implementation. The format method is
> > > intended to handle transforming between cases, thus tokens passed to the
> > > format() method need not be properly formatted for the given Case
> > instance,
> > > though they must still respect any reserve character restrictions.
> > > * List parse(String string)
> > > The parse method accepts a single string and returns a List of string
> > > tokens that abide by the Case implementation.
> > > Note: format() and parse() methods must be fully reciprocal. ie. On a
> > > single Case instance, when calling parse() with a valid string, and
> > passing
> > > the resulting tokens into format(), a matching string should be returned.
> > >
> > > DelimitedCase (base class for kebab and snake)
> > > Defines a Case where all tokens are separated by a single character
> > > delimiter. The delimiter is considered a reserved character and is not
> > > allowed to appear within tokens when formatting. No further restrictions
> > > are placed on token contents by this base implementation. Tokens can
> > > contain any valid Java String character. DelimitedCases can support
> > > zero-length tokens, which can occur if there are no characters between
> > two
> > > instances of the delimiter or if the parsed string begins or ends with
> > the
> > > delimiter.
> > > Note: Other Case implementations may not support zero-length tokens, and
> > > attempts to call format(...) with empty tokens may fail.
> > >
> > > KebabCase
> > > Extends DelimitedCase and initializes the delimiter as the hyphen '-'
> > > character. This case allows only alphanumeric characters within tokens.
> > >
> > > SnakeCase
> > > Extends DelimitedCase and initializes the delimiter as the underscore '_'
> > > character. This case allows only alphanumeric characters within tokens.
> > >
> > > PascalCase
> > > Defines a Case where tokens begin with an uppercase alpha character. All
> > > subsequent token characters must be lowercase alpha or numeric
> > characters.
> > > Whenever an uppercase alpha character is encountered, the previous token
> > is
> > > considered complete and a new token begins, with the uppercase character
> > > being the first character of the new token. PascalCase does not allow
> > > zero-length tokens when formatting, as it would violate the reciprocal
> > > contract of format() and parse().
> > >
> > > CamelCase
> > > Extends PascalCase and sets one additional restriction - that the first
> > > character of the first token (ie the first character of the full string)
> > > must be a lowercase alpha character (rather than the uppercase
> > requirement
> > > of PascalCase). All other restrictions of PascalCase apply.
> > >
> > >
> > > On Tue, Aug 8, 2023 at 8:55 PM Daniel Watson 
> > wrote:
> > >
> > > > Kebab case is extremely common for web identifiers, eg html element
> > ids,
> > > > classes, attributes, etc.
> > > >
> > > > In regards to PascalCase, i agree that most people won't understand the
> > > > reasoning behind the name, but it is nevertheless a wide

Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
Currently those exceptions do capture token and character index
information, but i think im just using it to create the message. I get what
you're saying but without them testing becomes less accurate. If IAE is
being thrown all over the place then asserting a failure can't actually
guarantee that it failed in the expected way.


In regards to what Elliotte said...


Not every set of tokens can actually be represented deterministcally in
every case. Which is why I think exceptions are needed.

my-component-1

Is a valid kebab cased string, with tokens my,component,1

However this cannot be formatted in camel case or Pascal case, because they
are delimited by alpha characters.

If those tokens were passed to those cases I would expect an exception to
be thrown, other wise the result is not reciprocal.  e.g. MyComponent1 is
only two PascalCase tokens.

On Wed, Aug 9, 2023, 7:36 PM Daniel Watson  wrote:

> Meant to add...
>
> The reason I would favor exceptions is that the underlying implementation
> can be easily customized. If the user needs to allow non alphanumeric
> characters there is a boolean flag in the underlying abstract class
> (AbstractConfigurableCase) that will simply turn that validation off. I
> don't think we need to make any specific implementation be significantly
> error tolerant.
>
> An extension of snake case to allow all characters should look like..
>
>
> class MySnakeCase extends SnakeCase {
> MySnakeCase(){
> super().
> this.alphanuneric = false;
> }
> }
>
>
> On Wed, Aug 9, 2023, 7:29 PM Daniel Watson  wrote:
>
>> Currently I'm planning a set of exceptions that are thrown for various
>> reasons. I created multiple classes to allow for clearer testing.
>>
>> ReservedCharacterException (extends InvalidCharacterException below) -
>> thrown specifically when a reserved character is encountered within a token.
>>
>> InvalidCharacterException (extends IllegalArgumentException) thrown
>> directly any time an illegal character is encountered.
>>
>> ZeroLengthTokenException (extends Illegal arg excep) - thrown when a zero
>> length token is encountered and Case does not support it.
>>
>> There are a few other error cases I believe. I'm not looking at the code
>> right this moment but I'm fairly certain about the need for the above 3.
>>
>>
>> On Wed, Aug 9, 2023, 6:08 PM Elliotte Rusty Harold 
>> wrote:
>>
>>> What happens when a token contains an unpermitted character?
>>>
>>> On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson 
>>> wrote:
>>> >
>>> > Here's my stab at a spec. Wanted to clarify some parts of the Case
>>> > interface first before jumping into the implementations. Wondering
>>> what a
>>> > good package name for this stuff is, given that "case" is a reserved
>>> word?
>>> >
>>> > Case (interface)
>>> > The Case interface defines two methods:
>>> > * String format(Iterable tokens)
>>> > The format method accepts an Iterable of String tokens and returns a
>>> single
>>> > String formatted according to the implementation. The format method is
>>> > intended to handle transforming between cases, thus tokens passed to
>>> the
>>> > format() method need not be properly formatted for the given Case
>>> instance,
>>> > though they must still respect any reserve character restrictions.
>>> > * List parse(String string)
>>> > The parse method accepts a single string and returns a List of string
>>> > tokens that abide by the Case implementation.
>>> > Note: format() and parse() methods must be fully reciprocal. ie. On a
>>> > single Case instance, when calling parse() with a valid string, and
>>> passing
>>> > the resulting tokens into format(), a matching string should be
>>> returned.
>>> >
>>> > DelimitedCase (base class for kebab and snake)
>>> > Defines a Case where all tokens are separated by a single character
>>> > delimiter. The delimiter is considered a reserved character and is not
>>> > allowed to appear within tokens when formatting. No further
>>> restrictions
>>> > are placed on token contents by this base implementation. Tokens can
>>> > contain any valid Java String character. DelimitedCases can support
>>> > zero-length tokens, which can occur if there are no characters between
>>> two
>>> > instances of the delimiter or if the parsed string begins or ends with
>>> the
>>> > delimiter.
>>> > Note: Other Case implementations may not support zero-length tokens,
>>> and
>>> > attempts to call format(...) with empty tokens may fail.
>>> >
>>> > KebabCase
>>> > Extends DelimitedCase and initializes the delimiter as the hyphen '-'
>>> > character. This case allows only alphanumeric characters within tokens.
>>> >
>>> > SnakeCase
>>> > Extends DelimitedCase and initializes the delimiter as the underscore
>>> '_'
>>> > character. This case allows only alphanumeric characters within tokens.
>>> >
>>> > PascalCase
>>> > Defines a Case where tokens begin with an uppercase alpha character.
>>> All
>>> > subsequent token characters must be lowercase alpha or numeric
>>> characters.
>>> 

Re: [Crypto] Compile From Source and Openssl3 Support

2023-08-09 Thread Alex Remily
I've been meaning to add openssl3 support to common crypto for a while
now.  Maybe this will give me the push that I need.  I'm on my phone now
without access to the branch, but I recall a docker file that provided a
lot of build information on the main branch.  You may consider having a
look if you haven't already.


Alex

On Wed, Aug 9, 2023, 2:01 PM Daniel Thertell  wrote:

> ya that's totally fine!
> I will continue to try and figure this out.
>
> Thanks,
> Dan Thertell
>
>
> On Wed, Aug 9, 2023 at 4:53 PM Gary Gregory 
> wrote:
>
> > The branch is work in progress from a while ago, and it did not work
> > completely, that much i remember. I can't take the time today to look at
> it
> > today, I'm looking at other issues in Commons.
> >
> > Gary
> >
> > On Wed, Aug 9, 2023, 4:27 PM Daniel Thertell 
> wrote:
> >
> > > Hey Gary
> > >
> > > I believe I managed to get it to build, however I do have a few
> > questions.
> > >
> > > 1. Why were the make targets for the header files commented out and
> > > pointing to the wrong locations (in the make file)?
> > > 2. After successfully running make, how do I package everything into a
> > JAR
> > > for testing?
> > >
> > > Thanks,
> > > Dan Thertell
> > >
> > > On Wed, Aug 9, 2023 at 2:13 PM Daniel Thertell 
> > > wrote:
> > >
> > > > Hey Gary,
> > > >
> > > > lol ya I also have that note keeping issue!
> > > > By any chance do you know what the version env variable should be? I
> am
> > > > using 3_0_X right now.
> > > >
> > > > Thanks,
> > > > Dan Thertell
> > > >
> > > > On Wed, Aug 9, 2023 at 2:10 PM Gary Gregory 
> > > > wrote:
> > > >
> > > >> I should have kept notes!
> > > >>
> > > >> Gary
> > > >>
> > > >> On Wed, Aug 9, 2023, 1:49 PM Daniel Thertell 
> > > wrote:
> > > >>
> > > >> > Hey All,
> > > >> >
> > > >> > I am looking to compile Commons Crypto from source and I am
> > wondering
> > > if
> > > >> > there is any documentation for this process? I am trying to build
> > Gary
> > > >> > Gregory's OpenSSL3 branch but I am encountering the following
> > error. I
> > > >> know
> > > >> > this isn't the main branch but I am hoping someone will still be
> > able
> > > to
> > > >> > help out. I receive the following error when i run "make linux64"
> (i
> > > >> > received a similar error on a M1 when i ran 'make mac64')
> > > >> >
> > > >> > *** No rule to make target
> > > >> >
> > > >> >
> > > >>
> > >
> >
> 'target/jni-classes/org_apache_commons_crypto_random_OpenSslCryptoRandomNative.h',
> > > >> > needed by
> > > >> >
> > > 'target/commons-crypto-3_0_x-Linux-x86_64/OpenSslCryptoRandomNative.o'.
> > > >> > Stop.
> > > >> >
> > > >> > Any help or ideas would be appreciated!
> > > >> >
> > > >> > Thanks,
> > > >> > Dan Thertell
> > > >> >
> > > >>
> > > >
> > >
> >
>


Re: [Crypto] Compile From Source and Openssl3 Support

2023-08-09 Thread Daniel Thertell
Okay thanks Alex, I haven't looked at the Docker file! I just called make
in the root of the project.
I will be digging into it more tomorrow!

Dan Thertell

On Wed, Aug 9, 2023, 8:21 p.m. Alex Remily  wrote:

> I've been meaning to add openssl3 support to common crypto for a while
> now.  Maybe this will give me the push that I need.  I'm on my phone now
> without access to the branch, but I recall a docker file that provided a
> lot of build information on the main branch.  You may consider having a
> look if you haven't already.
>
>
> Alex
>
> On Wed, Aug 9, 2023, 2:01 PM Daniel Thertell  wrote:
>
> > ya that's totally fine!
> > I will continue to try and figure this out.
> >
> > Thanks,
> > Dan Thertell
> >
> >
> > On Wed, Aug 9, 2023 at 4:53 PM Gary Gregory 
> > wrote:
> >
> > > The branch is work in progress from a while ago, and it did not work
> > > completely, that much i remember. I can't take the time today to look
> at
> > it
> > > today, I'm looking at other issues in Commons.
> > >
> > > Gary
> > >
> > > On Wed, Aug 9, 2023, 4:27 PM Daniel Thertell 
> > wrote:
> > >
> > > > Hey Gary
> > > >
> > > > I believe I managed to get it to build, however I do have a few
> > > questions.
> > > >
> > > > 1. Why were the make targets for the header files commented out and
> > > > pointing to the wrong locations (in the make file)?
> > > > 2. After successfully running make, how do I package everything into
> a
> > > JAR
> > > > for testing?
> > > >
> > > > Thanks,
> > > > Dan Thertell
> > > >
> > > > On Wed, Aug 9, 2023 at 2:13 PM Daniel Thertell 
> > > > wrote:
> > > >
> > > > > Hey Gary,
> > > > >
> > > > > lol ya I also have that note keeping issue!
> > > > > By any chance do you know what the version env variable should be?
> I
> > am
> > > > > using 3_0_X right now.
> > > > >
> > > > > Thanks,
> > > > > Dan Thertell
> > > > >
> > > > > On Wed, Aug 9, 2023 at 2:10 PM Gary Gregory <
> garydgreg...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> I should have kept notes!
> > > > >>
> > > > >> Gary
> > > > >>
> > > > >> On Wed, Aug 9, 2023, 1:49 PM Daniel Thertell  >
> > > > wrote:
> > > > >>
> > > > >> > Hey All,
> > > > >> >
> > > > >> > I am looking to compile Commons Crypto from source and I am
> > > wondering
> > > > if
> > > > >> > there is any documentation for this process? I am trying to
> build
> > > Gary
> > > > >> > Gregory's OpenSSL3 branch but I am encountering the following
> > > error. I
> > > > >> know
> > > > >> > this isn't the main branch but I am hoping someone will still be
> > > able
> > > > to
> > > > >> > help out. I receive the following error when i run "make
> linux64"
> > (i
> > > > >> > received a similar error on a M1 when i ran 'make mac64')
> > > > >> >
> > > > >> > *** No rule to make target
> > > > >> >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> 'target/jni-classes/org_apache_commons_crypto_random_OpenSslCryptoRandomNative.h',
> > > > >> > needed by
> > > > >> >
> > > >
> 'target/commons-crypto-3_0_x-Linux-x86_64/OpenSslCryptoRandomNative.o'.
> > > > >> > Stop.
> > > > >> >
> > > > >> > Any help or ideas would be appreciated!
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Dan Thertell
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Elliotte Rusty Harold
On Wed, Aug 9, 2023 at 11:50 PM Daniel Watson  wrote:

> my-component-1
>
> Is a valid kebab cased string, with tokens my,component,1
>
> However this cannot be formatted in camel case or Pascal case, because they
> are delimited by alpha characters.
>

Not necessarily so. The last implementation I worked on (Python) in
fact did handle this case. This is why it's important to really lock
down the precise definition of these formats before worrying about
which exceptions to throw when. I know from experience that it is
possible to write an algorithm for snake case <--> camel case
conversions that converts any string with no exceptions because I've
done it (not in open source, unfortunately), but that might depend on
the definition of the format.

-- 
Elliotte Rusty Harold
elh...@ibiblio.org

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Elliotte Rusty Harold
On Wed, Aug 9, 2023 at 11:36 PM Daniel Watson  wrote:
>
> Meant to add...
>
> The reason I would favor exceptions is that the underlying implementation
> can be easily customized. If the user needs to allow non alphanumeric
> characters there is a boolean flag in the underlying abstract class
> (AbstractConfigurableCase) that will simply turn that validation off.

This is another point, but customizability is a bug, not a feature. I
don't want to guess what the method might be doing based on what flag
was set where. I want camel case to mean one thing and one thing only.
Ditto snake case, pascal case, and any other formats. Possibly there's
a reason to add additional subclasses, but the
CamelCase/SnakeCase/KebabCase classes should not emit different
strings depending on how they're configured. The public API should be
a pure function, not an object.

-- 
Elliotte Rusty Harold
elh...@ibiblio.org

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
I would think it's possible to hide that "configuration" from the user such
that the implementation can only be reconfigured via extension. But I'm not
in love with the configurable base class either way. It was convenient to
have the common functionality in one place, but it's not a big deal to
handle that differently.

The tradeoff in having the Cases be pure functions is that it makes it more
difficult for a user to extend them with additional functionality. And to
me the need for extension is apparent even when just looking at the 4 basic
cases. Two of them are character delimited, and 2 of them are uppercase
delimited. There's two bits of shared functionality just in the 4 most
basic cases.

Back to the exception topic, I don't think the tokens "my" "component" and
"1" can be formatted in PascalCase in a way that they could be parsed back
out into 3 tokens. So the question is less about whether it's possible to
format them and more about whether the API should format output that cannot
be parsed back into the same input. I think it makes sense to enforce that
consistency, or at the very least allow the user to enable it?




On Wed, Aug 9, 2023, 9:14 PM Elliotte Rusty Harold 
wrote:

> On Wed, Aug 9, 2023 at 11:36 PM Daniel Watson 
> wrote:
> >
> > Meant to add...
> >
> > The reason I would favor exceptions is that the underlying implementation
> > can be easily customized. If the user needs to allow non alphanumeric
> > characters there is a boolean flag in the underlying abstract class
> > (AbstractConfigurableCase) that will simply turn that validation off.
>
> This is another point, but customizability is a bug, not a feature. I
> don't want to guess what the method might be doing based on what flag
> was set where. I want camel case to mean one thing and one thing only.
> Ditto snake case, pascal case, and any other formats. Possibly there's
> a reason to add additional subclasses, but the
> CamelCase/SnakeCase/KebabCase classes should not emit different
> strings depending on how they're configured. The public API should be
> a pure function, not an object.
>
> --
> Elliotte Rusty Harold
> elh...@ibiblio.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>