Re: [DISCUSS] Support of time zone mappings in ParseDateFieldUpdateProcessorFactory

Kevin Risden Tue, 05 Nov 2024 15:45:44 -0800

I think this is the same as https://issues.apache.org/jira/browse/SOLR-17379
which has more details.


Kevin Risden


On Tue, Nov 5, 2024 at 5:16 PM Christos Malliaridis <[email protected]>
wrote:

> Looking into the recent JENKINS jobs, one of the failing tests is
>
> java.time.format.DateTimeParseException: Text 'Thu Nov 13 04:35:51 AKST
> 2008' could not be parsed at index 20
>
> The root cause of this failing test seems to be related to JDK 23 (and
> Windows?) and has some history from JDK 8/9 according to our tests. I
> believe this is caused by some changes in the newer JDK that no longer
> parse time zones as expected.
>
> The documentation of our ParseDateFieldUpdateProcessorFactory says:
>
> <p>A default time zone name or offset may optionally be specified for those
> dates that don't
> include an explicit zone/offset. NOTE: three-letter zone designations like
> "EST" are not
> parseable (with the single exception of "UTC"), because they are ambiguous.
> If no default time
> zone is specified, UTC will be used. See <a
> href="http://en.wikipedia.org/wiki/List_of_tz_database_time_zones";
> >Wikipedia's list of TZ
> database time zone names</a>.
>
>
> So officially, if I interpret it correctly, we do not support the time zone
> abbreviations, but we still have tests that check the JDK behavior for
> unsupported abbreviations like ASKT and also support the "z" in the
> dateformat pattern for "UTC".
>
> To address this "limitation", instead of removing the tests and the support
> for time zones, I would like to propose a new feature, allowing the users
> to provide time zone mappings to the
> solr.ParseDateFieldUpdateProcessorFactory processor.
>
> A configuration could look something like this:
>
>   <updateRequestProcessorChain
> name="parse-date-patterns-timeZoneCompat-config">
>     <processor class="solr.ParseDateFieldUpdateProcessorFactory">
>       <arr name="format">
>         <str>yyyy-MM-dd['T'[HH:mm[:ss[.SSS]][z</str>
>         <str>yyyy-MM-dd['T'[HH:mm[:ss[,SSS]][z</str>
>         <str>yyyy-MM-dd HH:mm[:ss[.SSS]][z</str>
>         <str>yyyy-MM-dd HH:mm[:ss[,SSS]][z</str>
>         <str>[EEE, ]dd MMM yyyy HH:mm[:ss] z</str>
>         <str>EEEE, dd-MMM-yy HH:mm:ss z</str>
>         <str>EEE MMM ppd HH:mm:ss [z ]yyyy</str>
>       </arr>
>       <arr name="zoneMappings">
>         <!-- Alaska Standard Time -->
>         <zoneMapping abbreviation="ASKT" offset="-09:00"/>
>         <zoneMapping abbreviation="AKDT" offset="-09:00"/>
>         <!-- Irish Standard Time -->
>         <zoneMapping abbreviation="IST" offset="+01:00"/>
>         <!-- Japan Standard Time -->
>         <zoneMapping abbreviation="JST" offset="+09:00"/>
>       </arr>
>     </processor>
>     <processor class="solr.RunUpdateProcessorFactory" />
>   </updateRequestProcessorChain>
>
> The solution would extend the functionality with an optional parameter
> "zoneMappings" that accepts a list of key-value pairs for abbreviations and
> their equivalent time offset. These values can then be used to replace
> occurences in the datetime string before it is passed to DateTimeFormatter,
> guaranteeing the successful parsing of unsupported time zone abbreviations.
>
> This way the user can be explicit of how the time zone abbreviation is
> interpreted and provide his own mappings directly in the processor as
> configuration. As a current workaround I believe the user has to use
> another processor for mapping the time zones before processing the datetime
> with the default processor. If this should be the recommended approach,
> support for timezone should be dropped completely instead (probably by
> simply removing the failing tests related to time zone abbreviations).
>
> Since I have not actively worked on processor implementations, I am not
> sure if this proposal makes sense. What are your thoughts?
>

Re: [DISCUSS] Support of time zone mappings in ParseDateFieldUpdateProcessorFactory

Reply via email to