Hey all.  have a situation that seems pretty rough.  currently in our data
we have a lot of sentences like this:

elements comprise the "stuff" of the tax. 3 Reg. § 1.901-2(a)(2). 4 Only
non-Saudis are subject to the
<https://heinonline.org/HOL/SearchVolumeSOLR?input=(((%223%20Regulation%201%22%20OR%20%223%20Regulation%201%22%20OR%20%223%20Reg.%201%22)%20AND%20NOT%20id:hein.journals/rcatorbg3.14))&div=13&handle=hein.journals/taxlr53&collection=journals>
By default the word delimiter is treating all punctuation as a space.  So
when you search for:
3 Reg. 1, your results can include  3 Reg. § 1.901

I Have experimented with the WDF and added § => ALPHA and this works, and
treats the character as a letter.  however during some queries, I still
need searches such as

Servitudes 2.10

to return results with:


Servitudes § 2.10


I at the moment can not conceive of a way to to this aside from two
separate text fields, and effectively doubling the size of my index.
which currently sits at 300 gb optimized, and 500gb if left to its
own.


Thanks for any help or suggestions

Reply via email to