On 9/23/2015 11:28 AM, Erick Erickson wrote:
This is totally weird.

Don't only re-index your old docs, find the data directory and
rm -rf data (with Solr stopped) and re-index.
I pretty much do that. The thing is: I don't have a data directory anywhere! Most of my stuff is in /localapps/dev/EventLog/solr/, but I *do* have a /localapps/dev/EventLog/index/ directory where the main index resides. I'd like to move that into /localapps/dev/EventLog/solr/ so that I can keep all Solr-related files under one parent dir, but I can't find where the configuration for that is...

Perhaps I should also share what start command I'm using (in case it is wrong!):

/localapps/dev/solr-5.3.0/bin/solr start -s /localapps/dev/EventLog
re: the analysis page Alessandro mentioned.
Go to the Solr admin UI (http://localhost:8983/solr). You'll
see a drop-down on the left that lets you select a core,
select the appropriate one.

Now you'll see a bunch of new choices. The "analysis" section
is what Alessandro is referencing. That shows you _exactly_ what
effects your analysis chain has at index and query time.

On the same page, you'll find "schema browser". Take a look at
your logtext field and hit the "load term info" button. You should
see a bunch of single-word tokens listed. If you see really long ones,
then your index is hosed and you should start by blowing away
the data directory....
I wish I could show a screen capture! But according to your symptoms, my index is hosed (I see very few single-word tokens and lots of really long ones.) I have no data directory to blow away, though. I've blown away /localapps/dev/EventLog/index/ before, but that has had no effect on the problem.

Am I indexing improperly perhaps? I'm using /dataimport. Here is my data-config.xml, which hasn't been giving me any obvious trouble. Import seems successful. And I can get correct search results so long as I wrap my search text in asterisks...

<?xml version="1.0"?>
<dataConfig>
<dataSource user="awips" url="jdbc:postgresql://dx1f/OHRFC" driver="org.postgresql.Driver"/>
        <document>
<entity deltaQuery="SELECT posttime AS id FROM eventlogtext WHERE lastmodtime > '${dataimporter.last_index_time}';" query="SELECT posttime AS id, username, logtext, category FROM eventlogtext;" name="eventlogtext"> <entity query="SELECT catname FROM categorytypes WHERE catid='${eventlogtext.category}';" name="categorytypes"> </entity>
            </entity>
    </document>
</dataConfig>

Because this symptom is totally explained by searching on a "string"
rather than a "text" type. But your definition is clearly a tokenized text
type so I'm mystified.

The ELall field is a red herring. The debug output shows you're searching
on the logtext field, this line is the relevant one:
"parsedquery_toString":"logtext:deeper",
Should I just get rid of "ELall"? I only created it with the intent to be able to search on "fenbers" and get hits if "fenbers" occurred in either place, the logtext field or the username field.

thanks,
Mark

Reply via email to