Hi all

I have to index srt files belonged to videos so that the users can get not
only the video but also the time when their search takes place in it. For
the sake of clarity, you can find below an example of this kind of files:

1
00:00:08,580 --> 00:00:12,880
Welcome back, and in this video we're
going to continue where we left off

2
00:00:12,880 --> 00:00:14,160
in the previous video, and talk a

3
00:00:14,160 --> 00:00:16,840
little bit more about the linear
programming problem.

The easiest approach would be index so many documents as counters the file
is made up of (in the previous example it would be three), but then there
will be billion of tiny Solr documents in the index and also all the
documents belonged to the same srt file would have a lot of fields with the
same value (the ID, title of the video, etc) which means a lot of redundant
data. I know I could use new JOIN feature to get through it, but I'd like
to receive some inputs first from such a active community.

Thanks in advance

Reply via email to