Hey,
I have to work with large files (~100GB) and to make my life easier, I write out the file offsets of new lines to a separate file while generating the corpus. Is there a way to use this list to build an index? This is for a CLI app and it makes building index everytime very painful.