Marginalia Search now properly supports phrase matching. This not only permits a more robust implementation of quoted search queries, but also helps promote results where the search terms occur in the document exactly in the same order as they do in the query. This is a write-up about implementing this change. This is going to be a relatively long post, as it represents about 4 months of work. I’m also happy and grateful to announce that the nlnet people reached out after the run of the gra...| www.marginalia.nu
The project has been haunted by a mysterious bug since sometime February. It relates to the code that constructs the index, particularly the code that merges partial indices. In short the search engine constucts the reverse index through successive merging of smaller indices, which reduces the overall memory requirement. You can conceptualize the revese index itself as two files, one with offset pointers into another file, which has sorted numbers. This code runs after each partition finishes...| www.marginalia.nu
Been working on improving Marginalia Search query parsing and understanding. This is going to be a pretty long update, as it’s a few months’ work. Apart from cleaning up the somewhat messy query parsing code, a problem I’m trying to address is that the search engine is currently only good at dealing with fairly focused queries, they don’t need to be short, but if you try to qualify a search that is too broad by adding more terms, it often doesn’t produce anything useful.| www.marginalia.nu