I know its dangerous to get general answers when talking about performance, as the answer usually is "it depends". But I am going to try anyway :) My question is as a general rule of thumb is it better to have a list of items in an array stored and the query only has to issue a single matching term? Or store a single value per document and create various terms in an array passing in those generated terms for the query?
My example use case is this. I am trying to find contacts by name and email. Emails usually fall into several common patterns (first.last@domain, first_last@domain, firstinitial_last@domain, etc), so I want to be able to search against all of those possible combinations in trying to find this contact in our index. The queries are all filter terms, no wildcard, etc. The fields are all not_analyzed, so its basically an exact term match that I am looking for. So, I can either store the extra possible combinations in the document, and have the query syntax only need to pass in one term (as the field stored is an array). Or I can pass in the multiple combinations in a term array in the query syntax, and search against the single email we have stored in the index.
I know its never a perfect answer, but even general rule of thumb response from someone with deep internal knowledge of lucene/ES would be appreciated.