How to migrate a field type

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

How to migrate a field type

John Chang
I have dates mapped as strings, which was a mistake, and I need them mapped as longs so as to sort on them successfully (as I've been advised in previous threads - thanks!).

How can I handle this migration?  A few options occur to me:

1) Index into an completely new index.  When it is done, switch over my queries.
2) Create a new field.  The old one (the string) is called "receivedDate"  I could create "receivedDateAsLong" and then update all the documents to fill in that field.  Once done, sort on that field.
3) Do something with multi fields.  In experiments, I found that I can take a non-multi field and turn it into a multi.  However, I do wonder how I'd tell the sort *which* instance of the multi field to sort on.
4) Something else I'm not thinking of that you might suggest.

Or, maybe none of these are the right idea.  In my ideal world, I'd be left with 1 non-multi field with a long (the string field going away) and I would not need to rebuild an entirely new index.  I don't know if that will be possible.

Also, if possible, I'd like to index the receivedDate field without having to get all the other fields.  (I don't have the source anymore; I disabled it, and the original data for all the other fields is a pain for me to get at again).  Again, this may not be possible, but it is a definite preference.

Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: How to migrate a field type

kimchy
Administrator
Your simplest and cleanest solution going forward is to completely reindex the data using the date mapping (and not string). Even if you add the new date mapping as additional mapping, you will still need to reindex all the data again (no option to update just one field in a doc), so a more optimized manner is to just index into a new doc.

On Sat, Dec 18, 2010 at 2:10 AM, John Chang <[hidden email]> wrote:

I have dates mapped as strings, which was a mistake, and I need them mapped
as longs so as to sort on them successfully (as I've been advised in
previous threads - thanks!).

How can I handle this migration?  A few options occur to me:

1) Index into an completely new index.  When it is done, switch over my
queries.
2) Create a new field.  The old one (the string) is called "receivedDate"  I
could create "receivedDateAsLong" and then update all the documents to fill
in that field.  Once done, sort on that field.
3) Do something with multi fields.  In experiments, I found that I can take
a non-multi field and turn it into a multi.  However, I do wonder how I'd
tell the sort *which* instance of the multi field to sort on.
4) Something else I'm not thinking of that you might suggest.

Or, maybe none of these are the right idea.  In my ideal world, I'd be left
with 1 non-multi field with a long (the string field going away) and I would
not need to rebuild an entirely new index.  I don't know if that will be
possible.

Also, if possible, I'd like to index the receivedDate field without having
to get all the other fields.  (I don't have the source anymore; I disabled
it, and the original data for all the other fields is a pain for me to get
at again).  Again, this may not be possible, but it is a definite
preference.

Thanks.
--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/How-to-migrate-a-field-type-tp2108002p2108002.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: How to migrate a field type

John Chang
Thanks for your above help.  I am having trouble getting back at the original documents to reindex the data (long an unfortunate story).  So for now, I am trying to come up with a work-around until I can get back to the original data to reindex.  My dates at current is indexed as:
1) Long value, but as type string.  (foolish, I know)
2) Long cast to a float, as type float.  (another mistake)

I'm very sorry to have to ask for help on a hacky work-around based on mistakenly indexed data, but I'm not going to be able to get out that situation quickly.

When I try to sort of the long-stored-as-string it's unreliable.  If my requested result set size is a relatively high percentage of the matching docs (say, 35 of 100 matching docs), it works fine.  If I want a low percentage of matching docs (say, 35 of 1000 matching docs), they often don't come back sorted right (it seems to come back sorted in the order indexed, not in the order of the long-stored-as-string value).  Also, BTW, I can't get it to sort ASCENDING, but that is not so bad, as I want DESCENDING.

When I try to sort off the long cast to a float (stored as type float), if my query term matches a large number of documents (thousands), then it is again unreliable.  Sometimes it works fine (generally the first query after restarting the client).  In subsequent queries, I tend to be missing the most recent month or so, but the results before that in time do come back and are sorted correctly.

Thanks,
John
Reply | Threaded
Open this post in threaded view
|

Re: How to migrate a field type

John Chang
I should add: When I try to sort off the long cast to a float (stored as type float), if my query term matches a small number of documents, it works fine.
Reply | Threaded
Open this post in threaded view
|

Re: How to migrate a field type

kimchy
Administrator
I am not sure I understand then, is it a problem in ES and how it sorts (based on the actual types mapped), or the problem with long loosing resolution as float, and long represented as string?

You can try and use the scrolling search api to paginate thought docs and reindex them, though it gets quick expensive as you paginate more. If you can, yourself, chunk search requests into separate ones (for example, based on time, or something else), then you can use search for each chunk and reindex the data.

On Thu, Jan 6, 2011 at 8:16 PM, John Chang <[hidden email]> wrote:

I should add: When I try to sort off the long cast to a float (stored as type
float), if my query term matches a small number of documents, it works fine.
--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/How-to-migrate-a-field-type-tp2108002p2207200.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.