Restrict a field from going to _source field or easily print out all stored fields?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Restrict a field from going to _source field or easily print out all stored fields?

ppearcy
Hello,
  I like the concept of the _source field, however, in some cases I
have large content fields that can be any arbitrary length. I have no
need to return these content fields to the search and was wondering if
it was possible to selectively disable what goes into the _source
field. I understand that this removes the usefulness of _source for
the re-index case.

If this isn't possible, is it possible to add a * or _all input to the
fields parameter on the search as a convenience to show all the stored
fields? When debugging different indexes with different fields it is
painful to have to manually list all the fields you want returned and
want to be able to get a snapshot of everything you have.

As always, thanks a ton for the help.

Best Regards,
Paul
Reply | Threaded
Open this post in threaded view
|

Re: Restrict a field from going to _source field or easily print out all stored fields?

ppearcy
Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.

I could probably hack out a patch for this, if you there is interest.

Thanks,
Paul

On Aug 3, 1:57 pm, Paul <[hidden email]> wrote:

> Hello,
>   I like the concept of the _source field, however, in some cases I
> have large content fields that can be any arbitrary length. I have no
> need to return these content fields to the search and was wondering if
> it was possible to selectively disable what goes into the _source
> field. I understand that this removes the usefulness of _source for
> the re-index case.
>
> If this isn't possible, is it possible to add a * or _all input to the
> fields parameter on the search as a convenience to show all the stored
> fields? When debugging different indexes with different fields it is
> painful to have to manually list all the fields you want returned and
> want to be able to get a snapshot of everything you have.
>
> As always, thanks a ton for the help.
>
> Best Regards,
> Paul
Reply | Threaded
Open this post in threaded view
|

Re: Restrict a field from going to _source field or easily print out all stored fields?

kimchy
Administrator
Hi,

   All, both this and the previos email are valid requests. I think that the ability to return all the stored fields is a valid one, you can open an issue for this. But, the more stored fields you have, the slower the "fetch" process be.

   Regarding the source field. When indexing, it is never actually converted to an in memory representation, it is pull parsed directly into the index structure, and stored as bytes. Removing some fields from it requires either moving it to an in mem rep and mungin it, or having smart recreation of it while it is being pull parser.

   Returning part of the source field has the same problem. Since it is stored as byte array, it is just fetched and returned. In order to extract part of the data from it, then it needs to be parsed and munged for each fetch. Not ideal as well... .

-shay.banon

On Wed, Aug 4, 2010 at 4:43 AM, Paul <[hidden email]> wrote:
Actually, in retrospect, my preferred approach would be to keep the
_source field as is, but be able to suppress certain fields within it
from being returned.

I could probably hack out a patch for this, if you there is interest.

Thanks,
Paul

On Aug 3, 1:57 pm, Paul <[hidden email]> wrote:
> Hello,
>   I like the concept of the _source field, however, in some cases I
> have large content fields that can be any arbitrary length. I have no
> need to return these content fields to the search and was wondering if
> it was possible to selectively disable what goes into the _source
> field. I understand that this removes the usefulness of _source for
> the re-index case.
>
> If this isn't possible, is it possible to add a * or _all input to the
> fields parameter on the search as a convenience to show all the stored
> fields? When debugging different indexes with different fields it is
> painful to have to manually list all the fields you want returned and
> want to be able to get a snapshot of everything you have.
>
> As always, thanks a ton for the help.
>
> Best Regards,
> Paul

Reply | Threaded
Open this post in threaded view
|

Re: Restrict a field from going to _source field or easily print out all stored fields?

ppearcy
Sweet, thanks.

http://github.com/elasticsearch/elasticsearch/issues/issue/296

On Aug 3, 11:59 pm, Shay Banon <[hidden email]> wrote:

> Hi,
>
>    All, both this and the previos email are valid requests. I think that the
> ability to return all the stored fields is a valid one, you can open an
> issue for this. But, the more stored fields you have, the slower the "fetch"
> process be.
>
>    Regarding the source field. When indexing, it is never actually converted
> to an in memory representation, it is pull parsed directly into the index
> structure, and stored as bytes. Removing some fields from it requires either
> moving it to an in mem rep and mungin it, or having smart recreation of it
> while it is being pull parser.
>
>    Returning part of the source field has the same problem. Since it is
> stored as byte array, it is just fetched and returned. In order to extract
> part of the data from it, then it needs to be parsed and munged for each
> fetch. Not ideal as well... .
>
> -shay.banon
>
>
>
> On Wed, Aug 4, 2010 at 4:43 AM, Paul <[hidden email]> wrote:
> > Actually, in retrospect, my preferred approach would be to keep the
> > _source field as is, but be able to suppress certain fields within it
> > from being returned.
>
> > I could probably hack out a patch for this, if you there is interest.
>
> > Thanks,
> > Paul
>
> > On Aug 3, 1:57 pm, Paul <[hidden email]> wrote:
> > > Hello,
> > >   I like the concept of the _source field, however, in some cases I
> > > have large content fields that can be any arbitrary length. I have no
> > > need to return these content fields to the search and was wondering if
> > > it was possible to selectively disable what goes into the _source
> > > field. I understand that this removes the usefulness of _source for
> > > the re-index case.
>
> > > If this isn't possible, is it possible to add a * or _all input to the
> > > fields parameter on the search as a convenience to show all the stored
> > > fields? When debugging different indexes with different fields it is
> > > painful to have to manually list all the fields you want returned and
> > > want to be able to get a snapshot of everything you have.
>
> > > As always, thanks a ton for the help.
>
> > > Best Regards,
> > > Paul
Reply | Threaded
Open this post in threaded view
|

Re: Restrict a field from going to _source field or easily print out all stored fields?

kimchy
Administrator
pushed to master.

On Wed, Aug 4, 2010 at 10:17 AM, Paul <[hidden email]> wrote:
Sweet, thanks.

http://github.com/elasticsearch/elasticsearch/issues/issue/296

On Aug 3, 11:59 pm, Shay Banon <[hidden email]> wrote:
> Hi,
>
>    All, both this and the previos email are valid requests. I think that the
> ability to return all the stored fields is a valid one, you can open an
> issue for this. But, the more stored fields you have, the slower the "fetch"
> process be.
>
>    Regarding the source field. When indexing, it is never actually converted
> to an in memory representation, it is pull parsed directly into the index
> structure, and stored as bytes. Removing some fields from it requires either
> moving it to an in mem rep and mungin it, or having smart recreation of it
> while it is being pull parser.
>
>    Returning part of the source field has the same problem. Since it is
> stored as byte array, it is just fetched and returned. In order to extract
> part of the data from it, then it needs to be parsed and munged for each
> fetch. Not ideal as well... .
>
> -shay.banon
>
>
>
> On Wed, Aug 4, 2010 at 4:43 AM, Paul <[hidden email]> wrote:
> > Actually, in retrospect, my preferred approach would be to keep the
> > _source field as is, but be able to suppress certain fields within it
> > from being returned.
>
> > I could probably hack out a patch for this, if you there is interest.
>
> > Thanks,
> > Paul
>
> > On Aug 3, 1:57 pm, Paul <[hidden email]> wrote:
> > > Hello,
> > >   I like the concept of the _source field, however, in some cases I
> > > have large content fields that can be any arbitrary length. I have no
> > > need to return these content fields to the search and was wondering if
> > > it was possible to selectively disable what goes into the _source
> > > field. I understand that this removes the usefulness of _source for
> > > the re-index case.
>
> > > If this isn't possible, is it possible to add a * or _all input to the
> > > fields parameter on the search as a convenience to show all the stored
> > > fields? When debugging different indexes with different fields it is
> > > painful to have to manually list all the fields you want returned and
> > > want to be able to get a snapshot of everything you have.
>
> > > As always, thanks a ton for the help.
>
> > > Best Regards,
> > > Paul