"Incompatible encoding" when using Logstash to ship JSON files to Elasticsearch

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

"Incompatible encoding" when using Logstash to ship JSON files to Elasticsearch

Vagif Abilov
Hello,

We have been successfully using Logstash to parse our JSON logs data and import them to Elasticsearch database, but recently had failures on some machines. Here's the error Logstash displays:

←[32mRegistering file input {:path=>["D:/Octopus/Applications/prod-ndoa/Bridge.Web/logs/BridgeSoap.*.txt"], :level=>:info}←[0m
←[32mPipeline started {:level=>:info}←[0m
←[31mA plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::File add_field=>{"_environment"=>"prod-ndoa", "_application"=>"bridge_rest"}, path=>["D:/Octopus/Applications/prod-ndoa/Bridge.Rest.Host/logs/BridgeRest.*.txt"], sincedb_path=>"D:/Octopus/Applications/prod-ndoa/Bridge.Rest.Host/logs/sincedb",tags=>["bridge_rest"], start_position=>"end">
  Error: incompatible encodings: Windows-1252 and UTF-8 {:level=>:error}←[0m

The input file is a set of JSON documents in UTF-8 encoding (with BOM). If edit the file and remove BOM symbols, the import goes fine.

And here's the input file configuration:

input {
  file {
    path => "D:/Octopus/Applications/prod-ndoa/Bridge.Web/logs/BridgeSoap.*.txt"
    sincedb_path => "D:/Octopus/Applications/prod-ndoa/Bridge.Web/logs/sincedb"
    codec => json
    start_position => "end"
   }
}

If I remove codec "json", it doesn't fail but the output of course is wrong because it treats JSON documents as plain text.

The strangest things is that on other machines it works properly (same 1.4.2 version of Logstash).

Does anyone have an idea why this might happen?

Thanks in advance

Vagif Abilov

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f5865299-c16a-4cb7-b185-811d3b4c8ec0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: "Incompatible encoding" when using Logstash to ship JSON files to Elasticsearch

Aaron Mildenstein
Thank you for bringing this to our attention.  Can you please create an issue at https://github.com/logstash-plugins/logstash-codec-json ?

Thanks!

On Wednesday, December 10, 2014 7:13:16 AM UTC-8, Vagif Abilov wrote:
Hello,

We have been successfully using Logstash to parse our JSON logs data and import them to Elasticsearch database, but recently had failures on some machines. Here's the error Logstash displays:

←[32mRegistering file input {:path=>["D:/Octopus/Applications/prod-ndoa/Bridge.Web/logs/BridgeSoap.*.txt"], :level=>:info}←[0m
←[32mPipeline started {:level=>:info}←[0m
←[31mA plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::File add_field=>{"_environment"=>"prod-ndoa", "_application"=>"bridge_rest"}, path=>["D:/Octopus/Applications/prod-ndoa/Bridge.Rest.Host/logs/BridgeRest.*.txt"], sincedb_path=>"D:/Octopus/Applications/prod-ndoa/Bridge.Rest.Host/logs/sincedb",tags=>["bridge_rest"], start_position=>"end">
  Error: incompatible encodings: Windows-1252 and UTF-8 {:level=>:error}←[0m

The input file is a set of JSON documents in UTF-8 encoding (with BOM). If edit the file and remove BOM symbols, the import goes fine.

And here's the input file configuration:

input {
  file {
    path => "D:/Octopus/Applications/prod-ndoa/Bridge.Web/logs/BridgeSoap.*.txt"
    sincedb_path => "D:/Octopus/Applications/prod-ndoa/Bridge.Web/logs/sincedb"
    codec => json
    start_position => "end"
   }
}

If I remove codec "json", it doesn't fail but the output of course is wrong because it treats JSON documents as plain text.

The strangest things is that on other machines it works properly (same 1.4.2 version of Logstash).

Does anyone have an idea why this might happen?

Thanks in advance

Vagif Abilov

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1fb33e9b-6fb9-49b7-ba4a-af97cf0610cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: "Incompatible encoding" when using Logstash to ship JSON files to Elasticsearch

Vagif Abilov
In reply to this post by Vagif Abilov
Thank you Aaron, done. I've created an issue. But I'd like to find out if there's a workaround for this problem. What's really strange that the same Logstash installation works with similar JSON files on other machines.

Vagif

On Wednesday, December 10, 2014 4:13:16 PM UTC+1, Vagif Abilov wrote:
Hello,

We have been successfully using Logstash to parse our JSON logs data and import them to Elasticsearch database, but recently had failures on some machines. Here's the error Logstash displays:

←[32mRegistering file input {:path=>["D:/Octopus/Applications/prod-ndoa/Bridge.Web/logs/BridgeSoap.*.txt"], :level=>:info}←[0m
←[32mPipeline started {:level=>:info}←[0m
←[31mA plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::File add_field=>{"_environment"=>"prod-ndoa", "_application"=>"bridge_rest"}, path=>["D:/Octopus/Applications/prod-ndoa/Bridge.Rest.Host/logs/BridgeRest.*.txt"], sincedb_path=>"D:/Octopus/Applications/prod-ndoa/Bridge.Rest.Host/logs/sincedb",tags=>["bridge_rest"], start_position=>"end">
  Error: incompatible encodings: Windows-1252 and UTF-8 {:level=>:error}←[0m

The input file is a set of JSON documents in UTF-8 encoding (with BOM). If edit the file and remove BOM symbols, the import goes fine.

And here's the input file configuration:

input {
  file {
    path => "D:/Octopus/Applications/prod-ndoa/Bridge.Web/logs/BridgeSoap.*.txt"
    sincedb_path => "D:/Octopus/Applications/prod-ndoa/Bridge.Web/logs/sincedb"
    codec => json
    start_position => "end"
   }
}

If I remove codec "json", it doesn't fail but the output of course is wrong because it treats JSON documents as plain text.

The strangest things is that on other machines it works properly (same 1.4.2 version of Logstash).

Does anyone have an idea why this might happen?

Thanks in advance

Vagif Abilov

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/07dc4a43-5ccb-45fc-bdaa-29d4e5245f3a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: "Incompatible encoding" when using Logstash to ship JSON files to Elasticsearch

InquiringMind
We use the HTTP protocol from logstash to send to Elasticsearch, and therefore we have never had this issue. 

There is a version of ES bundled with logstash, and if it doesn't match the version of ES you are using to store the logs then you may see problems if you don't use the HTTP protocol.

Brian

On Wednesday, December 10, 2014 3:53:30 PM UTC-5, Vagif Abilov wrote:
Thank you Aaron, done. I've created an issue. But I'd like to find out if there's a workaround for this problem. What's really strange that the same Logstash installation works with similar JSON files on other machines.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a86923e8-8e9f-429e-b85e-8ab8f7ab20d2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: "Incompatible encoding" when using Logstash to ship JSON files to Elasticsearch

Aaron Mildenstein
Actually this has nothing to do with the Elasticsearch output plugin being http vs. node.

Jordan has already confirmed the issue with BOM: https://github.com/logstash-plugins/logstash-codec-json/issues/1#issuecomment-66532688

On Wednesday, December 10, 2014 4:05:38 PM UTC-8, Brian wrote:
We use the HTTP protocol from logstash to send to Elasticsearch, and therefore we have never had this issue. 

There is a version of ES bundled with logstash, and if it doesn't match the version of ES you are using to store the logs then you may see problems if you don't use the HTTP protocol.

Brian

On Wednesday, December 10, 2014 3:53:30 PM UTC-5, Vagif Abilov wrote:
Thank you Aaron, done. I've created an issue. But I'd like to find out if there's a workaround for this problem. What's really strange that the same Logstash installation works with similar JSON files on other machines.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/16ad8f2f-6f8d-4007-becc-49a2df176ac1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: "Incompatible encoding" when using Logstash to ship JSON files to Elasticsearch

Vagif Abilov

Wow, that was quick! Now I need to find out what piece of our code puts BOMs into a UTF-8 file.

Thanks a million!

Vagif

On Dec 11, 2014 1:16 AM, "Aaron Mildenstein" <[hidden email]> wrote:
Actually this has nothing to do with the Elasticsearch output plugin being http vs. node.

Jordan has already confirmed the issue with BOM: https://github.com/logstash-plugins/logstash-codec-json/issues/1#issuecomment-66532688

On Wednesday, December 10, 2014 4:05:38 PM UTC-8, Brian wrote:
We use the HTTP protocol from logstash to send to Elasticsearch, and therefore we have never had this issue. 

There is a version of ES bundled with logstash, and if it doesn't match the version of ES you are using to store the logs then you may see problems if you don't use the HTTP protocol.

Brian

On Wednesday, December 10, 2014 3:53:30 PM UTC-5, Vagif Abilov wrote:
Thank you Aaron, done. I've created an issue. But I'd like to find out if there's a workaround for this problem. What's really strange that the same Logstash installation works with similar JSON files on other machines.

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/WSCgVfgYCmA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/16ad8f2f-6f8d-4007-becc-49a2df176ac1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2Bxi7%2B08ke2u%3DNmP6q_EfPwcgbtAGpu78cdWAtiPiad4JZGkOg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.