This post has NOT been accepted by the mailing list yet.
I would reduce the bulk size to 10,000 and change the number of shards to 2 since it is a single-node cluster and you are indexing 20M documents.
Each shard can hold around 2B documents so 20M is not much. The more shards you have, the more resource ES is going to consume so 2 shards is more than enough. 50 is way too much.
If I were you, I would do the following without knowing much about the data and the number of cores
- index the same set with 1 shards, bulk 10,000, and take a measurement
- index the same set with 1 shards, bulk 20,000 and take a measurement
- index the same set with 1 shards, bulk 30,000 and take a measurement
then repeat again with 2 shards... you'll find what is acceptable with your HW and data set.