Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datanode resource limits #69

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jniebuhr
Copy link

It seems like the limits were removed so the data nodes won't be limited too much so I've added a new resources section for the data nodes.

@stevesloka
Copy link
Member

stevesloka commented Jul 13, 2017

As you can see I already had this code in there, just commented out. I found with limits on memory when you set them and the pod exceeds, then it gets killed. I think the underlying container needs some work to limit the java heap to not exceed a limit so that the pod doesn't.

In my testing, the data nodes would exceed the memory limit pretty regularly and cause the cluster to be restarting always.

Here's the docs:

If a container exceeds its RAM limit, it is terminated. If a container exceeds its CPU limit, it becomes a candidate for having its CPU use throttled.

Do you have any thoughts? Does your testing show the same?

@jniebuhr
Copy link
Author

In my example, there was an environment variable ES_JAVA_OPTS set on the stateful set. That contains -Xmx1024M -Xms1024M for me. So I've set my container limits accordingly. -Xmx1024 should tell the JVM to only have 1Gi of Heap, there will be some other memory depending on how many cores you have so it should be around 1.5Gi max.

@jniebuhr
Copy link
Author

oh yeah, it's hardcoded for the data nodes

@jniebuhr
Copy link
Author

we should add something you can set that too

@stevesloka
Copy link
Member

Would you mind coding it so that the limits are optional? Maybe if you don't supply value for the limit then it doesn't get set?

@jniebuhr
Copy link
Author

I'll change it

@jrnt30
Copy link

jrnt30 commented Nov 3, 2017

@stevesloka This actually brings up an initial observation I had while looking at the project.

What are your thoughts on making:

  • distinct node type's request/limits customizable but defaulted to something sane
  • setting the request/limit be the same so you get the "Guaranteed" QOS for K8
  • JVM settings dynamically tuned off of those defaulted as a % of request/limit

@jrnt30
Copy link

jrnt30 commented Nov 8, 2017

On additional thing to note here, when you add in a "Limit" to a container, this also includes the filesystem cache and the CGroup will cap it. The implication of this is you will probably want a lower ratio of JVM Heap / Limit than you would expect (ex. maybe 50% of limit goes to the Limit) to provide ES ample headroom for OffHeap storage via the Filesystem Cache.

We ran into a similar issue with our custom grown ES deployment recently and figured I'd throw that out here.

@chancez
Copy link

chancez commented Nov 22, 2017

It's pretty easy to write a entrypoint.sh which calculates the java heap size based on the pod's memory request/limits. I've found that making the JVM heapsize 50% of the memory limit/request tends to work fairly well. The pod's entrypoint can get this information as environment variables via the downward API.

@jrnt30
Copy link

jrnt30 commented Nov 22, 2017

Thanks for the feedback, however not setting a limit can result in a negative impact on other pods running in the cluster. We found that without limiting this, there can be thrashing among the multiple pods on the same node and leads to non-deterministic performance .

@chancez
Copy link

chancez commented Nov 22, 2017

I've opened pires/docker-elasticsearch-kubernetes#51 which should help with some of this. Once merged I'll open a PR to this repo that passes everything necessary for the containers to autodetect their memory limit/requests and set their JVM heap size appropriately.

This PR is still needed I think however so that you can control memory limits on a per-component basis.

@chancez
Copy link

chancez commented Nov 22, 2017

I opened #120 which is part of the auto-setting the JVM heap size. Still requires the image to have the entrypoint script that I've opened a PR for however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants