MAPREDUCE-516] Fix the 'cluster drain' problem in th. A much better model is for the scheduler to pick specific TaskTrackers and reserve slots on them while accounting for the same against the HighRAMJob and it's queue.
This would mean that once there is a reserved slot(s), per-task of the HighRAMJob, other slots in the cluster can be handed out to other jobs/queues in the cluster. Once the accounting for reserved slots is fixed, it would automatically ensure that a HighRAMJob can only reserve slots upto the quota of the queue it belongs to. Hence the next enhancement is to pick specific slots and hold them rather than hold slots on every TaskTracker. Picking slots for High RAM Jobs The key to better support for HighRAMJobs is to reserve slots on specific TaskTracker. Locality of input for the specific map-task of the job Minimize expected delay time until the slot in freed on a specific ! For the first cut, I'd propose we consider only locality and not expected time. Accounting for Reserved Slots Metering Notes on Implementation and Challenges.
Configuration Parameters: What can you just ignore? » Cloudera H. Configuring a Hadoop cluster is something akin to voodoo.
There are a large number of variables in hadoop-default.xml that you can override in hadoop-site.xml. Some specify file paths on your system, but others adjust levers and knobs deep inside Hadoop’s guts. Unfortuately, there’s little or no documentation on how to set them well. Is there a single optimal configuration? Are there some settings that can just be “set to 11?” At Cloudera, we’re working hard to make Hadoop easier to use and to make configuration less painful.
The rest of this post discusses why it’s a bad idea to just set all the limits as high as they’ll go, and gives you some pointers to get started on finding a happy medium. Why can’t you just set all the limits to 1,000,000? Increasing most settings has a direct impact on memory consumption. That having been said, here’s a list of some things that can be cranked up higher than the defaults by a fair margin: File descriptor limits fs.epoll.max_user_instances = 4096.
Using Apache with ZServer (NOT Zope.cgi) Created by .
Last modified on 2003/08/05. New: a complete solution using SiteAccess! The existing How-To's for combining Zope with the popular Apache server mostly require you to use the CGI interface to Zope, instead of the much faster, multithreaded ZServer. This incurs a big penalty in startup overhead for each run of the CGI, and it forces single threading of the Zope database. On the other hand, running ZServer alone isn't always practical! Fortunately, it's easy to embed ZServer behind Apache, using the ProxyPass and ProxyPassReverse configuration directives from Apache's mod_proxy. What does ProxyPass do? Apache's ProxyPass directive implements what is known as an "inverse proxy," where one server sits in front of another and "hides" the back server's content by accepting requests, massaging the URL, passing it to the back server, waiting for a response, and handing it back to the client.
There are actually two directives: ProxyPass /visiblepath ProxyPassReverse /visiblepath -- anser.