zhanglong created AURORA-1109:
---------------------------------

             Summary: Add mesos role  feature
                 Key: AURORA-1109
                 URL: https://issues.apache.org/jira/browse/AURORA-1109
             Project: Aurora
          Issue Type: Story
          Components: Scheduler
            Reporter: zhanglong


Problems

We are from eBay platform team. Previously, we used marathon to generate 
Jenkins master instance in dedicated vms and recieve resource offer from same 
dedicated vms. For the details, please refer to
http://www.ebaytechblog.com/2014/04/04/delivering-ebays-ci-solution-with-apache-mesos-part-i/#.VNQUuC6_SPU

Now, we found Aurora is more stable and powerful. We are moving from Marathon 
to Aurora. During the move, we found there is no mesos role in Aurora now. But 
we need use mesos role way to solve the problem in section "Frameworks stopped 
receiving offers after a while" of the given url.

Here is a snippet of the problem description:

We noticed occurred after we used Marathon to create the initial set of CI 
masters. As those CI masters started registering themselves as frameworks, 
Marathon stopped receiving any offers from Mesos; essentially, no new CI 
masters could be launched. Let’s start with Marathon. In the DRF model, it was 
unfair to treat Marathon in the same bucket/role alongside hundreds of 
connected Jenkins frameworks. After launching all these Jenkins frameworks, 
Marathon had a large resource share and Mesos would aggressively offer 
resources to frameworks that were using little or no resources. Marathon was 
placed last in priority and got starved out.

We decided to define a dedicated Mesos role for Marathon and to have all of the 
Mesos slaves that were reserved for Jenkins master instances support that Mesos 
role. Jenkins frameworks were left with the default role “”.* This solved the 
problem – Mesos offered resources per role and hence Marathon never got starved 
out. A framework with a special role will get resource offers from both slaves 
supporting that special role and also from the default role “”.** However, 
since we were using placement constraints, Marathon accepted resource offers 
only from slaves that supported both the role and the placement constraints.*
Solution

So we add role feature is the source code to solve the problem in same way: 
When accept a resource offer, Aurora will send back the needed resources to 
Mesos with the mesos role in resource offer.

How to configure the Mesos role:
1.Add cmd option --mesos_role=${Mesos role name} when start Aurora scheduler.

We change the test cases according code change. Each changed test case is green



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to