pub struct TenantScheduler {
tenants: HashMap<String, Tenant>,
pid_tenant_map: HashMap<u32, String>,
total_gpus: u32,
}Expand description
Multi-tenant scheduler that enforces GPU-proportional CPU allocation.
If Job A has 2 GPUs and Job B has 6 GPUs, Job B gets 3x the CPU scheduling weight for data loading. This prevents one tenant’s data loading from starving another’s NCCL collectives.
Fields§
§tenants: HashMap<String, Tenant>§pid_tenant_map: HashMap<u32, String>Maps pid -> tenant_id for quick lookup.
total_gpus: u32Implementations§
Source§impl TenantScheduler
impl TenantScheduler
pub fn new() -> Self
Sourcepub fn register_tenant(&mut self, tenant: Tenant)
pub fn register_tenant(&mut self, tenant: Tenant)
Register a new tenant.
Sourcepub fn unregister_tenant(&mut self, tenant_id: &str)
pub fn unregister_tenant(&mut self, tenant_id: &str)
Remove a tenant.
Sourcepub fn assign_pid(&mut self, pid: u32, tenant_id: &str)
pub fn assign_pid(&mut self, pid: u32, tenant_id: &str)
Associate a process with a tenant.
Sourcepub fn cpu_weight_for_pid(&self, pid: u32) -> f32
pub fn cpu_weight_for_pid(&self, pid: u32) -> f32
Get the CPU weight for a given pid. Weight is proportional to the tenant’s GPU share.
Sourcepub fn effective_priority(&self, pid: u32, phase_priority: i32) -> i32
pub fn effective_priority(&self, pid: u32, phase_priority: i32) -> i32
Compute effective priority for a pid, combining phase priority with tenant weight.
pub fn tenant_count(&self) -> usize
pub fn get_tenant_for_pid(&self, pid: u32) -> Option<&Tenant>
Auto Trait Implementations§
impl Freeze for TenantScheduler
impl RefUnwindSafe for TenantScheduler
impl Send for TenantScheduler
impl Sync for TenantScheduler
impl Unpin for TenantScheduler
impl UnwindSafe for TenantScheduler
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more