Latency vs Price vs Throughput
You can explicitly prioritize a specific attribute to disable load-balancing and target providers that best suit your needs. The three attributes you can target are:
Latencysorts providers by the lowest latencyPricesorts providers by the lowest priceThroughputsorts providers by the highest throughput
use orpheus::prelude::*;
fn main() {
let client = Orpheus::new("Your-API-Key");
let prompt = "What is 23 + 47?";
let model = "moonshotai/kimi-k2";
for priority in [Sort::Latency, Sort::Price, Sort::Throughput] {
let res = client
.chat(prompt)
.model(model)
.with_preferences(|pref| pref.sort(priority))
.send()
.unwrap();
println!(
"Provider picked with priority '{:?}': {}",
priority, res.provider
);
}
}Provider picked with priority 'Latency': DeepInfra
Provider picked with priority 'Price': Targon
Provider picked with priority 'Throughput': GroqLast updated
Was this helpful?