This is a browser-only client. The remote vLLM server must allow CORS. Any API key entered here is exposed to the browser and should only be used for trusted endpoints.
These are estimates for the server powering BartlebyGPT, which is currently deployed on an RTX A4000 in TierPoint's Spokane, WA data center. The server can generate up to 15 simultaneous conversations, so these figures are the cost to serve all users.
Watts is the estimated total facility draw for this server — GPU, CPU, RAM, storage, fans, and the data center's power distribution and cooling overhead (PUE ~1.35). Idle reflects background system load; active reflects full GPU utilization during inference.
gCO₂/Whr is the direct carbon cost of the electricity consumed — wattage × grid carbon intensity. Avista Power serves Spokane with a ~65% renewable power mix, with an estimated emissions factor of ~0.3 g/Wh. This does not include emissions generated in manufacturing or end-of-life costs for the hardware, or private jets to lobby politicians.
$/hr is the cost to rent this GPU from the data center. We accrue this cost whenever it is running, idle or not.
All three parameters are adjustable under Advanced.