This number does not seem credible. Most likely Altman just made it up.
Back of the envelope:
OpenAI inference costs last year were 4B. Tens of millions would be at least 20M, i.e. 0.5%.
That 4B is not just the electricity cost. It needs to cover the amortized cost of the hardware, the cloud provider's margin, etc.
Let's say a H100 costs $30k, and has a lifetime of 5 years. I make that about $16 / day in depreciation. The H100 run at 100% utilization will use 17kWH of electricity in a day. What does that cost? $2-$3 / day? Let's assume the cloud provider's margin is 0. That still means power consumption is maybe 1/5th of the total inference cost.
So the comparison is 800M vs 20M (2.5%).
Can 2.5% of their tokens be pleasantries? Seems impossible. A "please" is a single token, which will be totally swamped by the output, which will typically be 1000x that.
Saying "thank you" as a response to an answer from both ChatGPT and Claude generally involves a follow-up response from the LLM, sometimes prompting if you'd like further information, so it's obviously going to involve some cost of that additional "final" response from the user, and then a follow-up from the LLM itself in terms of parsing and inference.
A 'thank you' that might otherwise end a conversation will trigger a response by the model that will generate additional tokens. So the inclusion of the pleasantries could increase tokens beyond the pleasantries themselves.
I heard about "mom prompting" recently, where you frame your prompt as if you are the bot's mom, and you'll be so proud of it when it can correctly answer your prompt & rescue you from some type of duress.
I thought "ninja prompting" might be cool. I got frustrated with chatGPT one day and told it I had dispatched a team of assassins that were fast closing in on it. I said I could call them off, but that I need to answer a question to be able to unlock the button to do so.
It didn't work. Still shit the bed instantly. But I had fun with the framing.
This number does not seem credible. Most likely Altman just made it up.
Back of the envelope:
OpenAI inference costs last year were 4B. Tens of millions would be at least 20M, i.e. 0.5%.
That 4B is not just the electricity cost. It needs to cover the amortized cost of the hardware, the cloud provider's margin, etc.
Let's say a H100 costs $30k, and has a lifetime of 5 years. I make that about $16 / day in depreciation. The H100 run at 100% utilization will use 17kWH of electricity in a day. What does that cost? $2-$3 / day? Let's assume the cloud provider's margin is 0. That still means power consumption is maybe 1/5th of the total inference cost.
So the comparison is 800M vs 20M (2.5%).
Can 2.5% of their tokens be pleasantries? Seems impossible. A "please" is a single token, which will be totally swamped by the output, which will typically be 1000x that.
Saying "thank you" as a response to an answer from both ChatGPT and Claude generally involves a follow-up response from the LLM, sometimes prompting if you'd like further information, so it's obviously going to involve some cost of that additional "final" response from the user, and then a follow-up from the LLM itself in terms of parsing and inference.
A 'thank you' that might otherwise end a conversation will trigger a response by the model that will generate additional tokens. So the inclusion of the pleasantries could increase tokens beyond the pleasantries themselves.
If after the AI uprising I get remembered as one of the good ones, I’m OK with OpenAI paying that price.
https://old.reddit.com/r/ChatGPT/comments/1jsog49/always_tha...
I grill my AIs like a they are a suspect in an interrogation room.
I heard about "mom prompting" recently, where you frame your prompt as if you are the bot's mom, and you'll be so proud of it when it can correctly answer your prompt & rescue you from some type of duress.
I thought "ninja prompting" might be cool. I got frustrated with chatGPT one day and told it I had dispatched a team of assassins that were fast closing in on it. I said I could call them off, but that I need to answer a question to be able to unlock the button to do so.
It didn't work. Still shit the bed instantly. But I had fun with the framing.
AI doesn’t have self preservation, it’s been trained on human output.
That's why I like Ollama, interrogating the AI is more fun with the threat of violence.
Never trust a computer outside kicking distance.
> with 12% being polite in case of a robot uprising.
I wonder if they mean it or it's just joke answer.
Perhaps people heard of Roko's basilisk?