I don't know the true answer. However, to me it makes intuitive sense that longer-dated contracts would be more expensive because you have to pay someone to store the product for a longer period of time (think renting out a storage locker). If the shorter-dated contracts are more expensive, then to me that's backwards and hence backwardation.
With options there are two kinds of skew: horizontal (time) and vertical (strike) and they are measured in terms of IV. Contango/backwardation refer to horizontal skew: if IV rises with longer dated options, they're said to be in contango. If the short dated have higher IV, that's "backwardation". During periods of high volatility, both futures and options tend to go into their respective forms of backwardation (higher price for near dated futures and higher IV for near dated options). When markets are calm, they will be in their normal state of contango.
Here's some 101 for you. I mean, you've only been here since medieval times. Puts are calls; calls are puts. All you need is stock (Beatles). Long-dated are often in backwardation. Don't confuse the vol-line with vega. Why do ppl find McMillan and/or Hull and suddenly think they invented vol?
Just FYI, most people call "horizontal skew" as "volatility term structure". I can't recall anyone calling it "skew" since the late 90s.
OK, but for the purpose of explaining "IV contango/backwardation", putting it in terms of skew seem to me to be less abstract and more intuitive; and then it can all be put under the heading of "term structure" without having to give a seperate explanation of "term structure" by comparing it to vertical skew.
The two are strongly related, so a complete explanation should include both. It's an hour-long discussion, though and probably not interesting to anyone who's not planning to structure relative value trades. For a quick chat, nothing prevents you from saying "the terms structure of implied volatility is normally upward sloping and is only downward sloping when demand for gamma is higher than demand for vega". This way, same person reading dealer research would not be confused by the standard terminology. No need to mention the skewness in strike space at all. Someone who is trading term structure (e.g. 1m/3m calendar) is exploiting a very different risk premium effect than someone who is trading the skew.
If someone who understands what strike skew is, then asks about the meaning of "contango/backwardation", I think it's visually simple and intuitive for them to see it as an "IV time skew". "Skew" is visual - "structure" is abstract: If I started using "Strike structure" in place of "strike IV skew", all I would be doing is adding an abstract layer of professional sounding jargon, impeding understanding to make myself seem smarter. I suspect the replacement of "IV time skew" with "term structure" was a similar kind of process.
5 delta would normally suggest about a 5% probability of expiring beyond the strike. The “probability of touch” is roughly twice delta, so the probability of touching the strike of a 5 delta OTM option will be about 10%.