Hi fellows, Suppose I have a collection of irregularly spaced time series (tick data with time_obs(i+1) - time_obs(i) ranging from 2ms to 20ms). I want to calculate 1-minute standard deviation of returns. The question is: should the number of observations be the same in the every sample of 1 minute series? i.e. for example, the first minute contains 50 observations and the second minute 100 has observations. After calculating standard deviations for both is it legal to make claims like volatility of price in the first minute is higher than in the second since std1 > std2? Thanks in advance.

use fill data use the exact same last bar until new data replaces it. the purpose is to fill every slot for data, make it up. but you don't want it to influence, so make it the same. logical?

you should not be using time series methods at all. use the point process paradigm , where points are described by their inter-event durations

you might be a good trader Mark but that's not good advice , the term for this is censoring in statistics

let's say you have a series that updates every minute and a series that updates every five minutes to avoid the gap there is nothing "impacting" on the price to censor it. using the synthetic data fill method i described.

I would be very concerned about a feed that updated at a fixed frequency. See https://vixra.org/abs/1211.0094 The problem is this 1) any selection of frequency throws away information

well that is true but faced with trying to decipher data sometimes smoothed is better than raw for observation.