STATISTICAL INFERENCE TECHNIQUES APPLIED TO LARGE SAMPLES

Autores

  • Hugo Vinícius Ferreira Azevedo
  • Eduardo Toledo de Lima Junior

Palavras-chave:

Large Samples, Statistical Inference, Resampling, Kolmogorov-Smirnov

Resumo

The growing evolution of structural materials and analysis models demands a proper un-
derstanding of the safety levels adopted in the design practice. The uncertainties inherent to structural

engineering problems can be evaluated from the statistical description of their design variables – dimen-
sional, mechanical and loading ones – and incorporated into the analysis by using structural reliability

models. The statistical characterization is a crucial part of the whole process, being carried out by
inference techniques, as the goodness-of-fit (GoF) tests, which verify if sample data fits a theoretical
distribution model, given a specified significance level. The GoF tests can be very sensitive to large
samples - from the order of thousands of data, becoming unsuitable for this kind of analysis. Alternative
techniques can be applied to handle large datasets. This is the case of the subsampling principle, which
involves the random, non-biased withdrawal of subsamples from the original dataset, so that a part of the
sample is used in the parameterization of the model and another part is used for the GoF test, in varying
proportions, to be studied. In addition, the AIC and BIC (Akaike and Bayesian Information Criteria)
values can be used as a preliminary indicative of the congruence between data sample and a theoretical

distribution. It is proposed the analysis of two different samples, in order to apply some inference tech-
niques, implemented in Python language. It is expected to contribute with the characterization of large

samples for studies in data science and reliability analysis applied to engineering.

Downloads

Publicado

2024-08-26

Edição

Seção

Artigos