SAS supports more than 25 common probability distributions for the PDF, CDF, QUANTILE, and RAND functions. If you need a less-common distribution, you can implement new distributions by using Base SAS (specifically, PROC FCMP) or the SAS/IML language. On the SAS Support Communities, a SAS programmer asked how to implement the generalized extreme value (GEV) distribution in SAS. This article shows how to use PROC FCMP and PROC IML to implement functions for working with the GEV distribution. T...| The DO Loop
A previous article discusses Cohen's d statistic and how to compute it in SAS. For a two-sample independent design, Cohen's d estimates the standardized mean difference (SMD). Because Cohen's d is a biased statistic, the previous article also computes Hedges' g, which is an unbiased estimate of the SMD. Lastly, the article discusses how to estimate the standard error of the statistic. Today's article extends the analysis by showing how to compute a confidence interval (CI) for Cohen's d (and ...| The DO Loop
What is Cohen's d statistic and how is it used?| The DO Loop
The recent releases of SAS 9.| The DO Loop
Gambling games that use dice, such as the game of "craps," are often used to demonstrate the laws of probability.| The DO Loop
A SAS programmer wanted to simulate samples from a family of Beta(a,b) distributions for a simulation study.| The DO Loop
The metalog family of distributions (Keelin, Decision Analysis, 2016) is a flexible family that can model a wide range of continuous univariate data distributions when the data-generating mechanism is unknown.| The DO Loop
The Johnson system (Johnson, 1949) contains a family of four distributions: the normal distribution, the lognormal distribution, the SB distribution, and the SU distribution.| The DO Loop
This article uses simulation to demonstrate the fact that any continuous distribution can be transformed into the uniform distribution on (0,1).| The DO Loop
In SAS, the aspect ratio of a graph is the physical height of the graph divided by the physical width.| The DO Loop
Most introductory statistics courses introduce the bar chart as a way to visualize the frequency (counts) for a categorical variable.| The DO Loop
Parameters in SAS procedures are specified a list of values that you manually type into the procedure syntax.| The DO Loop
SAS provides procedures to fit common probability distributions to sample data. You can use PROC UNIVARIATE in Base SAS or PROC SEVERITY in SAS/ETS software to estimate the distribution parameters for approximately 20 common distributions, including normal, lognormal, beta, gamma, and Weibull. Since there are infinitely many distributions, you may eventually need to fit a distribution that SAS does not natively support. There are three often-used methods for fitting the parameters of a distri...| The DO Loop
Many common probability distributions contain terms that increase or decrease quickly, such as the exponential function and factorials. The numerical evaluation of these quantities can result in numerical overflow (or underflow). This is why we often work on the logarithmic scale: on the log-scale, the numerical computations for equations such as the log-likelihood function are more stable. This article demonstrates a different trick that is useful for computing a sum of exponential terms. I ...| The DO Loop
Suppose you measure data weekly.| The DO Loop
Dating can be a challenge. No, I'm not talking about the process of finding a soulmate. I'm talking about managing days, weeks, months, and years in statistical analyses and reports! One challenge is how to number the weeks of the year. Because there are seven days in a week, 52 weeks equals 7*52 = 364 days of the year. Of course, there are 365 (or 366) days in a year, so the "first week of the year" does not always start on New Year's Day. But when does it start? What date is the first day o...| The DO Loop
I follow several data visualization experts on social media.| The DO Loop
When you pass a matrix as an parameter (argument) to a SAS/IML module, the SAS/IML language does not create a copy of the matrix.| The DO Loop
While researching a topic on effect sizes, I learned about a SAS function that is related to noncentrality parameters. I previously wrote an article about the noncentral t distribution, which is one of several well-known distributions that contains an optional noncentrality parameter. I mentioned that the PDF, CDF, and QUANTILE functions in SAS support an optional noncentrality parameter for the t, chi-square, and F distributions. However, I did not know until recently that SAS also provides ...| The DO Loop
A previous article discusses a "Catch-22" paradox for fitting nonlinear regression models: You can't estimate the parameters until you fit the model, but you can't fit the model until you provide an initial guess for the parameters! If your initial guess for the parameters is not good enough, the nonlinear optimization algorithm that tries to maximize the loglikelihood might not converge. The previous article shows how to specify a grid of initial parameter values in PROC NLIN and PROC NLMIXE...| The DO Loop
I have previously written about the moment-ratio diagram as a graphical tool for modeling univariate distributions and also as a tool for examining the distribution of the skewness and kurtosis statistics for distributions.| The DO Loop
This article shows how to classify a set of high-dimensional data into orthants.| The DO Loop
An article by David Corliss in Amstat News (Corliss D.| The DO Loop
SAS software provides many run-time functions that you can call from your SAS/IML or DATA step programs.| The DO Loop
Newton's method was in the news this week.| The DO Loop
"I think that my data are exponentially distributed, but how can I check?| The DO Loop
Imagine an animal that is searching for food in a vast environment where food is scarce.| The DO Loop
The tail of a probability distribution is an important notion in probability and statistics, but did you know that there is not a rigorous definition for the "tail"?| The DO Loop
In my previous post, I showed how to approximate a cumulative density function (CDF) by evaluating only the probability density function.| The DO Loop
A frequent topic on SAS discussion forums is how to check the assumptions of an ordinary least squares linear regression model.| The DO Loop
Since the late 1990s, SAS has supplied macros for basic bootstrap and jackknife analyses.| The DO Loop
In my article "Simulation in SAS: The slow way or the BY way," I showed how to use BY-group processing rather than a macro loop in order to efficiently analyze simulated data with SAS.| The DO Loop
You've probably heard of a random walk, but have you heard about the drunkard's walk?| The DO Loop
In SAS, the INPUT and PUT functions are powerful functions that enable you to convert data from character type to numeric type and vice versa.| The DO Loop
Over the past few years, and especially since I posted my article on eight tips to make your simulation run faster, I have received many emails (often with attached SAS programs) from SAS users who ask for advice about how to speed up their simulation code.| The DO Loop
The Johnson system (Johnson, 1949) contains a family of four distributions: the normal distribution, the lognormal distribution, the SB distribution (which models bounded distributions), and the SU distribution (which models unbounded distributions).| The DO Loop
From the early days of probability and statistics, researchers have tried to organize and categorize parametric probability distributions.| The DO Loop
This article describes best practices and techniques that every data analyst should know before bootstrapping in SAS.| The DO Loop
If you want to bootstrap the parameters in a statistical regression model, you have two primary choices.| The DO Loop
A common question is "how do I compute a bootstrap confidence interval in SAS?| The DO Loop
Many statistical tests use a CUSUM statistic as part of the test.| The DO Loop
To a statistician, the DIF function (which was introduced in SAS/IML 9.| The DO Loop
Graphs enable you to visualize how the predicted values for a regression model depend on the model effects.| The DO Loop
Isotonic regression (also called monotonic regression) is a type of regression model that assumes that the response variable is a monotonic function of the explanatory variable(s).| The DO Loop
Just like the SAS DATA step, the SAS IML language supports both functions and subroutines.| The DO Loop
As announced and demonstrated at SAS Innovate 2024, SAS plans to include a generative AI assistant called SAS Viya Copilot in the forthcoming SAS Viya Workbench.| The DO Loop
One of the most exciting features of SAS Viya Workbench is that the code editor includes a generative AI component called SAS Viya Copilot.| The DO Loop
A previous article discusses a formula for a confidence interval for R-square in a linear regression model (Olkin and Finn (1995) "Correlations redux", Psychological Bulletin) The formula is useful for large data sets, but should be used with caution for small samples.| The DO Loop
A SAS analyst ran a linear regression model and obtained an R-square statistic for the fit.| The DO Loop
This article discusses how to scale a probability density curve so that it fits appropriately on a histogram, as shown in the graph to the right.| The DO Loop
A SAS analyst read my previous article about visualizing the predicted values for a regression model that uses spline effects.| The DO Loop
After writing a program that simulates data, it is important to check that the statistical properties of the simulated (synthetic) data match the properties of the model.| The DO Loop
A SAS statistical programmer recently asked a theoretical question about statistics.| The DO Loop
At a recent conference in Las Vegas, a presenter simulated the sum of two dice and used it to simulate the game of craps.| The DO Loop
Years ago, I wrote an article that showed how to visualize patterns of missing data.| The DO Loop
The moment-ratio diagram is a tool that is useful when choosing a distribution that models a sample of univariate data.| The DO Loop