**16.262 regression questions.**

Software packages that calculate regressions sometimes also return p-values. I want to understand how to calculate this p-value by hand.
Here's what I think I understand:
I want to calculate the ...

I fitted a multiple linear regression model which included 5 explanatory variables ("model 1").
After this I realised that maybe I was omiting several interesting variables so I fitted another model ...

Question
I always used a paired t-test or a wilcoxon signed rank test (of course depending on the dataset) to check whether two methods (on average) yielded the same results. After learning more about ...

I have one independent variable which has values less than 1, and want to see how the odds of having a disease change when the independent variable increase by 0.1. I ran a binary logistic regression ...

I have the data of events occurring randomly. Variables include, Unique ID,type, Date of Event, End date etc. I have the dates of, when the event occurs for 6 years. I need to predict the next ...

I have a multivariate linear model (y=x1+x2) which gives me the following results when using R's plot() function:
I can clearly see that the Normality and ...

I am working on an analysis of a simple linear regression and I don't know what to do. This is my graph:
the p-value is <0.0001 but the data is clearly not linear and $R^2$ value is really small. ...

I have a query related AIC value. I am getting very high AIC values while selecting multiple regression model, ranging from 4300-4600. Is it possible to get such high AIC values?

I am solving a MILP problem in my current master thesis and I got stuck with some issue. Originally, I had it a set of data points that I was using to build an OLS regression model of second order.
...

I have a case where I have only 2.7% of the rows having value for a particular column?
What steps could I take to utilize it? Any specific methods?
Also, any algorithms or techniques for utilizing ...

I have a normally distributed continuous variable referring to an observed human behavior, and I'm interested in measuring or rather analyzing the extreme of this behavior, namely, the top 10% of the ...

I have a 1-10 Likert scale as my DV and five variables as my IV. My IVs include both continuous and categorical variables.
I am wondering which independent variables have a major effect on my ...

Seemingly reputable sources claim that the dependent variable must be normally distributed:
Model assumptions: $Y$ is normally distributed, errors are normally
distributed, $e_i \sim N(0,\sigma^2)...

If I have a continuous Dependent Variable and two Independent Variables, where one is categorical with three levels and the other is continuous, what assumptions do I need to check for multiple ...

Consider, there are two classes of data and we have learned the SVM parameters in terms Lagrange multipliers. There are many learning techniques to learn these parameters quadratic programming or ...

I'm raising a recurrent issue with different results in R lm vs. Python statsmodel for simple linear regression, becasue my case suggests a problem with statsmodel. I'm exploring the relationship ...

Marketing campaign effectiveness example.
I am trying to model App installs as a function of the following variables
Click through rate
Ad Spend - CPM (Display) and CPC (Paid Search)
Media Type -...

While working on a big data set made of 10-minutes-points of information - i.e. 144 points per day, 1008 per week and ...

I have a project that I am working on and would like a pointer to help me move forward in the correct direction. I have a very large dataset, about 5 million lines, and I performing missing data ...

A previous study had participants make a dichotomous decision, and it was analyzed with logistic regression.
A follow-up study is measuring the same construct, but having participants make a ...

Here is a question in the book of Freedman p.189.
The answers provided are
(a) ( i)&(ii)
(b) not used
(c) something wrong
My biggest confusion is the answer for (c), what is wrong with this ...

I have performed (backwards elimination) stepwise regression using some fMRI data predictors to model spectroscopy data as a DV. This has resulted in some interesting models.
I now have some ...

I am trying to understand the mechanism behind the lasso. However I want to gain some intuition in the case, that "what happens if we dont standardize our data". I find many posts but none was ...

Multiple linear regression:What to do if my explanatory variables are highly correlated and I want to fit a model.Explanatory variables are measured on ordinal scale which are 4 point and 10 point.

I am working on forecasting problem using a regression model like gradient boosting to predict the number of weekly sold shoes.
I am using the historical data only from last year to predict the sales ...

I've found for my econometrics exams that if I forget the scalar notation, I can often save myself by remembering the matrix notation and working backwards. However, the following confused me.
Given ...

I have a dataset of SO4 concentration (annually) from 1986-2016 across 35 water catchment stations. The dataset looks like this:
I first plotted the data and found out that the trend pattern is ...

How should I judge and interpret the result of my logistic model knowing that I get:
...

I would like to pose this question in two parts. Both deal with a generalized linear model, but the first deals with model selection and the other deals with regularization.
Background: I utilize ...

Can linear regression be used when both the dependent and independent variable are categorical?
i am looking at word-frequency distribution among a series of texts, and want to show that there is a ...

I have to run a linear regression analysis with an interaction effect of two categorical variables:
Modality (audio, visual and audio-visual)
Repetition (1x, 2x and 4x)
I have already dummified the ...

Byars et al.'s paper "Natural selection in a contemporary human population" includes a multiple linear regression of number of children (LRS, lifetime reproductive success) on several variables (...

I have a time series with sample size of 20. Is there a way I can find the inflection point using R? Also, what is the theoretical statistics behind the test?

For each horse in a race, I am trying to model a response variable called rating which follows a normal distribution.
I need a mean and a variance for ...

Various recent efforts of mine on modelling some data through logistic regression have been... not successful. While there is still more data to look at, I've been wanting to explore nonlinear ...

I'm reading about best subset selection in the Elements of statistical learning book.
If I have 3 predictors $x_1,x_2,x_3$, I create $2^3=8$ subsets:
Subset with no predictors
subset with predictor $...

What are the relation and differences between time series and linear regression?
I have a strong grasp of linear regression, and a beginner's grasp on time series analysis; I know the Box-Jenkins ...

I have several time series of two variables over the course of one year (approx. 2.5k observations). I hypothesize one variable (x) acts as a potential predictor for the other variable (y). I looked ...

I would like to build a regression model to predict an outcome variable, y. Let ymin, ymax be the smallest and largest observed values of y in the dataset. Let ymean be the mean observed value. The ...

If your modeling problem is that you have too many features, a solution to this problem is LASSO regularization. By forcing some feature coefficients to be zero, you remove them, thus reducing the ...

Suppose we have three explanatory variables like $x_1,x_2,x_3$ and three response variables like $y_1, y_2, y_3$, we know that y should be a function of x, such that
$$
(y_1,y_2,y_3) = f(x_1,x_2,x_3)
$...

I'm trying to convince my boss that we should consider using machine learning in our field (oncology). We study brain tumours, roughly 90% die within a few years. I wanted to compare the performance ...

I read that these are the conditions for using the multiple regression model:
the residuals of the model are nearly normal,
the variability of the residuals is nearly constant
the residuals are ...

I've looked at prior posts as well as the lme4 documentation in R but can't seem to find a solution to my problem.
I am trying to model how an intervention (tutoring), impacts examination pass rates ...

I'm working on a regression model using Latent Dirichlet Allocation (LDA). Using daily news data, I'm using a GARCH-model to see if different topics found using LDA indeed are significant in the ...

Since the median is often a better central descriptor of skewed distributions, I'd like to know if there is an equivalent method to a linear model for explaining variability of one variable through a ...

I would like to check if the slope coefficients retrieved from two separate regression models are significantly different. Both models have the same independent variables. The dependent variable (DV) ...

Title says it all. I understand that the Least-Squares and Maximum-Likelihood will give the same result for regression coefficients if the model's errors are normally distributed. But, what happens if ...

I have a variable (call it V) that I want to create a linear model with.
The problem is that I want to break up this continuous variable V into 2 continuous variables (such as V1 for V<0, and V2 ...

- r
- logistic
- multiple-regression
- machine-learning
- time-series
- linear-model
- least-squares
- correlation
- categorical-data
- generalized-linear-model
- self-study
- regression-coefficients
- interaction
- residuals
- linear
- statistical-significance
- econometrics
- hypothesis-testing
- predictive-models
- anova
- interpretation
- modeling
- classification
- data-transformation
- mathematical-statistics

- Why not include as a requirement that all functions must be continuous to be differentiable?
- How does Linux identify users?
- Can weapons get better?
- Symbol sum is generated bad
- Dealing with a PhD student reneging on an agreement to appear in social media
- How far should you go in compromising your work to get it published?
- Can you escape a Maze spell by turning into a Minotaur?
- What does ENV (“_”) do for anti-debugging?
- Got invited to apply for a job for which I don't qualify. How should I take this?
- Does Mickey Mouse exist in the Ducktales universe?
- How could Dumbledore have opposed the Reasonable Restriction of Underage Sorcery?
- Should I take out a loan to pay off a relative's credit card debt?
- Logo design - Finding a good mix between the metaphor and legibility
- The Infinite Sea, how narrow can that be?
- Calculating length of polygon in geopandas?
- Post-Golden Age Human Civilization
- Is this tone unusual in a tech workplace?
- Innovative Ways to Provide Background Information
- What is the word for the metal things on boots where the laces go?
- Two-year visitor visa
- Windows - can't delete empty folder because it is used
- Could a person scuba diving take off his air tank and use it to propel himself?
- Coworker demands immediate assistance when having computer trouble
- What is the justification for the typical punishment of a student who did not cheat but helped others cheat?