How to obtain Confidence Intervals for a LASSO regression? The 2019 Stack Overflow Developer Survey Results Are InStandard errors for lasso prediction using RInference after using Lasso for variable selectionWhy is feature selection important, for classification tasks?Building reliable glmnet model and constructing predictionsHow can I implement lasso in R using optim functionR - Lasso Regression - different Lambda per regressorHow to interpret all zero coefficients in the results of cv.glmnet?Nested cross-validation with LASSO for model selection and evaluationglmnet returning lambda that gives all-zero coefficients as optimal lambdaLASSO: optimal $lambda$ drops all predictors from modelInference after using Lasso for variable selectionBuilding final model in glmnet after cross validationCOX model with Lasso using one dataset and predicting in a different dataset
Should I use my personal e-mail address, or my workplace one, when registering to external websites for work purposes?
Delete all lines which don't have n characters before delimiter
Shouldn't "much" here be used instead of "more"?
Apparent duplicates between Haynes service instructions and MOT
Sci-fi book where a human is taken from Earth to help man an alien ship in a fight against other aliens and rises through the ranks to command
How to notate time signature switching consistently every measure
Worn-tile Scrabble
Multiply Two Integer Polynomials
Why hard-Brexiteers don't insist on a hard border to prevent illegal immigration after Brexit?
Why isn't airport relocation done gradually?
Is a "Democratic" Feudal System Possible?
Why do UK politicians seemingly ignore opinion polls on Brexit?
Are there any other methods to apply to solving simultaneous equations?
How to manage monthly salary
What does Linus Torvalds mean when he says that Git "never ever" tracks a file?
Landlord wants to switch my lease to a "Land contract" to "get back at the city"
FPGA - DIY Programming
How can I autofill dates in Excel excluding Sunday?
Why didn't the Event Horizon Telescope team mention Sagittarius A*?
Is bread bad for ducks?
Is this app Icon Browser Safe/Legit?
Can you compress metal and what would be the consequences?
Loose spokes after only a few rides
What is the most effective way of iterating a std::vector and why?
How to obtain Confidence Intervals for a LASSO regression?
The 2019 Stack Overflow Developer Survey Results Are InStandard errors for lasso prediction using RInference after using Lasso for variable selectionWhy is feature selection important, for classification tasks?Building reliable glmnet model and constructing predictionsHow can I implement lasso in R using optim functionR - Lasso Regression - different Lambda per regressorHow to interpret all zero coefficients in the results of cv.glmnet?Nested cross-validation with LASSO for model selection and evaluationglmnet returning lambda that gives all-zero coefficients as optimal lambdaLASSO: optimal $lambda$ drops all predictors from modelInference after using Lasso for variable selectionBuilding final model in glmnet after cross validationCOX model with Lasso using one dataset and predicting in a different dataset
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I'm very new from R. I have this code for a LASSO regression:
X <- X <- as.matrix(read.csv2("DB_LASSO_ERP.csv"))
y <- read.csv2("OUTCOME_LASSO_ERP.csv",header=F)$V1
fit <- glmnet(x = X, y = y, family = "binomial", alpha = 1)
crossval <- cv.glmnet(x = X, y = y, family = "binomial")
penalty <- crossval$lambda.min
fit1 <- glmnet(x = X, y = y, family = "binomial", alpha = 1, lambda = penalty)
I want to obtain Confidence Intervals for this coefficients. How can I do? Can you help me with the script please? I have very few experience with R.
Thanks!
regression confidence-interval lasso glmnet
New contributor
$endgroup$
add a comment |
$begingroup$
I'm very new from R. I have this code for a LASSO regression:
X <- X <- as.matrix(read.csv2("DB_LASSO_ERP.csv"))
y <- read.csv2("OUTCOME_LASSO_ERP.csv",header=F)$V1
fit <- glmnet(x = X, y = y, family = "binomial", alpha = 1)
crossval <- cv.glmnet(x = X, y = y, family = "binomial")
penalty <- crossval$lambda.min
fit1 <- glmnet(x = X, y = y, family = "binomial", alpha = 1, lambda = penalty)
I want to obtain Confidence Intervals for this coefficients. How can I do? Can you help me with the script please? I have very few experience with R.
Thanks!
regression confidence-interval lasso glmnet
New contributor
$endgroup$
1
$begingroup$
The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
$endgroup$
– V. Aslanyan
6 hours ago
1
$begingroup$
The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
$endgroup$
– EdM
5 hours ago
add a comment |
$begingroup$
I'm very new from R. I have this code for a LASSO regression:
X <- X <- as.matrix(read.csv2("DB_LASSO_ERP.csv"))
y <- read.csv2("OUTCOME_LASSO_ERP.csv",header=F)$V1
fit <- glmnet(x = X, y = y, family = "binomial", alpha = 1)
crossval <- cv.glmnet(x = X, y = y, family = "binomial")
penalty <- crossval$lambda.min
fit1 <- glmnet(x = X, y = y, family = "binomial", alpha = 1, lambda = penalty)
I want to obtain Confidence Intervals for this coefficients. How can I do? Can you help me with the script please? I have very few experience with R.
Thanks!
regression confidence-interval lasso glmnet
New contributor
$endgroup$
I'm very new from R. I have this code for a LASSO regression:
X <- X <- as.matrix(read.csv2("DB_LASSO_ERP.csv"))
y <- read.csv2("OUTCOME_LASSO_ERP.csv",header=F)$V1
fit <- glmnet(x = X, y = y, family = "binomial", alpha = 1)
crossval <- cv.glmnet(x = X, y = y, family = "binomial")
penalty <- crossval$lambda.min
fit1 <- glmnet(x = X, y = y, family = "binomial", alpha = 1, lambda = penalty)
I want to obtain Confidence Intervals for this coefficients. How can I do? Can you help me with the script please? I have very few experience with R.
Thanks!
regression confidence-interval lasso glmnet
regression confidence-interval lasso glmnet
New contributor
New contributor
New contributor
asked 7 hours ago
AlfonsoAlfonso
111
111
New contributor
New contributor
1
$begingroup$
The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
$endgroup$
– V. Aslanyan
6 hours ago
1
$begingroup$
The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
$endgroup$
– EdM
5 hours ago
add a comment |
1
$begingroup$
The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
$endgroup$
– V. Aslanyan
6 hours ago
1
$begingroup$
The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
$endgroup$
– EdM
5 hours ago
1
1
$begingroup$
The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
$endgroup$
– V. Aslanyan
6 hours ago
$begingroup$
The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
$endgroup$
– V. Aslanyan
6 hours ago
1
1
$begingroup$
The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
$endgroup$
– EdM
5 hours ago
$begingroup$
The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
$endgroup$
– EdM
5 hours ago
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.
The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?
Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.
That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Alfonso is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f402267%2fhow-to-obtain-confidence-intervals-for-a-lasso-regression%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.
The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?
Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.
That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.
$endgroup$
add a comment |
$begingroup$
Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.
The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?
Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.
That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.
$endgroup$
add a comment |
$begingroup$
Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.
The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?
Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.
That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.
$endgroup$
Please think very carefully about why you want confidence intervals for the LASSO coefficients and how you will interpret them. This is not an easy problem.
The predictors chosen by LASSO (as for any feature-selection method) can be highly dependent on the data sample at hand. You can examine this in your own data by repeating your LASSO model-building procedure on multiple bootstrap samples of the data. If you have predictors that are correlated with each other, the specific predictors chosen by LASSO are likely to differ among models based on the different bootstrap samples. So what do you mean by a confidence interval for a coefficient for a predictor, say predictor $x_1$, if $x_1$ wouldn't even have been chosen by LASSO if you had worked with a different sample from the same population?
Despite this instability in feature selection, LASSO-based models can be useful for prediction. The selection of 1 from among several correlated predictors might be somewhat arbitrary, but the 1 selected serves as a rough proxy for the others and thus can lead to valid predictions. The quality of predictions from a LASSO model is typically of more interest than are confidence intervals for the coefficients. You can test the performance of your LASSO approach by seeing how well the models based on multiple bootstrapped samples work on the full original data set.
That said, there is recent work on principled ways to obtain confidence intervals and on related issues in inference after LASSO. This page and its links is a good place to start. The issues are discussed in more detail in Section 6.3 of Statistical Learning with Sparsity. There is also a package selectiveInference in R that implements these methods. But these are based on specific assumptions that might not hold in your data. If you do choose to use this approach, make sure to understand the conditions under which the approach is valid and exactly what those confidence intervals really mean. That statistical issue, rather than the R coding issue, is what is crucial here.
edited 5 hours ago
answered 6 hours ago
EdMEdM
22.3k23496
22.3k23496
add a comment |
add a comment |
Alfonso is a new contributor. Be nice, and check out our Code of Conduct.
Alfonso is a new contributor. Be nice, and check out our Code of Conduct.
Alfonso is a new contributor. Be nice, and check out our Code of Conduct.
Alfonso is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f402267%2fhow-to-obtain-confidence-intervals-for-a-lasso-regression%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
The answer here suggests that there is no consensus on how to calculate the standard errors of LASSO. Since you need the standard errors for confidence interval, you have to be very careful
$endgroup$
– V. Aslanyan
6 hours ago
1
$begingroup$
The link provided by @V.Aslanyan is quite useful, but note that the initial discussion on that page (from 2014) pre-dated much subsequent work on this topic.
$endgroup$
– EdM
5 hours ago