How to explain the utility of binomial logistic regression when the predictors are purely categorical The Next CEO of Stack OverflowLogistic regression with only categorical predictorsHow do I interpret logistic regression output for categorical variables when two categories are missing?Logistic regression power analysis with moderation between categorical and continuous variableChecking the proportional odds assumption holds in an ordinal logistic regression using polr functionLogistic regression with categorical predictorsImplementing logistic regression (R)Logistic regression with only categorical predictorsLogistic regression with multi-level categorical predictorsBinary logistic regression with compositional proportional predictorsBinomial logistic regression with categorical predictors and interaction (binomial family argument and p-value differences)multiple logistic regressions with binary predictors vs single logistic regression with categorical predictors

What is the difference between Statistical Mechanics and Quantum Mechanics

Can you teleport closer to a creature you are Frightened of?

Do scriptures give a method to recognize a truly self-realized person/jivanmukta?

Is there an equivalent of cd - for cp or mv

Is dried pee considered dirt?

Is "three point ish" an acceptable use of ish?

Would a completely good Muggle be able to use a wand?

Is it ever safe to open a suspicious HTML file (e.g. email attachment)?

Is a distribution that is normal, but highly skewed, considered Gaussian?

Traveling with my 5 year old daughter (as the father) without the mother from Germany to Mexico

How to use ReplaceAll on an expression that contains a rule

Won the lottery - how do I keep the money?

Is there a way to save my career from absolute disaster?

Help understanding this unsettling image of Titan, Epimetheus, and Saturn's rings?

How to explain the utility of binomial logistic regression when the predictors are purely categorical

How to avoid supervisors with prejudiced views?

What happened in Rome, when the western empire "fell"?

I dug holes for my pergola too wide

What was the first Unix version to run on a microcomputer?

Audio Conversion With ADS1243

Purpose of level-shifter with same in and out voltages

Can I calculate next year's exemptions based on this year's refund/amount owed?

Decide between Polyglossia and Babel for LuaLaTeX in 2019

What does "shotgun unity" refer to here in this sentence?



How to explain the utility of binomial logistic regression when the predictors are purely categorical



The Next CEO of Stack OverflowLogistic regression with only categorical predictorsHow do I interpret logistic regression output for categorical variables when two categories are missing?Logistic regression power analysis with moderation between categorical and continuous variableChecking the proportional odds assumption holds in an ordinal logistic regression using polr functionLogistic regression with categorical predictorsImplementing logistic regression (R)Logistic regression with only categorical predictorsLogistic regression with multi-level categorical predictorsBinary logistic regression with compositional proportional predictorsBinomial logistic regression with categorical predictors and interaction (binomial family argument and p-value differences)multiple logistic regressions with binary predictors vs single logistic regression with categorical predictors










1












$begingroup$


The resources that I have seen feature graphs such as the following



enter image description here



This is fine if the predictor $x$ is continuous, but if the predictor is categorical and just has a few levels it's not clear to me how to justify the logistic model / curve.



I have seen this post, this is not a question about whether or not binary logistic regression can be carried out using categorical predictors.



What I'm interested in is how to explain the use of the logistic curve in this model, as there doesn't seem to be a clear way like there is for a continuous predictor.



edit



data



data that has been used for this simulation



library(vcd)
set.seed(2019)
n = 1000
y = rbinom(2*n, 1, 0.6)
x = rbinom(2*n, 1, 0.6)


crosstabulation



> table(df)
y
x 0 1
0 293 523
1 461 723


> prop.table(table(df))
y
x 0 1
0 0.1465 0.2615
1 0.2305 0.3615


mosaic plot



enter image description here










share|cite|improve this question











$endgroup$
















    1












    $begingroup$


    The resources that I have seen feature graphs such as the following



    enter image description here



    This is fine if the predictor $x$ is continuous, but if the predictor is categorical and just has a few levels it's not clear to me how to justify the logistic model / curve.



    I have seen this post, this is not a question about whether or not binary logistic regression can be carried out using categorical predictors.



    What I'm interested in is how to explain the use of the logistic curve in this model, as there doesn't seem to be a clear way like there is for a continuous predictor.



    edit



    data



    data that has been used for this simulation



    library(vcd)
    set.seed(2019)
    n = 1000
    y = rbinom(2*n, 1, 0.6)
    x = rbinom(2*n, 1, 0.6)


    crosstabulation



    > table(df)
    y
    x 0 1
    0 293 523
    1 461 723


    > prop.table(table(df))
    y
    x 0 1
    0 0.1465 0.2615
    1 0.2305 0.3615


    mosaic plot



    enter image description here










    share|cite|improve this question











    $endgroup$














      1












      1








      1





      $begingroup$


      The resources that I have seen feature graphs such as the following



      enter image description here



      This is fine if the predictor $x$ is continuous, but if the predictor is categorical and just has a few levels it's not clear to me how to justify the logistic model / curve.



      I have seen this post, this is not a question about whether or not binary logistic regression can be carried out using categorical predictors.



      What I'm interested in is how to explain the use of the logistic curve in this model, as there doesn't seem to be a clear way like there is for a continuous predictor.



      edit



      data



      data that has been used for this simulation



      library(vcd)
      set.seed(2019)
      n = 1000
      y = rbinom(2*n, 1, 0.6)
      x = rbinom(2*n, 1, 0.6)


      crosstabulation



      > table(df)
      y
      x 0 1
      0 293 523
      1 461 723


      > prop.table(table(df))
      y
      x 0 1
      0 0.1465 0.2615
      1 0.2305 0.3615


      mosaic plot



      enter image description here










      share|cite|improve this question











      $endgroup$




      The resources that I have seen feature graphs such as the following



      enter image description here



      This is fine if the predictor $x$ is continuous, but if the predictor is categorical and just has a few levels it's not clear to me how to justify the logistic model / curve.



      I have seen this post, this is not a question about whether or not binary logistic regression can be carried out using categorical predictors.



      What I'm interested in is how to explain the use of the logistic curve in this model, as there doesn't seem to be a clear way like there is for a continuous predictor.



      edit



      data



      data that has been used for this simulation



      library(vcd)
      set.seed(2019)
      n = 1000
      y = rbinom(2*n, 1, 0.6)
      x = rbinom(2*n, 1, 0.6)


      crosstabulation



      > table(df)
      y
      x 0 1
      0 293 523
      1 461 723


      > prop.table(table(df))
      y
      x 0 1
      0 0.1465 0.2615
      1 0.2305 0.3615


      mosaic plot



      enter image description here







      machine-learning logistic binary-data logistic-curve






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited 49 mins ago







      baxx

















      asked 3 hours ago









      baxxbaxx

      275111




      275111




















          2 Answers
          2






          active

          oldest

          votes


















          1












          $begingroup$

          First, you could make a graph like that with a categorical x. It's true that that curve would not make much sense, but ...so? You could say similar things about curves used in evaluating linear regression.



          Second, you can look at crosstabulations, this is especially useful for comparing the DV to a single categorical IV (which is what your plot above does, for a continuous IV). A more graphical way to look at this is a mosaic plot.



          Third, it gets more interesting when you look at multiple IVs. A mosaic plot can handle two IVs pretty easily, but they get messy with more. If there are not a great many variables or levels, you can get the predicted probablity for every combination.






          share|cite|improve this answer









          $endgroup$












          • $begingroup$
            thanks, please see the edit that I have made to this post. It seems that you're suggesting (please correct me if i'm wrong) that there's not really much to say with respect to the use of the logistic curve in the case of such a set up as this post. Other than it happens to have properties which enable us to map odds -> probabilities (which I'm not suggesting is useless). I was just wondering whether there was a way to demonstrate the utility of the curve in a similar manner to the use when the predictor is continuous.
            $endgroup$
            – baxx
            49 mins ago











          • $begingroup$
            @baxx: Please also see my answer, in addition to Peter's answer.
            $endgroup$
            – Isabella Ghement
            24 mins ago


















          1












          $begingroup$

          A binary logistic regression model with continuous predictor variable x has the form:



          log(odds that y = 1) = beta0 + beta1 * x (A)


          According to this model, the continuous predictor variable x has a linear effect on the log odds that the binary response variable y is equal to 1 (rather than 0).



          One can easily show this model to be equivalent to the following model:



           (probability that y = 1) = exp(beta0 + beta1 * x)/[1 + exp(beta0 + beta1 * x)] (B)


          In the equivalent model, the continuous predictor x has a nonlinear effect on the probability that y is equal to 1.



          In the plot that you shared, the S-shaped blue curve is obtained by plotting the right hand side of equation (B) above as a function of x and shows how the probability that y = 1 increases (nonlinearly) as the values of x increase.



          If your x variable were a categorical predictor with, say, 2 categories, then it would be coded via a dummy variable x in your model, such that x = 0 for the first (or reference) category and x = 1 for the second (or non-reference) category. In that case, your binary logistic regression model would still be expressed as in equation (B). However, since x is a dummy variable, the model would be simplified as:



          log(odds that y = 1) = beta0 for the reference category of x (C1) 


          and



          log(odds that y = 1) = beta0 + beta1 for the non-reference category of x (C2)


          The equations (C1) and (C2) can be further manipulated and re-expressed as:



          (probability that y = 1) = exp(beta0)/[1 + exp(beta0)] for the reference category of x (D1)


          and



          (probability that y = 1) = exp(beta0 + beta1)/[1 + exp(beta0 + beta1)] for the non-reference category of x (D2)


          So what is the utility of the binary logistic regression when x is a dummy variable? The model allows you to estimate two different probabilities that y = 1: one for x = 0 (as per equation (D1)) and one for x = 1 (as per equation (D2)).
          You could create a plot to visualize these two probabilities as a function of x and superimpose the observed values of y for x = 0 (i.e., a whole bunch of zeroes and ones sitting on top of x = 0) and for x = 1 (i.e., a whole bunch of zeroes and ones sitting on top of x = 1). The plot would look like this:



           ^
          |
          y = 1 | 1 1
          |
          | *
          |
          | *
          |
          y = 0 | 0 0
          |
          |------------------>
          x = 0 x = 1
          x-axis


          In this plot, you can see the zero values (i.e., y = 0) stacked atop x = 0 and x = 1, as well as the one values (i.e., y = 1) stacked atop x = 0 and x = 1. The * symbols denote the estimated values of the probability that y = 1. There are no more curves in this plot as you are just estimating two distinct probabilities. If you wanted to, you could connect these estimated probabilities with a straight line to indicate whether the estimated probability that y = 1 increases or decreases when you move from x = 0 to x = 1. Of course, you could also jitter the zeroes and ones shown in the plot to avoid plotting them right on top of each other.



          If your x variable has k categories, where k > 2, then your model would include k - 1 dummy variables and could be written to make it clear that it estimates k distinct probabilities that y = 1 (one for each category of x). You could visualize the estimated probabilities by extending the plot I showed above to incorporate k categories for x. For example, if k = 3,



          this:



           ^
          |
          y = 1| 1 1 1
          | *
          | *
          |
          | *
          |
          y = 0| 0 0 0
          |
          |---------------------------------->
          x = 1st x = 2nd x = 3rd
          x-axis


          where 1st, 2nd and 3rd refer to the first, second and third category of the categorical predictor variable x.



          Note that the effects package in R will create plots similar to what I suggested here, except that the plots will NOT show the observed values of y corresponding to each category of x and will display uncertainty intervals around the plotted (estimated) probabilities. Simply use these commands:



          install.packages("effects")
          library(effects)

          model <- glm(y ~ x, data = data, family = "binomial")

          plot(allEffects(model))





          share|cite|improve this answer











          $endgroup$













            Your Answer





            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "65"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f400452%2fhow-to-explain-the-utility-of-binomial-logistic-regression-when-the-predictors-a%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1












            $begingroup$

            First, you could make a graph like that with a categorical x. It's true that that curve would not make much sense, but ...so? You could say similar things about curves used in evaluating linear regression.



            Second, you can look at crosstabulations, this is especially useful for comparing the DV to a single categorical IV (which is what your plot above does, for a continuous IV). A more graphical way to look at this is a mosaic plot.



            Third, it gets more interesting when you look at multiple IVs. A mosaic plot can handle two IVs pretty easily, but they get messy with more. If there are not a great many variables or levels, you can get the predicted probablity for every combination.






            share|cite|improve this answer









            $endgroup$












            • $begingroup$
              thanks, please see the edit that I have made to this post. It seems that you're suggesting (please correct me if i'm wrong) that there's not really much to say with respect to the use of the logistic curve in the case of such a set up as this post. Other than it happens to have properties which enable us to map odds -> probabilities (which I'm not suggesting is useless). I was just wondering whether there was a way to demonstrate the utility of the curve in a similar manner to the use when the predictor is continuous.
              $endgroup$
              – baxx
              49 mins ago











            • $begingroup$
              @baxx: Please also see my answer, in addition to Peter's answer.
              $endgroup$
              – Isabella Ghement
              24 mins ago















            1












            $begingroup$

            First, you could make a graph like that with a categorical x. It's true that that curve would not make much sense, but ...so? You could say similar things about curves used in evaluating linear regression.



            Second, you can look at crosstabulations, this is especially useful for comparing the DV to a single categorical IV (which is what your plot above does, for a continuous IV). A more graphical way to look at this is a mosaic plot.



            Third, it gets more interesting when you look at multiple IVs. A mosaic plot can handle two IVs pretty easily, but they get messy with more. If there are not a great many variables or levels, you can get the predicted probablity for every combination.






            share|cite|improve this answer









            $endgroup$












            • $begingroup$
              thanks, please see the edit that I have made to this post. It seems that you're suggesting (please correct me if i'm wrong) that there's not really much to say with respect to the use of the logistic curve in the case of such a set up as this post. Other than it happens to have properties which enable us to map odds -> probabilities (which I'm not suggesting is useless). I was just wondering whether there was a way to demonstrate the utility of the curve in a similar manner to the use when the predictor is continuous.
              $endgroup$
              – baxx
              49 mins ago











            • $begingroup$
              @baxx: Please also see my answer, in addition to Peter's answer.
              $endgroup$
              – Isabella Ghement
              24 mins ago













            1












            1








            1





            $begingroup$

            First, you could make a graph like that with a categorical x. It's true that that curve would not make much sense, but ...so? You could say similar things about curves used in evaluating linear regression.



            Second, you can look at crosstabulations, this is especially useful for comparing the DV to a single categorical IV (which is what your plot above does, for a continuous IV). A more graphical way to look at this is a mosaic plot.



            Third, it gets more interesting when you look at multiple IVs. A mosaic plot can handle two IVs pretty easily, but they get messy with more. If there are not a great many variables or levels, you can get the predicted probablity for every combination.






            share|cite|improve this answer









            $endgroup$



            First, you could make a graph like that with a categorical x. It's true that that curve would not make much sense, but ...so? You could say similar things about curves used in evaluating linear regression.



            Second, you can look at crosstabulations, this is especially useful for comparing the DV to a single categorical IV (which is what your plot above does, for a continuous IV). A more graphical way to look at this is a mosaic plot.



            Third, it gets more interesting when you look at multiple IVs. A mosaic plot can handle two IVs pretty easily, but they get messy with more. If there are not a great many variables or levels, you can get the predicted probablity for every combination.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered 3 hours ago









            Peter FlomPeter Flom

            76.7k11109214




            76.7k11109214











            • $begingroup$
              thanks, please see the edit that I have made to this post. It seems that you're suggesting (please correct me if i'm wrong) that there's not really much to say with respect to the use of the logistic curve in the case of such a set up as this post. Other than it happens to have properties which enable us to map odds -> probabilities (which I'm not suggesting is useless). I was just wondering whether there was a way to demonstrate the utility of the curve in a similar manner to the use when the predictor is continuous.
              $endgroup$
              – baxx
              49 mins ago











            • $begingroup$
              @baxx: Please also see my answer, in addition to Peter's answer.
              $endgroup$
              – Isabella Ghement
              24 mins ago
















            • $begingroup$
              thanks, please see the edit that I have made to this post. It seems that you're suggesting (please correct me if i'm wrong) that there's not really much to say with respect to the use of the logistic curve in the case of such a set up as this post. Other than it happens to have properties which enable us to map odds -> probabilities (which I'm not suggesting is useless). I was just wondering whether there was a way to demonstrate the utility of the curve in a similar manner to the use when the predictor is continuous.
              $endgroup$
              – baxx
              49 mins ago











            • $begingroup$
              @baxx: Please also see my answer, in addition to Peter's answer.
              $endgroup$
              – Isabella Ghement
              24 mins ago















            $begingroup$
            thanks, please see the edit that I have made to this post. It seems that you're suggesting (please correct me if i'm wrong) that there's not really much to say with respect to the use of the logistic curve in the case of such a set up as this post. Other than it happens to have properties which enable us to map odds -> probabilities (which I'm not suggesting is useless). I was just wondering whether there was a way to demonstrate the utility of the curve in a similar manner to the use when the predictor is continuous.
            $endgroup$
            – baxx
            49 mins ago





            $begingroup$
            thanks, please see the edit that I have made to this post. It seems that you're suggesting (please correct me if i'm wrong) that there's not really much to say with respect to the use of the logistic curve in the case of such a set up as this post. Other than it happens to have properties which enable us to map odds -> probabilities (which I'm not suggesting is useless). I was just wondering whether there was a way to demonstrate the utility of the curve in a similar manner to the use when the predictor is continuous.
            $endgroup$
            – baxx
            49 mins ago













            $begingroup$
            @baxx: Please also see my answer, in addition to Peter's answer.
            $endgroup$
            – Isabella Ghement
            24 mins ago




            $begingroup$
            @baxx: Please also see my answer, in addition to Peter's answer.
            $endgroup$
            – Isabella Ghement
            24 mins ago













            1












            $begingroup$

            A binary logistic regression model with continuous predictor variable x has the form:



            log(odds that y = 1) = beta0 + beta1 * x (A)


            According to this model, the continuous predictor variable x has a linear effect on the log odds that the binary response variable y is equal to 1 (rather than 0).



            One can easily show this model to be equivalent to the following model:



             (probability that y = 1) = exp(beta0 + beta1 * x)/[1 + exp(beta0 + beta1 * x)] (B)


            In the equivalent model, the continuous predictor x has a nonlinear effect on the probability that y is equal to 1.



            In the plot that you shared, the S-shaped blue curve is obtained by plotting the right hand side of equation (B) above as a function of x and shows how the probability that y = 1 increases (nonlinearly) as the values of x increase.



            If your x variable were a categorical predictor with, say, 2 categories, then it would be coded via a dummy variable x in your model, such that x = 0 for the first (or reference) category and x = 1 for the second (or non-reference) category. In that case, your binary logistic regression model would still be expressed as in equation (B). However, since x is a dummy variable, the model would be simplified as:



            log(odds that y = 1) = beta0 for the reference category of x (C1) 


            and



            log(odds that y = 1) = beta0 + beta1 for the non-reference category of x (C2)


            The equations (C1) and (C2) can be further manipulated and re-expressed as:



            (probability that y = 1) = exp(beta0)/[1 + exp(beta0)] for the reference category of x (D1)


            and



            (probability that y = 1) = exp(beta0 + beta1)/[1 + exp(beta0 + beta1)] for the non-reference category of x (D2)


            So what is the utility of the binary logistic regression when x is a dummy variable? The model allows you to estimate two different probabilities that y = 1: one for x = 0 (as per equation (D1)) and one for x = 1 (as per equation (D2)).
            You could create a plot to visualize these two probabilities as a function of x and superimpose the observed values of y for x = 0 (i.e., a whole bunch of zeroes and ones sitting on top of x = 0) and for x = 1 (i.e., a whole bunch of zeroes and ones sitting on top of x = 1). The plot would look like this:



             ^
            |
            y = 1 | 1 1
            |
            | *
            |
            | *
            |
            y = 0 | 0 0
            |
            |------------------>
            x = 0 x = 1
            x-axis


            In this plot, you can see the zero values (i.e., y = 0) stacked atop x = 0 and x = 1, as well as the one values (i.e., y = 1) stacked atop x = 0 and x = 1. The * symbols denote the estimated values of the probability that y = 1. There are no more curves in this plot as you are just estimating two distinct probabilities. If you wanted to, you could connect these estimated probabilities with a straight line to indicate whether the estimated probability that y = 1 increases or decreases when you move from x = 0 to x = 1. Of course, you could also jitter the zeroes and ones shown in the plot to avoid plotting them right on top of each other.



            If your x variable has k categories, where k > 2, then your model would include k - 1 dummy variables and could be written to make it clear that it estimates k distinct probabilities that y = 1 (one for each category of x). You could visualize the estimated probabilities by extending the plot I showed above to incorporate k categories for x. For example, if k = 3,



            this:



             ^
            |
            y = 1| 1 1 1
            | *
            | *
            |
            | *
            |
            y = 0| 0 0 0
            |
            |---------------------------------->
            x = 1st x = 2nd x = 3rd
            x-axis


            where 1st, 2nd and 3rd refer to the first, second and third category of the categorical predictor variable x.



            Note that the effects package in R will create plots similar to what I suggested here, except that the plots will NOT show the observed values of y corresponding to each category of x and will display uncertainty intervals around the plotted (estimated) probabilities. Simply use these commands:



            install.packages("effects")
            library(effects)

            model <- glm(y ~ x, data = data, family = "binomial")

            plot(allEffects(model))





            share|cite|improve this answer











            $endgroup$

















              1












              $begingroup$

              A binary logistic regression model with continuous predictor variable x has the form:



              log(odds that y = 1) = beta0 + beta1 * x (A)


              According to this model, the continuous predictor variable x has a linear effect on the log odds that the binary response variable y is equal to 1 (rather than 0).



              One can easily show this model to be equivalent to the following model:



               (probability that y = 1) = exp(beta0 + beta1 * x)/[1 + exp(beta0 + beta1 * x)] (B)


              In the equivalent model, the continuous predictor x has a nonlinear effect on the probability that y is equal to 1.



              In the plot that you shared, the S-shaped blue curve is obtained by plotting the right hand side of equation (B) above as a function of x and shows how the probability that y = 1 increases (nonlinearly) as the values of x increase.



              If your x variable were a categorical predictor with, say, 2 categories, then it would be coded via a dummy variable x in your model, such that x = 0 for the first (or reference) category and x = 1 for the second (or non-reference) category. In that case, your binary logistic regression model would still be expressed as in equation (B). However, since x is a dummy variable, the model would be simplified as:



              log(odds that y = 1) = beta0 for the reference category of x (C1) 


              and



              log(odds that y = 1) = beta0 + beta1 for the non-reference category of x (C2)


              The equations (C1) and (C2) can be further manipulated and re-expressed as:



              (probability that y = 1) = exp(beta0)/[1 + exp(beta0)] for the reference category of x (D1)


              and



              (probability that y = 1) = exp(beta0 + beta1)/[1 + exp(beta0 + beta1)] for the non-reference category of x (D2)


              So what is the utility of the binary logistic regression when x is a dummy variable? The model allows you to estimate two different probabilities that y = 1: one for x = 0 (as per equation (D1)) and one for x = 1 (as per equation (D2)).
              You could create a plot to visualize these two probabilities as a function of x and superimpose the observed values of y for x = 0 (i.e., a whole bunch of zeroes and ones sitting on top of x = 0) and for x = 1 (i.e., a whole bunch of zeroes and ones sitting on top of x = 1). The plot would look like this:



               ^
              |
              y = 1 | 1 1
              |
              | *
              |
              | *
              |
              y = 0 | 0 0
              |
              |------------------>
              x = 0 x = 1
              x-axis


              In this plot, you can see the zero values (i.e., y = 0) stacked atop x = 0 and x = 1, as well as the one values (i.e., y = 1) stacked atop x = 0 and x = 1. The * symbols denote the estimated values of the probability that y = 1. There are no more curves in this plot as you are just estimating two distinct probabilities. If you wanted to, you could connect these estimated probabilities with a straight line to indicate whether the estimated probability that y = 1 increases or decreases when you move from x = 0 to x = 1. Of course, you could also jitter the zeroes and ones shown in the plot to avoid plotting them right on top of each other.



              If your x variable has k categories, where k > 2, then your model would include k - 1 dummy variables and could be written to make it clear that it estimates k distinct probabilities that y = 1 (one for each category of x). You could visualize the estimated probabilities by extending the plot I showed above to incorporate k categories for x. For example, if k = 3,



              this:



               ^
              |
              y = 1| 1 1 1
              | *
              | *
              |
              | *
              |
              y = 0| 0 0 0
              |
              |---------------------------------->
              x = 1st x = 2nd x = 3rd
              x-axis


              where 1st, 2nd and 3rd refer to the first, second and third category of the categorical predictor variable x.



              Note that the effects package in R will create plots similar to what I suggested here, except that the plots will NOT show the observed values of y corresponding to each category of x and will display uncertainty intervals around the plotted (estimated) probabilities. Simply use these commands:



              install.packages("effects")
              library(effects)

              model <- glm(y ~ x, data = data, family = "binomial")

              plot(allEffects(model))





              share|cite|improve this answer











              $endgroup$















                1












                1








                1





                $begingroup$

                A binary logistic regression model with continuous predictor variable x has the form:



                log(odds that y = 1) = beta0 + beta1 * x (A)


                According to this model, the continuous predictor variable x has a linear effect on the log odds that the binary response variable y is equal to 1 (rather than 0).



                One can easily show this model to be equivalent to the following model:



                 (probability that y = 1) = exp(beta0 + beta1 * x)/[1 + exp(beta0 + beta1 * x)] (B)


                In the equivalent model, the continuous predictor x has a nonlinear effect on the probability that y is equal to 1.



                In the plot that you shared, the S-shaped blue curve is obtained by plotting the right hand side of equation (B) above as a function of x and shows how the probability that y = 1 increases (nonlinearly) as the values of x increase.



                If your x variable were a categorical predictor with, say, 2 categories, then it would be coded via a dummy variable x in your model, such that x = 0 for the first (or reference) category and x = 1 for the second (or non-reference) category. In that case, your binary logistic regression model would still be expressed as in equation (B). However, since x is a dummy variable, the model would be simplified as:



                log(odds that y = 1) = beta0 for the reference category of x (C1) 


                and



                log(odds that y = 1) = beta0 + beta1 for the non-reference category of x (C2)


                The equations (C1) and (C2) can be further manipulated and re-expressed as:



                (probability that y = 1) = exp(beta0)/[1 + exp(beta0)] for the reference category of x (D1)


                and



                (probability that y = 1) = exp(beta0 + beta1)/[1 + exp(beta0 + beta1)] for the non-reference category of x (D2)


                So what is the utility of the binary logistic regression when x is a dummy variable? The model allows you to estimate two different probabilities that y = 1: one for x = 0 (as per equation (D1)) and one for x = 1 (as per equation (D2)).
                You could create a plot to visualize these two probabilities as a function of x and superimpose the observed values of y for x = 0 (i.e., a whole bunch of zeroes and ones sitting on top of x = 0) and for x = 1 (i.e., a whole bunch of zeroes and ones sitting on top of x = 1). The plot would look like this:



                 ^
                |
                y = 1 | 1 1
                |
                | *
                |
                | *
                |
                y = 0 | 0 0
                |
                |------------------>
                x = 0 x = 1
                x-axis


                In this plot, you can see the zero values (i.e., y = 0) stacked atop x = 0 and x = 1, as well as the one values (i.e., y = 1) stacked atop x = 0 and x = 1. The * symbols denote the estimated values of the probability that y = 1. There are no more curves in this plot as you are just estimating two distinct probabilities. If you wanted to, you could connect these estimated probabilities with a straight line to indicate whether the estimated probability that y = 1 increases or decreases when you move from x = 0 to x = 1. Of course, you could also jitter the zeroes and ones shown in the plot to avoid plotting them right on top of each other.



                If your x variable has k categories, where k > 2, then your model would include k - 1 dummy variables and could be written to make it clear that it estimates k distinct probabilities that y = 1 (one for each category of x). You could visualize the estimated probabilities by extending the plot I showed above to incorporate k categories for x. For example, if k = 3,



                this:



                 ^
                |
                y = 1| 1 1 1
                | *
                | *
                |
                | *
                |
                y = 0| 0 0 0
                |
                |---------------------------------->
                x = 1st x = 2nd x = 3rd
                x-axis


                where 1st, 2nd and 3rd refer to the first, second and third category of the categorical predictor variable x.



                Note that the effects package in R will create plots similar to what I suggested here, except that the plots will NOT show the observed values of y corresponding to each category of x and will display uncertainty intervals around the plotted (estimated) probabilities. Simply use these commands:



                install.packages("effects")
                library(effects)

                model <- glm(y ~ x, data = data, family = "binomial")

                plot(allEffects(model))





                share|cite|improve this answer











                $endgroup$



                A binary logistic regression model with continuous predictor variable x has the form:



                log(odds that y = 1) = beta0 + beta1 * x (A)


                According to this model, the continuous predictor variable x has a linear effect on the log odds that the binary response variable y is equal to 1 (rather than 0).



                One can easily show this model to be equivalent to the following model:



                 (probability that y = 1) = exp(beta0 + beta1 * x)/[1 + exp(beta0 + beta1 * x)] (B)


                In the equivalent model, the continuous predictor x has a nonlinear effect on the probability that y is equal to 1.



                In the plot that you shared, the S-shaped blue curve is obtained by plotting the right hand side of equation (B) above as a function of x and shows how the probability that y = 1 increases (nonlinearly) as the values of x increase.



                If your x variable were a categorical predictor with, say, 2 categories, then it would be coded via a dummy variable x in your model, such that x = 0 for the first (or reference) category and x = 1 for the second (or non-reference) category. In that case, your binary logistic regression model would still be expressed as in equation (B). However, since x is a dummy variable, the model would be simplified as:



                log(odds that y = 1) = beta0 for the reference category of x (C1) 


                and



                log(odds that y = 1) = beta0 + beta1 for the non-reference category of x (C2)


                The equations (C1) and (C2) can be further manipulated and re-expressed as:



                (probability that y = 1) = exp(beta0)/[1 + exp(beta0)] for the reference category of x (D1)


                and



                (probability that y = 1) = exp(beta0 + beta1)/[1 + exp(beta0 + beta1)] for the non-reference category of x (D2)


                So what is the utility of the binary logistic regression when x is a dummy variable? The model allows you to estimate two different probabilities that y = 1: one for x = 0 (as per equation (D1)) and one for x = 1 (as per equation (D2)).
                You could create a plot to visualize these two probabilities as a function of x and superimpose the observed values of y for x = 0 (i.e., a whole bunch of zeroes and ones sitting on top of x = 0) and for x = 1 (i.e., a whole bunch of zeroes and ones sitting on top of x = 1). The plot would look like this:



                 ^
                |
                y = 1 | 1 1
                |
                | *
                |
                | *
                |
                y = 0 | 0 0
                |
                |------------------>
                x = 0 x = 1
                x-axis


                In this plot, you can see the zero values (i.e., y = 0) stacked atop x = 0 and x = 1, as well as the one values (i.e., y = 1) stacked atop x = 0 and x = 1. The * symbols denote the estimated values of the probability that y = 1. There are no more curves in this plot as you are just estimating two distinct probabilities. If you wanted to, you could connect these estimated probabilities with a straight line to indicate whether the estimated probability that y = 1 increases or decreases when you move from x = 0 to x = 1. Of course, you could also jitter the zeroes and ones shown in the plot to avoid plotting them right on top of each other.



                If your x variable has k categories, where k > 2, then your model would include k - 1 dummy variables and could be written to make it clear that it estimates k distinct probabilities that y = 1 (one for each category of x). You could visualize the estimated probabilities by extending the plot I showed above to incorporate k categories for x. For example, if k = 3,



                this:



                 ^
                |
                y = 1| 1 1 1
                | *
                | *
                |
                | *
                |
                y = 0| 0 0 0
                |
                |---------------------------------->
                x = 1st x = 2nd x = 3rd
                x-axis


                where 1st, 2nd and 3rd refer to the first, second and third category of the categorical predictor variable x.



                Note that the effects package in R will create plots similar to what I suggested here, except that the plots will NOT show the observed values of y corresponding to each category of x and will display uncertainty intervals around the plotted (estimated) probabilities. Simply use these commands:



                install.packages("effects")
                library(effects)

                model <- glm(y ~ x, data = data, family = "binomial")

                plot(allEffects(model))






                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited 12 mins ago

























                answered 25 mins ago









                Isabella GhementIsabella Ghement

                7,638422




                7,638422



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Cross Validated!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f400452%2fhow-to-explain-the-utility-of-binomial-logistic-regression-when-the-predictors-a%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Are there any AGPL-style licences that require source code modifications to be public? Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?Force derivative works to be publicAre there any GPL like licenses for Apple App Store?Do you violate the GPL if you provide source code that cannot be compiled?GPL - is it distribution to use libraries in an appliance loaned to customers?Distributing App for free which uses GPL'ed codeModifications of server software under GPL, with web/CLI interfaceDoes using an AGPLv3-licensed library prevent me from dual-licensing my own source code?Can I publish only select code under GPLv3 from a private project?Is there published precedent regarding the scope of covered work that uses AGPL software?If MIT licensed code links to GPL licensed code what should be the license of the resulting binary program?If I use a public API endpoint that has its source code licensed under AGPL in my app, do I need to disclose my source?

                    2013 GY136 Descoberta | Órbita | Referências Menu de navegação«List Of Centaurs and Scattered-Disk Objects»«List of Known Trans-Neptunian Objects»

                    Mortes em março de 2019 Referências Menu de navegação«Zhores Alferov, Nobel de Física bielorrusso, morre aos 88 anos - Ciência»«Fallece Rafael Torija, o bispo emérito de Ciudad Real»«Peter Hurford dies at 88»«Keith Flint, vocalista do The Prodigy, morre aos 49 anos»«Luke Perry, ator de 'Barrados no baile' e 'Riverdale', morre aos 52 anos»«Former Rangers and Scotland captain Eric Caldow dies, aged 84»«Morreu, aos 61 anos, a antiga lenda do wrestling King Kong Bundy»«Fallece el actor y director teatral Abraham Stavans»«In Memoriam Guillaume Faye»«Sidney Sheinberg, a Force Behind Universal and Spielberg, Is Dead at 84»«Carmine Persico, Colombo Crime Family Boss, Is Dead at 85»«Dirigent Michael Gielen gestorben»«Ciclista tricampeã mundial e prata na Rio 2016 é encontrada morta em casa aos 23 anos»«Pagan Community Notes: Raven Grimassi dies, Indianapolis pop-up event cancelled, Circle Sanctuary announces new podcast, and more!»«Hal Blaine, Wrecking Crew Drummer, Dies at 90»«Morre Coutinho, que editou dupla lendária com Pelé no Santos»«Cantor Demétrius, ídolo da Jovem Guarda, morre em SP»«Ex-presidente do Vasco, Eurico Miranda morre no Rio de Janeiro»«Bronze no Mundial de basquete de 1971, Laís Elena morre aos 76 anos»«Diretor de Corridas da F1, Charlie Whiting morre aos 66 anos às vésperas do GP da Austrália»«Morreu o cardeal Danneels, da Bélgica»«Morreu o cartoonista Augusto Cid»«Morreu a atriz Maria Isabel de Lizandra, de "Vale Tudo" e novelas da Tupi»«WS Merwin, prize-winning poet of nature, dies at 91»«Atriz Márcia Real morre em São Paulo aos 88 anos»«Mauritanie: décès de l'ancien président Mohamed Mahmoud ould Louly»«Morreu Dick Dale, o rei da surf guitar e de "Pulp Fiction"»«Falleció Víctor Genes»«João Carlos Marinho, autor de 'O Gênio do Crime', morre em SP»«Legendary Horror Director and SFX Artist John Carl Buechler Dies at 66»«Morre em Salvador a religiosa Makota Valdina»«مرگ بازیکن‌ سابق نساجی بر اثر سقوط سنگ در مازندران»«Domingos Oliveira morre no Rio»«Morre Airton Ravagniani, ex-São Paulo, Fla, Vasco, Grêmio e Sport - Notícias»«Morre o escritor Flavio Moreira da Costa»«Larry Cohen, Writer-Director of 'It's Alive' and 'Hell Up in Harlem,' Dies at 77»«Scott Walker, experimental singer-songwriter, dead at 76»«Joseph Pilato, Day of the Dead Star and Horror Favorite, Dies at 70»«Sheffield United set to pay tribute to legendary goalkeeper Ted Burgin who has died at 91»«Morre Rafael Henzel, sobrevivente de acidente aéreo da Chapecoense»«Morre Valery Bykovsky, um dos primeiros cosmonautas da União Soviética»«Agnès Varda, cineasta da Nouvelle Vague, morre aos 90 anos»«Agnès Varda, cineasta francesa, morre aos 90 anos»«Tania Mallet, James Bond Actress and Helen Mirren's Cousin, Dies at 77»e