How to figure out whether the data is sample data or population data apart from the client's information?Locating freely available data samplesWhat is the difference between a population and a sample?Whether to use r-square or adjusted r-square with a small sample size that may represent the entire population?How to Estimate Population Variance from Multiple SamplesPopulation or Sample Standard Deviation: monthly climate dataIs this conclusion drawn from sample or population?Likelihood that two random sample sets come from the same populationHow do I compare means when I have a sample and the whole population?Is it possible to estimate a population mean from a convenience sample?Sample is almost the same as the population

Is there a way to get a compiler for the original B programming language?

Can solid acids and bases have pH values? If not, how are they classified as acids or bases?

Rivers without rain

Realistic Necromancy?

What route did the Hindenburg take when traveling from Germany to the U.S.?

Packing rectangles: Does rotation ever help?

Why does nature favour the Laplacian?

How to make a pipeline wait for end-of-file or stop after an error?

What makes accurate emulation of old systems a difficult task?

Term for maladaptive animal behavior that will lead to their demise?

If a warlock with the Repelling Blast invocation casts Eldritch Blast and hits, must the targets always be pushed back?

How to pronounce 'C++' in Spanish

Pulling the rope with one hand is as heavy as with two hands?

Why was Germany not as successful as other Europeans in establishing overseas colonies?

What language was spoken in East Asia before Proto-Turkic?

Binary Numbers Magic Trick

What is the strongest case that can be made in favour of the UK regaining some control over fishing policy after Brexit?

Please, smoke with good manners

Do I have to worry about players making “bad” choices on level up?

Don’t seats that recline flat defeat the purpose of having seatbelts?

US visa is under administrative processing, I need the passport back ASAP

How exactly does Hawking radiation decrease the mass of black holes?

Stateful vs non-stateful app

Error message with tabularx



How to figure out whether the data is sample data or population data apart from the client's information?


Locating freely available data samplesWhat is the difference between a population and a sample?Whether to use r-square or adjusted r-square with a small sample size that may represent the entire population?How to Estimate Population Variance from Multiple SamplesPopulation or Sample Standard Deviation: monthly climate dataIs this conclusion drawn from sample or population?Likelihood that two random sample sets come from the same populationHow do I compare means when I have a sample and the whole population?Is it possible to estimate a population mean from a convenience sample?Sample is almost the same as the population






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2












$begingroup$


What are the ways available to figure out whether the data is sample data or population data apart from the client's information?










share|cite|improve this question











$endgroup$


















    2












    $begingroup$


    What are the ways available to figure out whether the data is sample data or population data apart from the client's information?










    share|cite|improve this question











    $endgroup$














      2












      2








      2





      $begingroup$


      What are the ways available to figure out whether the data is sample data or population data apart from the client's information?










      share|cite|improve this question











      $endgroup$




      What are the ways available to figure out whether the data is sample data or population data apart from the client's information?







      sample population






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited 28 mins ago









      Richard Hardy

      28.5k644131




      28.5k644131










      asked 1 hour ago









      AkarshAkarsh

      111




      111




















          3 Answers
          3






          active

          oldest

          votes


















          1












          $begingroup$

          I think there is no way to know just by looking at the data.



          In general, the population may be small or big and the sample may be small or big, hence in some situations the sample size might be quite close to the whole population. Imagine we would collect 90%, 95%, 99% and 100% of the population. I don't expect anything completely new happening with the results in case of the 100% (=population) data.



          But maybe you know something about the population? If you know that the population consists of all customers of the company and you know how many customers they have per month you can maybe estimate how big the population is?



          My question would be why you want to know that and why you don't know that? Usually one should know something about the data one is supposed to analyse. Keep in mind that inferencial statistics tries to draw conclusions about the population based on information that we know from the sample. This means if you have the population data there is no need for inferencial statistics (significance test, confidence intervals,...) and you can simply see the descriptive statistics. So such information about the data should be known by the analyst.






          share|cite|improve this answer











          $endgroup$




















            0












            $begingroup$

            A sample is a just subset of the population. If the sample is representative (which it should be), the only main between sample and population is their size.



            However, it should be noted that for any analysis in real life it's very important to know where the data comes from, and the process of collecting them needs to be well documented. Not even knowing whether the data is a sample looks like a rather bad red flag.






            share|cite|improve this answer









            $endgroup$




















              0












              $begingroup$

              There is no way - the "population" of interest is part of the specification of the problem.



              Statistical problems involving inference to a "population" require specification of the group of interest, about which we are making an inference. Only a proper specification of the problem ---in this case, from a briefing from the client--- can give you this. Of course, there may be situations where the client does not know how to specify their problem in a well-posed way, and in this case, part of the responsibility of the statistician is to elicit contextual information to assist the client to formulate a well-posed problem. In some cases, the source of existing sample data may also imply some natural suggestions about the "population" for which we can make a valid inference. (Generally, a random sample allows us to make an inference about characteristics of the corresponding sampling frame, which may be close to some population of common interest.) Sample data cannot formulate your statistical problem for you. The problem must arise from some objective or context.



              As to whether data is "sample data" or "population data", that also depends on context, and specification of the group of interest. For example, suppose we consider data on the driving record (demerit points, fines, years with license, etc.) of a random sample of people with driver's licenses registered in a particular State. That data would be "sample data" from the associated sampling frame from which they were drawn ---i.e., all people who hold a driver's license registered in that State--- and the data of all people with a driver's license registered in that State would be the "population". However, that "population" can also be regarded as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the country, which can in turn be considered as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the world.



              All of this goes back to a fundamental aspect of sampling problems. In any such problem, there must be a specified "population" of interest, for which we wish to make an inference, and there must be "sample data" that bears somehow on that inference. (Ideally, we would like the sample data to be a random sample from a sampling frame that is close to the population of interest.)






              share|cite









              $endgroup$













                Your Answer








                StackExchange.ready(function()
                var channelOptions =
                tags: "".split(" "),
                id: "65"
                ;
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function()
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled)
                StackExchange.using("snippets", function()
                createEditor();
                );

                else
                createEditor();

                );

                function createEditor()
                StackExchange.prepareEditor(
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: false,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: null,
                bindNavPrevention: true,
                postfix: "",
                imageUploader:
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                ,
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                );



                );













                draft saved

                draft discarded


















                StackExchange.ready(
                function ()
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f405456%2fhow-to-figure-out-whether-the-data-is-sample-data-or-population-data-apart-from%23new-answer', 'question_page');

                );

                Post as a guest















                Required, but never shown

























                3 Answers
                3






                active

                oldest

                votes








                3 Answers
                3






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                1












                $begingroup$

                I think there is no way to know just by looking at the data.



                In general, the population may be small or big and the sample may be small or big, hence in some situations the sample size might be quite close to the whole population. Imagine we would collect 90%, 95%, 99% and 100% of the population. I don't expect anything completely new happening with the results in case of the 100% (=population) data.



                But maybe you know something about the population? If you know that the population consists of all customers of the company and you know how many customers they have per month you can maybe estimate how big the population is?



                My question would be why you want to know that and why you don't know that? Usually one should know something about the data one is supposed to analyse. Keep in mind that inferencial statistics tries to draw conclusions about the population based on information that we know from the sample. This means if you have the population data there is no need for inferencial statistics (significance test, confidence intervals,...) and you can simply see the descriptive statistics. So such information about the data should be known by the analyst.






                share|cite|improve this answer











                $endgroup$

















                  1












                  $begingroup$

                  I think there is no way to know just by looking at the data.



                  In general, the population may be small or big and the sample may be small or big, hence in some situations the sample size might be quite close to the whole population. Imagine we would collect 90%, 95%, 99% and 100% of the population. I don't expect anything completely new happening with the results in case of the 100% (=population) data.



                  But maybe you know something about the population? If you know that the population consists of all customers of the company and you know how many customers they have per month you can maybe estimate how big the population is?



                  My question would be why you want to know that and why you don't know that? Usually one should know something about the data one is supposed to analyse. Keep in mind that inferencial statistics tries to draw conclusions about the population based on information that we know from the sample. This means if you have the population data there is no need for inferencial statistics (significance test, confidence intervals,...) and you can simply see the descriptive statistics. So such information about the data should be known by the analyst.






                  share|cite|improve this answer











                  $endgroup$















                    1












                    1








                    1





                    $begingroup$

                    I think there is no way to know just by looking at the data.



                    In general, the population may be small or big and the sample may be small or big, hence in some situations the sample size might be quite close to the whole population. Imagine we would collect 90%, 95%, 99% and 100% of the population. I don't expect anything completely new happening with the results in case of the 100% (=population) data.



                    But maybe you know something about the population? If you know that the population consists of all customers of the company and you know how many customers they have per month you can maybe estimate how big the population is?



                    My question would be why you want to know that and why you don't know that? Usually one should know something about the data one is supposed to analyse. Keep in mind that inferencial statistics tries to draw conclusions about the population based on information that we know from the sample. This means if you have the population data there is no need for inferencial statistics (significance test, confidence intervals,...) and you can simply see the descriptive statistics. So such information about the data should be known by the analyst.






                    share|cite|improve this answer











                    $endgroup$



                    I think there is no way to know just by looking at the data.



                    In general, the population may be small or big and the sample may be small or big, hence in some situations the sample size might be quite close to the whole population. Imagine we would collect 90%, 95%, 99% and 100% of the population. I don't expect anything completely new happening with the results in case of the 100% (=population) data.



                    But maybe you know something about the population? If you know that the population consists of all customers of the company and you know how many customers they have per month you can maybe estimate how big the population is?



                    My question would be why you want to know that and why you don't know that? Usually one should know something about the data one is supposed to analyse. Keep in mind that inferencial statistics tries to draw conclusions about the population based on information that we know from the sample. This means if you have the population data there is no need for inferencial statistics (significance test, confidence intervals,...) and you can simply see the descriptive statistics. So such information about the data should be known by the analyst.







                    share|cite|improve this answer














                    share|cite|improve this answer



                    share|cite|improve this answer








                    edited 31 mins ago

























                    answered 37 mins ago









                    stats.and.rstats.and.r

                    4339




                    4339























                        0












                        $begingroup$

                        A sample is a just subset of the population. If the sample is representative (which it should be), the only main between sample and population is their size.



                        However, it should be noted that for any analysis in real life it's very important to know where the data comes from, and the process of collecting them needs to be well documented. Not even knowing whether the data is a sample looks like a rather bad red flag.






                        share|cite|improve this answer









                        $endgroup$

















                          0












                          $begingroup$

                          A sample is a just subset of the population. If the sample is representative (which it should be), the only main between sample and population is their size.



                          However, it should be noted that for any analysis in real life it's very important to know where the data comes from, and the process of collecting them needs to be well documented. Not even knowing whether the data is a sample looks like a rather bad red flag.






                          share|cite|improve this answer









                          $endgroup$















                            0












                            0








                            0





                            $begingroup$

                            A sample is a just subset of the population. If the sample is representative (which it should be), the only main between sample and population is their size.



                            However, it should be noted that for any analysis in real life it's very important to know where the data comes from, and the process of collecting them needs to be well documented. Not even knowing whether the data is a sample looks like a rather bad red flag.






                            share|cite|improve this answer









                            $endgroup$



                            A sample is a just subset of the population. If the sample is representative (which it should be), the only main between sample and population is their size.



                            However, it should be noted that for any analysis in real life it's very important to know where the data comes from, and the process of collecting them needs to be well documented. Not even knowing whether the data is a sample looks like a rather bad red flag.







                            share|cite|improve this answer












                            share|cite|improve this answer



                            share|cite|improve this answer










                            answered 38 mins ago









                            PerePere

                            4,7531821




                            4,7531821





















                                0












                                $begingroup$

                                There is no way - the "population" of interest is part of the specification of the problem.



                                Statistical problems involving inference to a "population" require specification of the group of interest, about which we are making an inference. Only a proper specification of the problem ---in this case, from a briefing from the client--- can give you this. Of course, there may be situations where the client does not know how to specify their problem in a well-posed way, and in this case, part of the responsibility of the statistician is to elicit contextual information to assist the client to formulate a well-posed problem. In some cases, the source of existing sample data may also imply some natural suggestions about the "population" for which we can make a valid inference. (Generally, a random sample allows us to make an inference about characteristics of the corresponding sampling frame, which may be close to some population of common interest.) Sample data cannot formulate your statistical problem for you. The problem must arise from some objective or context.



                                As to whether data is "sample data" or "population data", that also depends on context, and specification of the group of interest. For example, suppose we consider data on the driving record (demerit points, fines, years with license, etc.) of a random sample of people with driver's licenses registered in a particular State. That data would be "sample data" from the associated sampling frame from which they were drawn ---i.e., all people who hold a driver's license registered in that State--- and the data of all people with a driver's license registered in that State would be the "population". However, that "population" can also be regarded as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the country, which can in turn be considered as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the world.



                                All of this goes back to a fundamental aspect of sampling problems. In any such problem, there must be a specified "population" of interest, for which we wish to make an inference, and there must be "sample data" that bears somehow on that inference. (Ideally, we would like the sample data to be a random sample from a sampling frame that is close to the population of interest.)






                                share|cite









                                $endgroup$

















                                  0












                                  $begingroup$

                                  There is no way - the "population" of interest is part of the specification of the problem.



                                  Statistical problems involving inference to a "population" require specification of the group of interest, about which we are making an inference. Only a proper specification of the problem ---in this case, from a briefing from the client--- can give you this. Of course, there may be situations where the client does not know how to specify their problem in a well-posed way, and in this case, part of the responsibility of the statistician is to elicit contextual information to assist the client to formulate a well-posed problem. In some cases, the source of existing sample data may also imply some natural suggestions about the "population" for which we can make a valid inference. (Generally, a random sample allows us to make an inference about characteristics of the corresponding sampling frame, which may be close to some population of common interest.) Sample data cannot formulate your statistical problem for you. The problem must arise from some objective or context.



                                  As to whether data is "sample data" or "population data", that also depends on context, and specification of the group of interest. For example, suppose we consider data on the driving record (demerit points, fines, years with license, etc.) of a random sample of people with driver's licenses registered in a particular State. That data would be "sample data" from the associated sampling frame from which they were drawn ---i.e., all people who hold a driver's license registered in that State--- and the data of all people with a driver's license registered in that State would be the "population". However, that "population" can also be regarded as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the country, which can in turn be considered as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the world.



                                  All of this goes back to a fundamental aspect of sampling problems. In any such problem, there must be a specified "population" of interest, for which we wish to make an inference, and there must be "sample data" that bears somehow on that inference. (Ideally, we would like the sample data to be a random sample from a sampling frame that is close to the population of interest.)






                                  share|cite









                                  $endgroup$















                                    0












                                    0








                                    0





                                    $begingroup$

                                    There is no way - the "population" of interest is part of the specification of the problem.



                                    Statistical problems involving inference to a "population" require specification of the group of interest, about which we are making an inference. Only a proper specification of the problem ---in this case, from a briefing from the client--- can give you this. Of course, there may be situations where the client does not know how to specify their problem in a well-posed way, and in this case, part of the responsibility of the statistician is to elicit contextual information to assist the client to formulate a well-posed problem. In some cases, the source of existing sample data may also imply some natural suggestions about the "population" for which we can make a valid inference. (Generally, a random sample allows us to make an inference about characteristics of the corresponding sampling frame, which may be close to some population of common interest.) Sample data cannot formulate your statistical problem for you. The problem must arise from some objective or context.



                                    As to whether data is "sample data" or "population data", that also depends on context, and specification of the group of interest. For example, suppose we consider data on the driving record (demerit points, fines, years with license, etc.) of a random sample of people with driver's licenses registered in a particular State. That data would be "sample data" from the associated sampling frame from which they were drawn ---i.e., all people who hold a driver's license registered in that State--- and the data of all people with a driver's license registered in that State would be the "population". However, that "population" can also be regarded as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the country, which can in turn be considered as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the world.



                                    All of this goes back to a fundamental aspect of sampling problems. In any such problem, there must be a specified "population" of interest, for which we wish to make an inference, and there must be "sample data" that bears somehow on that inference. (Ideally, we would like the sample data to be a random sample from a sampling frame that is close to the population of interest.)






                                    share|cite









                                    $endgroup$



                                    There is no way - the "population" of interest is part of the specification of the problem.



                                    Statistical problems involving inference to a "population" require specification of the group of interest, about which we are making an inference. Only a proper specification of the problem ---in this case, from a briefing from the client--- can give you this. Of course, there may be situations where the client does not know how to specify their problem in a well-posed way, and in this case, part of the responsibility of the statistician is to elicit contextual information to assist the client to formulate a well-posed problem. In some cases, the source of existing sample data may also imply some natural suggestions about the "population" for which we can make a valid inference. (Generally, a random sample allows us to make an inference about characteristics of the corresponding sampling frame, which may be close to some population of common interest.) Sample data cannot formulate your statistical problem for you. The problem must arise from some objective or context.



                                    As to whether data is "sample data" or "population data", that also depends on context, and specification of the group of interest. For example, suppose we consider data on the driving record (demerit points, fines, years with license, etc.) of a random sample of people with driver's licenses registered in a particular State. That data would be "sample data" from the associated sampling frame from which they were drawn ---i.e., all people who hold a driver's license registered in that State--- and the data of all people with a driver's license registered in that State would be the "population". However, that "population" can also be regarded as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the country, which can in turn be considered as (non-randomised) "sample data" from the larger class of all people with driver's licenses registered anywhere in the world.



                                    All of this goes back to a fundamental aspect of sampling problems. In any such problem, there must be a specified "population" of interest, for which we wish to make an inference, and there must be "sample data" that bears somehow on that inference. (Ideally, we would like the sample data to be a random sample from a sampling frame that is close to the population of interest.)







                                    share|cite












                                    share|cite



                                    share|cite










                                    answered 7 mins ago









                                    BenBen

                                    29.2k234130




                                    29.2k234130



























                                        draft saved

                                        draft discarded
















































                                        Thanks for contributing an answer to Cross Validated!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid


                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.

                                        Use MathJax to format equations. MathJax reference.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function ()
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f405456%2fhow-to-figure-out-whether-the-data-is-sample-data-or-population-data-apart-from%23new-answer', 'question_page');

                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        Era Viking Índice Início da Era Viquingue | Cotidiano | Sociedade | Língua | Religião | A arte | As primeiras cidades | As viagens dos viquingues | Viquingues do Oeste e Leste | Fim da Era Viquingue | Fontes históricas | Referências Bibliografia | Ligações externas | Menu de navegação«Sverige då!»«Handel I vikingetid»«O que é Nórdico Antigo»Mito, magia e religião na volsunga saga Um olhar sobre a trajetória mítica do herói sigurd«Bonden var den verklige vikingen»«Vikingatiden»«Vikingatiden»«Vinland»«Guerreiras de Óðinn: As Valkyrjor na Mitologia Viking»1519-9053«Esculpindo símbolos e seres: A arte viking em pedras rúnicas»1679-9313Historia - Tema: VikingarnaAventura e Magia no Mundo das Sagas IslandesasEra Vikinge

                                        What's the metal clinking sound at the end of credits in Avengers: Endgame?What makes Thanos so strong in Avengers: Endgame?Who is the character that appears at the end of Endgame?What happens to Mjolnir (Thor's hammer) at the end of Endgame?The People's Ages in Avengers: EndgameWhat did Nebula do in Avengers: Endgame?Messing with time in the Avengers: Endgame climaxAvengers: Endgame timelineWhat are the time-travel rules in Avengers Endgame?Why use this song in Avengers: Endgame Opening Logo Sequence?Peggy's age in Avengers Endgame

                                        Are there legal definitions of ethnicities/races? The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)Legal definitions in the United StatesAre there truly legal limits on US interest rates?Are gender identity and sexual orientation federally protected?Why is there an apparent legal bias against digital services?What limits are there to the powers of individual judges in the United States legal system?Are women only scholarships legal under Irish / EU law?Is the term “race” defined by Public Law enacted by Congress of the United StatesIs there a legal definition of race in the US?Neighbors are spying for landlord on Renters is it legal?Are Protected Classes Bi-directional?