Salta al contenuto

Mese: Ottobre 2018

Boole inequality and calculation of union probability of n arbitrary events. Explain in a simple way the concept of sampling distribution of the mean (or any other computable statistics on the sample as standard deviation (sigma) or mode or median).

Boole inequality and calculation of union probability of n arbitrary events The Boole inequality, or union bound, says that for every limited or countable collection of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events. Formally, if we have a finite or countable set of events A1, A2, A3, An we say that: It is easily demonstrable for n = 2 event. If for example, we have two arbitrary events A and B we can say for the first axiom of probability that: P(AUB)= P(A)+P(B)- P(A∩B)  P(A∩B) is >0, which is subtracted, and so it follows that P(AUB)<= P(A)+P(B). If we consider the event C = AUB and another arbitrary event D we can iterate the demonstration for n = n+1. Therefore the result is the above formula.  So it is possible to apply the Boole inequality to compute the union probability of n arbitrary event. Here’s the formula: Explain in a simple way the concept of the sampling distribution of the mean (or any other computable statistics on the sample as standard deviation (sigma) or mode or median). The mean of the sampling distribution of the mean is the mean of the population from which the scores were sampled. Therefore, if a population has a mean μ (unknown), then the mean of the sampling distribution of the mean is also μ. The symbol μM (calculated) is used to refer to the mean of the sampling distribution of the mean. Therefore, the formula for the mean of the sampling distribution of the mean can be written as:   μM = μ It is important to keep in mind that every statistic, not just the mean, has a sampling distribution.  

Concept and definition of mean. Relationship between frequency and mean. The Markov inequality.

Concept and definition of mean In statistics, we define as mean a single numeric value that describes synthetically a set of data. There’re three types of mean: arithmetical, harmonic and geometrical mean. In statistics, we usually consider as mean the arithmetic mean. If the mean is computed by using the whole population it is called population mean, but if the values used are just a subset of the population the result is called sample mean. The equation to compute the arithmetic mean is the following: A := 1/n sum(from i = 1 to n) of ai This kind of approach is called ex post computation because to do that we need to collect the whole sample data before the computation. This isn’t always possible to do, so in some cases, we must use the expected mean or expected value, which is a measure of a central tendency where all data are weighted by their probability of occurring and then summed. The expected mean is an ex ante calculation (sometimes referred to as weighted mean where probabilities are the weights). The mean can be used to extract some results and interpretations from data. For example, if we can calculate the distance between a value v of the population and the average m we can see that this value if it must be balanced by another set of value v’ who got the same distance from m. In formula: distance(m,v) = sum[from i = 1 to d ](distance(m,v’) Statistical mean is popular because it includes every item in the data set and it can easily be used with other statistical measurements. However, the major disadvantage in using statistical mean is that it can be affected by extreme values in the data set and therefore be biased. For example, mean income is typically skewed upwards by a small number of people with very large incomes, so that the majority have an income lower than the mean. To avoid these problems one of the solutions could be the Knuth algorithm: # this is a Python implementation of Knuth alogorithm def mean_knut(data): n = 0 mean = 0 for x in data: n = n + 1 delta = x – mean mean = mean + delta/n return mean This algorithm is less subject to information loss given by floating point cancellation and rounded, but it could be less efficient than the naive implementation because of the division inside the loop. The relationship between frequency and mean To…

Conditional frequency

Request Understand and discuss the notion of conditional frequency Insight 2:  Research the various approaches to define the concept of probability, the ‘Kolmogorov axioms’ and the relationship with the notion of (relative) frequency. Exercise: take a CSV file and read all the lines. Split each line by using the ‘stringsplit method and (if you can) load the data as properties of suitable objects in a list of objects (example object student) Execution To introduce the concept of conditional frequency, we must discuss the concept of conditional probability. For example, I have two honest dice and i want to describe all the possibles results I can use this notation: S = {(i,j), i = 1,2,3,4,5,6 , i = 1,2,3,4,5,6} defining i is the result of the first dice and j is the result of the second one. Since the dice are honest every combination of (i,j) has the probability of 1/36 to happen. We suppose now that the first dice results, 3, so we fix i = 3 and we ask ourself: What is the probability that the sum of the results will be 8? By fixing i as 3 there are 6 other possibles results : (3,1), (3,2), (3,3), (3,4), (3,5), (3,6) . This means that if i = 3 the probability of the above events to happen is 1/6 and the probability of the other 30 initial events is 0. Then we can conclude that the probability of i+j = 8 when i = 3 is just 1/6. If we now call the event “sum of  i + j = 8″ E and the event “i = 3” F, then we define what’ve just calculated as the conditional probability of E given F and we denote that in this way: P(E|F) We can also say that if F is verified, to verify E, the event must be part of the E AND F (where AND is the intersection). Since now F is an event really happened it will become the new set of possibles solutions. So the conditional probability is the division of the probability of E AND F to the probability of F: P(E|F) = P(E AND F) / P(F) Given these definitions, the conditional frequency is the number of occurrences of the event E given the event F and it has the same formula the conditional probability. Insights Research the various approaches to define the concept of probability, the ‘Kolmogorov axioms’ and…

Statistics and range applications

Request Definition of Statistics and range applications Basic notions and definitions Population, statistical unit, Attributes Observations, dataset The concept of scale (or levels) of measurement Execution Statistics is the science that manages to pull out conclusions from experimental data. A typical statistical scenario is when we want to study a really big set of something that has associated some measurable values, called attributes. This enormous set is called “population”. The statistical approach to this problems consists in selecting a reduced subset of the population, called statistical unit (or in Italian, “campione”). From this statistical unit with some observations to register the pieces of information we want to study, we can retrieve a dataset, a set of measurable values, and we can study it to pull out some valid conclusions about the whole population. An implicit hypothesis that we’ve to do is that there’s a distribution of probability of the population so that our datasets are independent values from that distribution. By also applying the result of the LCT (Limit Central Theorem) we can introduce the concept of “scale of measurement” by saying that one of the main tough points of statistical research is the size of the statistical unit: the larger is the statistical unit, the more accurate and efficient is the study. Insight Most used programming languages within VS.NET Main similarities, differences, and comparison of languages Online translator VS.NET Built for main core and infrastructure, but it can easily be ported literally everywhere Syntax-based on C# so is really lookalike C#, C and C++ Object-oriented Here is possible to find some comparisons: Java vs .NET explained with cats Python vs .NET NodeJs vs .NET vs Spring Here is the online translator from .NET to C# Application Discuss possible differences between .NET and C#: VB has no parenthesis but is important the correct indentation. They also have different but similar syntax. VB wants .net to run, C# wants csc C# has to be compiled, VB no.