Sample size and time the sample is taken can also have drastic consequences on the outcome of the study.
I do a number of studies against the North Carolina public offender database. Take something like prostitution. When you plot the crimes you find bands forming where the arrests plummet during winter months... and sharply rise when temperatures are warmest. A summer study would give a false positive for winter months... and vice versa. So the whole of the year needs to be taken into consideration to show that there is a decline and to generate a hypothesis as to why this might be occurring. Comparing it against a state like Florida you don't get that banding. And looking at a state like Nevada you need to look at counties, since in some it's legal. Polling prostitutes seriving prison sentences and looking at the states where they're from, you might conclude that Florida in this case has a larger prostitution problem because they're active all year round. But what might be missed in this case is that Nevada has a larger number of escorts and prostitutes, and that another problem all together is generated in North Carolina in the off season is that you get a spurious increase in sexual assault in the winter months when the prostitutes aren't available. So if you merely studied the North Carolina data and promoted it for the country you'd have a seriously incorrect skew and an unobjective analysis. This single state study process being applied accross the nation for studies done solely in Minneosta and finding them applied to the world quickly becomes apparent as to why this is a bad practice.
It's like looking at speeding tickets in the state. Looking at specific weeks and time periods. As a whole the first and last sunday of each month have the highest tickets produced. This would suggest to some that there's a quota. Higher samples for breakins, you get a funky stat that no BE has occured in the state on a tuesday between the hours of 2-3am in the morning... I don't remember if these are all correct, it's been a while since I looked at my data and such but these are examples of things that kind of of tip off in the back of my mind. I'm working with some 2.7 million records. Seems like a lot, but when you start splitting by name, date of birth, and crime by year... and hoping for an n=5 for a given day of the year and your numbers start getting really small to mean N=5. The more you're checking the vastly greater the sample size needs to be to meat the minimum requirements.
In the case of these tests using N=30 and grouping all violent crimes, or all classes of a mental illness in a given schedule you get an even smaller sample subset when you break down that the test covers 10 segments... you have an actual N = 3 nominal for a given subject segment.
They claim high accuracy. I remember reading through the collective works of CGY in his early type inidication studies where his N=2 for Introverted... arguably it may have been N=1.
Some FDA drug testing can be a little as N=10. This meets the double blind minimum of 5/5 for what is classed as minimum needed for an accurate statistical measure. This to me is a joke. That 10 people can represent 7 billion.
Given the MMPI-2 can be used to guage viablity for security clearance. This is an interesting article for malignant attempts to fake out the MMPI and MMPI-2... using a semi-decent sample size in comparison...
http://maamodt.asp.radford.edu/Research%20-%20Forensic/97%2012-2-Aamodt-42-47.pdfThere's been a lot of studies conducted on the conclusiveness of these studies and they all basically conclude that a lot more research needs to be conducted before serious conclusions can be made as to the validity of the tests.
Just for kicks I checked the 1979 mortality data from the CDC and they still haven't corrected it. August 10, 2010 I sent them a fix and it's still not done. That's a good case for larger sample size can generate a problem. If you looked at all the records in that file, because of an error in the data, you get 20k worh of babies dying in mining accidents. This illustrates another important problem is that a lot of the data used in these studies is inaccurate or incorrect.
I wish there was something more conclusive, but right now there's not. I don't think there will be. If you want stat testing I'd argue banking corporations have compiled the largest amount of data over the years. I know the MBTI was compiled largely based on the banking surveys, or at least followed the model of the banking surveys. Doesn't mean the MBTI, or Kersey Temperment Sorters or any of these other tests are accurate. Again, study upon study I conclude the same thing... you've got as good a chance at having your palm read to determine what's going on as you do with these tests sometimes. It's psuedo-science at its finest.
Need serious ZZZ's... Be well everyone... Interesting disucssion.
just me... trying to be... something more than I was yesterday. be well everyone.