Realm Of Randomness

December 8, 2007

Are birthdays evenly distributed across the year? *

Filed under: Academic,Freakonomics,Musing — Randomizer @ 12:46 am

[* Two updates appended at the bottom]

It started off as a wasteful exercise by Sharath, where he tracked birthdays for two weeks on Orkut just to see if he knew at least one person born on each day of the year. Something I noticed from his sampling though, was that he had about 2 birthdays / day in his stats … which means he should have roughly 365×2 = 730 friends on his Orkut list. Well, guess what, he has 736 friends on his list! Which suggested that birthdays are sort of evenly distributed across the year.

However, in my personal observation, I’ve seen more people born in the latter half of the year, than the first half. There always seemed to be a ton of birthdays around Oct/Nov rather than, say, April/May. He then went on to make a distribution of birthdays from his class of ’97 – and found that there were 28 birthdays in the first half as opposed to 36 in the second – a small victory for my observation. I couldn’t wait to try this on more stats, so I sampled my own class of ’98 – fairly simple, as we have a database on our yahoo group, and these were the results:

births.jpg

The stats for my class are: 34 born in the first half, and 33 born in the second half – an almost even distribution. Well, so much for my lead :( . But I found an interesting article on the monthly distribution of births in rural India – and guess what ? There is a very clear bias for births in the second half. Which means – high rates of conception in December/January. The paper also refers to a similar increase in conception rates in the United States, attributed to the Christmas (Holiday?) season :) .

So what are the factors you think contribute to an uneven distribution ? Off the top of my head – I’d say

  • Astrological ‘luck’ periods – especially for couples in India
  • Admission to Kindergarten – I’d assume that more people would like their kids to *just make* the eligibility criteria. For instance, if the cut-off for admission to Kindergarten is ‘Those born in 95’, I’d think that people would think it is an advantage to be born in the latter half of ’95, say Nov-Dec, as they are almost a year younger than those born in Jan, yet they are on the same ‘level’ academically.
  • When you get married also influences birth – at least it used to :) .

What are the other reasons you think there are for an uneven distribution of births across a year? Have you noticed a trend amongst your friends?

On another note, I can probably explain why I *feel* more of my friends are born in the latter half of the year – It is simply because I’m born in October and it is more easy for me to remember those that were born in Sep/Oct/Nov than in the earlier half of the year :). I’m not sure what bias to call it. Maybe you can help me with that too.

Update 1: A regular visitor and commentator at this site, Luciferratic gives us some data from a much larger dataset. The chart is astonishing. There is either something wrong with the dataset, or May is the most romantic time of year in the Middle-East ! :)

middleeast.jpg

Update 2: Our instincts are right … there is a reason there are that many January-borns in the data set … Wanna brainstorm on this for a while ? So what you know as of now is : The dataset is huge, it has data of people of all ages, and they are mostly middle-eastern. Now: What possible reason can you think of for this spike in the dataset for January?

Spend some time on this… the solution is elusive but I’m sure someone should be able to figure this out!

.

.

.

.

Luciferratic answers this question (along with a bunch of unnecessary apologies!) in his comment here.

Taking January out of the graph, as we know it is likely to have skewed data, the graph is quite even, as follows. There still seems to a bias towards the latter half though (Note: I have not corrected the percentage values in the graph, merely removed the January bar):

correctedbirths.jpg

Advertisements

23 Comments »

  1. […] Update: therandomizer with more data. […]

    Pingback by Birthday distribution « Epistles — December 8, 2007 @ 6:03 am | Reply

  2. …. Are u writing some kind of thesis on this or wat ??? [;)]

    Comment by :) — December 9, 2007 @ 10:30 pm | Reply

  3. @ :) : If you provide me some funding, I might :). ( And I don’t mean just your 2 cents on this )

    Comment by Randomizer — December 10, 2007 @ 1:04 am | Reply

  4. that’s what i was gonna say… someone’s already desperate for his next *fix* of thesis research..

    Comment by Anonymous — December 10, 2007 @ 3:29 am | Reply

  5. and that smart comment above is from muah

    Comment by caffeinator — December 10, 2007 @ 3:30 am | Reply

  6. Props to Sharath – I loved this post of yours and here’s why –

    1. Its very demographic economics related and while it might be dull for a lot of people, there are some amazing inferences that can be established from such data.

    2. I took some time out to run such a query on my bank’s customer data base (won’t tell you the sample size, but I think it’ll suffice to say its much much larger than yours and Sharath’s put together) and found out the following…

    Mth of birth>% of customers>Mth of conception
    Jan> 23.6%> May
    Feb> 6.81%> Jun
    Mar> 7.09%> Jul
    Apr> 6.38%> Aug
    May> 5.81%> Sep
    Jun> 6.06%> Oct
    Jul> 6.83%> Nov
    Aug> 7.55%> Dec
    Sep> 6.80%> Jan
    Oct> 7.23%> Feb
    Nov> 7.36%> Mar
    Dec> 8.49%> Apr

    Obviously the month of conception is just an assumption of being 9 mths prior to the birth (we do not capture such data ;))
    Talk about skewed – Looks like all everyone does in May is make babies!!! But that might just be what people in this part of the world do at that time, but I can’t figure out what’s so special about May.
    I believe there are a number of factors that influence the birth rate distribution across the year.
    – During my bachelor’s degree in India Economics 101 – we learnt about seasonal influences on birth rates. Historically this might have been a huge factor as people didn’t have heaters or warm fluffy duvets and I believe this would still apply to rural areas. You will see that birth rates spike 9 months after the winter period.
    – Another factor mentioned in the economics texts was ‘Entertainment’ – a lack of entertainment would have seen conception rates shoot up – monsoon periods when there were rampant blackouts due to power lines being down resulted in rural classes having nothing better to do besides marinating the leather rod.
    – Another side of the seasonal change aspect is the geographical location itself. The northern and southern hemispheres have seasonal changes at different times of the year. There is a lag of around 1/8 of a year and so you would see the seasonal spike in the south may come a few months later.
    – You mentioned astrology, but I’d like to say its more about the predominant religion in the region. For instance, Hindus don’t put the you know what in the you know where during certain times of the year. The same goes for Muslims – they don’t do the horizontal monster mash during Ramadan and I’m guessing Christians that abstain during lent might not be getting any either. So depending on where you live you’d see birth rates slump 9 months after these periods.

    I guess if you live in a cosmopolitan area such as the US where there are immigrants from each and every walk of life, you are bound to have a more even distribution as compared to a rural location in India. This is more likely the reason why your class room data was evenly spread. The same goes for Sharath’s Orkut data as he probably has those 736 friends from all over the world and not just one city/town.

    Comment by Luciferratic — December 10, 2007 @ 7:12 am | Reply

  7. oh @:) ,,, that was me.. put the smiley in the wrong place. And why would anybody fund you to do a research on buddays man :)) … Imagine if somebody actually funds u to do that. Freaking CT

    Comment by salman — December 10, 2007 @ 1:45 pm | Reply

  8. @Luciferratic – Thanks ! :) I’ve appended your data to the original post. There is something definitely fishy going on though, for no amount of googling could give me any other data that pointed to a spike of births in January for people from the middle-east. Maybe you just discovered the secret formula ! All you need to do is go after the January-borns, and more likely than not, they will become your customer ;).

    Jokes apart, I’m disappointed that this exercise is a standard Econ 101 lesson :( .

    The Horizontal Monster Mash ? Is that what the kids are calling it these days ? ;)

    Comment by Randomizer — December 11, 2007 @ 12:16 am | Reply

  9. @Caffeinator : I’m still an official grad student, till this Friday :). So it’s all fair game.

    @Sallu : Fund me man ! :) Anyways, the pounds are kicking the dollar’s arse – so it won’t cost you much ! ;)

    Comment by Randomizer — December 11, 2007 @ 12:20 am | Reply

  10. […] December 10, 2007 Posted by Sharath Rao in statistics. trackback Luciferratic, a commenter on therandomizer’s  blog ( am seriously getting pissed with these pseudonames ) has access to a much larger dataset […]

    Pingback by Birthday distribution - Middle-east edition « Epistles — December 11, 2007 @ 2:56 am | Reply

  11. Randomizer – I have to agree with you, there definitely is something wrong as the birth rates cannot be skewed to this extent and I have been asking around. This is the first time anyone has tried to look at the portfolio on the basis of month of birth as it really has no relation to the risk profile of a customer.

    So, I asked around as to possible reasons as to why the data could be so skewed. Apparently the data is correct, its the birth DATES that are wrong. And I DO apologize about this as I should have checked before posting such a reckless response.

    The country I reside in has had a national identification system in place only since 1950. Prior to that there was no official document used to identify birth dates! When the National ID came into play in 1950 most people didn’t know which month they were born in and they were randomly assigned dates in January more often than not the 1st of January! Yes I know it sounds stupid, but prior to 1950, people here were familiar with the Islamic Calendar (Hijri Calendar) and were not used to the Gregorian Calendar. A market norm is to use the date of birth as appearing on the National ID and the result as you have already seen is the data is skewed towards January for customers that were born before 1950.

    I apologize to you and to all of your readers for such a big mistake on my part. I also apologize to Sharath as I know he’s posted the same data on his blog as well.

    I ran the data again, but this time excluded anyone that was born prior to 1950 and while I understand all of your skepticism, the previous findings still hold, but with a far more reasonable spike in December and January.

    The revised findings are as follows:

    Jan> 10.94%
    Feb> 8.00%
    Mar> 8.31%
    Apr> 7.44%
    May> 6.74%
    Jun> 6.99%
    Jul> 7.89%
    Aug> 8.89%
    Sep> 7.96%
    Oct> 8.46%
    Nov> 8.49%
    Dec> 9.89%

    Once again, I do apologize for the goof up and I’m going to abstain from the blog world for a while – until all of you forget me and what I did.

    Comment by Luciferratic — December 11, 2007 @ 8:21 am | Reply

  12. @Luciferratic – That’s amazing, right ? :) … A trivial exercise such as this one and we ended up learning about the census system in the Middle East in the 1950’s ! I would have never thought of that, no matter how much I brainstormed.

    I shall revise the data, and no, there’s no need for any abstaining of any sort ! :)

    Comment by Randomizer — December 11, 2007 @ 1:27 pm | Reply

  13. @Luciferratic – Appended the new info learnt – I must say, this is a great brain-teaser :)

    Comment by Randomizer — December 12, 2007 @ 1:25 pm | Reply

  14. Looks like everyone is talking about B’days and their distribution. Have a look at this recent article at Freakonomics

    http://freakonomics.blogs.nytimes.com/2007/12/12/whats-the-significance-of-your-sign-a-guest-post/#more-2152

    Comment by Luciferratic — December 13, 2007 @ 7:02 am | Reply

  15. Nice article, that ! No doubt he’ll be able to statistically show that astrology is indeed bull … that was a really simple and smart way to go about doing it. :)

    Comment by Randomizer — December 17, 2007 @ 2:31 am | Reply

  16. […] something wrong with the dataset, or May is the most romantic time of year in the Middle-East ! Are birthdays evenly distributed across the year? * Realm Of Randomness __________________ Come warm yourself beside the fire and hear of days of […]

    Pingback by Are more people born during specific times of the year? - Science Forums — July 13, 2008 @ 1:29 am | Reply

  17. I live in the Southern hemisphere, and noticed a peak in birthdays in June and July, which ties in nicely with the increase (even if corrections are done for assigned birthdays) in December and January in the Middle East.

    Comment by mynah — July 13, 2008 @ 2:27 am | Reply

  18. I have friends from Somalia who simply do not know their birthdates (although they do know what day of the week each was born). So they put “January 1” on all forms as their birthday. So I wasn’t surprised at all to see that spike in January–it simply reflects a lack of knowledge of that date.

    I was curious about a daily birth rate chart, to see if things like Valentine’s Day had a corresponding spike distributed 9 months later.

    Comment by Ranbo — May 18, 2009 @ 5:36 pm | Reply

  19. Hack again?!

    Comment by iracruz — October 28, 2010 @ 5:03 am | Reply

  20. […] this blog post and it’s pretty interesting.  Birthdays are not randomly distributed.  As a matter of fact, […]

    Pingback by Are Birthday’s Randomly Distributed? | Points and Figures — May 8, 2011 @ 6:20 pm | Reply

  21. Does anybody knows about any empirical/statistical test of numerology?

    Comment by terdon — April 28, 2013 @ 10:19 am | Reply

  22. The population difference between the norm and the January don’t do birthdays. They often use 1 Jan by defacto.

    Comment by Ron — May 9, 2013 @ 8:39 pm | Reply

  23. Excellent weblog here! Also your web site lots up fast!

    What host are you the usage of? Can I am getting your associate hyperlink to
    your host? I want my site loaded up as quickly as yours lol

    Comment by Drone Shadow strike hack — September 26, 2015 @ 1:48 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: