A study of production and consumption of information graphics in regional newspapers in India

About this project
This work involved collecting the infographics data for the month of December 2017 from the Delhi editions of top 4 Hindi newspapers and top 4 English newspapers in India. This data is organised newspaper-wise, and date-wise with certain parameters of judgement like the content it is depicting, the location of the news in the newspaper, its treatment, and the area it covers on a page. We propose a tentative data model for a systematic collection of the otherwise unusable infographics data, in a database, ready for analysing and deriving insights. This model can be used to gather similar data from other regional languages (Bangla, Tamil, Malyalam, etc.) newspapers printed in India.

What does this data tell us in the present?
We pulled out some basic statistics from the collected data of various newspapers, which can be accessed from the navigation pane on the top. These satistics represent different variables, variable types and the frequency of occurrence of these variables in the newspaper (for example, we can see "what percentage of visualisations in Times of India were sports-based" or "what percentage of visualisations in Dainik Bhaskar were bar charts"). The Pearson and the Spearman correlation graphs at the end of each page shows how different variables are correlated. These statistics, however, do not tell how different variables are interdependant. Future work includes demonstrating the interdependability of variables. This would allow us to analyse richer details like "how many sports visualisations in Times of India were bar charts". The collected data can be analysed here. For viewing the repository with the link to the collected spreads see the data on Drive.

See the Report PDF See the Presentation PDF See the data on Drive