Students ‘ Emerging Reasoning about Data Tables of Large-Scale Data

This study investigated thirty-two Year 9 secondary school students‘ (15 year olds) reasoning about data tables of large-scale data. Eight groups of four students, drawn from six classes, participated in a workshop that examined the components of population change for EU and candidate countries, namely natural increase of population, net overseas migration for Europe and their country, and total population growth. Students investigated trends in real data displayed in tables, and responded to a set of reflective questions. Analysis of the reasoning used by the students revealed four levels of data-table comprehension—reading the data, reading within the data, reading beyond the data, and reading behind the data—similar to the levels described for students working with smaller data sets.


Introduction
Developments enabled by novel technologies have completely altered the ways that citizens can access data.Indeed, citizens can access an enormous amount of numerical information that is even greater nowadays with the data revolution that gives rise to emerging data sources that are providing new sorts of evidence used to influence public opinion.Three emerging trends have impacted this revolution in our increasingly data-driven society.These trends include: 1) the increasing use of large-scale databases within the open data movement, 2) the growing use of big data, and 3) novel ways of visualising data.
well, and the techniques offer considerable promise in helping us understand complex social processes like learning, political and organizational change, and the diffusion of knowledge.
The three emerging trends that characterise the revolution of our increasingly data-driven society offer considerable promise in enhancing people's understanding of complex scientific and societal issues, such as political and organizational change, population immigration.The revolution of data in our society is having a profound impact on teaching statistics.
The statistics community is not faced merely with the challenge of educating students to become competent explorers of large-scale authentic data on a huge variety of social important topics, it is faced with educating an entire population about difficult statistical tasks, including interpreting multivariate data sets, and drawing conclusions about samples of large-scale data.
My purpose in this article is to report an empirical study to more clearly identify the patterns of reasoning used by students interacting with lasrge-scale data tables.It brings together key ideas from various perspectives, going beyond several earlier reviews of the literature, to identify critical factors that influence secondary school students' reasoning about large-scale data organised in tables, to enable an in-depth analysis of key aspects of participants' reasoning.I also provide some recommendations for instruction and future research.

Theoretical Framework
The expanding use of large-scale data for prediction and decision-making in almost all domains of life makes it a priority for mathematics school curricula worldwide to help students develop their understanding of key statistical ideas prior to entering college.This includes understanding of data-tables, which is a core aspect of statistics, essential to conducting meaningful data analysis.Tables, graphs, and other data displays are used broadly in the media to present, disseminate, and explain information, thus students need to be able to read and interpret them in meaningful ways.
A number of research studies about the difficulties that learners have with drawing inferences from tables and graphs showed that students have particular difficulty in drawing inferences from tables and graphs in order to interpret the data, and make predictions (e.g., Bright and Friel, 1998;Estrepa, Batanero, and Sanchez, 1999;Pereira-Mendoza and Mellor, 1991;Sharma, 1997).For example, when interpreting the data, one usually compares different data sets presented in graphs and tables to make predictions about an unknown case, to generalize to a population, or to discern a trend.
There appears to be little research on learners' comprehension of tables, despite the pervasive use of data tables in statistical data analysis and textbooks of statistics.The limited existing literature in statistics education addresses table learning in children (Brizuela and Lara-Roth, 2002;Ben-Zvi and Sharett-Amir, 2005;Marti, 2009;Brizuela and Alvarado, 2010;Gabucio et al., 2010;Marti et al., 2010).Brizuela and Lara-Roth (2002) showed that 7-year-old students who had not received direct instruction in the use and configuration of tables, could use information from a table to work on a problem.The tables used by students in the research study of Brizuela and Lara-Roth's were produced without imposing any specific structure on the primary students.Estrella and Mena (2014) investigated primary-school children's comprehension of statistical frequency tables, when the students producedmore tables while trying to analyse some data.They identified different levels of conceptualization of tables in these students, such as text lists with and without counting, tables with icons with and without counting, tables with text with and without counting, and tables with text without individual counts but with marginal totals.These primary students' conceptualization of tables allowed Estrella and Mena's (2014) to explore how students register data in a table, count in a table, list of elements belonging to a class, using partitioning, equivalence relations, and counting that allow for ordering data to obtain information in order to place data in rows, columns, cells, and to use written language to label headings.Kemp and Kissane (2010) described a five-step framework to help both teachers and students in primary, secondary, and tertiary mathematics education, to interpret data in the form of tables or graphs.The framework for interpreting tables and graphs provides a progression from simple numerical reading of a table to more complex interpretations of tables and graphs required for a better understanding of data in their context.Another classification of patterns of reading and comprehending graphs, and hence of interpreting of graphs, was developed by Curcio (1987) who assessed fourth and seventh grade students' interpretations of school graphs.From the analysis of the students' responses, he developed a framework for assessing and building learners' graphical comprehension that has three levels: reading the data, reading within the data, and reading beyond the data.These three types of related components of graph comprehension comprise the classical Curcio schema: obvious answer is right there in the graph (e.g., What is the least preferable means of transportation of students who travelled to school on Mondays?).
2. -Reading between the data,‖ which involves finding relationships in the data presented in a graph by making comparisons (e.g., Is the number of students who travel to school by car on Mondays the same as the number of students who travel to school by bus?).
3. -Reading beyond the data,‖ which involves extrapolating, predicting, or inferring from the representation to answer implicit questions (e.g., If we ask the students of ten schools, about how they get to school on Monday, how many students who travel by bus might they expect to find?).
This framework has helped statistics educators in building instructional strategies for facilitating student understanding of different graphical representations.
Although Curcio's framework (1987) has undoubtedly made a very important contribution for understanding the processes involved in the interpretation of graphical representations, it has been criticized in recent years for limiting its investigation to the kinds of graphs used in school contexts (e.g., simplistic tables of limited purpose in real life, cf.Monteiro & Ainley, 2007), and hence for restricting the range of situations to which the interpretation of data tables and graphs is applied (Sharma, 2013).According to Sharma (2013), Curcio did not investigate how students evaluated and critically commented on information displayed in tables and graphs.
Additionally, the questions of Curcio's research did not provide students with any opportunities to explain their choices, hence, to gain an insight into students' thinking.
A more elaborated framework given by Friel, Curcio, and Bright (2001), developed the original Curcio framework, splitting each stage into two parts, to more precisely describe the behaviours associated with graph comprehension: 1) recognizing the components of graphs, the interrelationships of these components, and the impact of these components on the graphical presentation of information (Reading the data) 2) speaking the language of particular graphs, when reasoning about the information displayed in graphs (Reading the data) 3) understanding the relationships among graphs (Reading within the data), 4) making sense of a graph (Reading within the data) 5) interpreting information in a graph (Reading beyond the data) 6) recognizing if the graph is appropriate (Reading beyond the data) Monteiro and Ainley (2007) argue that familiarity with the above components is not sufficient to ensure understanding of specific graphs.They claim that that context may be the key factor in understanding the comprehension of graphs.According to Monteiro and Ainley, data displays used for analytical purposes are predominantly tools for detection of important or unusual features in the data.On the other hand, graphs used for communication are defined as pictures intended to convey information about numbers and relationships among numbers.The authors add that the use of the kinds of -school graphs‖ which were used within Friel et al.'s study (for example displaying information about -the number of letters in students' names‖ or -how many raisins are in various boxes‖) have limited purpose, in terms of analysing or communicating information which relates to interesting problems.Moreover, Monteiro and Ainley state that in the specific examples used by Friel et al., the term looking beyond the data does not imply a need to look critically at the data and ask worrying questions (see Gal, 2004).Indeed we might look beyond the data (extrapolating, predicting, or inferring from the representation) without being prompted to question the main idea presented in the data display.
Similarly, Monteiro and Ainley (2007) have criticized the Friel, Curcio, and Bright (2001) framework, arguing that learners' familiarity with its components is not sufficient to ensure understanding of data displays, and that context might be the most important factor in graph comprehension.Recognizing the important role of context in statistical analysis, Shaughnessy (2007), added a fourth level beyond Curcio's three levels of graph comprehension: -reading behind the data or graph,‖ which emphasizes the need for interpreting data displays based on the context and situation underlying the graph being constructed.But Shaughnessy's work, too, was done with simple data tables like those found in text books; in this study we look to see if similar patterns of reasoning hold with more complex data tables.

Methodology
This study uses qualitative analysis to examine students' reasoning about large-scale data based on experimental data.

Sample
The research study involves two schools in Cyprus.Eight groups, each of four year-9 students, were drawn from six different classes in the same school (N = 32) to participate in a workshop that examined the components of population change in the EU and member countries.The mathematics teachers of each class selected students who came from the same class, so interpersonal relationships had been already established prior to the research.The teachers were asked to choose articulate students who would have no difficulty in setting up a friendly group.The groups were selected by the teacher so as to include, in their assessment, two girls and two boys from a -middle‖ attainment in mathematics.The researcher spent one 100-minute double period with each group.Scripts from each group and rough working sheets from each group were collected.Written reflections of each group of students were included in the data.The researcher also made field notes during and immediately after students' engagement with the workshop.

Instrument
A statistics-learning situation was implemented with paper and pencil during a workshop designed to provide opportunities for students to engage in investigating real data published by EUROSTAT, the statistical office of the European Union situated in Luxemburg.
The students were provided with the workshop sheet (Appendix A) and they were asked to examine five data tables taken from the EUROSTAT website representing number of live births (Appendix B), crude rates of population change (Appendix C), immigration rates (Appendix D), emigration rates (Appendix E), and population by citizenship-Foreigners (Appendix F).The students did not have direct access to computers or mobile devices, therefore I presented the five tables to the students in a paper format.
The workshop called for participants to examine the components of population change in Europe and candidate countries, namely natural increase of population, net overseas migration for Europe and their country, and total population growth.
The workshop included reflective questions designed to provoke students to pause and reflect on the data-tables, seeking interesting aspects of the graphs such as possible reasons for the higher rate of natural increase in some of these countries, or variations in growth rates among different countries, and to discuss their observations.The questions provided opportunities to query students as to the reason underlying their reasoning, and thus, gain an insight into students' way of thinking.In particular, the participants in the workshop were asked to examine the data-tables and complete the following tasks in the order presented: 1) Compare the indicators of population change in Cyprus: number of live births, 1 crude rate of population change, 2 immigration, 3 emigration, 4 and population by citizenship-Foreigners. 5 2) Discuss and explain their observations regarding growth in Cyprus from 2001-2012, including possible social, historical, environmental, economic, and political factors that might have caused this change.Identify and justify the dominant factors.
3) Identify the European countries that have a net loss of migrants, explain why these countries may be experiencing that loss, and identify and reason about the dominant social, historical, environmental, economic, and political factors that might have influenced the change of the population.
In this paper, we report on students' written answers to question three, supplemented by observations of the students working in groups and discussions with the students to clarify their reasoning.During the interviews, students presented and discussed the conclusions they had drawn regarding the tasks.In this way we obtained further clarification of the nature and type of reasoning they used and the difficulties that emerged during the reflective activity.As mentioned earlier, the purpose of this paper is to elaborate more precisely the nature of students' reasoning about large-scale data presented in tables, to enable us an in-depth analysis of key aspects of 1 Eurostat defines the number of live births as the number of births of children that showed any sign of life (total births minus stillbirths).
2 Eurostat defines the crude rate of population change as the ratio of the population change during the year to the average population in that year.The value is expressed per 1000 inhabitants.Population change is the difference between the population sizes on 1 January of two consecutive years.
3 Eurostat defines immigration the total number of long-term immigrants into the reporting country during the reference year. 4Eurostat defines emigration the total number of long-term emigrants from the reporting country during the reference years. 5Eurostat defines population by citizenship-Foreigners as the total number of foreigners residing in the country, including citizens of other EU Members States and non-EU citizens, usually resident in the reporting country.

Analysis
Participants' reflective responses to questions, in conjunction with the working sheets from each group and researcher's field notes, were analysed at the macro level to identify episodes of students' reasoning while examining the data tables.
Each episode was coded based on common elements of participants' reasoning then subjected to microanalysis to see if there were shared characteristics of the reasoning.Finally, the analysis identified typical instances of students' reflective activities in the workshop engaged in by students that capture the category of students' reasoning and the competencies that underpin such a reasoning category.

Results
The results are presented according to the four organizational categories defined by Shaughnessy, which guided this study.In each category is presented an episode from the data that is representative of the category.These students appeared to be chiefly confined to the reading of data in order to report the variations of change seen in rows of the tables of Emigration versus time, and Number of live births versus time.They did not seem to understand the deep structure of the data in their totality, through making comparisons among the countries that students have chosen.

Second category: Reading within data
This type of interaction occurs when students are interpolating and finding relationships in the data while reasoning about the information displayed in tables.A second group of students read the data displayed in columns of the tables (vertical reading), comparing the variations in population change among different countries from 2001-2012: Group 2: In 2001, Liechtenstein had the highest crude rate of change (19.9%), then Ireland followed with 17.3%, Turkey (13.8%),Spain (13.7%), and Cyprus (11.4%).Luxemburg also had the same crude rate (11.4%) as Cyprus (11.4%).When we looked at the data for 2012, the crude range of population change of Liechtenstein was decreased to 9.9% in 2012, while Ireland's crude rate decreased to 1.8%.
This group of students observed the data in columns (vertical reading) to identify the country that had the highest numerical value of a data point for a certain year.They usually combined the vertical reading with reading the data in rows (horizontal reading) in order to make comparisons of data between different countries (vertical reading) and within a country (horizontal reading) for different years.Although, they made correct comparisons within the data-tables, students did not make any reference to the contextual factors that impacted on the reported population change.
Third category: Reading beyond the data This category is concerned with extrapolating, predicting, or inferring from the data table to answer implicit questions.Some groups of students attended to different variables of the data and seemed able to integrate the information provided by those variables: Group 4: In general, almost all the countries of Europe have been affected by the economic crisis.When we look at the table of the crude rate of population change, we observe that there is a decrease of the crude rate in almost all of the European countries.However, the crude rate of some countries was decreased a lot.For example, we observe that after 2008 the crude rate of population in Greece decreased from 3.4 (in 2007) to -5.5 (in 2012).Similarly, Portugal's crude rate decreased from 2.0 (in 2008) to -5.2 (in 2012).UK's crude rate decreased from 8.1 (in 2008) to 6.3 (in 2012).However, in Romania, we observe an increase in the crude rate from -23.7 (in 2007) to -1.9 (in 2012) In this category of reasoning, students seemed to pay attention to the entire distribution of data and then they focused particularly on individual cases that exhibit distinctive variability in the measurement, providing appropriate qualitative inferences about the possible meaning of the data within their context.They acknowledged, however, the students acknolwedged that the many factors impacting populations change meant that they were not fully able to explain the observed changes in the the variables presented in the data table.Other students engaged critically in a familiar context when they observed the data-tables of immigration: Group These students seemed to appreciate variation and to qualitatively interpret the existence of variation in context.They demonstrated awareness of relevant features of the table, however these features are predominantly based on both the data and the context.When using this type of reasoning, the students appeared to be able to focus on the data interpretation and they exclusively based their answers on the different variables in the data tables.
Fourth category: (reading behind the data) In this type of reasoning, the students seemed to move beyond the data, and attempted to give an answer that drew upon prior knowledge about issues directly related to the data presented in the tables.In such situations, students' reasoning related prior knowledge to components of open data tables, which allowed more complex inferences: Group 5: Knowing that the economic crisis has been very intense for the following countries: Greece, Cyprus, Ireland, Italy, Portugal, Romania and Spain, when we observe the data-table of emigration versus time, we understand that Spain was the first of these countries that began to suffer from the economic crisis, in 2003; when its citizens began to leave the country in an attempt to find work in other countries.Afterwards, the data tell us that economic crisis has affected Portugal in 2004, Ireland in 2006, and Italy in 2008.Later on, in 2009, financial crisis influenced Greece.In 2010 Cyprus joined the group of the EU countries affected by the economic crisis.We can understand when one country has been affected by the EU economic crisis from the increase we observe in the Emigration data table, since the citizens of a country leave their country during economic crisis with the intent to settle permanently in another country. .However, this trend is not observed in Romania, because the emigration there is decreasing instead.We can observe a data table of another indicator for Romania to be able to tell when Romania was affected by crisis.The data table of immigration vs. time for Romania shows that the number of Immigrants decreased from 2008 to 2009, increased slightly in 2010, decreased slightly in 2011 and increased substantially in 2012.The data do not provide us with adequate evidence to deduce the effect of the financial crisis on Romania.We need to look at the data table of another indicator.The data from the table of the number of live births shows an increase from 2001 to 2012, so it is not clear when Romania was influenced by a rise in poverty.We should look at other data-tables that can provide us with appropriate evidence to be able to draw any reliable conclusions.For example, we need to look at the table for the crude rate of population change for Romania.We look at it and we observe that the crude rate of population is increasing from 2007 (-23.7) to 2012 (-1.9).We do not have enough evidence to deduce whether Romania was one of the five countries of the EU that was most affected by the economic crisis.
Students in this group attempted to give an answer that integrates prior knowledge of issues directly related to the data-tables.They seemed to understand the purpose of the data, and of the inferences made.These students used the relevant features of the data and background contextual knowledge, and utilized different tables of data of the same variable to appropriately answer a question.They acknowledged that the quantitative data included in a single table might not show a particular trend in the data, thus examination of other variables is required to get a more complete picture of the situation at hand.

Conclusions
The open data movement has provided unprecedented access to authentic, large-scale data sets on a wide range of socially important topics.Competent use of large-scale data predominantly requires comprehension of tables and other visual representations of statistical data, since these are routinely used in daily life and in the workplace to communicate information.Thus, statistics instruction at the school level should give more emphasis to enhancing students' comprehension and interpretation of large-scale data displayed in tables and graphs.
In this paper, I investigated the emerging reasoning about data-tables of a group of year 9 secondary school students (15 year olds) in Cyprus.Level 2, an intermediate reading (reading within data), is focused on making comparisons of data between different countries and within a country for different years.Students at this level attend to one or more relevant aspects of the data but have difficulty in integrating those aspects into their context.
Level 3, an overall reading (reading beyond the data) is characterised by interpreting the numerical values of the data, and attempting to contextualise the data by providing qualitative interpretations of what might have impacted the variation in data values.Additionally, students reading beyond the data begin to gain an awareness of how a few of the possible social, historical, environmental, economic and political factors might have caused similarities and/or differences in the data.The students of this group appeared to be aware that many complicated questions about data might be answered by examining data tables of different variables.
Level 4, an advanced reading (reading behind the data) is characterised by attempting to give an answer that takes into account prior knowledge about a question that is directly related to the data-tables.In such situations, students' reasoning related to comprehension of the components of open data tables is characterised by inference from the data to develop answers to questions (e.g., we are aware of Europe's economic crisis, but we do not know the number of the countries that have been very badly affected by the economic crisis.Can you tell from the data-tables of the given indicators of population change-number of live births, crude rate of population change, immigration, emigration, population by citizenship-foreigners-which are these countries?).
Concurring with the findings of previous studies (Sharma, 2013), findings from the current study indicate that students' reasoning about large-scale data changes over time due to natural developmental process from reading data to focusing on interpreting data with respect to the data's context.
Furthermore, Sharma (2013) argues that a -number of research studies from different theoretical perspectives seem to show that students are particularly weak in drawing inferences and predicting from tables and graphs (e.g., Bright and Friel, 1998;Curcio, 1987;Estepa, Bataneo, and Sanchez, 1999;Pereira-Mendoza andMellor, 1991, Sharma, 1997)‖ (p. 52).This could be the result of the instructional neglect of concepts related to the interpretation of tabular representations in context.Student encounters with data-tables in the mathematics classroom are restricted to some -school tables‖, which do not support developing understanding of complex and challenging tabular representations of authentic data such as those presented to students in the current study.The analysis of the results of this study does not suggest shortcomings of the participants in any meaningful way; it shows that some students reason in simpler patterns than others, but not in any way that we can generalize about overall performance of students.The study isn't set up to assess ability; it is set up to characterize patterns of reasoning.
Although the study has provided some valuable insights into students' conceptions of data tables, very little is still known about this important aspect of statistical reasoning.More research needs to be carried out to investigate and support comprehension of tables by students of different age groups and educational and cultural backgrounds.As the research literature tells us very little about how comprehension of data-tables develops, a possible direction for future research is to find ways to scaffold students' learning in terms of reading and understanding tables, and connecting them with other numerical and graphical representations of data.
Another possible research direction is to study how contextual knowledge affects comprehension of data-tables, and to find ways to help students relate information displayed in a table to the context of the situation.This is essential since, as shown in the current study, students' comprehension of tables and other data representations is reliant not only upon their understanding of the features of the visual display under study, but also on their prior knowledge of the context from which the presented data is drawn, as well as on their ability to utilize this contextual knowledge to make sense of the situation displayed in the table or chart.

Limitations of the Study
This study discussed in this paper involved relatively a small sample of students.I have reported only few groups of students' reasoning, the clearest illustrations of the emerging ideas.Even had it been possible to analyse all data and the examples I presented were representative of the sample that I had drawn, the findings must be regarded as tentative because this was a small sample and we cannot generalise.Fortunately this has opened up opportunities for future research at a macro-level on students' reasoning about large-scale data displayed in tables.
Future research on students' emerging reasoning about large-scale data should begin with this study as a cornerstone.Implications for research and teaching are outlined below.

Implications for Teaching
The data revolution provides challenges and opportunities for statistics educators to educate an entire population and create instructional materials for curricula that devote particular attention to engaging students with a broader variety of novel techniques that encourage the comprehension and interpretation of large-scale data displayed in tables and graphs.The comprehension of large-scale open data sets that are two-dimensional tables (both rows and columns) can be achieved when answer questions dealing with several variables.The exploration and interpretation of large-scale multivariate data sets (Ridgway, Nicholson, & McCusker, 2013) is very challenging.A wide range of visualisation tools (e.g., Gapminder) may help students to simplify multidimensional datasets, thus interpret complex data sets.Prodromou (2013) argues that what is to be communicated to the student is not just the technique of partitioning the complex data set as a building block of process, but also the value of the final partitioning of the dimensions of a data set, when identified and explicitly labeled.Partitioning leads to a focus on a part or segment of the data.When this segment is rendered relatively homogeneous with respect to some features of the complete data, its internal complexity is reduced.Thus the selected data segment's own particular internal data patterns are more likely to emerge in any data summarising activity.
For statistics educators who teach big data, traditional methods and techniques for analysis of data cannot be applied, and novel methods must be developed through collaboration of statistics educators with computer scientists.
Fundamental ideas, such as data quality, the principles of measurement, and drawing inferences in the face of uncertainty, request particular attention.In addition, the habit of thinking from samples to inferences about populations was a function of a -small data‖ environment.At their core, big data and open data are about making generalisations and predictions, similar to making statistical inferences.For example, large-scale data may be used to predict consumers' future purchases based on their interactions at different sites on the internet or even the performance of a stock market.
To provide effective instruction, teachers need to increase their knowledge of the three emerging trends that have impacted this revolution in our increasingly data-driven society and of how to teach these new trends.
Because of the recent emphasis on large-scale data and data analysis, these concerns have only recently become an important necessity in the secondary school mathematics curriculum.Consequently, teachers may not have had adequate opportunities to learn about large-scale data.More visualisation tools need to be developed to fill this gap.But beyond the materials, thought should be given to how professional-development experiences can be structured so that teachers learn not only how to better interpret large-scale data displayed in tables and graphs, but also how to help students develop similar skills.In order to take into account the full complexity of data, we have to change the way we think about controlling and handling data.This view calls for another change to the constructs of statistical literacy (Gal, 2002) and the introduction of new constructs and principles needed for the revolution of data.
In order to immerse students to this new culture of data, it seems important to give students many opportunities to construct their own meaningful data visualizations that highlight emerging important aspects of data and promote their reasoning about covariation between multiple variables while using the cycle of inquiry and visual analysis (Prodromou, 2014).In particular, I think it will be helpful to encourage students to revisit their specific kinds of inferences while inventing and revising their visual representations of data.In this way they will be able to attend to the changing role of variables from data visualisation to data visualisation.

Implications for Future Research
The revolution of large-scale data challenges people to become better informed about the ways in which they can harness vast bodies of data rather than small datasets, and simultaneously harness the technology.This attention to graphical developments increases the need to research the psychological aspects of data visualisation.These understandings will provide us with feedback about how students reason when using graphical displays, what aspects of formal inference are needed given current visualisation tools, and which methods foster students' ability to understand conventional formal conceptions and characteristics of large-scale data sets.Ideally, there should be further progress in the formal theory of data visualisation.Nevertheless, current growth of the field already leads to the challenges of integrating data visualisation in statistics education for students so that they are enabled to become competent citizens in the large-scale, big data era.
One crucial issue related to this process that was outside the scope of this study was the question of how people can use visualization to recognize biased or otherwise distorted data.One major concern for our society is the potential misuse of what might be called -big data,‖ which, in contrast to the open data provided by governments and researchers, is proprietary and used for the profit of large corporations.The scope of the study only allowed for consideration of the students' experiences with large-scale open data, and not with big data.Further study could build on the foundation provided here to examine students' interactions with big data.In pursuing that research, there is an important role for visualization technologies that were not incorporated into this study.The reasoning about big data used by expers is different from common reasoning, because of the inherent complexity of data, and supporting dynamic visualisations of data are required.As Prodromou (2014) showed, 14-to 16year-old students interpreted representations of multivariate data generated by a dynamic visualisation tool while they constructed their own meaningful data visualizations that highlighted emerging important aspects of data.Such a use of visualisation tools promoted students' articulations of the diverse inferences from data visualisations and reasoning about covariation between multiple variables while using the cycle of inquiry and visual analysis.In that study, students revisited their specific inferences while using complex data visualisation tools, inventing and revising their visual representations of data.Once they obtained some necessary insight, they readily made an informed decision.
Using the data in Identify: (a) the EU and candidate countries with a considerable decrease in the total number of long-term immigrants into the country; (b) the EU and candidate countries with a considerable increase in the total number of long-term immigrants into the country;

1.2.1
Provide possible explanations as to why the countries you identified in part (a) are experiencing a loss of migrants, whereas the countries identified in part (b) are experiencing high increases in immigration.

APPENDIX D
Eurostat -Tables, Graphs and Maps Interactive (TGM) Table printer Prieview First category: Reading dataThis first category is concerned with how students engage with tables and how they recognise the components of the table (e.g., the raw data) and the interrelations among these components and then use this information to answer explicit questions.After studying the data-tables of Emigration versus time, and Number of live births versus time, students commented: Group 3: We observe that from 2001 to 2012, emigration in Italy has increased from 56077 to 106216, while in Cyprus immigration has increased from 13909 to 18105.In Italy, it has doubled (from 56077 to 106216), but in Cyprus the increase was less than 25%.The number of live births in Italy has remained the same from 2001 to 2012, while in Cyprus, it has increased by approximately 2000.The crude rate of population change in Cyprus is 11.4% in 2001, 11.5% in 2002, 12.8% in 2003, 14.0% in 2004, 14.8% in 2005, 18.5% in 2006, and it increases substantially to 24% in 2007.The change in the crude rate in 2008 is 26.2%, 17.5% in 2009, and then it decreases to 24.8% in 2010, increases to 26.2% in 2011 and then it dramatically decreased to 4.5% in 2012.
1: The countries where the immigration is decreasing during the last few years are Esthonia (from 3709 in 2011 to 2629 in 2012), Italy (from 558019 in 2007 to 350772 in 2012), Cyprus (20206 in 2010 to 17476 in 2012) and Greece (from 119079 in 2010 to 110139 in 2012).We cannot claim that all these countries experience financial crisis.For example, we observe a pattern of immigration in Estonia; immigration goes up one year and then down the next year.Similarly, in Cyprus.On the contrary, in Italy immigration decreases steadily from 2007 to 2012, and similarly in Greece it decreases from 2010 to 2012.So, the immigration (the people who go to work in a foreign country). . .The countries that have not been affected by the economic crisis and where the immigration is increasing after 2010 (including 2010) are Belgium, France, and Austria because they have money to pay people, so people immigrate to these countries.
The findings not only provide empirical confirmation of the four-part framework of Shaughnessy in this research on large-scale data-table comprehension, they help establish a theoretical framework that can address different levels of large-scale data-table comprehension.This extension of the framework described by Shaughnessy to a novel context gives support to the emergence of four levels of large-scale data-table comprehension and it shows that when drawing conclusions about samples of large-scale data, at level 3 and level 4 of the framework, comprehension, comparison, and interpretation of different variable of the multivariate data sets is central.Level 1, an elementary reading (reading data) is characterised by simply reading the data either horizontally or vertically, following the rows and columns of a two-dimensional table to answer specific questions for which the obvious answer is in the data-table, without making any judgements with regard to comparing any variations in growth rates among different countries.

Table 1 :
Create one or more graphs that compare the following indicators of population change in your country: number of live births, Immigration, Emigration, population by citizenship-foreigners.After studying the graph(s) and the table, write at least 10 lines explaining your observations regarding growth change in your country from 2001-2012, discussing possible social, historical, environmental, economic and political factors that might have caused this change.Identify and justify the dominant factors.Share the data and your graphs with students from another country and compare your data and graphs, explaining (by writing at least 10 lines) your observations regarding similarities and/or differences in growth change between your countries from 2001-2012.Discuss and compare the possible social, historical, environmental, economic and political factors that might have caused these similarities and/or differences.Identify and justify the dominant factors.