Listening to Young Children ’ s Voices : The Evaluation of a Coding System

Listening to young children’s voices is an issue with increasing relevance for many researchers in the field of early childhood research. At the same time, teachers and researchers are faced with challenges to provide children with possibilities to express their notions, and to find ways of comprehending children’s voices. In our research we aim to provide a method for listening to, and analyzing young children’s voices on educational issues. In this article we describe a new step in our research in which we are dealing with the issues of validity and reliability for the evaluation of our coding system: is our coding system for analyzing young children’s voices valid and reliable?


Introduction
Listening to children's voices is becoming increasingly relevant for many researchers and practitioners in the field of early childhood. In addition to its practical importance, it is often related to the UN Convention on the Rights of the Child (1989) too. This convention advocates the rights of children to be heard as active citizens in all matters concerning them (e.g. Clark, Kjørholt & Moss, 2005;Formosinho & Araújo, 2006). If we want to do justice to children's perspectives in nowadays society, it is essential to listen to their voices. Researchers have to deal with many challenges and struggles in offering children possibilities to involve their perspectives in early childhood practices (Pascal & Bertram, 2009). The position ascribed to young children in society depends strongly on the prejudices and images present in society about children. In research the idea is put forward that the child we meet in our society, and hence in education as well, is in fact a construction based upon theory and prejudices (Engel, 2005;Komulainen, 2007). Research revealed that teachers particularly have strong images about "the" child (see for instance Seifert, 2000).
In our research program we concentrate on the problem of how to relate properly to children in educational situations, and we raised the question whether it is possible at all to identify young children's own voices. Through qualitative studies we wanted to provide a scientific contribution to clarify this issue. First, by developing a conceptual framework which describes the elements of young children's voices and secondly, by building a valid and reliable coding system, appropriate for qualitatively analyzing their voices.
In the previous chapter we described the construction of a method for researching the attribution of meaning to educational issues by children, aged 5-6, in school. We explored the concept of young children's voices, and formulated indicators for the construct of voice and attribution of meaning. We conducted five case studies, and we set the first steps in developing a coding system for analyzing elements of young children's voices. In this chapter we describe a new step in our research, in which we are dealing with the issues of validity and reliability of our coding system.

Theoretical background
For a qualitative analysis of young children's voices, we first have to define the construct of voice. In this we follow Bakhtin, who states that any word uttered by an individual is essentially inter-individual. An utterance can never be attributed to a single speaker, as there is always a (real or virtual) listener involved. So the word of a speaker is always half someone else´s, according to Bakhtin (1981; see also Wertsch, 1991).
In our research on young children's voices, we focus on young children's attributions of meaning in situations and events in school. In the speaking and acting of children, in interactions with peers and adults, we can see and hear attributions of meaning. We focus on individual children, as we see each individual as a speaking personality, using language as a way to express himself. So listening to individual children requires a method to gain insight in these children's notions and opinions. At the same time, those children could never be isolated when we want to study them in an ecologically valid way. Hence, due attention is given in our research settings to the children's real life contexts, in which teachers, peers and parents/caregivers are included as important others.
Schematically, we summarize our conceptual framework as follows: Context We ground the theoretical framework of our research in the cultural-historical activity theory. People's opinions are always influenced by social, cultural, biographical, and historical determinants (e.g. Bourdieu, 1991). Those opinions, expressed or voiced by an individual, are influenced by these determinants, as well as by actual context-related interactions. The acquisition of opinions occurs in interaction with others, in a dialogical process, in which the voices of others resound as well (Bakhtin, 1981;Wertsch, 2002). Komulainen (2007, p.13) states that children's voices are to be understood as "multidimensional social constructions that are subject to change. At the same time 'voices' manifest discourses, practices, and contexts in which they occur." Children's perspectives and images originate from historically developed local contexts -like the classroom or the playground -in which others are also involved. In interpreting children's expressions, this specific context needs to be taken into account (Christopher & Bickhard, 2007;Daniels & Edwards, 2009). In our research the context is the school context, in which peers and teachers are present too. In this specific context, significant others, like parents or caregivers, are relevant as well. Children's opinions and their ways of expressing them, are influenced by children's context-related interactions, but at same time by social, cultural, biographical, and historical determinants (e.g. Bronfenbrenner, 1979;Meadows, 2010).
In our research we focus on voices, as manifested in expressions and attribution of meaning by young children, in the school context. We define attribution of meaning in this research as the way in which a child expresses his conceptions and values on three aspects he encounters in the daily practice of his educational setting: the activities, the organization in and around the classroom, and the roles of his teacher in the school context. Besides the verbal and non-verbal aspects of these voices or expressions, we also look for underlying elements like thinking, feeling, and wanting. These are elements of the subject's personality and play their part in the acting person (González Rey, 2008). González Rey (2008) refers to thinking and feeling as categories of the acting personality uniting intellect and affect. Thinking and feeling can be considered as aspects of conation, a dimension of mental processes, having to do with striving and wanting (Reber & Reber, 2001). Not only the content of what people tell one another counts, but how people interact with one another is important as well, especially when it comes to feelings and motives of people (Daniels & Edwards, 2009).
All the related elements in our construct of voice and attribution of meaning, as represented in Figure 1, are part of our data gathering. This conceptual framework is the foundation of our coding system for data collection and analysis. All elements in our construct of voice have a theoretical basis in cultural-historical activity theory.

Research method
Our research comprises five case studies. In each case study we listened to children, aged 5-6, in school in several settings, and we studied the dynamics of the specific school contexts the children are involved in. Conducting more case studies means gathering more data, which enables us to articulate the issues of validity and reliability in an accountable way. Using more settings in a case study may lead to more supportive or supplementary findings by triangulating the data (Yin, 2009). According to Yin (2009;p. 116), data triangulation contributes to the strength of construct validity as well, by providing several sources of evidence for the same researched phenomenon. We decided to order our case studies sequentially. Each new case study is built on and elaborates the outcomes of a previous case study. This results in a so called multiple case study with a qualitativeinterpretative approach in a flexible design, and using multiple sources of evidence (Robson, 2002).
First we formulated sensitizing concepts with respect to major elements of the school context. In this context we considered the following concepts as our main domains for analysis: school activities, classroom organization, and teacher's roles. Secondly, we analyzed the data, collected in the different cases, in a process of open coding, focusing on emerging concepts, and looking for the relationships among them (Glaser & Strauss, 1967;Strauss & Corbin, 1998). We attributed subcategories to the categories, drawing from our empirical observations. Children's expressions (like commenting, adopting, narrating, et cetera), were considered as properties or dimensions of the subcategories. Each category, subcategory, and property has its own written definition. Coding children's expressions is consistently based on a coherent unit of expressions from the transcribed observations of the focal children in the case studies (Miles & Huberman, 1984).
We considered the coding process as completed after analyzing five case studies, as we were unable to add new properties to our defined subcategories, and so saturation had occurred (Glaser & Strauss, 1967;Strauss & Corbin, 1998).

Data collection
Based on the outcomes of a previous exploratory study, we planned a series of case studies with different children in different school contexts, looking for comparable as well as complementary findings (see also chapter 2). The child in our first, exploratory case study was Tom.
Tom is 6.5 and attends a Roman-Catholic primary school in a little village in the south of the Netherlands (Limburg).
Tom is a bit older than most of the children in his class but he is small and looks a bit younger. Tom has an older brother and sister at the same school and a little brother at home. His father and his mother both work part-time.
Tom's school bases its educational philosophy on basic development, which means a specific form of developmental education based on Vygotskian theory (see van Oers, 2009). Tom's class has a teacher with many years of experience in educating young children, called Tessa.
After the first case study we decided to have more than one child involved at the same time, so we would be able to get more detailed insight in the way conversations with others might influence a child's expressions. Irfan and Margareta were the children in our next case studies.
Irfan is 6.0 and attends a primary school in Amsterdam. Irfan has an older sister at the same school and a baby brother at home. His parents have a Moroccan background. His father works for a transport organization and his mother is a staff member at a public institution. Margareta is 5.6 and attends the same school as Irfan. She is the only child of a Turkish father and a Dutch mother. Her father runs a local business. Her mother has an academic background.
The school of Irfan and Margareta has a mixed population. Many children have (grand)parents who are born outside the Netherlands. Each class with young children has, besides a teacher, also a part-time teacher-assistant.
Much attention is paid to language stimulation and independent learning (weekly tasks). Irfan's and Margareta's class has two part-time teachers. One highly experienced teacher, Jona, and a teacher, Mandy, who is recently qualified. Ayla is the teacher-assistant.
Finally we added another two case studies, Lennart and Bernadette, to our research.
Lennart is 6.6 and attends a Roman-Catholic primary school in Amstelveen (a suburban city near Amsterdam).
Lennart is a bit older than the other children in his class, but he is quite small and looks younger. He has two younger brothers, who do not attend school yet. His father and his mother both have an academic background.
They both work part-time. Bernadette is 5.7 and attends the same school as Lennart. Bernadette is quite young, compared to the other children in her class, but she is tall, and looks older. She has a half-sister, aged 15 (her father was married before), who is living with her own mother. Her father runs a local business and her mother has an academic background. They both work full-time.
The primary school of Lennart and Bernadette has eight classes for young children (aged 4-6) and there are also equivalent classes for the older children. There are two school buildings on two locations. The classes for the children, aged 10-12, are accommodated in another street, nearby the main building. Bernadette's class has two part-time teachers, both with many years of experience in educating young children, Cecile and Magda. During the research Magda was present on the last day.
In the exploratory study (Tom) we used three different settings for observations, to achieve data triangulation: • regular classroom and school activities; • playing school in a play area; and • a semi-structured interview about school notions. By regular classroom and school activities, we refer to the current classroom projects, consisting of learning contents and educational activities. Playing school in a play area was an arranged activity, offering children the opportunity for role-play. In a semi-structured interview (Appendix B.1) the children responded to questions like: If it were up to you, what would your school look like? What would you prefer to do, if you had free choice of activity?
For the purpose of strengthening the reliability of our outcomes in subsequent case studies, we decided to add another two settings for observations in the next four case studies (Irfan, Margareta, Lennart, and Bernadette): • taking pictures in school and discussing them; and • talking about feelings in and on school.
We provided the children in our research with a single-use photo camera. Cameras offer children the possibility to respond in non-verbal ways to questions like: Can you show me what you think is important here in and around school? Thus asking explicitly for the children's opinions on the subject "school". The answers, consisting of series of photographs, were used later on to discuss their expressions (both verbal and nonverbal): which pictures they liked best, which pictures represented a story, which pictures showed what they did not like at school, et cetera (Clark, 2007). We also explicitly invited the children to respond to questions about their feelings in school. The questions, offered to them as propositions, were answered by the children by selecting a picture, emoticon, that represented their feelings best. Questions like: How do you feel when the teacher is helping you to perform a difficult task?, are partly based on a pictorial scale of perceived competence and acceptance (Lewis & Lindsay, 2000), and a social-emotional task of affective labeling (Formosinho & Araújo, 2006).
All observations of playing school in a play area, discussing pictures, talking about feelings, and a semi-structured interview, were videotaped.

Data analysis
All the observations of school activities and the videotapes were transcribed verbatim. Kwalitan (www.kwalitan.nl), a computer program, was used for the systematic comparative, qualitative data analysis. This computer program is a tool, supporting researchers in entering, archiving and exploring data (e.g. looking for certain words), structuring documents (e.g. segmentation), organizing data (e.g. overviews of codes with frequencies), selecting extracts in documents, and describing the process of data analysis (e.g. in memos).
Based on the data of the exploratory study, we started to build a coding system in Kwalitan, following the basic assumptions of the grounded theory approach (Glaser & Strauss, 1967;Strauss & Corbin, 1998; see also chapter 2). We defined our sensitizing concepts and labeled them as the three main categories in our coding system: school activities, classroom organization and teacher's roles. A fourth category ("relations") was needed, for describing the relations among focal children, peers and adults, besides the teacher (see Appendix C1, Coding System 1).
After labeling the categories and subcategories of our coding system, we defined properties as parts of the subcategories to code elements of young children's acting. These codes are partly derived from the contexts of the children involved (in vivo codes) and partly from the studied literature (constructed codes). Those constructed codes are based on indicators, we have formulated, as possible manifestations of young children's voices within the school context (see also chapter 2): • expressing feelings and choices; • sharing ideas about competences and needs; • showing knowledge by pointing out, investigating, confirming, opposing; and • intending to gain something related to others.

Validity
In this phase of our research we have been scaffolding our coding system, and in particular we wanted to pay attention to ecological and construct validity. To achieve ecological validity, we focused on young children's attribution of meaning in situations and events in school that make sense to the children. We took care that the children were observed in their daily school context and were engaged in different naturalistic settings, that is, daily classroom activities and (outside) play. To reach construct validity, we focused in our research on theoretically formulated constructs of young children's voices and attribution of meaning. Moreover, we used multiple sources of evidence: playing school in the play area, taking pictures and discussing them, talking about feelings in and on school, and an interview about school notions. To strengthen construct validity further, we used these multiple sources of evidence during data collection for establishing a conceptually consistent chain of evidence (Yin, 2009).

Reliability
We also strengthened our chain of evidence by inviting two independent coders to go through the same analyzing and coding processes (Yin, 2009), with the help of the coding system, including definitions of main theoretical constructs. By pattern matching -comparing the outcomes of data analysis in the different case studies -we will look for convergence between the constructs voice and attribution of meaning in our case studies (Trochim, 2011). Both coders can be considered experts in the field, as they were teacher-trainers in early childhood at a university of applied sciences.
Memos with definitions of the categories, subcategories and properties, and a written coding instruction were at the disposal of the coders. First, the two coders watched the videotapes to get acquainted with two children in their school context in different settings. Then there was a meeting in which the structure of the coding system and the written definitions were explained, and questions could be asked. Finally, examples of written observations from other case studies were presented to practice the coding procedure. We compared the outcomes of the coders with the results of the researcher's coding processes, looking for similar and rival interpretations in coding on the three levels of our coding system, described as categories, subcategories, and properties (inter-coder reliability).These results were needed to strengthen the consistency of the coding system.
To ensure reliability we also created a case study database, consisting of the data and a case-study protocol to be used for the analysis of the case studies. This protocol, consisting of notes, documents, tabular materials, et cetera, was discussed with the peer researchers every six weeks (peer debriefing).

Results
As to the issue of ecological validity, we took care that children were indeed observed in their everyday contexts and were engaged in different naturalistic settings. As for construct validation we used those different naturalistic settings as multiple sources of evidence for data triangulation. We also maintained a conceptually consistent chain of evidence during the whole process of data collection and analysis, with the help of theory-based categories and definitions that were available to the coders.
To establish inter-coder reliability, two coders separately analyzed the videotaped observations of the children in two case studies: playing school in the play area, talking about feelings, and the semi-structured interview. The researchers' theory-based coding system maximizes the chances that the coders indeed focused on phenomena that theoretically relate to the notion of voice. Comparing the results of these data analyses, we were looking for similar and rival interpretations in coding on the three levels of our coding system, described as categories, subcategories, and properties.
For the definition of reliability we follow Miles and Huberman (1984;p. 63): the total number of similarities divided by the total number of similarities and differences in coding. A first data analysis by several observers, independently using the same coding system, should generate about 70% inter-coder reliability, according to Miles and Huberman (1984). We decided we would accept 70% of overall agreement in coding among the researcher and the coders, as a result of this first analysis.
In Table 1 (the left side: before re-adjustments) the results of the comparison of the first round of coding processes among the researcher (A) and the two coders (B and C) are presented.  Looking at the results on the left side of Table 1, we see that at first we could not meet our formulated standard of an overall agreement of 70%. Based on these outcomes, we had to reconsider our coding system, definitely on the levels of subcategories and properties, which showed the lowest percentages of agreement.
To improve our coding system, we first made a qualitative analysis of the found similar and rival interpretations in coding among the researcher and the two coders.
On the first level of coding (categories) we found that the coders faced difficulties in deciding, which category was the most appropriate in coding expressions of the focal children, despite the instruction that more than one code could be assigned to a single expression of the children.  Especially category 4 (Relations) caused entanglement, as there were almost always others (such as peers) involved. The coders found it difficult to decide when they should, or should not, assign codes (also) to this category (see Table 2 line 1, 3, and 7). Another difficulty occurred in assigning codes to category 3 (Teacher's roles). Codes were attributed only when the teacher was physically present and intervening in the situations the focal children were involved in. Despite the instructions, the coders were uncertain to attribute category 3 codes when the children were referring to the teacher, but the teacher was not present at the time.
On the level of subcategories we faced a similar kind of coding difficulties. Knowledge and skills (subcategory 01) is nearly always related to certain behavior of the child (subcategory 02: attitude). Table 3 Child Bernadette (Context: Playing School in the Play Area) Bernadette is playing with Lennart, Jan, and Elza outside the classroom in an area, which is furnished with a table and chairs and school material such as books, paper, pencils, scissors, and glue. The children have decided what they needed to play school in that area, and together with the teacher they have brought in what they wanted to play with within that specific area. Bernadette has been busy making a drawing and asked Lennart what to do next, but Lennart walked away in the direction of the classroom. Difficulties in choosing the appropriate subcategory in category 2 (Classroom organization) was even more obvious. Not knowing the specific school context, it is unfeasible for external coders to distinguish whether "rules" (subcategory 03) or "routines" (subcategory 04) are applicable (see Table 3).
On the level of properties we found that some properties were related too closely, for example, commenting and judging (subcategory 01), preferring and choosing in subcategory 02 (see Table 4, line 6), and accepting and adopting (subcategory 03). Based on the results of this qualitative analysis we took the following measures.
We created the possibility to add to all codes a relational component: P (for peers), F (for family), O (for others, including the researcher), or a combination of P, F, and O. As a consequence we removed the separate category "Relations" (see Appendix C.1, Coding System 2).
We maintained the other three main categories, but redefined some subcategories. Category 1 (School activities) was transformed into attitude towards school activities, as attitude is always involved in the opinions children have about school (activities). Here we followed Vyverman and Vettenburg (2010), who advocate that affective, cognitive, as well as behavioral components are to be distinguished in using the concept attitude or opinion, referring to children. These three components became our subcategories. Affect (subcategory 01) refers to the feelings and preferences children show. Cognition (subcategory 02) refers to the (intellectual) views and information children have. Finally, behavior (subcategory 03) refers to how children actually perform. Table 4 Child Lennart (Context: Semi-Structured Interview on Notions About School) During the semi-structured interview with Lennart, Bernadette, and Jan, the children are allowed to work on some activity, such as making a drawing. The interview took place in the play area where the children played school. Lennart sees the letter box, which is also put in the play area to play school. We also decided to create two new subcategories for category 2 (Classroom organization). Children are accepting and following rules and routines (subcategory 04: adoption), or they re-adjust rules and routines (subcategory 05: modification).
We added characteristics to codes in category 3 (Teacher's roles) referring to the kind of teacher's involvement in children's activities: i (child -teacher interaction), r (child taking the role of a teacher), and a (child expressing himself about the teacher, without the teacher being around).
On the level of properties, we decided to reduce or combine those properties which caused confusion by the coders, because they were related too closely. As a result of the re-adjustments in the coding system, we have rewritten our memos with all the definitions of the categories, subcategories, and the properties (see Appendix C.4). We made a new instruction for the coders, in which we drew special attention to the intended hierarchy of the coding system (see Appendix C.3).
We decided to recode the three units from the data collection of the two case studies, which showed the lowest agreement percentages in the first coding process: play in the play area by Bernadette, and talking about feelings, and the interview with Lennart. Following Miles and Huberman (1984;p. 63), we decided to accept now an overall agreement of 80%, as a result of a second round of data analysis and coding.
We show the results of the recoding process in percentages on inter-coder reliability in Table 1 (after re-adjustments, on the right side).
Looking at the overall results in Table 1, we see that we met our formulated standard of an overall 80% agreement on all the recoded units. First on play in the play area by Bernadette (89%, was 55%), secondly on talking about feelings by Lennart (83%, was 60%), and finally on the interview with Lennart (83%, was 51%).
The next step in our research was to look into the content of the results of the recoding process. Is our coding system appropriate to analyze elements of young children's voices and link them to the indicators, derived from the studied literature, we have formulated before?
In Table 5 we see an element of underlying expressions by Lennart, labeled preferring as a property of subcategory 01 (affect). Lennart is referring to himself and his friend Jan, speaking in a personal way: "we would like" (line 2). We see the same kind of expression in Table 2, when Lennart is talking about going to school, including his peers Bernadette and Jan, by saying: "we all like school" (line 7). He is referring to himself and what he likes in Table 4: "I like coloring a car" (line 6). At the same time, Lennart expresses himself in Table  5 about his choices, what he wants to achieve and the importance of the collaboration with peer Jan: "we would like to do it alone with us" (line 2 and 6).

Discussion and conclusion
In our research we had to deal with the issues of validity and reliability of a coding system for analyzing young children's voices. We formulated the following question: is our coding system for analyzing young children's voices valid and reliable? On the basis of available data, we may conclude that we have been able to confirm the validity and reliability of the coding system. As for ecological validity, we observed children in their real school life context. As for construct validation we used multiple sources of evidence for data triangulation, and we maintained a conceptually consistent chain of evidence. The researchers reviewed drafts of the case study reports on a regular basis too (peer debriefing).The chain of evidence allowed two independent coders to go systematically through the same analyzing and coding processes (Yin, 2009). With an 80% agreement on coding among the researcher and two independent coders (see Table 1), we consider our coding system sufficiently reliable to analyze young children's voices in more detail in the next step of our research project. An important issue in researching the construct of voices, certainly with young children, is the role of the researcher and its potential bias. Most of the time during the research, the researcher remained a marginal observer, registering the ways the children acted during all the occurring daily activities in school. However, the different roles of the researcher are, in fact, inseparable from the participating children in the research context (Holland, Renold, Ross & Hillman, 2010). There is not one or a simple solution to deal with this problem of potential bias. The only option is to use reflexive techniques, to explore the dynamics of the relationships between researcher and the ones involved in the research, according to Holland et al. (2010). By arranging peer debriefing at a regular basis, cooperating with independent coders, presenting at adequate forums to develop and maintain a chain of evidence, and publishing in peer reviewed journals, we dealt with this methodological issue in the best possible way.
In the next phase of our research we will use the results of the coding processes to analyze the contents of the children's voices in our five case-studies. What do the children in our case studies have to say about their school contexts? What are their notions and their opinions? The outcomes of these analyses will then be used to make an overall comparative analysis on the content of the children's voices in these five case studies. For now we can conclude that we can be confident that the coding system yields data that permit reliable and valid conclusions.