Results        

A study of how specific principal behaviors affect teacher and student performance

 


Home
Dissertation Defense
Introduction
Significant Quotes
Context
Origin of Study
Conceptual Framework
Literature Review
Research Design
Measures & Insturments
Treatments
Data & Analysis
Support Documents
Results
Discussion
About Authors
Contact Information

Results in this chapter are organized around the research questions of this study.

1.      How will the treatment of principal-teacher interactions affect teachers’ instructional practices?

2.      How will changes in teachers’ instructional practices, initiated by the set of principal-teacher interactions, affect student performance?

3.      How will changes in principal-teacher interactions affect the frequency and focus of teacher conversations with principals, students, and other teachers?

The research design used in this study was quasi experimental with multiple quantitative analytic techniques. There were three specific time frames associated with this research in relation to the measures and principal-teacher interactions:

1.      Prior to the pilot year (prior to fall 2007)

2.      Pilot year (2007-2008 school year)

3.      Year of full implementation (2008-2009 school year)

Two principal-teacher interactions, snapshots and data reviews were implemented during the pilot year and the full set of four principal-teacher interactions, one-on-one summer meetings, snapshots, data reviews, and teacher self-assessment were implemented in the year of full implementation. Classroom grade distributions and student discipline referral data were collected from each of these three time frames. Teacher and student survey data were collected from two of these time frames, the pilot year and the year of full implementation. Principal-completed and teacher-completed QIR data were collected only in the year of full implementation.

Research Question One: How will the treatment of principal-teacher interactions affect teachers’ instructional practices?

QIR data were used in a single group pretest posttest research design in order to explore any affect the introduction of a set of principal-teacher interactions had on the quality of teacher instructional practices as defined by the Quality Instruction Rubric (QIR) during the year of full implementation, the 2008-2009 school year. Data from the QIRs completed by the principals and the QIRs completed by teachers from both the pretest and posttest were analyzed using paired sample t-tests.

One assumption of t-tests is that the data yields a normal distribution. The normality of the QIR score distributions were assessed by computing kurtosis and skewness for each of the four subscales and the overall QIR score for both the pretest and posttest data. The ten kurtosis values ranged from -0.54 to 1.69. The ten skewness values ranged from -0.01 to 1.03. A common interpretation is that absolute values of kurtosis and skewness less than 2 are approximately normal enough for most statistical assumption purposes (Minium, King, & Bear, 1993). These results support that normality assumptions were upheld for these data.

Changes in the Quality of Teacher Instructional Practices During the Year of Full Implementation  

The results of a comparison of data from pretest and posttest QIR are presented in Table 15. According to analysis results of QIR ratings completed by teachers, the quality of teacher instructional practices improved in the two domains of Planning & Preparation and Learning Environment at a significance level of p<0.01, indicating a small effect size in each of these two domains. Analyses results of QIR ratings completed by the principals did not detect a change in the quality of teacher instructional practices in the same two domains of Planning & Preparation and Learning Environment.

Analyses results of QIR ratings completed by teachers did not indicate a change in the quality of teacher instructional practices in the two domains of Instruction and Assessment. According to analyses results of QIR ratings completed by the principals, the quality of teacher instructional practices improved in the same two domains of Instruction and Assessment, at a significance level of p<0.001, indicating a small effect size in each domain.

Overall, according to the analyses results of QIR ratings completed by the principals, the quality of teacher instructional practices did improve, at a significance level of p<0.05, producing a small effect size. Analyses results of QIR ratings completed by teachers also indicated the quality of teacher instructional practices improved overall at a significance level of p<0.05, producing a small effect size. However, as noted in Table 15, the specific domains in which significant changes occurred according to teachers’ ratings were exactly the opposite of those indicated by principals’ ratings.

Table 15

Comparison of QIR Pre-Post Mean Scores (Standard Deviation) for Year of Full Implementation

 

Pre

Post

t

p-value

Effect size

 

(SD)

(SD)

(Cohen’s d)

 

TEACHER-COMPLETED

   

Planning & Preparation

3.56

3.74

2.75

0.008

0.42*

(0.48)

(0.38)

 

Learning Environment

3.69

3.85

2.75

0.008

0.39*

(0.46)

(0.36)

 

Instruction

3.51

3.58

0.90

0.374

(0.50)

(0.48)

 

Assessment

3.30

3.39

1.17

0.249

(0.62)

(0.48)

 

Overall

3.52

3.64

2.23

0.031

0.26*

(0.51)

(0.42)

 

 

PRINCIPAL-COMPLETED

 

 

Planning & Preparation

3.16

3.2

1.32

0.194

(0.78)

(0.770)

 

Learning Environment

3.26

3.29

0.858

0.395

(0.72)

(0.70)

 

Instruction

2.84

3.09

4.99

< 0.001

0.40*

(0.64)

(0.60)

 

Assessment

2.69

2.98

4.29

< 0.001

0.42*

(0.76)

(0.60)

 

Overall

2.98

3.14

3.75

< 0.001

0.23*

(0.73)

(0.67)

 

*Indicates a small effect size (0.2<d< 0.5); **Indicates a medium effect size (0.5<d< 0.8); ***Indicates a large effect size (d >0.8). (Cohen, 1988)

 A Comparison of the Differences of Perceptions of the Quality of Teacher Instructional Practices between Teachers and Principals

         The mean scores of QIR ratings completed by teachers and QIR ratings completed by the principals, presented in Table 15, appear to differ systematically. Results of a comparison of pretest and posttest data from QIR ratings completed by the principals to QIR ratings completed by teachers are presented in Table 16. Teachers rated the quality of their instructional practices higher than did principals in each domain and overall. The differences in the results of QIR ratings completed by the principals and QIR ratings completed by teachers were significant at a level of p<0.001 and indicated large differences in each domain and overall.

Table 16

Comparison of Teacher-completed to Principal-completed QIR Mean Scores (Standard Deviation) for Year of Full Implementation

 

Teacher

Principal

t

p-value

Effect size

(SD)

(SD)

(Cohen’s d)

 

PRETEST

 

 

Planning & Preparation

3.56

3.16

3.39

0.001

0.62***

(0.48)

(0.78)

 

Learning Environment

3.69

3.26

3.95

< 0.001

0.71***

(0.46)

(0.72)

 

Instruction

3.51

2.84

6.54

< 0.001

1.17***

(0.50)

(0.64)

 

Assessment

3.3

2.69

4.78

< 0.001

0.88***

(0.62)

(0.76)

 

Overall

3.52

2.98

5.03

< 0.001

0.86***

(0.51)

(0.73)

 

 

POSTEST

   

Planning & Preparation

3.74

3.2

4.54

< 0.001

0.89***

(0.38)

(0.770)

 

Learning Environment

3.85

3.29

5.42

< 0.001

1.01***

(0.36)

(0.70)

 

Instruction

3.58

3.09

4.65

< 0.001

0.90***

(0.48)

(0.60)

 

Assessment

3.39

2.98

3.95

< 0.001

0.75***

(0.48)

(0.60)

 

Overall

3.64

3.14

5.18

< 0.001

0.89***

(0.42)

(0.67)

 

*Indicates a small effect size (0.2<d< 0.5); **Indicates a medium effect size (0.5<d< 0.8); ***Indicates a large effect size (d >0.8). (Cohen, 1988)

Analyses of Systematic Differences in Teachers’ Self-Ratings

 Teachers with differing depths of quality of instructional practices may have differed systematically in their self-ratings. The prior analysis of all teachers in one group may mask any possible systematic differences. There were a number of different grouping methods which seemed logical in order to search for systematic differences in these data.

Teachers’ QIR self-ratings were separately analyzed by high, medium and low performing groups based on the overall posttest QIR ratings completed by the principals. Other options for generating teacher groups could have been overall pretest QIR ratings completed by the principals, overall posttest QIR ratings completed by teachers, or pretest QIR ratings completed by teachers.

Consideration was given to group teachers based on the overall pretest QIR ratings completed by the principals. In anticipation of the question, a correlation coefficient was calculated between the overall pretest QIR ratings completed by the principals and the overall posttest QIR ratings completed by the principals and found to be 0.873. Such a high correlation between the pretest and posttest indicate that using either set of data for grouping purposes would result in similar groupings and similar results.

            Consideration was given to grouping teachers based on the overall QIR ratings completed by teachers. However, as established in chapter two, principal ratings of instructional practices are more likely to be more valid than teacher ratings of instructional practices. As discussed in chapter three, we implemented several procedures during the course of this study, such as field tests, norming, and calibration procedures, to increase the validity and reliability of the QIR ratings completed by the principals. Thus, for discussion purposes it seemed more logical to group teachers according to overall QIR ratings completed by the principals.

Comparisons Among High, Medium, and Low Performing Teachers According to Posttest QIR Ratings Completed by the Principals.

Teachers’ QIR self-ratings were separately analyzed groups defined by their depth of quality instructional practices as determined by their placement on the QIR. The total sample (N=50), were split into three different, nearly equal sized, groups based on the overall posttest QIR ratings completed by the principals.

Group One-High Performing Teachers (n= 16)

Group Two-Medium Performing Teachers (n=17)

Group Three-Low Performing Teachers (n=17)

            The purpose of splitting the teachers into groups was to obtain as much discrimination between groups as possible. More than three groups are preferable. However, for this comparison we planned to compute means for each group to make potentially generalizable claims. Separating the original sample of 50 teachers into more than three groups would likely produce sample sizes which were too small for this purpose.

The results of an ANOVA of the overall posttest QIR ratings completed by the principals on each of these three groups indicated that the ratings for each group were statistically different at a significance level of p<0.0001. The results of an ANOVA of the overall pretest QIR ratings completed by teachers indicated that high, medium, and low performing teachers ratings were equivalent. The results of an ANOVA of the overall posttest QIR ratings completed by teachers, also indicated that high, medium, and low performing teachers ratings were equivalent.

            Table 17 reports results of a comparison of QIR ratings completed by the principals and QIR ratings completed by teachers for high, middle, and low performing teachers. Table 17 shows that high performing teachers’ ratings of the quality of their instructional practices were statistically equivalent to the principals’ ratings. By contrast, medium performing teachers’ ratings of their instructional practices were significantly higher, with medium to large effect sizes, than the principals’ ratings in each domain and overall. Likewise, low performing teachers’ ratings of their instructional practices were significantly higher than the principals’ ratings in each domain and overall. The effect sizes between the low performing teachers’ and principals’ ratings were consistently larger than those between medium performing teachers and principals.

 

Text Box: 94
Table 17

Comparison of Teacher-completed to Principal-completed QIR Mean Scores (Standard Deviation) for High, Medium, and Low Performing Teachers

 

 

 

p-value

Effect size (Cohen’s d)

 

 

 

p-value

Effect size (Cohen’s d)

 

 

 

p-value

Effect size (Cohen’s d)

Teacher (SD)

Principal (SD)

 

Teacher (SD)

Principal (SD)

 

Teacher (SD)

Principal (SD)

 

PRETEST-HIGH PERFORMING TEACHERS

 

PRETEST-MEDIUM PERFORMING TEACHERS

 

PRETEST-LOW PERFORMING TEACHERS

Planning & Preparation

3.60

3.83

0.333

 

3.66

3.28

0.005

0.77**

 

3.43

2.41

< 0.001

2.52***

(0.56)

(0.61)

   

(0.51)

(0.47)

     

(0.35)

(0.45)

   

Learning Environment

3.68

3.79

0.583

 

3.77

3.47

0.023

0.75**

 

3.64

2.59

< 0.001

2.13***

(0.46)

(0.64)

   

(0.44)

(0.35)

     

(0.50)

(0.49)

   

Instruction

3.60

3.34

0.168

 

3.53

2.99

0.001

1.28***

 

3.42

2.23

< 0.001

2.75***

(0.53)

(0.55)

   

(0.55)

(0.23)

     

(0.42)

(0.45)

   

Assessment

3.34

3.28

0.775

 

3.40

2.85

0.013

1.00***

 

3.15

2.02

< 0.001

2.3***

(0.71)

(0.66)

   

(0.61)

(0.48)

     

(0.54)

(0.44)

   

Overall

3.56

3.56

0.987

 

3.59

3.15

0.002

1.11***

 

3.41

2.31

< 0.001

2.80***

(0.50)

(0.57)

   

(0.47)

(0.31)

     

(0.38)

(0.41)

   
 

POSTTEST-HIGH PERFORMING TEACHERS

 

POSTTEST-MEDIUM PERFORMING TEACHERS

 

POSTTEST-LOW PERFORMING TEACHERS

Planning & Preparation

3.85

4.09

0.072

 

3.68

3.26

0.001

1.27***

 

3.70

2.45

< 0.001

3.15***

(0.40)

(0.380)

   

(0.33)

(0.33)

     

(0.39)

(0.40)

   

Learning Environment

3.93

4.03

0.384

 

3.86

3.38

< 0.001

1.81***

 

3.78

2.62

< 0.001

2.36***

(0.37)

(0.33)

   

(0.31)

(0.21)

     

(0.42)

(0.56)

   

Instruction

3.77

3.71

0.677

 

3.38

3.17

0.039

0.71**

 

3.60

2.52

< 0.001

2.22***

(0.51)

(0.35)

   

(0.35)

(0.23)

     

(0.51)

(0.47)

   

Assessment

3.48

3.56

0.642

 

3.32

3.01

0.013

1.03***

 

3.37

2.41

< 0.001

2.07***

(0.53)

(0.50)

   

(0.38)

(0.19)

     

(0.52)

(0.40)

   

Overall

3.76

3.85

0.423

 

3.56

3.21

< 0.001

1.73***

 

3.61

2.50

< 0.001

2.87***

(0.39)

(0.31)

   

(0.25)

(0.14)

     

(0.40)

(0.37)

   

*Indicates a small effect size (0.2<d< 0.5); **Indicates a medium effect size (0.5<d< 0.8); ***Indicates a large effect size (d >0.8). (Cohen, 1988)


 

Changes in the Quality of Teacher Instructional Practices During the Year of Full Implementation for High, Medium, and Low Performing Teachers.

            Results of a comparison of data from pretest and posttest QIR ratings for high performing teachers are presented in Table 18. According to analyses results of pretest and posttest QIR ratings completed by teachers, the quality of teacher instructional practices of high performing teachers improved in the domain of Learning Environment at a significance level of p<0.001, indicating a medium effect size. Analyses results of QIR ratings completed by teachers did not indicate a change in the quality of teacher instructional practices of high performing teachers in the three domains of Planning & Preparation, Instruction, or Assessment. Analyses results of QIR ratings completed by teachers indicated the quality of teacher instructional practices of high performing teachers increased overall at a significance level of p<0.05, producing a small effect size.


Table 18

Comparison of QIR Pre-Post Mean Scores (Standard Deviation) for Year of Full Implementation for High Performing Teachers

 

Pretest

Posttest

t

p-value

Effect size

(SD)

(SD)

(Cohen’s d)

 

TEACHER-COMPLETED

   

Planning & Preparation

3.60

3.85

1.82

0.088

(0.56)

(0.40)

 

Learning Environment

3.68

3.93

4.05

0.001

0.60**

(0.46)

(0.37)

 

Instruction

3.60

3.77

1.63

0.124

             –

(0.53)

(0.51)

 

Assessment

3.34

3.48

0.96

0.351

             –

(0.71)

(0.53)

 

Overall

3.56

3.76

2.18

0.046

0.45*

(0.50)

(0.39)

 

 

PRINCIPAL-COMPLETED

 

 

Planning & Preparation

3.83

4.09

2.04

0.060

(0.61)

(0.380)

 

Learning Environment

3.79

4.03

2.35

0.033

0.47*

(0.64)

(0.33)

 

Instruction

3.34

3.71

3.38

0.004

0.80***

(0.55)

(0.35)

 

Assessment

3.28

3.56

2.42

0.029

0.48*

(0.66)

(0.50)

 

Overall

3.56

3.85

3.18

0.006

0.63**

(0.57)

(0.31)

 

*Indicates a small effect size (0.2<d< 0.5); **Indicates a medium effect size (0.5<d< 0.8); ***Indicates a large effect size (d >0.8). (Cohen, 1988)

            Results of a comparison of data from pretest and posttest QIR ratings for medium performing teachers are presented in Table 19. According to analyses results of the pretest and posttest QIR ratings completed by teachers, the quality of instructional practices of medium performing teachers did not change during the year of full implementation. According to analyses results of the pretest and posttest QIR ratings completed by the principals, the quality of instructional practices of medium performing teachers improved in the domain of Instruction at a significance level of p<0.05, indicating a medium effect size. According to analyses results of the pretest and posttest QIR ratings completed by the principals, the quality of instructional practices of medium performing teachers did not change in any other domain or overall. Given that of the ten possible indicators of a change in the quality of instructional practices for medium performing teachers, only one (Instruction; principal-completed) indicated a change, it is likely that the quality of instructional practices of medium performing teachers were impacted significantly less by the set of principal teacher interactions during the year of full implementation than other teachers.

Analyses results of QIR ratings completed by the principals did not detect a change in the quality of teacher instructional practices of high performing teachers in the domain of Planning & Preparation. According to analyses results of QIR ratings completed by the principals, instructional practices of high performing teachers improved in the three domains of Learning Environment, Instruction, and Assessment, at a significance level of p<0.05, indicating a small effect size in the two domains of Learning Environment and Assessment, and a large effect size in the domain of Instruction. Overall, according to analyses results of QIR ratings completed by the principals, instructional practices for high performing teachers improved at a significance level of p<0.01, producing a medium effect size.


Table 19

Comparison of QIR Pre-Post Mean Scores (Standard Deviation) for Year of Full Implementation for Medium Performing Teachers

 

 

Pretest

Posttest

t

p-value

Effect size

(SD)

(SD)

(Cohen’s d)

 

TEACHER-COMPLETED

   

Planning & Preparation

3.66

3.68

0.19

0.852

(0.51)

(0.33)

 

Learning Environment

3.77

3.86

0.75

0.462

(0.44)

(0.31)

 

Instruction

3.53

3.38

1.08

0.296

(0.55)

(0.35)

 

Assessment

3.40

3.32

0.50

0.626

(0.61)

(0.38)

 

Overall

3.59

3.56

0.26

0.795

(0.47)

(0.25)

 

 

PRINCIPAL-COMPLETED

 

 

Planning & Preparation

3.28

3.26

0.18

0.862

(0.47)

(0.33)

 

Learning Environment

3.47

3.38

0.87

0.396

(0.35)

(0.21)

 

Instruction

2.99

3.17

2.29

0.036

0.78**

(0.23)

(0.23)

 

Assessment

2.85

3.01

1.36

0.194

(0.48)

(0.19)

 

Overall

3.15

3.21

0.69

0.500

(0.31)

(0.14)

 

*Indicates a small effect size (0.2<d< 0.5); **Indicates a medium effect size (0.5<d< 0.8); ***Indicates a large effect size (d >0.8). (Cohen, 1988)

            Results of a comparison of data from pretest and posttest QIR ratings for low performing teachers are presented in Table 20. According to analyses results of pretest and posttest QIR ratings completed by teachers, the quality of instructional practices of low performing teachers improved in the domain of Planning & Preparation at a significance level of p=0.05, indicating a medium effect size. Analyses results of pretest and posttest QIR ratings completed by teachers did not indicate a change in the quality of instructional practices of low performing teachers in the three domains of Planning & Preparation, Instruction, or Assessment. Analyses results of pretest and posttest QIR ratings completed by teachers indicated the quality of instructional practices of low performing teachers increased overall at a significance level of p<0.05, producing a medium effect size.

Table 20

Comparison of QIR Pre-Post Mean Scores (Standard Deviation) for Year of Full Implementation for Low Performing Teachers

 

Pretest

Posttest

t

p-value

Effect size

(SD)

(SD)

(Cohen’s d)

 

TEACHER-COMPLETED

   

Planning & Preparation

3.43

3.70

2.83

0.012

0.72**

(0.35)

(0.39)

 

Learning Environment

3.64

3.78

1.34

0.200

(0.50)

(0.42)

 

Instruction

3.42

3.60

1.50

0.152

(0.42)

(0.51)

 

Assessment

3.15

3.37

1.74

0.101

(0.54)

(0.52)

 

Overall

3.41

3.61

2.35

0.032

0.51**

(0.38)

(0.40)

 

 

PRINCIPAL-COMPLETED

 

 

Planning & Preparation

2.41

2.45

0.35

0.728

(0.45)

(0.40)

 

Learning Environment

2.59

2.62

0.26

0.797

(0.49)

(0.56)

 

Instruction

2.23

2.52

2.88

0.011

0.63**

(0.45)

(0.47)

 

Assessment

2.02

2.41

3.86

0.001

0.94***

(0.44)

(0.40)

 

Overall

2.31

2.50

2.84

0.012

0.49*

(0.41)

(0.37)

 

*Indicates a small effect size (0.2<d< 0.5); **Indicates a medium effect size (0.5<d< 0.8); ***Indicates a large effect size (d >0.8). (Cohen, 1988)

            Analyses results of pretest and posttest QIR ratings completed by the principals, did not detect a change in the quality of instructional practices of low performing teachers in the two domains of Planning & Preparation and Learning Environment. According to analyses results of pretest and posttest QIR ratings completed by the principals, the quality of instructional practices of low performing teachers improved in the two domains of Instruction and Assessment, at a significance level of p<0.01 and p<0.001 respectively, indicating a medium and large effect size. Overall, according to analyses results of pretest and posttest QIR ratings completed by the principals, the quality of instructional practices of low performing teachers increased at a significance level of p<0.05, producing a small effect size.

Research Question Two: How will changes in teachers’ instructional practices, initiated by the set of principal-teacher interactions, affect student performance?

Classroom grade distributions and student discipline referrals were used in a single, cross-sectional group interrupted time series research design in order to explore any effect changes in teacher instructional practices, initiated by the set of principal-teacher interactions, may have had on student performance during the pilot year and the year of full implementation. Data from classroom grade distributions and student discipline referrals from four years prior to the pilot year were analyzed using linear regression in order to predict expected levels of student performance during the pilot year (2007-2008) and the year of full implementation (2008-2009). Actual levels of student performance, operationalized as grade distributions and discipline referrals, from the pilot year and year of full implementation were then compared to the levels of student performance predicted from the regression analysis.

Classroom Grade Distributions

Classroom grade distributions are presented in Table 21 for four years previous to the pilot year (pre-treatment), the pilot year, and the year of full implementation.

Table 21

Actual Classroom Grade Distributions for all Students (n=approximately 1400)

School Year

Percentage of As

Percentage of Bs

Percentage of Cs

Percentage of Ds

Percentage of Fs

2003-2004

31.35

27.85

21.65

10.90

8.05

2004-2005

32.85

26.80

21.00

9.20

9.80

2005-2006

32.85

28.10

20.50

9.50

8.55

2006-2007

35.40

28.50

18.80

8.30

8.35

2007-20081

36.30

28.37

18.80

9.63

6.93

2008-20092

41.05

27.73

17.04

8.28

5.57

1Two of this study’s principal-teacher interactions –snapshots and data reviews—were in place for this school year.

2This study’s treatment – set of four principal-teacher interactions – were in place for this school year.

             Using grade distributions data from four years previous to the pilot year, expected levels of grade distributions were calculated using linear regression. Figure 6 depicts the actual grade distributions of As, Bs, Cs, Ds, and Fs for school years 2003-2004 through 2008-2009. A line of best fit for each grade distribution has been placed on the graph based on data collected in the years prior to the pilot year. Dashed lines on Figure 6 indicate expected levels of grade distributions according to pretreatment data. The Greek symbol delta (D) is used to indicate the difference between expected and actual values for each grade distribution in the pilot year and the year of full implementation. The differences between expected and actual values for each are indicated within parentheses on Figure 6. Table 22 reports these differences.

Table 22

Gap between Actual and Projected Classroom Grade Distributions for all Students

(n=approximately 1400)

 

As

Bs

Cs

Ds

Fs

 

Gap (D1) in Pilot Year)

0.17%

-0.28%

0.60%

2.03%

-1.65%

Gap (D2) in Year of Full Implementation)

3.71%

-1.25%

-0.25%

1.43%

-2.97%

                       

Percentages reported are the differences from the projected values based on linear regression of pretreatment data (School years 2003-2004 through 2006-2007)

            The percent As and percent Fs produced the differences of the largest magnitudes from expected values. The higher than expected percentage of Ds may have been due to a portion of the Fs becoming Ds. The higher than expected percentage of As may have been due to a portion of Bs becoming As.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Figure 6 Predicted and Actual Classroom Grade Distributions for all Students. Dashed lines represent predicted values based on pre treatment data from years 2003-2004 through 2006-2007. Differences between expected and actual values are represented within parentheses.

 Classroom Discipline Referrals

The number of reported classroom discipline referrals, aggressive discipline referrals (aggressive to school employee, defiance, failure to comply with discipline, fights, harassment, profanity, disorderly conduct, and repeated violations), and discipline referrals for several disaggregated groups are presented in Table 23 for four years previous to the pilot year (pre-treatment), the pilot year, and the year of full implementation.

Table 23

Discipline Referrals for School Years 2003-2004 through 2008-2009 (n=approximately 1400)

School Year

Total Discipline

Aggressive Discipline

Male

Female

Fresh

Soph

Jr

Sr

2003-2004

1792

666

1148

644

513

475

449

355

2004-2005

1756

709

1042

714

522

510

374

350

2005-2006

1708

740

1116

591

473

458

390

389

2006-2007

1997

1039

1433

564

727

513

366

391

2007-20081

1712

720

1158

554

552

455

313

392

2008-20092

1255

481

735

520

493

369

229

164

1Two of this study’s principal-teacher interactions –snapshots and data reviews—were in place for this school year.

2This study’s treatments – set of four principal-teacher interactions – were in place for this school year.

          Using discipline referral data from four years previous to the pilot year, expected levels of discipline referrals were calculated using linear regression. The differences between expected levels of discipline referrals and actual levels of discipline referrals, in each category, from the pilot year and year of full implementation are presented in Table 24. The differences between expected and actual values for each are indicated within parentheses on Figure 7.


Table 24

Differences of Discipline Referrals from Projected Frequencies (n=approximately 1400)

 

Total Discipline (TD)

Aggressive Discipline (AD)

Male (M)

Female (Fe)

Fresh (Fr)

Soph (So)

Jr

Sr

Gap (D1) in Pilot Year

-243

-387

-259

17

-155

-50

-23

-16

Gap (D2) in Year of Full Implementation

-757

-759

-775

19

-273

-142

-84

-259

Numbers reported are the differences in frequencies from the projected value based on linear regression of pretreatment data (School years 2003-2004 through 2006-2007)

            Figures 7, 8, and 9 depict the actual discipline referral frequencies for school years 2003-2004 through 2008-2009. A line of best fit for each grade distribution has been placed on the graph based on data collected in the years prior to the pilot year. Dashed lines on each figure indicate the expected frequency of discipline referrals according to pretreatment data. In the data table Ds have been indicated on each graph in order to indicate the difference between expected and actual values for each level of discipline in the pilot year and the year of full implementation in each category. The differences between expected and actual values for each are indicated within parentheses on each figure.

            As indicated on Figure 7, the actual frequency of total discipline referrals was 12% lower than expected in the pilot year and 38% lower than expected in the year of full implementation. Additionally, the actual frequency of aggressive discipline referrals was 35% lower than expected lower than expected in the pilot year and 61% lower than expected in the year of full implementation. This pattern seems to indicate that essentially all of difference in actual discipline referrals and expected discipline referrals is due to actual aggressive discipline referrals being much lower than expected.

Text Box: DTD2

Figure 7 Total Discipline and Aggressive Discipline for all Students. Dashed lines represent predicted values based on pre treatment data from years 2003-2004 through 2006-2007. Differences between expected and actual values are represented within parentheses.

             As indicated on Figure 8, the actual frequency of male discipline referrals was 18% lower than expected in the pilot year and 51% lower than expected in the year of full implementation. However, the actual frequency of female discipline referrals for both the pilot year and the year of full implementation is essentially equivalent to the expected value, 3% and 4% higher respectively.

Figure 8 Total Discipline by Gender. Dashed lines represent predicted values based on pre treatment data from years 2003-2004 through 2006-2007. Differences between expected and actual values are represented within parentheses.

            Figure 9 indicates discipline referrals for individual grade levels during the school years 2003-2004 through 2008-2009. The actual frequency of freshman discipline referrals was 22% lower than expected in the pilot year and 36% lower than expected in the year of full implementation. The actual frequency of sophomore discipline referrals was 10% lower than expected in the pilot year and 28% lower than expected in the year of full implementation. The actual frequency of junior discipline referrals was only 7% lower than expected in the pilot year, but 27% lower than expected in the year of full implementation. The actual frequency of senior discipline referrals is essentially the same as expected in the pilot year, 4% lower than expected, and 61% lower than expected in the year of full implementation,.


 

Figure 9 Total Discipline for Freshman, Sophomores, Juniors, and Seniors. Dashed lines represent predicted values based on pre treatment data from years 2003-2004 through 2006-2007. Differences between expected and actual values are represented within parentheses.

Text Box: 108

Classroom Grade Distributions and Student Discipline Referrals for High, Medium, and Low Performing Teachers

Analyses of QIR results indicated that ratings of teacher instructional practices completed by teachers and principal were divergent for high, medium, and low performing teachers. Thus it was of interest to investigate if there were differences in student outcomes across these three teacher groups. Mean classroom grade distributions and student discipline referrals disaggregated by high, medium, and low performing teachers for 2006-2007 through 2008-2009 are reported in Table 25. A comparison of classroom grade distributions and student discipline referrals indicated that there was a lack of  statistically significant differences in classroom grade distributions or student discipline referrals for high, medium, or low performing teachers from 2006-2007 through 2008-2009.

Table 25

Comparison of Classroom Grade Distributions and Discipline Referral Mean Scores (Standard Deviation) for High, Medium, and Low Performing Teachers for 2006-2007 through 2008-2009 (n=approximately 1400)

 

High Performing Teachers

 

Medium Performing Teachers

 

Low Performing Teachers

 

06-07

07-08

08-09

 

06-07

07-08

08-09

 

06-07

07-08

08-09

% of A

25%

30%

35%

 

36%

41%

44%

 

33%

30%

37%

(SD)

(12%)

(11%)

(12%)

 

(17%)

(17%)

(19%)

 

(18%)

(9%)

(15%)

% of B

33%

31%

31%

 

26%

24%

25%

 

26%

30%

28%

(SD)

(8%)

(7%)

(7%)

 

(11%)

(8%)

(8%)

 

(10%)

(6%)

(6%)

% of C

22%

22%

19%

 

18%

18%

17%

 

19%

20%

18%

(SD)

(4%)

(6%)

(6%)

 

(8%)

(7%)

(6%)

 

(9%)

(5%)

(6%)

% of D

9%

11%

9%

 

7%

9%

8%

 

9%

11%

9%

(SD)

(5%)

(4%)

(5%)

 

(4%)

(4%)

(5%)

 

(6%)

(6%)

(6%)

% of F

11%

6%

6%

 

12%

8%

5%

 

12%

9%

8%

(SD)

(8%)

(3%)

(3%)

 

(9%)

(5%)

(4%)

 

(9%)

(5%)

(5%)

Discipline Infractions

12

17

9

 

15

9

11

 

15

10

12

(SD)

(12)

(27)

(9)

 

(11)

(9)

(16)

 

(13)

(13)

(9)

 

Research Question Three: How will changes in principal-teacher interactions affect the frequency and focus of teacher conversations with principals, students, and other teachers?

         Teacher and student survey data were used in a single group pretest-midtest-posttest design in order to explore any effect changes in principal-teacher interactions coupled with changes in instructional practices had on the frequency and focus of teacher conversations with principals, students, and other teachers during the pilot year and the year of full implementation. Data from teacher and student surveys were compared using chi square from the spring of 2007, prior to the introduction of the set of principal-teacher interactions, to the spring of 2008, end of the pilot year, and from spring of 2008, before the year of full implementation, to spring 2009, after the year of full implementation. Some of the questions on the teacher and student surveys are conceptually related research question three, the frequency and focus of teacher conversations. However, according to analysis using Cronbach's alpha as reported in chapter three, there was a lack of internal consistency between the resultant responses of similar questions from the teacher and student surveys. Therefore, each question on the teacher and student surveys was analyzed individually.  

Although chi square is an acceptable analysis tool to compare the distributions of the survey data in this study, two assumptions were occasionally violated during analysis. One assumption of a chi square analysis, violated by some survey data, was that no cells contain a zero frequency count. The second assumption of a chi square analysis, violated by some survey data, was that no more than 20% of cells report less than a five frequency count. However, these are not rules, but guidelines and researchers support analyses which uses chi square when these assumptions are violated (Levin, 1999).

Teacher Survey Data

Results of teacher surveys are presented in Table 26 for spring 2007 (pretest), spring 2008 (posttest for pilot year/pretest for year of full implementation), and spring 2009 (posttest for year of full implementation).

Table 26

Teacher Survey of the Frequency and Focus of Teacher Conversations

Question

Response

Spring 071

Spring 081

Spring 091

 χ2      (df=2)
07/08

 χ2       (df=2)
08/09

Teacher-Teacher Conversations

 

How many times per day do you speak to another teacher?

8 or more times

5

35

33

 

 

 2-4 or 5-7

53

38

38

37.95***

0.03

None or One

13

0

0

 

 

How often do you discuss curriculum issues with other teachers?

Daily or Weekly

32

65

54

   

Monthly

25

4

14

31.97***

6.69*

Never or Annually

14

4

3

 

 

How often do you discuss discipline issues with other teachers?

Daily or Weekly

30

55

61

   

Monthly

8

14

7

31.70***

2.76

Never or Annually

33

4

3

 

 

Principal-Teacher Conversations

How often do you discuss curriculum issues with a principal?

Weekly or Daily

14

16

8

   

Monthly

21

24

24

0.44

3.14

Never or Annually

36

33

39

 

 

How often do you discuss discipline issues with a principal?

Weekly or Daily

25

28

22

   

Monthly

19

30

22

6.04*

5.35

Never or Annually

27

15

27

 

 

How often do you discuss teaching strategies with a principal?

Weekly or Daily

8

10

3

   

Monthly

19

20

27

0.23

4.83

Never or Annually

44

43

41

 

 

Frequency & Length of Classroom Visits

How often did a principal visit your classroom last year?

8 or more

0

30

42

   

5 to 7

0

30

19

100.53***

6.49

2 to 4

51

11

10

   

None or Once

20

2

0

 

 

What was the average length of the principals’ visits to your classroom (not counting official observations)?

30 to 60 minutes or 60 minutes or more

1

0

2

   

10 to 30 minutes

0

29

17

35.91***

5.77

I was not visited or Less than 10 minutes

70

44

52

 

 

               

1 Data are frequency counts of teacher responses in this category

*=p<0.05, **=p<0.01, ***=p<0.001

 

            The first three teacher survey questions in Table 26 relate specifically to the frequency and focus of teacher-teacher interactions. Data analyses indicated a signifigant difference in the distribution of teachers responses to questions about the conversations with other teachers before and after the pilot year. There were few significant differences in the distribution of teachers responses to the same questions about conversations with other teachers from the pilot year to the year of full implementation; the one exception was teachers’ responses to how often they discussed curriculum issues with other teachers. The results of data analyses indicated that teachers did percieve an increase in the frequency of teacher-teacher conversations related to curriculum and discipline as well overall. The same level of teacher-teacher conversations were sustained during the year of full implementation.

            The next three teacher survey quesitons in Table 26 relate specifically to the frequency and focus of principal-teacher conversations. With the exception of teachers’ responses to how often they discussed discipline issues with a principal, data analyses indicated a lack of significant differences in the distributions of teachers’ responses to questions concerning the frequency and focus of principal-teacher conversations either during the pilot year or the year of full implementation. Teachers’ responses to how often teachers discussed discipline issues with a principal were significantly different from before to after the pilot year. The results of data analyses indicated that teachers, did not percieve a change in the frequency of principal-teacher interactions related to curriculum, discipline, or teaching strategies during the pilot year or the year of full implementation.

            The last two teacher survey questions in Table 26 relate to the length and frequency of principal classroom snapshots. Data analyses indicates a significant difference, in the distribution of teachers responses to questions about the length and frequency of principal classroom visits before and after the pilot year. There were no significant differences in the distribution of teachers responses to the same questions about the length and frequency of principal classroom visits from the pilot year to the year of full implementation. The results of data analyses indicated that teachers did perceive an increase in the frequency and duration of principal classroom visits during the pilot year and that change was sustained during the year of full implementation.

Student Survey Data

Results of student surveys are presented in Table 27 for spring 2007 (pretest), spring 2008 (posttest for pilot year/pretest for year of full implementation), and spring 2009 (posttest for year of full implementation).

Table 27

Student Survey of the Frequency and Focus of Teacher-Student Conversations

Question

Responses

Spring 071

Spring 081

Spring 091

 χ2      (df=2)
07/08

 χ2       (df=2)
08/09

Frequency of Teacher-Student Conversations

How many times per day do you speak to a teacher?

8 or more times

124

128

100

   

 2-4 or 5-7

82

81

108

0.60

7.36*

None & One

10

7

8

 

 

Focus of Teacher-Student Conversations

How often do you discuss personal issues with teachers?

Daily or Weekly

47

54

48

   

Monthly

27

33

20

1.71

4.84

Annually or Never

142

129

148

 

 

How often do you discuss discipline issues with your teachers?

Daily or Weekly

34

36

30

   

Monthly

9

20

13

4.74

2.54

Annually or Never

173

160

173

 

 

How often do you discuss learning strategies with a teacher (how to study, test taking strategies, learning styles)?

Daily or Weekly

57

60

49

   

Monthly

55

57

42

0.24

6.40*

Annually or Never

104

99

125

 

 

How often does a teacher in this building motivate and inspire you?

Daily

49

58

63

   

Weekly or Monthly

89

96

98

2.85

0.65

Annually or Never

78

62

55

 

 

Do your teachers discuss class performance with you/the class (i.e. class average, test averages, etc)?

Daily or Weekly

120

131

112

   

Monthly

71

70

76

2.99

5.66

Yearly or Never

25

15

28

 

 

1 Data are frequency counts of teacher responses in this category

*p<0.05, **p<0.01, ***p<0.001

 The first student survey question in Table 27 relates specifically to the daily frequency of teacher-student conversations. Data analyses indicated that students did not perceive a change in the frequency of teacher-student conversations during the pilot year. Analyses of the student responses indicate that students did perceive a decrease in the daily frequency of teacher-student conversations, to a p<0.05 signifigance level, during the year of full implementation.

            The next five student survey questions in Table 27 relate to the frequency and focus of teacher-student conversations. Data analysis indicated that, according to students there were essentially no perceived differences in the frequency and focus of teacher-student conversations related to personal issues, discipline issues, learning strategies, motivation, or class performance during the pilot year or the year of full implementation. There were significant differences in the distribution of student responses to how often students discuss learning strategies with their teachers during the year of full implementation, to a significance level of p<0.05, indicating a slight decrease.

            A slight change in the frequency of teacher-student conversations was indicated in the year of full implementation. The results indicated that fewer students talked to a teacher eight or more times a day, but more students talked to a teacher between two to seven times per day. This difference was likely trivial, especially since results to other survey questions failed to indicate significant shifts in the focus of teacher-student conversations.  It is logical to think that since data clearly indicates an improvement in the quality of teacher instructional practices that the frequency and focus of teacher-student conversations would have also improved but such improvement was not indicated.


Home | Dissertation Defense | Introduction | Significant Quotes | Context | Origin of Study | Conceptual Framework | Literature Review | Research Design | Measures & Insturments | Treatments | Data & Analysis | Support Documents | Results | Discussion | About Authors | Contact Information

 This research project is sponsored by the University of Louisville, The Kenton County School District, & the head research (Kim Banta/Brennon Sapp)
For problems or questions regarding this Web site contact bsapp@bsapp.com or kim.banta@kenton.kyschools.us
Last updated: 07/09/10.