MAT 5335 Project 9 due Friday 31 March Part 1 ANOVA Consider the 2008 data on airline on-time performance, http://stat-computing.org/dataexpo/2009/ You can download the data yourself; it is also in the virtual desktop. 1. We wish to test the null hypothesis that the mean departure delays for the different carriers are all the same, where Origin="ORD", again the alternative hypothesis that at least one of the means differ. a. First make a box plot of UniqueCarriers from ORD. Note that not all airlines fly out of ORD. Label your axis, and add some color so that we can see the mean and the median. b. Conduct an ANOVA F-test at the 5% significance level. c. State your conclusion about the result. 2a. Determine the mean departure delays for the different days of the week. b. Determine the mean departure delays for the different airline carriers. c. Is there a relationship between the departure delays, and the airline carrier or day of the week? Be sure to treat the days of the week as factors. Part 2 Use Salaries table, and regexp to answer the following questions. Note: In general, MySQL is case insensitive. In general SQLite is case sensitive, hence, collate nocase may be helpful. 1. In salaries, select Names strictly having double consecutive vowels, i.e., at least one occurrence of aa, ee ii, oo, uu, either upper or lower case or any combination; no occurrence of aaa, eee, etc. 2. Return Names having an occurrence of Q not followed by U or O, yet the string can begin or end with q. 3. Return Names with a middle initial, for example, Gerald R. Aase. 4.(optional) Return Names with no middle initial. 5. Return Names with a Jr., or Jr, at the end of the string. 6. Return Names with third and fourth entry containing NA. 7. Return Name and Position having Assistant or Associate Professor but not just Professor. 8. What do the following expression return? Explain and provide some examples. b[^b]*b ^[^ZX] 9. Return Names with first name beginning with G, next character an e or a, followed by an r. For example, Gerald R. Aase. 10. Consider now the itcont file. See project 6 problem 10. The dates are numeric, numbers of length 7 or 8. For example 3012015 and 10012015. Using regexp, find the amount donated to each candidate, from (including) Oct 19 to Nov 7. This problem might be a bit messy but it can be done using only regexp.