Somehow despite all the conversations around education in the US the education system still sucks. I went to one of the highest funded (amount spent per child) public schools in my state, and as far as I am aware it was far behind in terms of curriculum strength compared to what my parents were taught in the Soviet Union at the same age.
I mean we didn't read a classic American author till 6th or 7th grade! And if I recall correctly there were still M&M's in math class in grade 4!
The US may have an education problem but somehow the Soviet Union and China did fine years ago with out all the ed-tech snake oil.
Education is a complex matter. There are many people with OPINIONS on what the best way to teach is. These ideas are in conflict and only rarely does anyone study what really works. (rarely compared to the number of opinions - there could be a lot of studies that nobody knows about when they state their opinion)
Humans have a limited lifetime: you cannot teach all possible useful knowledge/skills in a lifetime. I limited this to useful, there is a lot of useless things that are fun to know anyway, somehow those are are interested need time to learn it for fun. I didn't define useful either: is Music/French/Algebra/Sports... useful (I can make either argument for any subject)
Why is reading a classic American author important? Reading is important in an abstract sense, but if you can understand written instructions it doesn't matter what you happened to read to get that skill.
Likewise, what is wrong with using M&Ms for learning math? a concrete example helps to learn. (to be clear, this is an opinion that I was ranting against in the first paragraph - I don't know if I agree with the opinion but I understand it enough to repeat it)
One constant in the US in popular culture is our education system sucks compared to X. We have done well over the years despite that (or maybe because of it?)
> I went to the highest funded (amount spent per child) public school in my state, and as far as I am aware it was far behind in terms of curriculum strength compared to what my parents were taught in the Soviet Union at the same age.
This is because just pumping money into failing schools does not magically turn them around. There is little correlation between per capita secondary education spending and student outcomes.
Funding works strangely in USA public education. Schools in any given district seem to have a "hull speed" when it comes to money.
Once a certain amount of dollars are actually reaching the class room, adding more dollars will simply see most of the additional funds absorbed by hiring more administrators, prestige projects like sports facilities, "classroom technology" projects etc.
To detect this limit, simply check the level at which teachers begin paying for school supplies for their students from their own pockets and then back it off about 10%.
In the Northeast US, you'll generally see the best performing districts have a lower amount spent per child than the underperforming districts.
The underperforming districts will have higher property taxes (as a result of the higher education cost). This generally leads to parents seeking to move to a different school district for financial and educational reasons.
In education, at least, more money does not equate to better students, but instead, more mismanagement.
I question whether students are ready to read classic authors before middle school at the earliest. Perhaps one can read Huckleberry Finn as an adventure story before that, but is that more than surface familiarity? Don't know about M&Ms, though.
Anyway, as I say again and again: there isn't one US education system. Within the District of Columbia, a populous but geographically small area, there are practically if not legally speaking six or seven at least: public schools, magnet; public schools prosperous; public schools shaky to desperate; parochial schools; private schools; charter schools. And within the parochial, private, and charter school worlds there are considerable differences.
I see a lot of Russians and Chinese emigrate to America to bring up their children. I dont see any sending their children back to get the "superior" education there.
To be honest, I like that this article tries to perform simple analyses, but find their rationale pretty confusing.
This kind of data is commonly modeled using item response theory (IRT). I suspect that even in data generated by a unidimensional IRT model (which they are arguing against), you might get the results they report, depending on the level of measurement error in the model.
Measurement error is the key here, but is not considered in the article. That + setting an unjustified margin of 20% around the average is very strange. An analogous situation would be criticizing a simple regression, by looking at how many points fall X units above/below the fitted line, without explaining your choice of X.
Totally agree that this is not a fully rigorous analysis, and we do want to dig deeper and try to extend some IRT models to these types of questions.
The main point of this post is to highlight that the most common metric of student performance may not be that useful. Most of the time, students will get their score, the average score, and sometimes a standard deviation as well. As jimhefferon mentioned in a response to a different comment, the conventional wisdom is that two students with the same grade know roughly the same stuff, and that's seeming not to be true.
We're hoping to build some tools here to help instructors give students a better experience by helping them cater to the different groups that are present.
disclaimer: I'm one of the founders of Gradescope.
You brought a smile to my face. I came here to post this same point.
The piece is kind of making a basic fundamental mistake in measurement, assuming that all variability is meaningful variability.
There are ways of making the argument they're trying to make, but they're not doing that.
Also, sometimes a single overall score is useful. A better analogy than the cockpit analogy they use is clothing sizing. Yes, tailored shirts, based on detailed measurements of all your body parts, fit awesome, but for many people, small, medium, large, x-large, and so forth suffice.
I think there's a lesson here about reinventing the wheel.
I appreciate the goals of the company and wish them the best, but they need a psychometrician or assessment psychologist on board.
Does this article say anything more profound than, "If you roll 10 dice, you'll expect a score of 35, however any pair of rolls which sum to 35 are unlikely to be similar."
All the worst students will be very similar and all the best students will be very similar because the number of available states is low. Average students are all unique in their average-ness.
Am I missing some subtle statistical understanding that the toy example doesn't capture?
I think the article's contention is that on-the-ground teachers expect that two people coming out of a high school Algebra II with C+'s are similar. (Certainly that is my working hypothesis.) The article argues that it ain't so.
>Out of 4,063 pilots, not a single one fell within the average 30 percent on all 10 dimensions.
I wondered about a very similar problem some weeks ago. I was bothered about the terms "ectomorph" and "mesomorph" because they seemed useless once you considered height: the vast majority of "ectomorphs" seemed to be taller than the average while the vast majority of "mesomophs" seemed to be of average height, so there's no point to these words. And so I wondered how would shoulder width would change given height (which seems to have some kind "decreasing returns"), and how the average measures would relate to actual average build. I mean, is the "average guy" really the guy with the average height and average shoulders? Because it's not as if the scale had just changed, like doubling the size of a cube, but there seems to be some deformation going on as well.
Anyway, didn't get past the wondering phase at the time. But I think it's too much of an important problem to be casually thrown as part of a pitch. I don't see an immediate reason why the average tuple should be the tuple of all averages, because some of the variables might be "dislocated" and thus not coincide with the averages of other variables. Some guy might be very close to average height yet still somewhere in the left-tail when it comes to body mass, shoulder width or any other measure. So there might be a typical student, but I don't think this is the way to find him.
As you say, they definitely aren't uncorrelated dimensions - otherwise we would have seen ~50 pilots within one stdev for all 10 dimensions. So this simplified metaphor really isn't telling us anything about how statistics apply to students.
There is an analogy to clustering (an unsupervised learning technique) here.
Take the simple case of 2 dimensions (each observation is plotted in 2D space) with possible values of 0-10. Let's say the extreme (far from average) space is within 5% of the border. The total extreme area is (10x10)-(9x9) = 19 (i.e. 19%). Now add a 3rd dimension. The extreme "volume" in 3d space is now (10x10x10)-(9x9x9) = 271 (i.e. 27%). You can see where this is trending. Add enough dimensions, and every observation is now "extreme." They become so far apart that each observation almost deserves its own cluster, and you lose any idea of similarity.
Back to this particular article: when you _add_ (or average) all of the dimensions -- like you do on an exam -- suddenly they are close again.
Here's another look. If you have variables X_1, ..., X_n that are independent and random from normal distributions, if you want someone to be within 1 standard deviation from the mean in EACH dimension, then you are looking at a probability of that happening equal to about 0.68^n, which becomes really small for even a moderate n.
According to the article, the average person doesn't exist, either. I don't know many people that are 13% fluent in Mandarin, 13% fluent in English, 9% fluent in Hindi... At the same time, having ~2 hands and ~10 fingers seems about right. Some metrics work with averages, some don't.
This question of "what skills are students missing?" reminds me of the new teaching methods they were trying out as I started high school. The new teaching program centered around objectives. The idea was that each objective was a skill that the student needed to learn, but the upshot was that you had to score more than 70% on every single quiz to pass the class, and that you could retake every quiz you failed, repeatedly.
The implementation varied between classes - in my World History class, there were a large number of objectives, and each objective was met by a small quiz that tested ~one skill. (There were a lot of retaken quizzes in that class.) In Biology, there were about 10 objectives for the entire semester, so you could still pass while missing a few small skills, as long as those missing skills were spread out among different units.
My high school used that "objectives" system less and less as I moved up the grades -I assume that most teachers got tired of it pretty quickly and just decided to make their usual teaching material "look like objectives" rather than rebuild their curriculum in later years.
I don't like the way this headline is written to match the article. All they showed is that students with similar average scores over multiple questions differed in their scores on individual questions. That is kind of obvious.
This makes me wonder. What is the "best" way to teach computer science to students? Universities are not trade schools (nor should they be), but it seems apparent that CS graduates in general are unprepared entering the workforce. The other extreme (bootcamps) seem to produce graduates that are more "industry ready" but only at a superficial level. These graduates seem to lack rigor/theory. Makes me wonder if there is a more optimal training path for training students.
This is a topic I've spent some time researching and talking with professional educators about. The general consensus is we have to accept that the skills needed for the majority of professional work (the CRUD apps, the web dev, the infrastructure, etc) is almost completely disparate from any of the topics under the umbrella "computer science" (which is really more of a subset of mathematics). The sooner we treat the skill of programming more like writing (as in you need to be able to write to do all manner of jobs, but very few people go to school for it) the sooner we'll produce students of all disciplines that will excel in the jobs that are most numerous.
Jobs that actually need a strong foundation in CS theory are very rare, and will continue to be and the fantasy that you need a computer scientist to manage your CRUD app is resulting in many people incredibly over qualified for their positions and, in my opinion, one of the major reasons there's so much mental illness in the technology space.
Almost certainly an apprenticeship of some sort. You wouldn't hire a mechanical engineer and expect him to do a mechanics job straight out f University, which is in essence what you're asking people with CS degrees to do.
Internships and rigorous senior projects can help prepare students for real jobs. Having senior projects where you need to spend time documenting your work, scheduling tasks, etc can greatly help students get ready for when they start working at places where this is required from a management standpoint. My senior design project involved scheduling our project milestones, weekly meetings with the professor, daily meetings among group members, documenting the features, and even putting in purchase requests and justifications for hardware. That project plus my internships (once as a help desk tech, once in an intern group project, once working on a contract, and once working on an internal project) made me feel pretty confident to start my actual job and jump into the whole process again.
The "dream" scenario would be to leave training and mentoring to the actual companies. I understand a 5 person startup does not have the bandwidth to teach, but larger companies do yet most reject smart graduates because they don't happen to know Framework X.
I think it is up to the university to teach the students more theoretical topics and up to the students to either learn more technical topics on their own, or learn them quickly during the start of entering the workplace.
Let's not confuse "programming" with "computer scientist." I am a pretty good programmer. I am in no way a computer scientist. Asking what the "best" way to teach computer science is is akin or asking what is the "best" way to teach biology or physics or history.
A Computer Science degree does not, and should not, be the sole qualifier for whether or not you want to be a programmer.
Many strong graduates wind up in roles at major companies - Google, Facebook, Amazon, Microsoft, etc - where they are working with teams to implement things that do require research, rigor, etc. Their value as a contributor is wrapped up in theory, the code is just an implementation detail.
Bootcampers, meanwhile, often find themselves at younger companies that are more focused on shipping features and stamping out bugs - areas where the ability to write and ship code quickly is a priority. The differences between a b-tree and a red-black tree will be moot to them unless they're interviewing; going beyond binary search, hashmap, and bloom filter sees diminishing returns on investment in the near term for most small companies.
Working with real code in class. Finding what it takes to jump into a repository and start making incremental changes based on need. This could be an apprenticeship sort of thing or the professor finding some good OSS to have students look at.
If my memory and understanding are correct, the way that Mathematics is graded at Cambridge is interesting here.
Questions are scored alpha for a completely correct solution, beta if the examinee demonstrated that they knew what they were doing by maybe made some small mistake, and gamma for a reasonable effort.
That sounds very interesting, but I'm curious about the pass mark criteria. Is there some larger number of beta and gamma that can also pass? Otherwise, it seems like generating Beta and gamma scoring gives some nice measures for use by the teachers/students, but if ultimately passing only relies on alpha, it's a lot like any other math scoring.
It becomes a little like companies saying they value x & y, but take action only aligned to z.
When I read that the data was collected from 1500 CS finals, my immediate guess was that the class was CS61A.
---
I suspect that the distribution of the curve has to depend on: subjectiviness of the test and on the grading. Tests with questions where you know it or you don't. And how much partial credit graders are willing to give.
Given a large enough sample size, I'm sure you'll find such a student. Additionally, you will have plenty of students who beat the average and are below average. Performance below or above average matters because student performance is ranked while cockpit dimensions are not.
[+] [-] forgotpwtomain|8 years ago|reply
I mean we didn't read a classic American author till 6th or 7th grade! And if I recall correctly there were still M&M's in math class in grade 4!
The US may have an education problem but somehow the Soviet Union and China did fine years ago with out all the ed-tech snake oil.
[+] [-] bluGill|8 years ago|reply
citation?
Education is a complex matter. There are many people with OPINIONS on what the best way to teach is. These ideas are in conflict and only rarely does anyone study what really works. (rarely compared to the number of opinions - there could be a lot of studies that nobody knows about when they state their opinion)
Humans have a limited lifetime: you cannot teach all possible useful knowledge/skills in a lifetime. I limited this to useful, there is a lot of useless things that are fun to know anyway, somehow those are are interested need time to learn it for fun. I didn't define useful either: is Music/French/Algebra/Sports... useful (I can make either argument for any subject)
Why is reading a classic American author important? Reading is important in an abstract sense, but if you can understand written instructions it doesn't matter what you happened to read to get that skill.
Likewise, what is wrong with using M&Ms for learning math? a concrete example helps to learn. (to be clear, this is an opinion that I was ranting against in the first paragraph - I don't know if I agree with the opinion but I understand it enough to repeat it)
One constant in the US in popular culture is our education system sucks compared to X. We have done well over the years despite that (or maybe because of it?)
[+] [-] pc86|8 years ago|reply
This is because just pumping money into failing schools does not magically turn them around. There is little correlation between per capita secondary education spending and student outcomes.
[+] [-] noonespecial|8 years ago|reply
Once a certain amount of dollars are actually reaching the class room, adding more dollars will simply see most of the additional funds absorbed by hiring more administrators, prestige projects like sports facilities, "classroom technology" projects etc.
To detect this limit, simply check the level at which teachers begin paying for school supplies for their students from their own pockets and then back it off about 10%.
[+] [-] 15thandwhatever|8 years ago|reply
In the Northeast US, you'll generally see the best performing districts have a lower amount spent per child than the underperforming districts.
The underperforming districts will have higher property taxes (as a result of the higher education cost). This generally leads to parents seeking to move to a different school district for financial and educational reasons.
In education, at least, more money does not equate to better students, but instead, more mismanagement.
[+] [-] cafard|8 years ago|reply
Anyway, as I say again and again: there isn't one US education system. Within the District of Columbia, a populous but geographically small area, there are practically if not legally speaking six or seven at least: public schools, magnet; public schools prosperous; public schools shaky to desperate; parochial schools; private schools; charter schools. And within the parochial, private, and charter school worlds there are considerable differences.
[+] [-] rb808|8 years ago|reply
[+] [-] taway_1212|8 years ago|reply
[deleted]
[+] [-] closed|8 years ago|reply
This kind of data is commonly modeled using item response theory (IRT). I suspect that even in data generated by a unidimensional IRT model (which they are arguing against), you might get the results they report, depending on the level of measurement error in the model.
Measurement error is the key here, but is not considered in the article. That + setting an unjustified margin of 20% around the average is very strange. An analogous situation would be criticizing a simple regression, by looking at how many points fall X units above/below the fitted line, without explaining your choice of X.
[+] [-] arjun810|8 years ago|reply
The main point of this post is to highlight that the most common metric of student performance may not be that useful. Most of the time, students will get their score, the average score, and sometimes a standard deviation as well. As jimhefferon mentioned in a response to a different comment, the conventional wisdom is that two students with the same grade know roughly the same stuff, and that's seeming not to be true.
We're hoping to build some tools here to help instructors give students a better experience by helping them cater to the different groups that are present.
disclaimer: I'm one of the founders of Gradescope.
[+] [-] dhfhduk|8 years ago|reply
The piece is kind of making a basic fundamental mistake in measurement, assuming that all variability is meaningful variability.
There are ways of making the argument they're trying to make, but they're not doing that.
Also, sometimes a single overall score is useful. A better analogy than the cockpit analogy they use is clothing sizing. Yes, tailored shirts, based on detailed measurements of all your body parts, fit awesome, but for many people, small, medium, large, x-large, and so forth suffice.
I think there's a lesson here about reinventing the wheel.
I appreciate the goals of the company and wish them the best, but they need a psychometrician or assessment psychologist on board.
[+] [-] jjaredsimpson|8 years ago|reply
All the worst students will be very similar and all the best students will be very similar because the number of available states is low. Average students are all unique in their average-ness.
Am I missing some subtle statistical understanding that the toy example doesn't capture?
[+] [-] jimhefferon|8 years ago|reply
[+] [-] pmiller2|8 years ago|reply
[+] [-] tpeo|8 years ago|reply
I wondered about a very similar problem some weeks ago. I was bothered about the terms "ectomorph" and "mesomorph" because they seemed useless once you considered height: the vast majority of "ectomorphs" seemed to be taller than the average while the vast majority of "mesomophs" seemed to be of average height, so there's no point to these words. And so I wondered how would shoulder width would change given height (which seems to have some kind "decreasing returns"), and how the average measures would relate to actual average build. I mean, is the "average guy" really the guy with the average height and average shoulders? Because it's not as if the scale had just changed, like doubling the size of a cube, but there seems to be some deformation going on as well.
Anyway, didn't get past the wondering phase at the time. But I think it's too much of an important problem to be casually thrown as part of a pitch. I don't see an immediate reason why the average tuple should be the tuple of all averages, because some of the variables might be "dislocated" and thus not coincide with the averages of other variables. Some guy might be very close to average height yet still somewhere in the left-tail when it comes to body mass, shoulder width or any other measure. So there might be a typical student, but I don't think this is the way to find him.
[+] [-] forgotpwtomain|8 years ago|reply
[+] [-] connoredel|8 years ago|reply
Take the simple case of 2 dimensions (each observation is plotted in 2D space) with possible values of 0-10. Let's say the extreme (far from average) space is within 5% of the border. The total extreme area is (10x10)-(9x9) = 19 (i.e. 19%). Now add a 3rd dimension. The extreme "volume" in 3d space is now (10x10x10)-(9x9x9) = 271 (i.e. 27%). You can see where this is trending. Add enough dimensions, and every observation is now "extreme." They become so far apart that each observation almost deserves its own cluster, and you lose any idea of similarity.
Back to this particular article: when you _add_ (or average) all of the dimensions -- like you do on an exam -- suddenly they are close again.
[+] [-] vlasev|8 years ago|reply
[+] [-] fnovd|8 years ago|reply
According to the article, the average person doesn't exist, either. I don't know many people that are 13% fluent in Mandarin, 13% fluent in English, 9% fluent in Hindi... At the same time, having ~2 hands and ~10 fingers seems about right. Some metrics work with averages, some don't.
[+] [-] majewsky|8 years ago|reply
[+] [-] RangerScience|8 years ago|reply
[+] [-] PotatoEngineer|8 years ago|reply
The implementation varied between classes - in my World History class, there were a large number of objectives, and each objective was met by a small quiz that tested ~one skill. (There were a lot of retaken quizzes in that class.) In Biology, there were about 10 objectives for the entire semester, so you could still pass while missing a few small skills, as long as those missing skills were spread out among different units.
My high school used that "objectives" system less and less as I moved up the grades -I assume that most teachers got tired of it pretty quickly and just decided to make their usual teaching material "look like objectives" rather than rebuild their curriculum in later years.
[+] [-] bitwize|8 years ago|reply
[+] [-] opportune|8 years ago|reply
[+] [-] timemachiner|8 years ago|reply
[+] [-] lordCarbonFiber|8 years ago|reply
Jobs that actually need a strong foundation in CS theory are very rare, and will continue to be and the fantasy that you need a computer scientist to manage your CRUD app is resulting in many people incredibly over qualified for their positions and, in my opinion, one of the major reasons there's so much mental illness in the technology space.
[+] [-] maccard|8 years ago|reply
[+] [-] throwawayjava|8 years ago|reply
A CS degree with at least 2 summer internships building real software ticks both boxes.
[+] [-] moftz|8 years ago|reply
[+] [-] sotojuan|8 years ago|reply
[+] [-] silassales|8 years ago|reply
[+] [-] pc86|8 years ago|reply
A Computer Science degree does not, and should not, be the sole qualifier for whether or not you want to be a programmer.
[+] [-] seangrogg|8 years ago|reply
Many strong graduates wind up in roles at major companies - Google, Facebook, Amazon, Microsoft, etc - where they are working with teams to implement things that do require research, rigor, etc. Their value as a contributor is wrapped up in theory, the code is just an implementation detail.
Bootcampers, meanwhile, often find themselves at younger companies that are more focused on shipping features and stamping out bugs - areas where the ability to write and ship code quickly is a priority. The differences between a b-tree and a red-black tree will be moot to them unless they're interviewing; going beyond binary search, hashmap, and bloom filter sees diminishing returns on investment in the near term for most small companies.
[+] [-] jarboot|8 years ago|reply
This should be done in tandem with theory.
[+] [-] awinder|8 years ago|reply
[+] [-] pacaro|8 years ago|reply
Questions are scored alpha for a completely correct solution, beta if the examinee demonstrated that they knew what they were doing by maybe made some small mistake, and gamma for a reasonable effort.
The bare minimum pass mark is one alpha.
[+] [-] digikata|8 years ago|reply
It becomes a little like companies saying they value x & y, but take action only aligned to z.
[+] [-] grenoire|8 years ago|reply
[+] [-] QML|8 years ago|reply
---
I suspect that the distribution of the curve has to depend on: subjectiviness of the test and on the grading. Tests with questions where you know it or you don't. And how much partial credit graders are willing to give.
[+] [-] bryanrasmussen|8 years ago|reply
[+] [-] suyash|8 years ago|reply
[+] [-] crimsonalucard|8 years ago|reply
[+] [-] suyash|8 years ago|reply