Teaching Software Engineering
7 minute read.
I firmly believe that in the near future, the ability to read and write code will be viewed as a component of basic literacy. It will be hard to believe anyone could get by without it, much like reading and writing your spoken language is today, even though we know that just a few hundred years ago that was a rare skill reserved for the upper classes. If we are to achieve universal computer literacy, we will need new ways to teach, beyond the rigorous academic approaches of computer science departments.
A lot of good work is already happening in this direction: STEM education is innovating rapidly, building fun ways to code all the way down to kindergarten. However, I'm interested in teaching a group between the children and the university students: tradespeople. We need an educated workforce where anyone working with a computer can write, modify, or fix a script that makes their job easier. And just as not every literate person needs to be a poet, not everyone needs to know the mathematical underpinnings of computer science like assembly, compilers, and algorithms in order to be effective software developers.
I argued in an earlier post that most professional software developers are much more writers than they are mathematicians, scientists, or even engineers. Certainly there is a critical subset who do serious computer science, but as software expands to higher and higher levels, it is the writing of tools, user interfaces, and automation that is needed ubiquitously. Even if AI automates much of this code writing, users will still need enough fluency to proofread, edit, and fix the code produced.
Currently our education system from kindergarten through university is based on in-person learning with student-to-teacher ratios between about 10 and 100. The difficulty we face in trying to bring software literacy to every student is that most of our teachers are currently illiterate themselves, and there is such demand for software engineers that there aren't nearly enough of them available to make a dent in the huge field of education. Massive open online courses (MOOCs) have tried to fill the gap to some extent, along with self-study courses where recorded lectures and notes are available online. However, these courses still face the challenges of motivating students, as well as answering their questions and grading their work at scale.
Taking a step back, the purpose of education is to create a skilled workforce to keep our economy vibrant. We use accreditation to validate schools are teaching what is needed, and tests and grades to give employers a hint about how someone will perform as an employee. But how well do tests about math and algorithms represent what employers actually need? I would argue the most important skills of a software engineer are working well with others, giving and taking critical feedback, quickly reading and understanding code and documentation, accurately estimating the length of tasks, and good time management. But how do we teach and grade soft skills such as these?
I believe the answer lies in code review. Code review is the nearly universal process in tech industry of ensuring software quality through independent double checking, but it is also the primary method of continuing education in software companies. Having no formal education in computer science myself, I learned virtually everything I know about software engineering through a combination of trial-and-error and code review. Unlike in publishing, it is an egalitarian process, as everyone is both author and editor; both giver and receiver of constructive criticism. And it is not just about learning to write better code, but also to give and receive feedback better - keeping the process collaborative and motivating for all.
Imagine a project-based online class with weekly instruction and assignment. The instruction portion is like a MOOC: recorded lectures, course notes, etc, however the majority of student time is dedicated to the weekly coding project, in some high-level language like JavaScript or Python. Each student gets their code reviewed by two random classmates and likewise, each student reviews two of their classmates' projects, but each project comes in two flavors such that no one reviews the same problem as they are working on themselves. The reviews are not one-and-done, but the traditional back-and-forth over the days of the week to improve the code quality.
Instead of tests or grading, at the end of each week the students complete a simple binary ranking: of the two students they reviewed, which did the better job? This is a fundamentally qualitative judgement and each student will have a unique perspective, but they will be encouraged to think in terms of "would I want to be on this person's team?" Productivity, time management, communication, response to feedback, and politeness all come into play. Likewise, each student ranks their two reviewers in the same way, but productivity is replaced by the helpfulness of the review process.
The individual rankings stay secret, but they are combined to form an overall class ranking which determines each student's percentile grade. This ranking gets more accurate as more data comes in, and crucially, the ranks from different kinds of projects can be combined equally. This means that while the program can be split into classes on different coding subjects, only a single overall percentile is produced, which a student can track over time. The algorithm that computes the ranking can also choose the best way to pair up students in order to improve the ranking accuracy.
In addition to the secret binary ranks, each student also gives free-form feedback to each reviewer and reviewee in the form of one thing they did well and one thing they could improve. They answer the same questions as a self-review, which they can then compare to the four feedbacks they receive each week. The feedback is nearly anonymous, since the class is large, the students have not met in person, and they will only be paired with a given student once. This exposes them to a large number of communication styles and encourages them to get along with the greatest number of people.
The assigned projects are not intended to be particularly difficult or puzzling - instead they are meant to represent common software engineering tasks. As such, most will be to add tests and functionality to an existing code base, and this inherited code base will be one of the previous projects reviewed by this student. This provides an incentive to be thorough in code review, just like in a company, since good code is easier to extend. Likewise, the learning objective is around writing clear, simple, easy-to-maintain code, rather than on complex algorithms or data structures.
To induct new students into good review practices and to set expectations, the introductory class projects will be code-reviewed entirely by the final class of the graduating seniors (not to imply this is a four-year program; a year or less is probably adequate). This also gives the graduates valuable mentoring experience by helping new students.
Getting the work done reliably on time is still the most important aspect to an employer, and as such should show prominently in the final grade. Having each project last a week allows students to manage their own time and take days off as needed (they will generally have several classes and thus multiple projects during each week). However, failing to complete a project on time has a serious consequence: the final percentile ranking is adjusted such that even the lowest percentile score is ranked above any students with more missed assignments. Some number of projects can be missed for free by giving advanced notice and a smaller number after the fact, representing vacation and sick leave, respectively.
The idea is this educational program - call it trade school or an advanced coding boot camp - could scale to a very large student body as it requires no facilities or traditional faculty; only curriculum designers and data scientists to monitor and improve the program. It produces a much simpler and more meaningful grade - a percentile - than a GPA. Students and employers should be trained that even a 30th percentile is a decent grade, particularly when coding is only one part of a given job. It still demonstrates the student completed all their work on time and can handle a wide variety of communication and critique styles.
So, if I think this is such a great idea, why don't I go do it? As a crusher of dreams, it's only fair I crush my own as well. What drives my hesitation is that the hard part of this endeavor is a business problem, and business is neither my forte nor my interest. Even if this system is much more efficient and therefore cheaper, education always has a significant cost to students in terms of time. That is worthwhile only if the degree has value, which employers decide. How long would it take to convince them? This is why innovation in education is hard - reputation takes time to build.
Likewise, the startup cost is significant - developing the curriculum and course materials all needs to happen before the first student enters the program. Perhaps this could be mitigated by starting inside of an existing organization that already has online course materials like Khan Academy or Stanford University. But here lies the next problem: everyone has ideas, and people like their own ideas a lot better than other people's. Call it "not invented here", but it's just human nature. I'm not excited to fight that fight, so I'm publishing it here on the chance it inspires someone in the right position with the right expertise.
Software literacy can no longer be a skill restricted to a subset of the university-educated. Some say the rise of Artificial Intelligence (AI) that can write code from prompts means we won't need software engineers anymore. This is preposterous, and akin to saying our spelling and grammar correcting AI means we no longer need to be able to read and write. Students should use all available tools, and in so doing, learn to code review the AI as well. When anyone can ask an AI to write a script for them, and then confidently clean it up and verify that it works correctly, we will have a vastly more productive society.
It's true that AI won't replace software engineers. But it does democratize a subset of "programming" by allowing individuals who are untrained in software development to instruct computers to perform (some) tasks that previously only be accomplished through programming, using nothing more than their native spoken language. It's an interesting development that we as software developers should keep an eye on.
ReplyDeleteSort of, but nothing is perfect (AI or not), so any output will still require editing and verification. I don't think anything good will come of someone who can't read code asking an AI to write code for them.
DeleteYou are 100% correct. Unverified AI code generation is a looooong ways off. But it's interesting that non-programmers can ask AI sites to do things now that previously required some amount of programming by a human to accomplish, and it'll do them in a (sort of) non-code way.
DeleteTo get very meta, I asked ChatGPT to give me examples of what it can do that previously required programming knowledge. Here is it's answer:
Certainly! In the past, tasks that required programming knowledge were often inaccessible to those without coding skills. However, with the advent of AI and natural language processing, you can now accomplish various tasks without writing a single line of code. Here are some examples of what you can ask me to do:
1. Automate Repetitive Tasks:
Previously: Writing custom scripts to automate repetitive tasks.
Now: Ask me to perform repetitive actions, such as renaming files, sorting data, or sending scheduled emails.
2. Data Extraction and Analysis:
Previously: Writing complex code to scrape data from websites or analyze large datasets.
Now: Request specific information from web pages, summarize articles, or analyze data—all through natural language queries.
3, Generate Content:
Previously: Crafting code to create dynamic content, such as personalized emails or reports.
Now: Ask me to generate poems, stories, code snippets, or even song lyrics based on your input.
4. Create Custom Reports:
Previously: Developing custom dashboards or reports using programming languages.
Now: Describe the data you need, and I can create a summary or report for you.
5. Automate Workflows:
Previously: Integrating APIs and writing code to connect different tools.
Now: Ask me to connect services, automate workflows, or perform specific actions across platforms.
Remember, I’m here to assist you with a wide range of tasks, so feel free to ask anything! 🌟
I really like the article except for the underlying premise! I think your proposal could work in lots of places outside programming. However, I don't think that everyone will need to learn coding. I used to, but with technology advancing so fast, coding might become the domain of highly trained specialists only. Like how developers use hardware but (generally) have no real idea how the hardware was designed, "programming" in the traditional sense might be a task that only highly trained individuals do while everyone else just directs an AI for what they want done.
ReplyDeleteI question your claim that "in the near future, the ability to read and write code will be viewed as a component of basic literacy." I use a computer, but I don't understand how most of the hardware works. I barely understand a fraction of the full software stack! I think AI could very well push the required understanding even higher - into the conceptual "You must be good at being clear what you want" level rather than the "you must be good at writing clear code" level.
This gets to our debate about whether you believe a machine can be 10x smarter than a person. If that's true, then its code will be 10x better than yours and there would be no need to check it or understand it, just like I use code in libraries every single day without checking it from people who aren't even 10x better at coding than me.