I love visiting Tokyo and have been lucky enough to visit dozens of times over many years. The consumer electronics industry has certainly had ups and downs recently, but a constant has been the leading edge consumer and business adoption of new technologies. From PCs in the workplace to broadband at home and smartphones (a subject of many humorous team meetings back pre-bubble when I clearly didn’t get it and was content with the magic of my BB 850!) Japan has always had a leading adoption curve even when not necessarily producing the products used globally.
This visit was about visiting the University of Tokyo and meeting with some entrepreneurs. That, however, doesn’t stop me from spending time observing what CE is being used in the workplace, on the subway, and most importantly for sale in the big stores such as Yodobashi, Bic, and Labi and of course the traditional stalls at Akihabara. The rapid adoption, market size, and proximity to Korea and China often mean many of the products seen are not yet widely available in the US/Europe or are just making their way over. There’s a good chance what is emphasized in the (really) big retail space is often a leading indicator for what will show up at CES in January.
If you’re not familiar with Yodobashi, here’s the flagship store in Akihabara – over 250,000 sq ft and visited by 10′s of millions of people every year. I was once fortunate enough to visit the underground operations center, and as a kid who grew up in Orlando it sure feels a lot like the secret underground tunnels of the Magic Kingdom!
With that in mind here are 10 observations (all on a single page). This is not statistical in any way, just what caught my eye.
- Ishikawa Oku lab. The main focus of the trip was to visit University of Tokyo. Included in that was a wonderful visit with Professor Ishikawa-san and his lab which conducts research on exploring parallel, high-speed, and real-time operations for sensory information processing. What is so amazing about this work is that it has been going on for 20 years starting with very small and very slow digital sensors and now with Moore’s law applied to image capture along with parallel processing amazing things are possible such as can be seen in some of these Youtube videos (with > 5 million views), see http://www.youtube.com/ishikawalab. More about the lab http://www.k2.t.u-tokyo.ac.jp/index-e.html.
- 4K Displays. Upon stepping off the escalator on the video floor, one is confronted with massive numbers of massive 4K displays. Every manufacturer has displays and touts 4K resolution along with their requisite tricks at upscaling. The prices are still relatively high but the selection is much broader than readily seen in the US. Last year 4K was new at CES and it seems reasonable to suspect that the show floor will be all 4K. As a footnote relative to last year, 3D was downplayed significantly. In addition, there are numerous 4K cameras on sale now, more so than the US.
- Digital still. The Fuji X and Leica rangefinder digital cameras are getting a lot of floorspace and it was not uncommon to see tourists snapping photos (for example in Meiji Garden). The point and shoot displays feature far fewer models with an emphasis on attributes that differentiate them from phones such as waterproof or ruggedized. There’s an element of nostalgia, in Japan in particular, driving a renewed popularity in this form factor.
- Nikon Df. This is a “new” DSLR with the same sensor as the D-800/D4 that is packaged in a retro form factor. The Nikon Df is definitely only for collectors but there was a lot of excitement for the availability on November 21. It further emphasized the nostalgia elements of photography as the form factor has so dramatically shifted to mobile phones.
- Apple presence in store. The Apple presence in the main stores was almost overwhelming. Much of the first floor and the strategic main entry of Yodobashi were occupied by the Apple store-within-a-store. There were large crowds and as you often see with fans of products, they are shopping the very products they own and are holding in their hands. There has always been a fairly consistent appreciation of the Apple design aesthetic and overall quality of hardware but the widespread usage did not seem to follow. To be balanced, one would have to take note of the substantial presence of the Nexus 5 in the stores, which was substantially and well-visited.
- PCs. The size of the PC display area, relative to mobile and iOS accessories, definitely increased over the past 7 months since I last visited. There were quite a large number of All-In-One designs (which have always been popular in Japan, yet somehow could never quite leap across the Pacific until Windows 8). There were a lot of very new Ultrabooks running Haswell chips from all the major vendors in the US, Japan, and China. Surface was prominently displayed.
- iPhone popularity. There was a ubiquity of the iPhone that is new. Android had gained a very strong foothold over the national brands that came with the transition to nationwide LTE. Last year there was a large Android footprint through Samsung handsets that was fairly visible on display and in use. While the Android footprint is clearly there, the very fast rise of iPhone, particularly the easily spotted iPhone 5s was impressive. The vast expanse of iPhone accessories for sale nearly everywhere supports the opportunity. A driver for this is that the leading carrier (DoCoMo) is now an iPhone supplier. Returning from town, I saw this article speaking to the rise of iOS in Japan recently, iPhone 5S/C made up 76% of new smartphone sales in Japan this October.
- Samsung Galaxy J. Aside from the Nexus 5, the Android phone being pushed quite a bit was the Samsung Galaxy J. This is a model only in Asia right now. It was quite nice. It sports an ID more iPhone-like (squared edges), available in 5c-like colors, along with the latest QC processor, 5″ HD display, and so on. It is still not running Kitkat of course. For me in the store, it felt better than a Galaxy S. Given the intricacies of the US market, I don’t know if we’ll see this one any time soon. The Galaxy Note can be seen “in the wild” quite often and there seems to be quite a lot of interest based on what devices on display people would stop and interact with.
- Tablets. Tablets were omnipresent. They were signage in stores, menus in restaurants, in use on the subway, and in use at every place where people were sitting down and eating/drinking/talking. While in the US we are used to asking “where are all the Android tablets”, I saw a lot of 7″ Android tablets in use in all of those places. One wouldn’t expect the low-priced import models to be visible but there are many Japan OEMs selling Android tablets that could be spotted. I also saw quite a few iPad Minis in use, particularly among students on the trains.
- Digital video. As with compact digital cameras, there was a rather extreme reduction in the number of dedicated video recorders. That said, GoPro cameras had a lot of retail space and accessories were well placed. For example, there were GoPros connected to all sorts of gear/showing off all sorts of accessories at Tokyu Hand (the world’s most amazing store, imho). Professional HD and UHD cameras are on display in stores which is cool to see, for example Red and Arri. One of the neatest uses of video which is available stateside but I had not seen is the Sony DEV-50 binoculars/camera. It is pricey (USD$2000) but also pretty cool if you’ve got the need for it. They have reasonable sensors, support 3D, and more. The only challenge is stability which make sense given the equivalent focal length, but there is image stabilization which helps quite a bit in most circumstances.
There were many other exciting and interesting products one could see in this most wired and gadget friendly city. One always is on the lookout for that unique gift this holiday season, so I found my stocking-stuffer. Below you can see a very effective EMF shielding baseball hat (note, only 90% effective). As a backup stocking-stuffer, all gloves purchased in Japan appear to be designed with resistive touch screens in mind :-)
PS: Here’s me with some super fun students in a class on Entrepreneurship and Innovation at the University of Tokyo.
Much has been written recently about performance ratings and management at some large and successful companies. Amazon has surfaced as a company implementing OLRs, organization and leadership reviews, which target the least effective 10% of an organization for appropriate action. Yahoo recently implemented QPRs, quarterly performance reviews, which rates people as “misses” or “occasionally misses” among other ratings. And just so we don’t think this is something unique to tech, every year about this time Wall St firms begin the annual bonus process which is filled with any number of legendary dysfunctions given the massive sums of money in play. Even the Air Force has a legendary process for feedback and appraisal.
This essay looks at the challenges of performance review in a large organization. The primary purpose is to help share the realities that designing and implementing a system for such an incredibly sensitive topic is a monumental challenge when viewed in isolation. If you overlay the environment of an organization (stock price, public perception, revenue or profit, local competition for talent, etc.) then any system at all can seem anywhere from tyrannical to fair to kick-ass for some period of time, and then swing the other way when the context changes. For as much as we think of performance management as numeric and thus perfectly quantifiable, it is as much a product of context and social science as the products we design and develop. We want quantitative certainty and simplicity, but context is crucial and fluid, and qualitative. We desire fair, which is a relative term, but insist on the truth, which is absolute.
While there is an endless quest for simplicity, much as with airline tickets, car prices, or tax codes it is naive to believe that simplicity can truly be achieved or maintained over time. The challenge doesn’t change the universally shared goal of simplicity (believe me HR people do not like complex systems any more than everyone else does) but as a practical matter such purity is unattainable. Therefore comparing any system to some ideal (“flat tax”, “fixed pricing”) only serves to widen the gap between desire and implementation and thus increases the frustration and even fear of a system.
If this topic were simple there would not be over 25,000 books listed on Amazon’s US book site for the query “performance review”. Worse, the top selling books are about how to write your review, game the system, impress your boss, or tell employees they are doing well when they really aren’t doing well. You get pretty far down the list before you get to books that actually try to define a system and even then those books are filled with caveats. My own view is that the best book on the topic is Measuring and Managing Performance in Organizations. It is not about the perfect review system but about the traps and pitfalls of just measuring stuff in general. I love this book because it is a reminder of everything you know about measuring, from “measure twice, cut once” to “measure what you can change” to “if you measure something it goes up” and so on.
Notes. I am not an HR professional and don’t get wrapped up in the nuances of terminology between “performance review” or “performance management” or “performance rating”. I recognize the differences and respect them but will tend to intermix the terms for the purposes of discussion. I also recognize that for the most part, people executing such a system generally don’t see the subtle distinctions in these words as much as they might mean something within the academy. I am also not a lawyer, so what I say here may or may not be legally permitted in your place of doing business (geography, company size, sector). Finally, this post is not about any specific company practice past or present and any similarity is unintended coincidence.
This post will say some things that are likely controversial or appear plain wrong to some. I’ll be following this on twitter to see what transpires, @stevesi.
5 Essential Realities
There are several essential realities to performance reviews:
- Performance systems conflate performance and compensation with organizational budgets. No matter how you look at it, one person cannot be evaluated and paid in isolation of budgets. The company as a whole has a budget for how much to pay people (salary, bonus, stock, etc.) No matter what an individual’s compensation is part of a system that ultimately has a budget. The vast majority of mechanical or quantitative effort in the system is not about one person’s performance but about determining how to pay everyone within the budget. While it is desirable to distinguish between professional development and compensation, that will almost certainly get lost once a person sees their compensation or once a manager has to assign a rating. Any suggestion as to how to be more fair, allow for more flexibility, provide more absolute ratings, or otherwise separate performance from compensation must still come up with a way to stick to a budget. The presence of a budget drives the existence of a system. There is always a budget and don’t be fooled by “found money” as that’s just a budget trick.
- In a group of any size there is a distribution of performance. At some point a group of people working towards similar goals will exhibit a distribution of performance. From our earliest days in school we see this with schoolwork. In the workplace there are infinite variables that influence the performance of any individual but the variability exists. In an ideal system one could isolate all the variables from some innate notion of “pure contribution” or “pure skill” in order to evaluate someone. But that can’t be done so the distribution one sees essentially lumps together many performance related variables.
- In a system where you have m labels for performance, people who get all but the most rewarding one believe they are “so close” to the higher one. In school, teachers have letter grades or numeric ranges that break up test scores into “buckets”. In the workplace, performance systems generally implement some notion of grades or ratings and assign distributions to each of those. Much like a forced curve in a physics test, the system says that only a certain percentage of a population can get the highest performance rating and likewise a certain percentage of the team gets the lowest rating. The result is that most everyone in the organization believes they are extremely close to the next rating much like looking at a test and thinking if you could just get that one extra point you’d get the next letter grade. Because of human nature, any such system almost certain follows the corollary that managers are likely to imply or one being managed likely to hear evidence of just how close a call their review score was. There is a corollary: “everyone believes they are above average“.
- Among any set of groups, almost all the groups think their group is delivering more and other groups are delivering less. In a company with many groups, managers generally believe their group as a whole is performing better by relevant measures and thus should not be held to the same distribution or should have a larger budget. Groups tend to believe their work is harder, more strategic, or just more valuable while underestimating those contributions from other groups. Once groups realize that there is a fixed budget, some strive to solve the overall challenges by allowing for higher budgets on some teams. In this way you could either use a different distribution of people (more at the top) or just elevate the compensation for people within a group. Any suggestion to do this would need to also provide guidance as to how groups as a whole are to be measured relative to each other (which sounds an awful lot like how individuals would be measured relative to each other).
- Measurement is not an absolute but is relative. To measure performance it must be measured relative to something. Sales is the “easiest” since if you have a sales quota then your compensation is just a measure of how much you beat the quota. Such simplicity masks the knife fight that is quota settings and the process by which a comp sheet is built out, but it is still a relative measure. Most product have squishier goals such as “finish the product” or “market the product”. The larger the company the more these goals make sense but the less any individual’s day to day actions are directly related (“If I fix this bug will the sale really close?”). Thus in a large company, goal setting becomes increasingly futile as it starts to look like “get my work done” as the interconnection between other people and their work is impossibly hard to codify. Much of the writing about performance reviews focuses on goal setting and the skill in writing goals you can always brag about, unfortunately. All of this has taken a rather dramatic turn with the focus on agility where it is almost the antithesis of well-run to state months in advance what success looks like. As a result, measuring performance relative to peers “doing their work” is far more reliable, but has the downside that the big goals all fall to the top level managers. That’s why for the most part this entire topic is a big company thing—in a startup or a small company, actions translate into sales, marketing, products, and customers all very directly.
10 Real World Attributes
Once you take these realities you realize there will in fact be some sort of system. The goal of the system is to figure out how much to pay people. For all the words about career management, feedback, and so on that is not what anyone really focuses on at the moment they check their “score”. It certainly isn’t what is going on around the table of managers trying to figure out how to fit their team within the rules of the system.
Those that look to the once a year performance rating as the place for either learning how they are doing or for sharing feedback with an employee are simply doing it wrong—there simply shouldn’t be surprises during the process. If there are surprises then that’s a mistake no system can fix. There are no substitutes for concrete, actionable, and ongoing feedback about performance. If you’re not getting that then you need to ask.
At the same time, you can’t expect to have a daily/weekly rating for how you are doing. That’s because your performance is relative to something and that something isn’t determined on a daily basis. Finding that happy place is a challenge for individuals and managers, with the burden to avoid surprises falling to both equally. As much as one expects a manager to communicate well individuals must listen well.
Putting a system in place for allocating compensation is enormously challenging. There’s simply too much at stake for the company, the team, managers, and individuals. Ironically, because so much is at stake that materially impacts the lives of people, it is not unusual for the routine implementation of the process to take months of a given year and for it to occupy far more brain cycles than the actual externally facing work of the organization. Ironically, the more you try to make the process something HR worries about the even more disconnected it becomes from work and the more stress. As a result, performance management occupies a disproportionate amount of time and energy in large organizations.
Because of this, everyone in a company has enough experience to be critical of the system and has ideas how to improve it. Much like when a company does TV advertising and everyone can offer suggestions—simply because we all watch TV and buy stuff—when it comes to performance reviews since we all do work and get reviewed we all can offer insights and perspectives on the system. Designing a system from scratch is rather different than being critical of anecdotes of an existing system.
Given that so much is at stake and everyone has ideas how to improve the system, the actual implementation is enormously complex. While one can attempt to codify a set of rules, one cannot codify the way humans will implement the rules. One can keep iterating, adding more and more rules, more checks and balances, but eventually a process that already takes too much time becomes a crushing burden. Even after all that, statistically a lot of people are not going to be happy.
Therefore the best bet with any system is to define the variables and recognize that choices are being made and that people will be working within a system that by definition is not perfect. One can view this as gaming the system, if one believes the outcome is not tilted towards goodness. Alternatively, one can view this as doing the right thing, if one believes the outcome is tilted in the direction of goodness.
My own experience, is that there are so many complexities it is pointless to attempt to fully codify a system. Rather everyone just goes in with open eyes and a realistic view of the difficulty of the challenge and iterates through the variables throughout the entire dialog. Fixating on any one to the exclusion of others is when ultimately the system breaks down.
The following are ten of the most common attributes that must be considered and balanced when developing a performance review system:
- Determining team size. There is critical mass of “like” employees (job function, seniority, familiarity, responsibility) required to make any system even possible. If you have less than about 100 people no system will really work. At the same time, at about 100 people you are absolutely assured of having a sample size large enough to see a diversity in performance. There is going to be a constant tension between employees who believe the only fair way of evaluation is to have intimate knowledge of their work and a system that needs a lot of data points. In practice, somewhere between 1 and 5 people are likely to have intimate understanding of the work of an individual, but said another way any given manager is likely to have intimate knowledge of between 5 and about 50 people. At some point the system requires every level of management to honestly assess people based on a dialog of imperfect information. Team size also matters because small “rounding” efforts become enormous. Imagine something where you need to find 10% of the population and you have a team of 15 people to do that with. You obviously pick either 1 or 2 people (1 if the 10% is “bad”, 2 if it is “good”). Then imagine this rolls up to 15,000 people. Rather than 1500, you have either 1000 or 2000 people in that 10%. That’s either very depressing or very expensive relative to the budget. Best practice: Implement a system in groups of about 100 in seniority and role.
- Conflating seniority, job function, and projects does not create a peer group. Attempting to define relative contribution of a college new hire and a 10 year industry vet, or a designer and a QA engineer, manager or not, or a front-end v. ops tools are all impossibly difficult. The dysfunction is one where invariably as the process moves up the management chain there will be a bias that builds—the most visible people, the highest paid people or jobs, scarcest talent, the work that is understood and so on will become the things that get airtime in dialogs. There’s nothing inherently evil about this but it can get very tricky very quickly if those dialogs lead to higher ratings/compensation for these dimensions. This can get challenging if these groups are not sized as above and so you’ll find it a necessary balancing act. Best practice: Define peer groups based on seniority and job function within project teams as best you can.
- Measuring against goals. It is entirely possible to base a system of evaluation and compensation on pre-determined goals. Doing so will guarantee two things. First, however much time you think you save on the review process you will spend up front on an elaborate process of goal-setting. Second, in any effort of any complexity there is no way to have goals that are self-contained and so failure to meet goals becomes an exercise in documenting what went wrong. Once everyone realizes their compensation depends on others, the whole work process becomes clouded by constant discussion about accountability, expectation setting, and other efforts not directly related to actually working things out. And worse, management will always have the out of saying “you had the goal so you should have worked it out”. There’s nothing more challenging in the process of evaluation than actually setting goals and all of this is compounded enormously when the endeavor is a creative one where agility, pivots, and learning are part of the overall process. Best practice: let individuals and their manager arrive at goals that foster a sense of mastery of skills and success of the project, while focusing evaluation on the relative (and qualitative) contribution to the broader mission.
- Understanding cross-organization performance. Performance measurement is always relative, but determining performance across multiple organizations in a relative sense requires apples to oranges comparisons, even within similar job functions (i.e. engineering). If one team is winding down a release and another starting, or if one team is on an established business and another on a new business, or if one team has no competitors and another is in an intense battle, or if one team has a lot of sales support and another doesn’t, and so on are all situations which make it non-obvious how to “compare” multiple teams, yet this is what must happen at some level. Compounding this situation is that at some point in evaluation the basis for relative comparison might dramatically change—for example, at one level of management the accomplishment of multiple teams might be looked at through a lens that can be far removed from what members of those teams might be able to impact in their daily work. Best practice: do not pit organizations against each other by competing for rewards and foster cross-group collaboration via planning and execution of shared bets.
- Maintaining a system that both rates and rewards. Systems often have some sort of score or a grade and they also have compensation. Some think this is essential. Some think this is redundant. Some care deeply about one, but only when they are either very happy or very unhappy with the other. A system can be developed where these are perfectly correlated in which case one can claim they are redundant. A system where there is a loose correlation might as well have no correlation because both individuals and managers involved are hearing what they want to hear or saying one thing and doing another. At the same time, we’re all conditioned for a “score” and somehow a bonus of 9.67% doesn’t feel like a score because you don’t know what this means relative (so even though people want to be rated absolutely it doesn’t take long before they want to know where that stands relatively). Best practice: A clear rating that lets individuals know where they stand relative to their peer group along with compensation derived from that with the ability of a manager with the most intimate knowledge of the work to adjust compensation within some rating-defined range.
- Writing a performance appraisal is not making a case. Almost all of the books on Amazon about performance reviews focus on the art of writing reviews. Your performance review is not a trial and one can’t make or break the past year/month/quarter by an exercise in strong or creative writing. This holds for individuals hoping to make their performance shine and importantly for managers hoping to make up for their lack of feedback/action. The worst moments in a performance process are when an employee dissects a managers comments and attempts to refute them or when a manager pulls up a bunch of new examples of things that were not talked about when they were happening. Best practice: Lower the stakes for the document itself and make it clear that it is not the decision-making tool for the process.
- Ranking and calibrating are different. Much has been said about the notion of “stack rank” which often is used as a catch phrase for a process that assigns each member of a group a “one through n” score. This is always always a terrible process. There is simply no way to have the level of accuracy implied by such a system. What would one say to someone trying to explain the difference between being number 63 and number 64 on a 100 person team? The practice of calibrating is one of relative performance between members of peer groups as described above. The size and number of these groups is fixed and when done with adequate population size can with near certainty avoid endless discussion over boundary cases. Best practice: Define performance groups where members of a team fall but do not attempt to rank with more granularity or “proximity” to other groups.
- Encouraging excellent teams. Most managers believe their teams are excellent teams, and uniquely so. Strong performers have been hired to the team. Weak performers are naturally managed out if they someone made it on to the team. Results show this. It becomes increasingly difficult to implement a performance review system because organizations become increasingly strong and effective. This is how it should be. At the same time this cannot possibly be a permanent state (even the sports teams get new players that don’t pan out over the course of a season). In a dynamic system there will be some years where a team is truly excellent and some years where it is not, but you can’t really know that in an absolute sense. In fact, the most ossifying element of performance appraisal is to assume that a given team or given person has reached a point where they are just excellent in an absolute sense and thus the system no longer applies. Whether the team is 100 engineers just crushing it or an executive team firing on all cylinders, it is very tempting to say the system doesn’t apply. But if the system doesn’t apply you don’t really have a system. Perhaps your organization will have a concept of “tenure” or you have a job function primarily compensated by quota based on quantitative measures—those are ways to have different systems. Best practice: Make a system that applies to everyone or have multiple systems and clear rules how membership in different systems is determined.
- Allowing for exceptions creates an exception-based process. When a team adds all of the potential constraints up and attempt to finally close in on performance of individuals, there is a tendency to “feel the pain” of all the rules and to create a model for exceptions. For example, you might have a 10% group but allow for up to 1% exceptions. Doing so will invariably create either the 9% or 11% group depending on if it is better to except up or down. If managers have the option of giving someone a low rating by extra money along with other people getting a high rating with less money, then invariably most people will get this mixed message. All of these exceptions quickly permeate an organization and individuals end up considering getting an exception a normal part of the process. Best practice: If there is going to be a system, then stick to it and don’t encourage exceptions.
- Embracing diversity in all dimensions. Far too often in performance appraisal and rewards, even within peer groups there is potential for the pull of sameness. This can manifest itself through any number of professional characteristics that can be viewed as either style or actual performance traits. One of the earliest stories I heard of this was about a manager that preferred people to set very aggressive goals for adding features. Unfortunately there was no measure for quality of the work. Other members of the team would focus on a combination of features and quality. Members of the latter group felt penalized relative to the person with the high bug count. At the same time, the team tended to be one that got a lot of features done early but had a much longer tail. Depending on when performance reviews got done, the story could be quite different. Perhaps both styles of work are acceptable, but not appreciating the “perceived slow and steady” is a failing of that manager to embrace styles. The same can be said for personal traits such as the always present quiet v. loud, or oral v. written, and so. Best practice: Any strong and sustainable team will be diverse in all dimensions.
Much more could be said about the way performance appraisal and reward can and should work in organizations. Far too much of what is said is negative and assumes a tone dominated by us v. them or worse a view that this is all a very straight forward process that management just consistently gets wrong. Like so many things in business there is no right answer or perfect approach. If there was, then there would be one performance system that everyone would use and it would work all the time. There is not.
Some suggest that the only way to solve this problem is to just have a compensation budget and let some level of management be responsible. That is a manager just determines compensation for each member of a team based on their own criteria. This too is a system—the absence of a system is itself a system. In fact this is not a single system but n systems, one for each manager. Every group will arrive at a way to distribute money and ratings that meets the needs of that team. There will be peanut butter teams, there will be teams that do the “big bonus”, and more. There will even be teams that use the system as a recruiting tool.
As much as any system is maligned, having a system that is visible, has some framework, and a level of cross-organization consistency provides many benefits to the organization as whole. These benefits accrue even with all the challenges that also exist.
To end this post here are three survival tips for everyone, individuals and managers, going through a performance process that seems unfair, opaque, or crazy:
- No one has all the data. Individuals love to remind some level of management that they do not have all the data about a given employee. Managers love to remind people that they see more data points than any one individual. HR loves to remind people that they have competitive salary data for the industry. Executives remind people they have data for a lot of teams. The bottom line is that no one person has a complete picture of the process. This means everyone is operating with imperfect information. But it does not follow that everyone is operating imperfectly.
- Share success, take responsibility. No matter what is happening and in what context, everyone benefits when successes are shared and responsibility is taken. Even with an imperfect system, if you do well be sure to consider how others contributed and thank them as publicly as you can. If you think you are getting a bad deal, don’t push accountability away or point fingers, but look to yourself to make things better.
- Things work out in the end. Since no system is perfect it is tempting to think that one data point of imperfection is a permanent problem. Things will go wrong. We don’t talk about it much, but some people will get a rating and pay much higher than they probably deserve at some point. And yes, some people will have a tough time that they might not really deserve in hindsight. In a knowledge economy, talent wins out over time. No manager will hold one datapoint against a talented person who gracefully recovers from a misstep. It takes discipline and effort to work within a complex and imperfect system—this is actually one of the skills required for anyone over the course of a career. Whether it is project planning, performance management, strategic choices, management processes and more all of these are social science and all subject to context, error rates, and most importantly learning and iteration.
—Steven Sinofsky (@stevesi)
I’ve been surprised at the “feedback” I receive when I talk about products that compete with those made by Microsoft. While I spent a lot of time there, one thing I learned was just how important it is to immerse yourself in competitive products to gain their perspective. It helps in so many ways (see http://blog.learningbyshipping.com/2013/01/14/learning-from-competition/).
Dave Winer (@davewiner) wrote a thoughtful post on How the Times reviews tech today. As I reflected on the post, it seemed worth considering why this challenge might be unique to tech and how it relates to the use of competitive products.
When considering creative works, it takes ~two hours to see a film or slightly more for other productions. Even a day or two for a book. After which you can collect your thoughts and analysis and offer a review. Your collected experience in the art form is relatively easily recalled and put to good use in a thoughtful review.
When talking about technology products, the same approach might hold for casually used services or content consumption services. In considering tools for “intellectual work” as Winer described (loved that phrase), things start to look significantly different.Software tools (for “intellectual work”) are complex because they do complex things. In order to accomplish something you need to first have something to accomplish and then accomplish it. It is akin to reviewing the latest cameras for making films or the latest cookware for making food. While you can shoot a few frames or make a single meal, tools like these require many hours and different tasks. You shouldn’t “try” them as much as “use” them for something that really matters. Only then can you collect your thoughts and analysis.Because tools of depth offer many paths and ways to use them there is an implicit “model” to how they are used. Models take a time to adapt to. A cinematographer that uses film shouldn’t judge a digital camera after a few test frames and maybe not even after the first completed work.
The tools for writing, thinking, creating that exist today present models for usage. Whether it is a smartphone, a tablet, a “word processor”, or a photo editor these devices and accompanying software define models for usage that are sophisticated in how they are approached, the flow of control, and points of entry. They are hard to use because they do hard things.
The fact that many of those that write reviews rely on an existing set of tools, software, devices to for their intellectual pursuits implies that conceptual models they know and love are baked into their perspective. It means tools that come along and present a new way of working or seeing the technology space must first find a way to get a clean perspective.
This of course is not possible. One can’t unlearn something. We all know that reviewers are professionals and just as we expect a journalist covering national policy debates must not let their bias show, tech reviewers must do the same. This implicit “model bias” is much more difficult to overcome because it simply takes longer to see and use a product than it does to learn about and understand (but not necessarily practice) a point of view in a policy debate. The tell-tale sign of “this review composed on the new…” is great, but we also know right after the review the writer has the option of returning to their favorite way of working.
As an example, I recall the tremendous difficulty in the early days of graphical user interface word processors. The incumbent WordPerfect was a character based word processor that was the very definition of a word processor. The one feature that we heard relentlessly was called reveal codes which was a way of essentially seeing the formatting of the document as codes surrounding text (well today we think of that as HTML). Word for Windows was a WYSIWYG word processor in Windows and so you just formatted things directly. If it was bold on screen then it was implicitly surrounded by <B> and </B> (not literally but conceptually those codes).
Reviewers (and customers) time and time again felt Word needed reveal codes. That was the model for usage of a “word processor”. It was an uphill battle to move the overall usage of the product to a new level of abstraction. There were things that were more difficult in Word and many things much easier, but reveal codes was simply a model and not the answer to the challenges. The tech world is seeing this again with the rise of new productivity tools such as Quip, Box Notes, Evernote, and more. They don’t do the same things and they do many things differently. They have different models for usage.
At the business level this is the chasm challenge for new products. But at the reviewer level this is a challenge because it simply takes time to either understand or appreciate a new product. Not every new product, or even most, changes the rules of the predecessor successfully. But some do. The initial reaction to the iPhone’s lack of keyboard or even de-emphasizing voice calls shows how quickly everyone jumped to the then current definition of smartphone as the evaluation criteria.Unfortunately all of this is incompatible with the news cycle for the onslaught of new products or the desire to have a collective judgement by the time the event is over (or even before it starts).This is a difficult proposition. It starts to sound like blaming politicians for not discussing the issues. Or blaming the networks for airing too much reality tv. Isn’t is just as much what peole will click through as it is what reviewers would write about. Would anyone be interested in reading a Samsung review or pulling another ios 7 review after the 8 weeks of usage that the product deserves?
The focus on youth and new users as the baseline for review is simply because they do not have the “baggage” or “legacy” when it comes to appreciating a new product. The disconnect we see in excitement and usage is because new to the category users do not need to spend time mapping their model and just dive in and start to use something for what it was supposed to do. Youth just represents a target audience for early adopters and the fastest path to crossing the chasm.
Here are a few things on my to-do list for how to evaluate a new product. The reason I use things for a long time is because I think in our world with so many different models
- Use defaults. Quite a few times when you first approach a product you want to immediately customize it to make it seem like what you’re familiar with. While many products have customization, stick with the defaults as long as possible. Don’t like where the browser launching button is, leave there anyway. There’s almost always a reason. I find the changes in the default layout of iOS 6 v. 7 interesting enough to see what the shift in priorities means for how you use the product.
- Don’t fight the system. When using a new product, if something seems hard that used to seem easy then take a deep breath and decide it probably isn’t the way the product was meant to do that thing. It might even mean that the thing you’re trying to do isn’t necessarily something you need to do with the new product. In DOS WordPerfect people would use tables to create columns of text. But in Word there was a columns feature and using a table for a newsletter layout was not the best way to do that. Sure there needed to be “Help” to do this, but then again someone had to figure that out in WordPerfect too.
- Don’t jump to doing the complex task you already figured out in the old tool. Often as a torture test, upon first look at a product you might try to do the thing you know is very difficult–that side by side chart, reducing overexposed highlights, or some complex formatting. Your natural tendency will be to use the same model and steps to figure this out. I got used to one complicated way of using levels to reduce underexposed faces in photos and completely missed out on the “fill flash” command in a photo editor.
- Don’t do things the way you are used to. Related to this is tendency to use one device the way you were used to. For example, you might be used to going to the camera app and taking a picture then choosing email. But the new phone “prefers” to be in email and insert an image (new or just taken) into a message. It might seem inconvenient (or even wrong) at first, but over time this difference will go away. This is just like learning gear shift patterns or even the layout of a new grocery store perhaps.
- Don’t assume the designers were dumb and missed the obvious. Often connected to trying to do something the way you are used to is the reality that something might just seem impossible and thus the designers obviously missed something or worse. There is always a (good) chance something is poorly done or missing, but that shouldn’t be the first conclusion.
But most of all, give it time. It often takes 4-8 weeks to really adjust to a new system and the more expert you are the more time it takes. I’ve been using Macs on and off since before the product was released to the public, but even today it has taken me the better part of six months to feel “native”. It took me about 3 months of Android usage before I stopped thinking like an iPhone user. You might say I am wired too much or you might conclude it really does take a long time to appreciate a design for what it is supposed to do. I chuckle at the things that used to frustrate me and think about how silly my concerns were at day 0, day 7, and even day 30–where the volume button was, the charger orientation, the way the PIN worked, going backwards, and more.
LinkedIn engineer Martin Kleppmann wrote a wonderful post detailing the magical and thoughtful engineering behind the new LinkedIn Intro iOS app. I was literally verklepmpt reading the post–thinking about all those nights trying different things until he (and the team) ultimately achieved what he set out to do, what his management hoped he would do, and what folks at LinkedIn felt would be great for LinkedIn customers.
The internet has done what the internet does which is to unleash indignation upon Martin, LinkedIn, and thus the cycle begins. The post was updated with caveats and disclaimers. It is now riding atop of techmeme. Privacy. Security. etc.
Whether those concerns are legitimate or not (after all this is a massive public company based on the trust of a network), the reality is this app points out a longstanding architectural challenge in API design. The rise of modern operating systems (iOS, Android, Windows RT, and more) have inherent advantages over the PC-era operating systems (OS X, Windows, Linux) when it comes to maintaining the integrity as designed of the system overall. Yet we’re not done innovating around this challenge.
I remember my very first exploit. I figured out how to use a disk sector editor on CP/M and modified the operating system to remove the file delete command, ERA. I managed to do this by just nulling out the “ERA” string in what appeared to me to be the command table. I was so proud of myself I (attempted) to show my father my success.
The folks that put the command table there were just solving a problem. It was not an API to CP/M, or was it? The sector editor was really a tool for recovering information from defective floppies, or was it? My goal was to make a floppy with WordStar on it that I could give to my father to use but would be safe from him accidentally deleting a file. My intention was good. I used information and tools available to me in ways that the system architects clearly did not intend. I stood on the top step of a ladder. I used a screwdriver as a pry bar. I used a wrench as a hammer.
The history of the PC architecture is filled with examples of APIs exposed for one purpose put to use for another purpose. In fact, the power of the PC platform is a result of inventors bringing technology to market with one purpose in mind and then seeing it get used for other purposes. Whether hardware or software, unintended uses of extensibility have come to define the flexibility, utility, and durability of the PC architecture. There are so many examples: the first terminate and stay resident programs in MS-DOS, the Z80 softcard for the Apple ][, drawing low voltage power from USB to power a coffee warmer, all the way to that most favorite shell extension in Windows or OS X extension that adds that missing feature from Finder.
These are easily described and high-level uses of extensibility. Your everyday computing experience is literally filled with uses of underlying extensibility that were not foreseen by the original designers. In fact, I would go as far as to say that if computers and software were only allowed to do things that the original designers intended, computing would be particularly boring.
Yet it would also be free of viruses, malware, DLL hell, system rot, and TV commercials promising to make your PC faster.
Take for example, the role of extensibility in email, Outlook even in particular. The original design for Outlook had a wonderful API that enabled one to create an add-in that would automate routine tasks in Outlook. You could for example have a program that would automatically send out a notification email to the appropriate contacts based on some action you would take. You could also receive useful email attachments that could streamline tasks just by opening them (for example, before we all had a PDF reader it was very common to receive an executable that when opened would self-extract a document along with a viewer). These became a huge part of the value of the platform and an important part of the utility of the PC in the workplace at the time.
Then one day in 1999 we all (literally) received email from our friend Melissa. This was a virus that spread by using these same APIs for an obviously terrible usage. What this code did was nothing different than all those add-ins did, but it did it at Internet scale to everyone in an unsuspecting way.
Thus was born the age of “consent” on PCs. When you think about all those messages you see today (“use your location”, “change your default”, “access your address book”) you see the direct descendants of Melissa. A follow on virus professed broad love for all of us, I LOVE YOU. From that came the (perceived) draconian steps of simply disabling much of the extensibility/utility described above.
What else could be done? A ladder is always going to have a top step–some people will step on it. The vast majority will get work done and be fine.
From my perspective, it doesn’t matter how one perceives something on a spectrum from good to “bad”–the challenge is APIs get used for many different things and developers are always going to push the limits of what they do. LinkedIn Intro is not a virus. It is not a tool to invade your privacy. It is simply a clever (ne hack) that uses existing extensibility in new ways. There’s no defense against this. The system was not poorly designed. Even though there was no intent to do what Intro did when those services were designed, there is simply no way to prevent clever uses anymore than you can prevent me from using my screwdriver as a pry bar.
I wanted to offer a modern example that for me sums up the exploitation of APIs and also how challenging this problem is.
On Android an app can add one or more sharing targets. In fact Android APIs were even improved to make it easier in release after release and now it is simply a declarative step of a couple of lines of XML and some code.
As a result, many Play apps add several share targets. I installed a printing app that added 4 different ways to share (Share link, share to Chrome, share email, share over Bluetooth). All of these seemed perfectly legitimate and I’m sure the designers thought they were just making their product easier to use. Obviously, I must want to use the functionality since I went to the Play store, downloaded it and everything. I bet the folks that designed this are quite proud of how many taps they saved for these key scenarios.
After 20 apps, my share list is crazy. Of course sharing with twitter is now a lot of scrolling because the list is alphabetical. Lucky for me the Messages app bubbles up the most recent target to a shortcut in the action bar. But that seems a bit like a kludge.
Then along comes Andmade Share. It is another Play app that lets me customize the share list and remove things. Phew. Except now I am the manager of a sharing list and every time I install an app I have to go and “fix” my share target list.
Ironically, the Andmade app uses almost precisely the same extensibility to manage the sharing list as is used to pollute it. So hypothetically restricting/disabling the ability of apps to add share targets also prevents this utility from working.
The system could also be much more rigorous about what can be added. For example, apps could only add a single share target (Windows 8) or the OS could just not allow apps to add more (essentially iOS). But 99% of uses are legitimate. All are harmless. So even in “modern” times with modern software, the API surface area can be exploited and lead to a degraded user experience even if that experience degrades in a relatively benign way.
Anyone that ever complained about startup programs or shell extensions is just seeing the results of developers using extensibility. Whether it is used or abused is a matter of perspective. Whether is degrades the overall system is dependent on many factors and also on perspective (since every benefit has a potential cost, if you benefit from a feature then you’re ok with the cost).
There will be calls to remove the app from the app store. Sure that can be done. Steps will be taken to close off extensibility mechanisms that got used in ways far off the intended usage patterns. There will be cost and unintended side effects of those actions. Realistically, what was done by LinkedIn (or a myriad of examples) was done with the best of intentions (and a lot of hard work). Realistically, what was done was exploiting the extensibility of the system in a way never considered by the designers (or most users).
This leads to 5 realities of system design:
Everything is an API. Every bit of a system is an API. From the layout of files, to the places settings are stored, to actual published APIs, everything in a system as it is released serves as an interface to people who want to extend, customize, or modify your work. Services don’t escape this because APIs are in a cloud behind REST APIs. For example, reverse engineering packets or scraping HTML is no different — the HTML used by a site can come to be relied on essentially as an API. The Windows registry is just a place to store stuff–the fact that people went in and modified it outside the intended parameters is what caused problems, not the existence of a place to store stuff. Cookies? Just a mechanism.
APIs can’t tell you the full intent. APIs are simply tools. The documentation and examples show you the mainstream or an intended use of an API. But they don’t tell you all the intended uses or even the limits of using an API. As a platform provider, falling back on documentation is fairly impossible considering both the history of software platforms (and most of the success of a platform coming from people using it in a creative ways) and the reality that no one could read all the documentation that would have to explain all the uses of a single API when there are literally tens of thousands of extensibility points (plus all the undocumented ones, see #1).
Once discovered, any clever use of an API will be replicated by many actors for good or not. Once one developer finds a way to get something done by working through the clever mechanism of extensibility, if there’s value to it then others will follow. If one share target is good, then having 5 must be 5 times better. The system through some means will ultimately need to find a way to control the very way extensibility or APIs are used. Whether this is through policy or code is a matter of choice. We haven’t seen the last “Intro” at least until some action is taken for iOS.
Platform providers carry the burden of maintaining APIs over time. Since the vast majority of actors are doing valuable things you maintain an API or extensibility point–that’s what constitutes a platform promise. Some of your APIs are “undocumented” but end up being conventions or just happenstance. When you produce a platform, try as hard as you want to define what is the official platform and what isn’t but your implied promise is ultimately to maintain the integrity of everything overall.
Using extensibility will produce good and bad results, but what is good and bad will depend highly on the context. It might seem easy to judge something broadly on the internet as good or bad. In reality, downloading an app and opt-ing in. What should you really warn about and how? To me this seems remarkably difficult. I am not sure we’re in a better place because every action on my modern device has a potential warning message or a choice from a very long list I need to manage.
We’re not there yet collectively as an industry on balancing the extensibility of platforms and the desire for safety, security, performance, predictability, and more. Modern platforms are a huge step in a better direction.
Let’s be careful collectively about how we move forward when faced with a pattern we’re all familiar with.
28-10-13 Fixed a couple of typos.
We are trying to change a culture of compartmentalized, start-from-scratch style development here. I’m curious if there are any good examples of Enterprise “Open Source” that we can learn from.
—Question from reader with a strong history in engineering management
When starting a new product line or dealing with multiple existing products, there’s always a question about how to share code. Even the most ardent open source developers know the challenges of sharing code—it is easy to pick up a library of “done” code, not so hard to share something that you can snapshot, but remarkably difficult to share code that is also moving at a high velocity like your work.
Developers love to talk about sharing code probably much more than they love to share code in practice. Yet, sharing code happens all the time—everyone uses an OS, web server, programming languages, and more that are all shared code. Where it gets tricky is when the shared code is an integral part of the product you’re developing. That’s when shared code goes from “fastest way to get moving” to “a potential (difficult) constraint” or to “likely a critical path”. Ironically, this is usually more true inside of a single company where one team needs to “depend” on another team for shared code than it is on developers sharing code from outside the company.
Organizationally, sharing code takes on varying degrees of difficulty depending on the “org distance” between developers. For example, two developers working for the same manager don’t even think about “sharing code” as much as they think about “working together”. At the other end of the spectrum, developers on different products with different code bases (perhaps started at different times with early thoughts that the products were unrelated or maybe one code base was acquired) think naturally about shipping their code base and working on their product first and foremost.
This latter case is often viewed as an organizational silo—a team of engineering, testing, product, operations, design, and perhaps even separate marketing or P&L responsibility. This might be the preferred org design (focus on business agility) or it might be because of intrinsic org structures (like geography, history, leadership approach). The larger these types of organizations the more the “needs of the org” tend to trump the “needs of the code”.
Let’s assume everyone is well-meaning and would share code, but it just isn’t happening organically. What are 5 things the team overall can do?
Ship together. The most straight-forward attribute two teams can modify in order to effectively share code is to have a release/ship schedule that is aligned. Sharing code is the most difficult when one team is locked down and the other team is just getting started. Things get progressively easier the closer to aligned each team becomes. Even on very short cycles of 30-60 days, the difference in mindset about what code can change and how can quickly grow to be a share-stopper. Even when creating a new product alongside an existing product, picking a scheduling milestone that is aligned can be remarkably helpful in encouraging sharing rather than a “new product silo” which only digs a future hole that will need to be filled.
Organize together to engineer together. If you’re looking at trying to share code across engineering organizations that have an org distance that involves general management, revenue or P&L, or different products, then there’s an opportunity to use organization approaches to share code. When one engineering manager can look at a shared code challenge across all of his/her responsibilities there more of a chance that an engineering leader will see this as an opportunity rather than a tax/burden. The dialog about efficacy or reality of sharing code does not span managers or importantly disciplines, and the resulting accountability rests within straight-forward engineering functions. This approach has limits (the graph theory of org size as well as the challenges of organizing substantially different products together).
Allocate resources for sharing. A large organization that has enough resources to duplicate code turns out to be the biggest barrier to sharing code. If there’s a desire to share code, especially if this means re-architecting something that works (to replace it with some shared code, presumably with a mutual benefit) then the larger team has a built-in mechanism to avoid the shared code tax. As painful as it sounds, the most straight-forward approach to addressing this challenge is to allocate resources such that a team doesn’t really have the option to just duplicate code. This approach often works best when combined with organizing together, since one engineering manager can simply load balance the projects more effectively. But even across silos, careful attention (and transparency) to how engineering resources are spent will often make this approach attainable.
Establish provider/consumer relationships. Often shared code can look like a “shared code library” that needs to be developed. It is quite common and can be quite effective to form a separate team, a provider, that exists entirely to provide code to other parts of the company, a consumer. The consumer team will tend to look at the provider team as an extension to their team and all can work well. On the other hand, there are almost always multiple consumers (otherwise the code isn’t really shared) and then the challenges of which team to serve and when (and where requirements might come from) all surface. Groups dedicated to being the producers of shared code can work, but they can quickly take on the characteristics of yet another silo in the company. Resource allocation and schedules are often quite challenging with a priori shared code groups.
Avoid the technical buzz-saw. Developers given a goal to share code and a desire to avoid doing so will often resort to a drawn-out analysis phase of the code and/or team. This will be thoughtful and high-integrity. But one person’s approach to being thorough can also look to another as a delay or avoidance tactic. No matter how genuine the analysis might be, the reality is that it can come across as a technical buzz-saw making all but the most idealized code sharing impossible. My own experience has been that simply avoiding this process is best—a bake-off or ongoing suitability-to-task discussion will only drive a wedge between teams. At some level sharing code is a leap of faith that a lot of folks need to take and when it works everyone is happy and if it doesn’t there’s a good chance someone is likely to say “told you so”. Most every bet one makes in engineering has skeptics. Spending some effort to hear out the skeptics is critical. A winners/losers process is almost always a negative for all involved.
The common thread about all of these is that they all seem impossible at first. As with any initiative, there’s a non-zero cost to obtaining goals that require behavior change. If sharing code is important and not happening, there’s a good chance you’re working against some of the existing constraints in the approach. Smart and empowered teams act with the best intentions to balance a seemingly endless set of inbound issues and constraints, and shared code might just be one of those things that doesn’t make the cut.
Keeping in mind that at any given time an engineering organization is probably overloaded and at capacity just getting stuff done, there’s not a lot of room to just overlay new goals.
Sharing code is like sharing any other aspect of a larger team—from best practices in tools, engineering approaches, team management—things don’t happen organically unless there’s a uniform benefit across teams. The role of management is to put in place the right constraints that benefit the overall goals without compromising other goals. This effort requires ongoing monitoring and feedback to make sure the right balance is achieved.
For those interested in some history, this is a Harvard Business School case on the very early Office (paid article) team and the challenges/questions around organizing around a set of related products (hint, this only seems relatively straight-forward in hindsight).
With the latest pivot for Blackberry much has been said about disruption and what it can do to companies. The story, Inside the fall of BlackBerry: How the smartphone inventor failed to adapt, by Sean Silcoff, Jacquie Mcnish and Steve Ladurantaye in The Globe and Mail is a wonderful account.
Disruption has a couple of characteristics that make it fun to talk about. While it is happening even with a chorus of people claiming it is happening, it is actually very difficult to see. After it has happened the chorus of “told you so” grows even louder and more matter of fact. After the fact, everyone has a view of what could have been done to “prevent” disruption. Finally, the description of disruption tends to lose all of the details leading up to the failure as things get characterized at the broad company level or a simple characteristic (keyboard v. touch) when the situation is far more complex. Those nuances are what product folks deal with day to day and where all the learning can be found.
Like many challenges in business, there’s no easy solution and no pattern to follow. The decision moments, technology changes, and business realities are all happening to people that have the same skills and backgrounds as the chorus, but the real-world constraints of actually doing something about them.
The case of Blackberry is interesting because the breadth of disruptive forces is so great. It is not likely that a case like this will be seen again for a while–a case where a company has such an incredible position of strength in technology and business gained over a relatively short time and then essentially erased in a short time.
I loved my Blackberry. The first time I used one was before they were released (because there was integration with Outlook I was lucky enough to be using one some time in 1998–I even read the entire DOJ filing against Microsoft on one while stopped on the tarmac at JFK). Using the original 850 was a moment when you immediately felt propelled into the future. Using one felt like the first time I saw a graphical interface (Alto) or a GPS. Upon using one you just knew our technology lives would be different.
What went wrong is almost exactly the opposite of what went right and that’s what makes this such an interesting story and unbelievably difficult challenge for those involved. Even today I look at what went on and think of how galactic the challenges were for that amazing group of people that transported us all to the future with one product.
When you build a product you make a lot of assumptions about the state of the art of technology, the best business practices, and potential customer usage/behavior. Any new product that is even little bit revolutionary makes these choices at an instinctual level–no matter what news stories you read about research or surveys or whatever, I think we all know that there’s a certain gut feeling that comes into play.
This is especially the case for products that change our collective world view.
Whether made deliberately or not these assumptions play a crucial role in how a product evolves over time. I’ve never seen a new product developed where the folks wrote down a long list of assumptions. I wouldn’t even know where to start–so many of them are not even thought through and represent just an engineer or product manager “state of the art”, “best practice”, or “this is what I know”.
It turns out these assumptions, implicit or explicit, become your competitive advantage and allow you to take the market by storm.
But then along come technology advances, business model changes, or new customer behaviors and seemingly overnight your assumptions are invalidated.
In a relatively simple product (note, no product is simple to the folks making it) these assumptions might all be within the domain. Christensen famously studied the early days of the disk drive industry. To many of us these assumptions are all contained within one system or component and it is hard to see how disruption could take hold. Fast forward and we just assume solid-state storage, yet even this transition as obvious as it is to us, requires a whole new world view for people who engineer spinning disks.
In a complex product like the entirety of the Blackberry experience there are assumptions that cross hardware, software, communications networks, channel relationships, business models and more. When you bring all these together into a single picture one realizes the enormity of what was accomplished.
It is instructive to consider the many assumptions or ingredients of Blackberry success that go beyond the popular “keyboard v. touch”. In thinking about my own experience with the product, the following list just a few things that were essentially revisited by the iPhone from the perspective of the Blackberry device/team:
- Keyboard to touch. The most visible difference and most easily debated is this change. From crackberry thumbs to contests over who could type faster, your keyboard was clearly a major innovation. The move to touch would challenge you in technology, behavior, and more.
- Small (b&w) screens to large color. Closely connected with the shift to touch was a change in perspective that consuming information on a bigger screen would trump the use of the real estate for (arguably) more efficient input. Your whole notion of industrial design, supply chain, OS, and more would be challenged. As an aside, the power consumption of large screens immediately seemed like a non-starter to a team insanely focused on battery life.
- GPRS to 3G then LTE. Your heritage in radios, starting with the pager network, placed a premium on using the lowest power/bandwidth radio and focusing on efficiency therein. The iPhone, while 2G early, quickly turned around a game changing 3G device. You had been almost dragged into using the newer higher powered radios because your focus had been to treat radio usage as a premium resource.
- Minimize bandwidth to assume bandwidth is free. Your focus on reducing bytes over the wire was met with a device that just assumed bytes would be “free” or at least easily purchased. Many of the early comments on the iPhone focused on this but few assumed the way the communications companies would respond to an appetite for bandwidth. Imagine thinking how sloppy the iPhone was with bandwidth usage and how fast the battery would drain. Assuming a specific resource is high cost is often a path to disruption when someone makes a different assumption.
- No general web support v. general web support. Despite demand, the Blackberry avoided offered generalized web browsing support. The partnership with carriers also precluded this given their concern about network responsiveness and capacity. Again, few would have assumed a network buildout that would support mobile browsing the way it does today. The disruptor had the advantage of growing slowly (relatively) compared to flipping a switch on a giant installed base.
- WiFi as “present” to nearly ubiquitous. The physics of WiFi coverage (along with power consumption, chip surface area and more) assumed WiFi would be expensive and hard to find. Even with whole city WiFi projects in early 2000′s people didn’t see WiFi as a big part of the solution. Few thought about the presence of WiFi at home and new usage scenarios or that every urban setting, hotel, airport, and more would have WiFi. Even the carriers built out WiFi to offload traffic and include it for free in their plans. The elegant and seamless integration of WiFi on the iPhone became a quick advantage.
- Device update/mgmt by tethering to off air. Blackberry required tethering for some routine operations and for many the only way to integrate corporate mail was to keep a PC running all the time. The PC was an integral part of the Blackberry experience for many. While the iPhone was tethered for music and videos, the presence of WiFi and march towards PC-free experiences was an early assumption in the architecture that just took time to play out.
- Business to consumer. Your Blackberry was clearly a business device. Through much of the period of high success consumers flocked to devices like the SideKick. While there was some consumer success, you anchored in business scenarios from Exchange and Notes integration to network security. The iPhone comes along and out of the gate is aimed at consumers with a camera, MMS, and more. This disruption hits at the hardware, the software, the service integration, and even how the device is sold at carriers.
- Data center based service to broad set of cloud based services. Your connection to the enterprise was anchored in a server that business operated. This was a significant business upside as well as a key part of the value proposition for business. This server became a source for valuable business information propagated to the Blackberry (rather than use the web). The absence of an iPhone server seemed like a huge opportunity yet in fact it turned into an asset in terms of spreading the device. Instead the iPhone relied on the web (and subsequently apps) to deliver services rather than programmed and curated services.
- Deep channel partnership/revenue sharing to somewhat tense relationship. By most accounts, your Blackberry business was an incredible win-win with telcos around the world. Story after story talked of the amazing partnerships between carriers and Blackberry. At the same time, stories (and blame game) between Apple and AT&T in the US became somewhat legendary. Yet even with this tension, the iPhone was bringing very valuable customers to AT&T and unseating Blackberry customers.
- Ubiquitous channel presence to exclusives. Your global partnership strength was unmatched and yet disrupted. The iPhone launched with single carriers in limited markets, on purpose. Many viewed that as a liability, including Blackberry. Yet in hindsight this only increased the value to the selected partners and created demand from other potential partners (even with the tension).
- Revenue sharing to data plan. One of the main assets that was mostly invisible to consumers was the revenue to Blackberry for each device on the network. This was because Blackberry was running a secure email service as a major anchor of the offering. Most thought no one was going to give up this revenue, including the carrier ability to up-charge for your Blackberry. Few saw a transition to a heavily subsidized business model with high priced data plans purchased by consumers.
These are just a few and any one of these is probably debatable. The point is really the breadth of changes the iPhone introduced to the Blackberry offering and roadmap. Some of these are assumptions about the technology, some about the business model, some about the ecosystem, some about physics even!
Imagine you’ve just changed the world and everything you did to change the world–your entire world view–has been changed by a new product. Now imagine that the new product is not universally applauded and many folks not only say your product is better and more useful, but that the new product is simply inferior.
Put yourself in those shoes…
Disruption happens when a new product comes along and changes the underlying assumptions of the incumbent, as we all know.
Incumbent products and businesses respond by often downplaying the impact of a particular feature or offering. And more often than folks might notice, disruption doesn’t happen so easily. In practice, established businesses and products can withstand a few perturbations to their offering. Products can be rearchitected. Prices can be changed. Features can be added.
What happens though when nearly every assumption is challenged? What you see is a complete redefinition of your entire company. And seeing this happen in real time is both hard to see and even harder to acknowledge. Even in the case of Blackberry there was a time window of perhaps 2 years to respond–is that really enough time to re-engineer everything about your product, company, and business?
One way to look at this case is that disruption rarely happens from a single vector or attribute, even though the chorus might claim X disrupts Y because of price or a single feature, for example. We can see this in the case of something like desktop linux–being lower priced/open source are interesting attributes but it is fair to say that disruption never really happened to the degree that might have been claimed early on.
However, if you look at Linux in the data center the combination of using Linux for proprietary data center architectures and services combined with the benefit of open source/low price brought with it a much more powerful disruptive capability.
One might take away from this case and other examples, that the disruption to watch out for the most would be the one that combined multiple elements of the traditional marketing mix of product, price, place, promotion. When considering these dimensions it is also worth understanding the full breadth of assumptions, both implicit and explicit, in your product and business when defending against disruption. Likewise, if you’re intending to disrupt you want to consider the multiple dimensions of your approach in order to bypass the intrinsic defenses of incumbents.
It is not difficult to talk about disruption in our industry. As product and business leaders it is instructive to dive into a case of disruption and consider not just all the factors that contributed but how would you respond personally. Could you really lead a team through the process of creating a product that literally inverted almost every business and technology assumption that created $80B or so in market cap over a 10 year period?
In The Sun Also Rises, Hemingway wrote:
How did you go bankrupt? Two ways. Gradually, then suddenly.
That is how disruption happens.
Back in the pre-web days, acquiring software was difficult and expensive. Learning a given program (app) was difficult and time consuming. Within this context there was an amazing amount of innovation. At least in part, these constraints also contributed to the oft-cited (though not well-defined) concept of bloatware. Even these constraints do not seem particularly true on today’s modern mobile platforms, we are starting to see a rise in app bloat. It is early and with enough self-policing and through the power of reviews/ratings we might collectively avoid bloatware on our mobile devices.
Product managers have a big responsibility to develop feature lists/themes that make products easier to use, more functional, and better overall–with very finite resources. Focusing these efforts in ways that deliberately deprioritize what could lead to bloatware is an opportunity to break from past industry cycles and do a better job for modern mobile platforms. There are many forms of bloat across user experience, performance, resource usage and more. This post looks at some forms of UX bloat.
This post was motivated by a conversation with a developer considering building an “all in one” app to manage many aspects of the system and files. This interesting post by Benedict Evans, http://ben-evans.com/benedictevans/2013/9/21/atomisation-and-bundling about unbundling capability and Chris Dixon’s thoughtful post on “the internet is for snacking” http://cdixon.org/2013/09/14/the-internet-is-for-snacking/ serve as excellent motivators.
The first apps people used on PCs tended to be anchor apps–that is a given individual would use one app all day, every day. This might have been a word processor or a spreadsheet, commonly. These apps were fairly significant investments to acquire and to gain proficiency.
There was plenty of competition in these anchor apps. This resulted in an explosion in features as apps competed for category leadership. Software Digest used to evaluate all the entries in a category with lists of hundreds of features in a checklist format, for example. This is all innovation goodness modulo whether any given individual valued any given feature. By and large, the ability for a single (difficult to acquire and gain proficiency) product to do so many things for so many people was what determined the top products in any given category.
Two other forms of innovation would also take place as a direct result of this anchor status and need to continue to maintain such status.
First, the fact that people would be in an app all day created an incentive for an ISV to “pull into” the app any supporting functionality so the person would not need to leave the app (and enter the wild world of the OS or another app that might be completely different). This led to an expansion of functionality like file management, for example. This situation also led to a broad amount of duplication of OS capabilities from data access to security and even managing external devices such as printers or storage.
As you can imagine, over time the amount of duplication was significant and the divergence of mechanisms to perform common tasks across different apps and the OS itself became obvious and troublesome. As people use more and more programs this began to strain the overall system and experience in terms of resources and cognitive load.
Second, because software was so hard to use in the early days of these new apps and paradigms there was a great deal of innovation in user experience. Evolving from command line to keyboard shortcuts to graphical interface. Then within the graphical interface from menus to toolbars to palettes to context menus and more. Even within one phase such as early GUI there were many styles of controls and affordances. At each innovation junction the new, presumably easier mechanism was added to the overall user experience.
At an extreme level this just created complete redundancy in various mechanisms. When toolbars were introduced there was a raging debate over whether a toolbar button should always be redundant with a menu command or never be redundant with a menu command. Similarly, the same debate held for context menus (also called shortcut menus, which tells you where that landed). Note a raging debate means that well-meaning people had opposing viewpoints that each asserted were completely true, each unable to definitively prove any point of view. I recall some of the earliest instrumented studies (special versions of apps that packaged up telemetry that could later be downloaded from a PC enlisted in the study) that showed before/after the addition of redundant toolbar buttons, keyboard shortcuts, and shortcut menus. Each time a new affordance was added the existing usage patterns were split again–in other words every new way to access a command was used frequently by some set of people. This provided a great deal of validation for redundancy as a feature. It should be noted that the whole system surrounding a release of a new mechanism further validated the redundant approach–reviews, marketing, newsgroups, enthusiasts, as well as telemetry showed ample support for the added ways of doing tasks.
As you can imagine, over time the UX became arguably bloated and decidedly redundant. Ironically, for complex apps this made it even more difficult to add new features since each brand new feature needed to have several entry points and this toolbars, palettes, menus, keyboard shortcuts, and more were rather overloaded. Command location became an issue. The development of the Office ribbon (see JensenH’s blog for tons of great history – http://blogs.msdn.com/b/jensenh/) started from the principle of flattening the command hierarchy and removing redundant access to commands in order to solve this real-estate problem.
By the time modern mobile apps came on the scene it was starting to look like we would have a new world of much simpler and more streamlined tools.
Mobile apps and the potential for bloat
Mobile app platforms would seem to have the foundation upon which to prevent bloat from taking place if you consider the two drivers of bloat previously discussed. Certainly one could argue that the inherent nature of the platforms intend for apps to be focused in purpose.
First, apps are easy to get and not so expensive. If you have an app that takes photos and you want to do some photo editing, there are a plethora of available photo editing apps. If you want to later tag or manage photos, there are specialized apps that do just that. While there are many choices, the web provides a great way to search for and locate apps and the reviews and ratings provide a ton more guidance than ever before to help you make a choice. The relative safety, security, and isolation of apps reduces the risk of trial.
Second, because the new mobile platforms operate at a higher level of abstraction the time to learn and app is substantially reduced. Where classic apps might feel like you’re debugging a document, new apps building on higher level concepts get more done with fewer gestures, sometimes in a more focused domain (compare old style photo editing to instagram filters, for example). Again the safety afforded the platforms makes it possible to try things out and undo operations (or even whole apps) as well. State of the art software engineering means even destructive operations almost universally provide for undo/redo semantics (something missing from the early days).
Given these two realities, one might hope that modern mobile apps are on a path to stay streamlined.
While there is a ton of great work and the modern focus on design and simplicity abounds in many apps, it is also fair to say that common design patterns are arising that represent the seeds of bloat. Yet the platforms provide capabilities today that can be used effectively by ISVs to put people in control of their apps to avoid redundancy. Are these being used enough? That isn’t clear.
One example that comes to mind is the share to verb that is commonly used. Many apps can be both the source and sink of sharing.
For example, a mail program might be able to attach a photo from within the program. Or the photo viewer might be able to mail a photo.
It seems routine then that there should be an “attach” verb within the mail program along with the share verb from the photo viewer. On most platforms this is the case at least with third party mail programs as well. This seems fast, convenient, efficient.
As you play this out over time the the mail program starts to need more than attach of a photo but potentially lists of data types and objects. As we move away from files or as mobile platforms emphasize security the ability for one app to enumerate data created by another makes this challenging and thus the OS/apps need to implement namespaces or brokers.
The other side of this, share to, becomes an exceedingly long list of potential share targets. It becomes another place for ISVs to work to gain visibility. Some platforms allow ISVs to install multiple share targets per app and so apps show up more than once. On Android there is even a third party app that is quite popular that enables you the ability to offer to manage this list of share targets. Windows provides this natively and apps can only install as a single share to target to avoid this “spamming”.
As an app creator, the question is really how critical it is to provide circular access to your data types? Can you allow the system to provide the right level of access and allow people to use the native paradigms for sharing? This isn’t always possible and the limitations (and controls) can make this impossible, so this is also a call to OS vendors to think through this “cycle” more completely.
In the meantime, limited screen real estate is being dedicated to commands redundant with the OS and OS capabilities might be overloaded with capabilities available elsewhere.
A second example comes from cross-platform app development. This isn’t new and many early GUI apps had this same goal (cross-platform or cross-OS version). When you need to be cross-platform you tend to create your own mechanisms for things that might be available in the platform. This leads to inconsistencies or redundancies, which in turn make it difficult for people to use the models inherent in the platform.
In other words, your single-app view centered around making your app easier by putting everything in the context of your app drives the feature “weight” of your app up and the complexity of the overall system up as well. This creates a situation where everyone is acting in the interest of their app, but in a world of people using many different apps the overall experience degrades.
Whether we’re talking about user/password management, notifications, sounds, permissions/rights, and more the question for you as an ISV is whether your convenience or ease of access, or desire to do things once and work across platforms is making things locally easier at the expense of the overall platform or not?
Every development team deals with finite resources and a business need to get the most bang for the buck. The most critical need for any app is to provide innovative new features in the specific domain of the app–if you’re a photo editing app then providing more editing capabilities seems more innovative than being able to also grab a picture from the camera directly (this is sort of a canonical example of redundancy–do many folks start in the editor when taking a picture, yet almost all the editors enable this because it is not a lot of extra code).
Thinking hard about what you’re using your finite resources to deliver is a job of product management. Prioritizing domain additions over redundancy and bloat can really help to focus. One might also look to reviewers (in the App Stores or outside reviewers) to consider redundancy as not always more convenient but somewhat of potential challenge down the road.
Ironically, just as with the GUI era it is enthusiasts who can often drive features of apps. Enthusiasts love shortcuts and connections along with pulling functionality into their favorite apps. You can see this in reviews and comments on apps. Enthusiasts also tend to have the skills and global view of the platforms to navigate the redundancy without getting lost. So this could also be a case of making sure not to listen too closely to the most engaged…and that’s always tricky.
Designers and product managers looking to measure the innovation across the set of features chosen for a release might consider a few things that don’t necessarily count as innovation for apps on modern mobile platforms:
- Adding more access points to previously existing commands. Commands should have one access point, especially on small screen devices. Multiple access points means that over time you’ll be creating a screen real estate challenge and at some point some people will want everything everywhere, which won’t be possible.
- Making it possible to invoke a command from both inside-out and outside-in. When it comes to connecting apps with each other or apps to data, consider the most fluid and normal path and optimize for that–is it going from data to app or from app to data, is your app usually the source or the sink? It is almost never the case that your app or app data is always the starting point and the finishing point for an operation. Again, filling out this matrix leads to a level of bloat and redundancy across the system and a lack of predictability for customers.
- Duplicating functionality that exists elsewhere for convenience. It is tempting to pull in commonly changed settings or verbs into your app as a point of efficiency. The challenge with this is where does it end? What do you do if something is not longer as common as it once was or if the OS dramatically changes the way some functionality is accessed. Whenever possible, rely on the native platform mechanisms even when trying to be cross-platform.
- Thinking your app is an anchor so it needs to provide access to everything from within your app. Everyone building an app wants their app to be the one used all the time. No one builds an app thinking they are an edge case. This drives apps to have more and more capability that might not be central to the raison d’être for your app. In the modern mobile world, small tools dominate and the platforms are optimized for swiftly moving between tools. Think about how to build your app to be part of an overall orchestra of apps. You might even consider breaking up your app if the tasks themselves are discrete rather than overloading one app.
- Reminding yourself it is your app, but the person’s device. ”Taking over” the device as though your app is the only thing people will use isn’t being fair to people. Just because the OS might let you add entry points or gain visibility does not mean you should take advantage of every opportunity.
These all might be interesting features and many might be low cost ways to lengthen the change log. The question for product managers is whether this was the best use of resources today and whether it builds the experience foundation for your app that scales down the road?
Where do you want your innovation energy to go–your domain or potential bloat?