Blacked-Out Top Sheets & Objective Ski Testing

Blister weighs in on blacking out topsheets for ski tests

On our recent GEAR:30 podcast, Shop Talk: Pulling Back the Curtain on Gear Testing (episode #172), we discussed (once again) some of the best and worst ways to test skis — and many of these worst practices still take place in the ski industry today.

In response to the episode, a number of people wrote in to ask what we thought about blacking out top sheets in the interest of creating a more “objective” ski test.

We addressed this question several years ago, and thought it would be worthwhile to repost and to revisit the topic.

Let us know what you think.

10.09.17

This past Saturday, we got tagged by LINE Skis in a social post (along with SKI, Powder, and Freeskier magazine), about a “blacked-out” top sheet snowboard test that Snowboarder Magazine put together.

The post said this:

“Well done Snowboarder Magazine! Completely even playing field. No favorites, no tuning wars, no advertising dollars from competing brands and no sponsored / influenced testers. This would be a great idea for skiing. Do you agree? SKI Magazine, FREESKIER Magazine, BLISTER, Powder? We think adding something like this is a great idea.”

So since we were asked to weigh in, we did, and we replied that, while this was certainly an interesting experiment being done by Snowboarder Magazine, here at BLISTER, we don’t actually believe that blacking out top sheets would affect the results of our testing at all, and that there were far more important factors when it comes to creating a “completely even playing field.” Such as:

(1) Not accepting advertising dollars from any of the companies whose products are being reviewed — which we don’t.

(2) Not having sponsored skiers reviewing skis — which we don’t.

(3) Testing product over multiple days, and in multiple conditions — which we do.

(4) Finding skilled reviewers for our product testing, not just skilled skiers. (Because being really good at skiing does not mean that you are necessarily really good at teasing out and articulating the nuances of how a given ski performs.)

(For more on how and why we do things the way we do here at BLISTER, you can check this out.)

And so, assuming that the 4 criteria above are in place, do you still think that blacking out top sheets would make for a much more objective test?

Because we could certainly start working to put such a test together.

But … 

The more time you spend thinking about this, the more the logistical issues start to pile up. For example:

(1) It would be incredibly easy for us to identify certain well-known skis just from their shape, even if you blacked-out the top sheets. This seems like a really big problem.

(2) For this to be an apples-to-apples test, you’d need to pick a category of skis. It would be pretty useless to review a blacked-out 118mm-wide pow ski vs. a blacked-out 93mm-wide carver vs. a 105mm-wide charger. So you’d have to pick a category. Which is fine, except then it would seem that you would pretty quickly encounter the first problem we listed — some of the shapes would be obvious.

(3) So to counter that, you could have a bunch of companies all make a brand-new ski — e.g., either a touring ski, or a jib ski, or a directional charger. But they’d have to (a) want to do that; (b) have the time and resources to do that; (c) they’d need time to develop prototypes and test the new ski so that it didn’t perform like garbage — so we’d have to wait something like 6-18 months to do this test. And then (d) I guess they’d all then release for sale to the public all of these production skis in the same category?

Maybe they’d want to do this?

But given all of these logistical complexities, this is why we believe that there are far simpler, more straightforward, and more impactful steps that can be taken to trying to create a level playing field. And we have gone to great lengths to create such a level playing field — and have turned down a lot of advertising money from the ski and bike and running shoe and apparel companies in order keep this playing field level. We do this because we believe that this is the right way to do things if the goal is truly to provide the most accurate product information for consumers.

And taking all of this into account, the more it seems that the ‘blacked-out top sheets’ test makes the most sense if (a) you are taking money from advertisers (and therefore you need to find ways to remove the financial conflict of interest), and / or (b) your reviewers are working for (or are sponsored by) ski companies — so you need to make sure they aren’t being homers for their products, and haters on everyone else’s).

But again, we don’t think you should be doing (a) or (b) in the first place if you’re really trying to create a level playing field.

Furthermore, we train our reviewers to be as accurate as possible when testing a product. Our job is to talk about what a product does or doesn’t do well; identify where it excels and doesn’t excel, performance-wise, compared to its direct competitors; and help people understand which products will — and won’t — work best for them.

Another factor of our testing that helps to weed out any would-be bias is the product-comparison work we do. It’s possible that a tester might come in thinking, “I really tend to like skis from this company.” But then they have to go A/B that ski against several other skis in that category, and report how it compares on groomers, in moguls, on steeps, in powder, in crud, etc.

And then we often have multiple reviewers testing those same products in the same conditions, so we have other reviewers confirming or challenging any potentially undeserved praise or criticism that might come from a reviewer’s personal bias.

Given all of that, does it still sound to you like blacking-out the top sheets is a really compelling way to get more accurate / less biased ski assessments? Worth doing? Not worth doing? Thoughts on the numerous logistical issues?

49 comments on “Blacked-Out Top Sheets & Objective Ski Testing”

  1. Agree. I can’t imagine blacking out topsheets would have any impact for the positive unless you’re rating skis with a two-thumbs-up or 1-10 rating scale (which I appreciate that you don’t do). Part of what’s nice about your reviews is how you suss out the reality vs the expectations for any given ski and compare it to skis with a similar skier and objectives in mind. It just doesn’t make sense to blindly compare a jibby park ski to a east coast carver and compare their crud busting ability. Context is key.

    • After re-reading all of the older comments and all the new ones … I still am not persuaded that blacking out top sheets would have any meaningful impact because (as you say) we don’t do a thumbs-up system or a 1 out of 10 rating – AND … IMPORTANTLY … we aren’t offering ratings after doing only a run or two on the ski.

      And I still maintain that the less time you actually spend on or with a product, the greater the likelihood that you will be biased by things like topsheets. Because you haven’t actually spent enough time to come to considered judgments about performance differences. So you’re left to guess.

      And this is why it is a fatal flaw to do ski tests that include taking only one or two runs on a ski. And I think that the people who are most in favor of blacking-out topsheets are failing to realize this.

  2. I absolutely think that blacking out top sheets is a great idea. Yes, some you would be able to tell or at the very least suspect their provenance, but I think you underestimate the effect of graphical advertisement on even the most objective of judges. Why not add another layer of objectivity?

    • See my comment above.

      Also, time is a finite resource, and this is just one other thing to have to add to our extremely time-intensive process.

      And again, when your only goal is to write product reviews that are as accurate as possible … then accuracy is the only goal. And reviewers that aren’t accurate get cut.

      And given the way we’ve structured Blister financially — not taking any money from gear manufacturers — we have every incentive in the world to be as accurate as possible with what we’re experiencing on snow.

      While there is zero financial incentive – in fact, there’s only financial damage – that we’d experience if we are calling skis “great” because we like the topsheets?

      Furthermore, have you ever tried to write ~10,000 words about a product where you go into significant detail about how it performs, and other reviewers are reading your work to make sure you aren’t off base … but then it turns out that, all-along, you just really liked the sick topsheet on the ski, so your descriptions of how the ski performed on groomers and in chop and in crud and in powder and how it compares to 10 other skis in the category were thrown off … by those amazing topsheets?

      Again, blacked-out topsheets are more compelling if you aren’t actually spending enough time on the product, so you end up guessing. Or your a sponsored athlete or you work for a ski company and you are supposed to objectively go review other company’s skis.

      But that’s not who we are or what we do.

  3. So the Snowboard blackout test was all boards in one category.

    The shape thing is important. You’re not gonna confuse a Soul 7 with any other ski, or a Moment with any other brand.

    The snowboard test seemed mostly like an excuse to market a video series with a sponsored rider rather than a legit review scenario.

    • Ding ding.

      Or, I might argue, that it was a way to try to put a bandaid on a fundamentally flawed way of testing stuff.

  4. This is probably a good idea, but one of the great thing about your reviews are they are over the long term, though I guess reviewers could ride blacked out skis for the normal relatively long timeframe.
    Additionally almost all skis can be identified by their bases (at least the brand), which I assume you would not be blacking out?
    Still a big fan of blister reviews either way and I agree it’s even more important for other reviewers who do all their tests in a day or so.

    • Agreed. The less time on the product, the better this blacked-out idea is.

      So … why not just do the far better thing and not cram all your testing into a 2-3 day window?

      Answer: you do a 2-3 day test window because the primary intention is not to provide the most accurate and helpful product reviews. It’s an old model of testing that nobody questioned because it used to be the only model out there.

  5. Blacking out skis is a great idea- if only as a marketing tool for the site. It adds a layer of interest in the story, and definitely would draw more clicks to the reviews. Sure most of the time the reviewers are going to know exactly what ski they are on, but once and a while there will be a surprise and that may add an interesting layer to the writing.

  6. I fully agree with Jonathan plus the bases problem James mentioned.

    I hope Blister will not engage in such a testing because that would suggest that they are somewhere on a comparable level of the other magazines mentioned. That would be horrific.

    Have fun

  7. You guys will be able to tell in a split second what ski you have in your hand. Hell, I would probably guess 8/10.
    The only way this would remotely work is if somebody puts the skis down on top of each run and you step in blind. Most of you would probably still be able to guess what you on when you look down.

  8. One thing that we didn’t discuss while writing this post is the goal of the review. At Blister, we try to place the ski accurately on the spectrum — explain who the ski is for and who it isn’t for. That’s a very different idea than other tests that try to tell which ski is “best”.

    I don’t think blacking out would have much of an impact on my testing (ignoring the logistical issues) because I don’t really care if the ski is “good” or not. I just want to make sure that I understand what the ski does best and what kind of skier would excel on it (that’s not to say I don’t have my favorites, obviously).

    However, the mindset totally changes when you are trying to say which one YOU like better, which one is “best”. I think that sort of testing is much more prone to bias.

    • Late to the party, but this.

      I know what I’m looking for in a ski’s performance, and Blister’s reviews tell me if it’s a fit for me, or not so much.

      For instance, how can you say whether a Head Monster 108 or Kore 105 is “better”?!

      You can say which of the two is better at what traits, and the rest is up to the reader being honest with themselves.

  9. I don’t think blanking out tops sheets for professional reviewers would be practical or really helpful. Where it would be most beneficial is to black out top sheets as demos or tests by regular skiers, so that they can focus on the feel and performance and not be seduced by branding and looks. They’d also be less susceptible to knowing shapes at a glance. If the ski companies would be willing to let an independent company like Blister do an on mountain demo with documented feedback over multiple days in multiple locations and varied conditions then we would have usable data.

    • cjtrapp I really like your idea about having demos where laypeople could try out skis with blanked out top sheets, not for the purpose of publishing reviews, but just for people to review skis for themselves without bias (I’m assuming that the vast majority of skiers couldn’t tell what specific model a ski is just by the dimensions, flex like the Blister folks can).

      • Well at least we now have the Blister Summit, where laypeople *can* try out a number of skis and talk with us (if they want to) about what they’re feeling.

        I’m not sure we’ll do it at this upcoming Blister Summit, but maybe we try a bit of a blacked-out test on a category of skis at a future Summit.

  10. Your statement about not taking money from ski brands for your tests got me thinking. How Independent are the reviews in American ski magazines? I get the feeling that the “reviews” are little more than glorified ads. This year, I am getting Powder and Freeskier. Both seem to have similar lists of skis in their buyers guides. Can anyone confirm or deny this?

  11. My take is it might be a “fun” one time experiment to try. When you go to S. America for testing, and you have a number of similar skis to test in one day, it would be interesting. It would probably be obvious who makes the ski to all your testers, but some black spray paint would be easy to apply. If you had a guest tester, and he described what kind of ski he liked, and then you gave him a small number of blacked out skis to test, it might be interesting to see if he liked the one you guys describe as being the closest to what he likes. Or something along those lines.

    You guys have the right formula, stick to it. A test by the general public would be different with blacked out skis.

  12. My impression (may be wrong) is that you guys don’t take everyone on a weekend and test a ton of skis at once like Outside, Backcountry, etc do. This has been said before, but in that setting perhaps it makes sense. For Blister’s style reviews not so much.

  13. Blanked out top-sheets for Jonathan Ellsworth and Blister? Leave the blind taste test shit for the winos. Never has any ski that I purchased after a Blister review been anything other than a spot on reflection of the Blister review. Please don’t waste your most valuable resource, time, on such Tom Foolery.

  14. For me, Blacking our just emphasizes the choking of creativity in ski shape throughout the ski industry. It’s the same shape-ish. In my opinion over the last 5 years industry has been working its way back to the eighties and nineties where graphics were boring and the big brands said their ski was so good it could stay the same for years. There are a couple folks stills innovating…Hoji: always, Moment is doing some cool stuff and a few others.
    So yeah black them out, the graphic will be better then what is being offered by most brands today and They all ski pretty close to the same.
    Wow, I promise I’m not as bitter as that sounds. I love skiing, I just want the innovation that snowboard industry has at the moment. We don’t need more super side cut skis with cam-rock profiles…it’s been done…bored.

    • I’d say that your wish has been coming true – at least a bit – over the past four years. Skis like the Folsom Spar Turbo, the Black Crows Mirus Cor, the LINE Blade, etc. etc.

    • Ha. Maybe this will be our Blister ‘Crash Course’ video after we make our telemark video and our snowblade video and after we monoski in Alaska.

  15. Blackout is unnecessary here. I believe reviewers are sufficiently independent. Main point is that the reviewers actually ski the skis for more than 20 minutes on the piste of the day and write a detailed review from more than one reviewer’s point of view.

    When publications mix ad spend, sponsorships and 200 words which deliver the skiing equivalent of laterally stiff but vertically compliant then it makes no odds.

    An area for improvement would be for you guys to review some actual piste skis. 68mm Stocklis vs Fischer and Atomic etc. I have no clue what to buy and ski Swiss and Italian artificial ice rinks quite often.

  16. I think this would be interesting to try, just to see if there is any inherent bias (negative or positive) once you know the brand and model. For Blister to do this it wold be even more valuable because you have removed many of the other obvious conflicts and you test skis over a extended period with different conditions. What would be really interesting is to blind test say 3 to 5 pairs of skis of a certain category then retest the same skis with the top sheets reveled and see if the conclusions for each ski are similar. I suspect that knowing the brand has some effect on ones perception of the ski’s performance similar say to someone you respect as a skier telling you this is their favorite ski before you try it. Since Blister testers are strong, knowledgable skiers reviling the composite make up of each ski prior to testing without knowing the brand might have the same effect. Or maybe this prior knowledge has no effect at all and wouldn’t that be great to know.

  17. I’ve always enjoyed how most Blister reviewers start off reporting what the marketing team said about the ski, ride it, and then report how closely the ski aligns with its advertised purpose. Brilliant!! Kinda hard to do that in a blind test.

  18. Blacking out the top sheets is of limited use. To be honest the only tests I really rely on when chosing skis are Blisters. Your unique method isn’t as affected by brand bias as other methods. At least that’s what I think. Other testers that tests a lot of skis in a short period of time might gain from it since it might remove some brand bias. But on the other hand I don’t trust those tests that much. I see them as pointers. Nothing more.

  19. In my opinion, the testing Blister does today is top notch. I’ve purchased 3 skis solely based on Blister reviews, deep dives and comments/discussions. Every time, the ski performed precisely as Blister said it would.

    Op paper, adding blackout topsheets would add another layer of objectivity to what I think are already the most objective ski reviews in the industry. But as already mentioned, bases/shape/rocker and ski feel will be very difficult to hide unless Blister testers intentionally detach themselves from the industry and come into every test not knowing what the relative design patterns of say the Nordica Enforcer and Rossi S series are. And who wants reviews from people who don’t understand the very basics like this?

    Even if you could hide all that, what I love about Blister are the meticulous A/B and Deep Dive type testing — and the attention to the nuances of one ski over another. For example, the A/B testing you guys did with Enforcer 110 and the Wrenegade 108. Two very similar skis, similar performance, but different feel. Jonathan and Paul went back and forth on the nuances of turn initiation between the heavily rockered Wren vs the more immediate tip engagement of Enforcer. And the smoother on snow feel of the metal laminate Enforcer vs Wren.

    Being highly sensitive to these nuances is what makes Blister the best. So black out the topsheet all you want, but in a small (A/B, deep dive) group of skis I bet every Blister tester will need 3 turns to guess which ski model they are on.

    In short, I appreciate the desire to black out topsheets. But in practical terms, it is at best a futile exercise and at worst will make Blister reviews worse.

  20. Yes of course if you work in the industry you’re going to know what ski you’re on. Blacking out topsheets is a complete waste of time, only complete hubbards would believe that this kind of test could be objective.

  21. I don’t always agree with Blister reviews, but respect the even-handed approach you guys have towards gear testing. I’m pretty sure that you focus on making turns, not how rad your skis look in a lift line or on a skintrack. I even recall a review or three where your objections were about a bad factory tune (which you rectified).

    When you guys drop in, I’m pretty sure you’re focused on getting after it, not your topsheets. Short of some magical pixellated topsheets that wholly masked a ski’s shape, I don’t think there’s a way to hide what you’re skiing anyway (and that’s even worse for MTB’s).

    So…..don’t waste time or money on blacked out topsheets unless you’re trying different constructions of skis that come out of the same press (light core vs. beefy core). That might be interesting.

  22. Pretty sure they got the idea of the video from surfing magazine “Stab in the Dark” where a pro reviews a number of completely white surfboards, no logos, no dimensions/ markings.

    I think putting white or black vinyl or spray paint over the entire ski topsheet and just numbering them + maybe a waist width so that you have a somewhat blind test might be a good idea. That way you can at least keep track of the “categories” of skis. I think with no top sheet or bottom graphic even people who have ski’d a ton of stuff wouldn’t be able to pick out skis that don’t have a truly iconic shape.

    Everyone has historical bias on some level with brands, but since you guys and your testers have a pretty even hand/ process I don’t think it’d make as much of a difference then if a normal ski magazine did it. If a normal mag did it there would be a lot less “editor choice” awards…or maybe even more ad bribery?

    • Transworld Snowboarding, now defunct, did whiteout snowboard buyer’s guides/reviews for years. Always a good read while on the shitter, but the idea that top sheet graphics whited out actually does anything to lessen the reviewers’ bias is bull, in my opinion.

      Blister people: I’d like my printed buyer’s guide now, please

  23. From a pure research methodology standpoint, it is worth trying. My guess is that the impact is marginal (given the other factors that make Blister reviews trustworthy), but it could still have some effect.

    Maybe your testers can still ID most of the skis based on shape/feel/bottoms, but I suspect that blacking out (or any other color) the skis would still have an impact on implicit psychological biases.

    For instance, regardless of how the ski feels, your brain is wired in a way that it is just going to feel better about skis that you think LOOK good. If you hate graffiti-style topsheet art, your brain might tell you that you feel like you are having slightly more fun on Black Crows skis than on Moments. The reviewer’s opinions about aesthetics shouldn’t actually matter, since tastes vary and the reader can make that call without testing the skis.

    Also, even if your testers can tell (especially when looking at the skis in a lineup of other skis), having unmarked skis on their feet all day makes them less likely to think about branding and model during the test since it isn’t in their face. When you make something “top of mind” but putting it there every time you look down, you raise its mental importance. If brand and model are NOT supposed to be factors in your reviews and you’ve got an easy way to reduce their impact (spraypaint or vinyl layers)…why would you not try it?

    On the flip side, it might make the reviewing and comparison process more difficult for your reviewers. One of the great things about blister reviews is the comparison to other skis. It is going to be a lot harder to make those comparisons when you have no visual cues in your memory. How hard will it be to say “Despite the width, the carving performance feels a lot like black ski #23 that I tested last season…or wait, was it black ski #22?”. It is not like you are doing a single-session taste test between 5 different colas where you can keep sampling to determine if #1 has a deeper flavor than #3.

    Maybe try it as a single-season experiment where you announce that you will do it for one season of testing, (maybe just one category of skis–e.g. skis wider than 110 will be blacked out) and wait until the end of the season to evaluate whether or not you feel like it makes a difference.

  24. I see the appeal of trying to create a blind test. It would theoretically be more objective and could eliminate bias that reviews are not aware that they have. If you look at any scientific study this type of blind trial is not only standard, but the lack of such measures can be considered reason to throw out results. However, as mentioned, most skis would be identifiable without their top sheet. The shapes would give them away immediately.
    The solution would be to rig up a set of head gear that every tester would need to where such that they cannot look down at their skis. Imagine Randy from “A Christmas Story” or perhaps more of a back brace situation. I’m excited to see this implemented and beginning a new standard for true objectivity in ski testing.

  25. For the logistical reasons alone, this seems like a difficult, risky endeavor with little payoff.

    Honestly, my take away from the (very interesting) podcast with The Ski Monster was the importance of representation and accessibility. As a consumer it’s important to find a reviewer who describes their style similar to how they feel they approach gear or skiing. From the reviewers side, this partially means being open and personal with their style and “favorite” gear. If Jonathan and Luke disagree on a ski, who should the consumer agree with? In my opinion Blister does a good job of highlighting these bias’ in their reviewers.

    Related to accessibility, I do sometimes wish there was more East Coast or Upper-Midwest representation in reviewers. Logistically this seems almost impossible due to the sheer number of skis you guys have your hands on in a year. Pus shipping them around like that is time lost were that ski could potentially be ridden. A lot of times the numbers and features I look for in a ski here seem less important at larger resorts. For instance: a 30m radius ski like the 4frnt Hoji might be great for U.P. touring or spots like Mount Bohemia, but would be effectively useless for every other resort. Some reviews also tend to banish thoughts about skis for this area to the foot notes, leading to an epidemic of grossly overly aggressive frontside skis. Far too often I see beginners or low-intermediates here on extreme GS or Slalom skis because some magazine or shop told them thats what you need for boilerplate ice and groomers.

    Listen there may only be like 40 of us here, but someone has to do something about Chicago soccer moms buying Racetigers as a first ski.

  26. I think a main thing Blister is good at, that some review sites are not, is being honest about your biases. A reviewer who mostly skis park on the east coast is going to have biases that are dramatically different from Jonathan about the all mountain capabilities of a ski like the ARV etc.

    I also think certain shapes favor maritime powder vs continental powder. Having a reviewer saying where they ski and the type of powder/chop they skied is far more useful than a blanket statement about how a ski handles chop.

  27. useless, I think you have the correct test method and I believe in your impartiality and that the fact that there are several testers guarantees it.

  28. I’ve been thinking more about objective benchmarking for stiffness as a way to improve ratings. Similar to the way some bike stems and cranks are tested. A weight is added and deflection is measured. You’d have to account for ski length, but it might be a cool project even if pretty complicated to do.

    I’d also love to see a downloadable spreadsheet available with annual guides, with flex/weight/width etc. Even if you had to pay for it.

  29. Ever hear of a book called Noise by Daniel Kahneman? It is a fair amount of effort to read but anyone making judgements, which it seems may include reviews, might find it interesting and certainly useful. Its is amazing how many things influence our judgements and how flawed a human can be in making them. Fortunately the author provides ways to help minimize their impact.

  30. Blister reviews are great and the point of them as I see it: to lay out the characteristics of the skis for consumers to make informed choices doesn’t require blind tests. However, if reviewers wanted to pick their favourite ski between similar skis and they were unfamiliar with the shape then this would be a fun gimmicky test like a blind tasting for wine. A Special event perhaps to challenge the reviewer in identifying subtle differences, but not required for daily blister testing.

  31. I infer that Jonathan deems it would be a net-negative change to black-out topsheets at Blister, yet it seems he might consider doing it anyway if enough customers were to demand it. However, looks to me like the people lobbying here for black-out are nowhere near a majority. I’m strongly against black-out, in Blister’s case.

    I do a lot of A/B Testing of skis (not for Blister). Please understand that Blister’s A/B Testing/comparisons/spectrums take A LOT MORE time/effort/focus than just isolated ski reviews without any valid direct comparisons. That time/effort/focus increases factorially as you increase the number of ski models you directly compare (you know, kinda like “exponential growth”, but it’s “factorial”).

    Any Blister effort towards effective blacking-out/blind-coding would mean significantly LESS focus on all the comparisons and deliverables that readers crave—which would result in significantly fewer deliverables published. Not worth it in Blister’s case.

    Furthermore, does anyone really want to throw extra hurdles/distractions at Paul Forward that will hinder HIS “production of deliverables” (which includes keeping clients alive as a heli-guide, saving lives as a doctor, etc.)? Not worth it in Blister’s case.

Leave a Comment