Transcript of Health Ratings Research from “Trust or Consequence: How Failure to Disclose Ad Relationships Threatens to Burst the Search Bubble”

Presenters:
Peter Goldschmidt, Health Improvement Institute

Other Speakers:

Beau Brendler, Consumer Reports WebWatch
Tom Eng , VMP, MDH, eHealth Institute
Patricia Thomas, University of Georgia
Pravin Patel, Patel Communications
Stephen Barrett, Quackwatch
Bonnie Lawrence, Family Caregiver Alliance and the National Center on Caregiving
George Linzer, LabTests Online
Nicole Spelhaug, MayoClinic.com
Ruth Amernick, San Francisco Public Library
Barbara Coll, WebMama.com
Bill Kelm, SEO-SEM Consultant
Kristen Gerencher, CBS Marketwatch.com
John Hopkins, WebMD.com
Janet McDonald, FDA
Chuck Bell, Consumers Union

Note: This is an edited transcript of the proceedings.

Beau Brendler: Okay, so now we’re going to shift gears a bit, topically, we’re going to be talking about health sites and, specifically, the top 20 health information Web sites that Consumer WebWatch and the Health Improvement Institute have come together to rate.

So I want to introduce to you Peter Goldschmidt. He’s actually going to describe for you his background when he stands up. Peter has had a very long and interesting and varied background and history in this kind of work. He’s going to talk about that and also about the ratings that WebWatch and HII have done together, which go live today in beta.

Peter Goldschmidt: I’m the president and founder of the Health Improvement Institute based in Bethesda, Maryland. And by way of disclosure, I am a physician, I have a doctorate in public health and I went to business school. And I’ve spent my professional career doing two things: One is assessing the effectiveness of medical intervention and, secondly, measuring and improving the quality of medical care.

One of the foundations of medical practice is the evidence, the scientific evidence that underpins medicine. In 1976, we started to investigate that under a contract for the Secretary of Health and Human Services. And some of the work that you’ll hear today has conceptual underpinnings that come from that time and they also bring together a focus on how consumers can evaluate healthcare and how we can contribute to consumers’ ability to do that.

So what I’d like to do today is to describe to you this project. What I’ll do is I’ll talk about the partnership, I’ll talk about the methods that were used to rate the websites. We’ll talk about the results and some of the conclusions that come from those results.

Let me talk a little bit about the partnership. I’m very delighted to say that this has been a very successful partnership over the last many years. And I’d like to thank Consumer Reports WebWatch for the productive relationship that has developed. And also the Institute volunteers who rated some of these Web sites. Some of them are here today, so they can give you their personal experiences of how that process went.

The key events that occurred that led us to the point we are, was the Institute in 1997 held a workshop in which 60 people – or 60 organizations, I should say; some people sent two representatives – looked at what the quality of information was on the Internet and how it could be improved. From that initiative came several other initiatives, one of which was this quest to rate individual health Web sites.

The purpose of the ratings is twofold: One, to give consumers the opportunity to have independent ratings from the point of view of enabling them to be better consumers of health information.

I should also thank the Medical Library Association for their active support of this project and, as you may know, health librarians, librarians who specialize in healthcare are a focal point for both professionals and the public to find information, both in published literature and on the Web.

On the next slide, I’ve shown many of the efforts that are underway or have been undertaken to try to provide help to consumers. They take the form of criteria, codes of conduct and trust marks. There’s a growing literature on these activities and also growing literature on the quality of the Web sites. So the question becomes, what’s new? As I think you’ll see today, two things are really new: One is independent commitment to ratings, and the other would be the focus on consumers.

Our philosophy for rating Web sites is really to try to understand what they are about in terms of their disclosures and the ease with which they can be used. Now, health Web site ratings have practical value and they’re twofold: One is a predictive value, that is, people can find out whether or not the site is worth looking at; and secondly, they may illuminate issues that consumers would not be able to undertake for themselves.

For example, we’re all familiar with restaurant reviews. One of the reasons that we like them, other than they’re entertainment value, is that they guide us as to which restaurants we might want to try and which we might want to avoid. They can also provide you with information you may not be able to get as a consumer, because after all, if you go to a restaurant, you can tell whether you liked it or not, but you may not get some of the background information that a reviewer may have.

So what we’re looking at in the health Web sites is not only some of the things that a consumer can judge for him- or herself, such as ease of use, for which they may want predictive information, that is, if I go there, will it be easy to find what I’m looking for? But we can also provide them with information on, for example, the editorial adequacy of the site that they themselves may find very difficult to judge.

So that’s basically the underlying thoughts behind the way in which we’ve set up the conceptual framework for what we’re looking for on these Web sites.

The other thing that’s very interesting to us is that not all health information sites are the same. We have differentiated different types of Web sites. The health communication Web sites, of which health information is one part, can be differentiated, for example, from behavior modification sites. On the next slide, we show some of these different types of sites.

The criteria you need to evaluate these sites are somewhat common, but they’re also somewhat specific. And so we decided to start with health information sites; the information sites are those sites with the intention of providing consumers with access to information about topics that will help them with their healthcare. And these can be clearly differentiated from sites where the goals are somewhat different. We intend to go on to rate these other types of Web sites later on.

So that’s the basic, underlying philosophy of how we’ve approached the problem and what I’d like to do now is give you many more specifics about how we actually went about doing it.

There are on here four lines of activities. And they’re distinct and necessary in order to get to the end point, which are the ratings described. The first thing we have to do is to decide how we’re going to select Web sites, and so far we’ve focused on the most frequently visited sites, according to Nielsen ratings. We took the top 20 of those sites to provide information and ratings.

The second thing one has to do is have some kind of instrument by which you’ll do the ratings. We started off, as some of you may know, in 2003 by looking at all of the criteria that had been used to evaluate the quality of health information Web sites. From that, we then had a process of evaluating those criteria, coming up with an instrument – and I’ll tell you how we did that in a few moments.

The third stream of activities has to do with who’s going to make these judgments. What we have done is put in place a credentialing process by which we select volunteers in order that they can be put onto panels to rate Web sites, and I’ll describe that process.

The final stream of activities has to do with how can we present these informational ratings to the public in a way that is attractive and useful to them. So that’s the fourth stream of activities.

What I’ll do now is go through each of these and describe to you in brief terms how we’ve approached them.

The first thing we had to do was to come to grips with what is it we’re going to evaluate. We distinguished three things that are very important. One is a generic set of activities, of which you’ve heard a lot about today, that have to do with transparency and the accountability of Web sites.

The second one is the editorial adequacy – how content is selected, rated, reviewed and presented on Web sites, and that’s one of our contributions that we’re going to make with these ratings.

And the third one is basically the information reliability, the one ultimately that people want to know: Is it true? That is a question that transcends Web sites and really is all about the practice of medicine, is asking the question: What do we know and how well do we know it in the whole field of medicine? That’s an impossible question for an organization such as ours to undertake. The reason for that is following.

One, the Web site may contain thousands and thousands of subjects. To evaluate those subjects, you need experts in that particular subject. And thirdly, it’s no more or no less than trying to evaluate the entire state of the science of medical practice – a very large undertaking and, as we concluded, one that we couldn’t undertake. So what we decided to focus on was the first two, the transparency and accountability and the editorial adequacy of health Web sites.

It doesn’t matter whether you’re on a Web site, you’re in a printed article, you’re reading a textbook of medicine, you still have to face the final question, and we’ll talk more about that in this afternoon’s panel.

As I mentioned, we did start with the top 20 of the hundred most-trafficked sites. We intend to expand those ratings in two ways: one, more sites, and the second would be different types of sites, like behavior modification sites, sites which present information on ratings. That’s going to become more important; we’re going to see more and more sites giving you ratings on doctors and hospitals.

And, finally, and what’s more important ultimately, is the fact that these ratings have to be reviewed and, if necessary, revised periodically, and we intend to do that every six months.

On the next slide, we look at the rating instrument and the rating instrument has two parallel sides to it. As I mentioned, one is the accountability side and the other is the editorial adequacy. Consumer Reports WebWatch undertook to refine their instrument for looking at editorial accountability and transparency and what we did was undertake the task of coming up with a set of criteria and instruments that would actually rate the editorial adequacy of health Web sites. These instruments are available.

We ourselves like to be very transparent and tell everyone how we do things, with the point of view of getting feedback so we can improve the process and the outcome and therefore the usability and utility of the resultant ratings for consumers.

We move on to the next one. This was the contents of our instrument. It had four major parts: One, as I mentioned, the transparency; the second one was editorial adequacy. We also, from a research point of view, looked at the HON [Health of the Net Foundation] code compliance, an important issue – how trustworthy are trust marks? And, finally, we wanted to get feedback from raters about how we could improve the process, so that we could refine the instrument going forward and every iteration being an improvement on the last.

The five areas in which we look for accountability and transparency were the identity of the Web site; advertising and sponsorships; ease of use; whether they had a policy about corrections; and finally, one about privacy. All of these terms are defined and each of them has a set of criteria which are in the instrument and you can access.

In terms of the editorial adequacy, we’re looking at adequacy not only in a general sense, which might apply to print publications as well as Web sites, but also principles that apply specifically to Web sites. What we asked our volunteer raters to do was to go through two sets of activities. One was a set of checklists, was the A column. And B, to write some brief narratives as a result of doing their checklist.

The advantage of this approach – that is, a highly structured checklist plus then an interpretative narrative – is that you get the best of both worlds. You get some structured information and you also get some perceptive information – having been through the process, what do you as an expert conclude from having these results? And that creates a richness of information that we as developers of instruments and providers of ratings can use to refine the process and then pass on to consumers.

What we’re looking at in terms of editorial adequacy is how they select information, how they grade information, what they say about the authors, what they say about the reviewers, what kind of quality assurance checks do they have before it goes on the Web site and what kind of policies do they have to make corrections, bring to attention things that are new and revised and so on. These are in terms of very structured information items. And then, as I said, interpretative, evaluative items that they used to enrich the information we have available.

Go on to the next slide. I’ll talk about how we found the people who did the ratings from the Institute point of view. The Consumer Reports WebWatch ratings, the ones on transparency and accountability, were done by their staff. But what the Institute does is rely on volunteer experts in order to do ratings and participate in other programs we run.

We periodically issue a call for raters. They send us biographical information. We have a committee that looks at that information and decides whether or not they meet the criteria that we have established to become a rater. Those criteria are called credentialing criteria. If they meet those criteria, they then go into the pool. For any one Web site, we take three of them and try to match their expertise, so that we have a more complete, rounded view.

Also, our ability to use panels allows us to have more than one set of eyes, because sometimes you just may misinterpret something and, secondly, you get a much more full perspective, because everybody comes to this from a slightly different background. I’ll tell you the backgrounds of the people who actually did the ratings in this first wave.

May we go on? This next step was then to actually do the ratings and the way it was done was a Consumer Reports WebWatch staff did the first set on transparency. Our volunteers did the second set on editorial adequacy. We then integrated the ratings and, from that integration, we produced a final overall rating and then displayed that on the Web site.

The ratings Web site has as its purpose, obviously, to communicate to the public of the rating results and we tried to come up with a terrific design for that and Consumer WebWatch, with its background in publishing, was the driving force behind that. You can judge for yourself the result.

But, basically, it was fairly straightforward. We wanted a way for people to have an easy way to find the ratings, understand the ratings and, hopefully, use the ratings in selecting sites. And, finally, to provide us feedback about how that site was useful to them, what other sites we could rate, how we could improve the process, what else they’d like to know and so on and so forth. The process of feedback is absolutely critical to our success in being able to convince people that these ratings have utility and to find out from them the extent to which they do.

Now, I expect you want to know what the results were. What we’ll do is go through the sites we rated, what the results were in overview. We’ll give you some idea of the ratings Web site and then go on to some conclusions about that. We invite you very much to look at the ratings on Healthratings.org, which is the site that we set up for this purpose.

The types of raters that we used, it turned out we used 14 volunteer raters to rate these 20 Web sites. They were assembled into three rater panels, so that we mixed and matched their expertise. You have to remember that they were all credentialed, they all met the minimum criteria and that was, to remind you: at least five years relevant experience in fields that related to rating Web sites. It happened that two of them were health practitioners; five of them were health information experts, including, for example, experienced health librarians, information archivists and so on; two of them were health education specialists; and five were in other related fields, including people who designed health Web sites, health journalists and so on.

We came up with ultimately a final set of ratings that conform to the Consumer Reports type of rating, where you get a little red dot meaning excellent and you get a black dot if you’re poor. And you’ll see those on the Web site.

The final set of attributes that we came up with were identity, advertising and sponsorship, ease of use, corrections and currency, privacy, design coverage, accessibility, contents and then an overall rating.

Now, some of these things can easily be guessed and evaluated by consumers. A consumer can tell you how easy it is to use a Web site, and here we’re giving people information of a predictive type. If you go to this site, you’ll find it easy to use.

Now, when we get down to some of these things, like the contents, it’s very difficult for consumers to understand and evaluate the editorial policies and procedures that some of these sites use. So here we’re giving them information that they normally would not be able to discern themselves.

If we go on to the next slide. These are, in fact, in order of the frequency of visit, the top 20 sites. This does not say that this is the best site, this is just the order of use, the number of hits, according to Nielsen. We then assigned these to panels of people to rate and we came up with some ratings.

This is the result in all of the 20 sites taken together. What we decided to do was to show you the median attribute. That is, if you take all the ratings and line them up from the best to the worst, what’s the one in the middle look like? The one in the middle looks like this. It has an excellent identity, so therefore, you know, it tells you who owns the site, who pays for it and so on and so forth. They all did a great job of that.

The second one was advertising and sponsorship. Did a good job of that, too. You could tell what content was advertising and the content that was not advertising.

Ease of use was a surprising one and also with design, they weren’t as good as you might think they would be. Those results were rather interesting to us, because we thought they might be better rated at that than they were.

One area which was found to be somewhat deficient was corrections and currency. A few of the sites had places so obvious where you could go to say, “Whoops, we made a mistake and we should have told you this, but we told you that.”

In terms of the four areas that have to do with the editorial adequacy, design was good, but coverage – by which we mean other things that the site said it covered – how well did it cover them. And that was very good.

Accessibility, and that covers three things. The navigation ability – when you go on the site, can you move around and find things? Search engine site map and so on. Reading level, which is absolutely critical, because you have to be able to comprehend what you’re looking at in order to gain anything from it. And the third was accessibility to people with special needs. That would include, for example, foreign languages, visually impaired and so on.

And then, finally, the contents, which dealt with the editorial policies, such as how you found things, what topics you selected, how it ensured the quality and so on.

These are the distributions of the top 20. As you’ll see, six of them were rated Excellent overall, which is 30% of the total. Five were Very Good. Eight were Good, one was Fair and none of them were Poor.

Now, you have to remember that these were the top 20 most-trafficked sites. Now, these results are not generalizable necessarily to every Web site of a health type that’s out there. We did, by way of comparison, do some ratings of other sites and we did in fact find some that were Poor. But these were for our own quality control purposes.

These are, in fact, the five – rather, six of the top 20 sites that were rated excellent. And you’ll notice there’s a mixture in here of commercial sites and non-commercial sites and the two non-commercial sites are Kids Health, which is a non-profit, and the NIH [National Institutes of Health], which is a government-run site.

Interestingly enough, the only site really that was compliant with the special needs is the NIH site, because all government sites must be compliant by law. But the other sites were not compliant with those special needs.

This is an example of the ratings Web page. It is a beta site and by beta, we mean that it is a site that we have put up for feedback and comment in terms of its design and usability for consumers. The actual information displayed here is correct and current. But we would really like to get feedback about how this works. This is obviously a static picture, it’s a pro forma. When you go to the site, it’s much more dynamic, and we encourage you to do that and to give us feedback about how well you think we’ve succeeded in providing a site that communicates information about the ratings that we have made as to the utility of Web sites to consumers.

There are some conclusions we can draw from having done these 20 sites. The first one, I think, may be obvious to everyone, but does bear mentioning that it’s very costly, very complicated and very challenging to provide consumers with high quality, reliable health information. This is not an easy job.

The second thing, and a corollary of this, is it’s a very difficult thing to evaluate these sites, but we think it’s necessary to provide consumers with this predictability and with this independent assessment of the sites’ utility to their particular needs.

Six of the sites were rated overall and ultimately all of these sites were all limited by the state of medical science. No site can go beyond that. All they can do is try to do the best job to provide information that gets us close to the state of medical science as possible.

In the next slide, there are some general points we found where some improvement could be made. One is that the editorial policies were often not described, poorly described and in some cases, we suspect, needed improvement. So this is one area where some improvement could be made and the topic areas that we thought improvement was necessary was the search for [grading] information, the naming of authors and reviewers, their credentials – did their credentials have to be relevant to what they were writing about?

They should state in the pieces or the articles when they were last reviewed and updated, and that didn’t occur all the time. There should be references to the basic facts. Where were these facts found from, so there would be links to citations or at least names to citations.

And then, finally, there should be criteria for linking to other Web sites. If somebody goes and says, “Well, here’s a resource,” what was the criterion you were using to say that this was a good thing for you to do, to go look for more information on this site?

We think that there certainly can be some improvement in the design and ease of use. That was one of the surprising things to us, is how poorly the sites rated on these issues. Clearly, there needs to be more attention paid to accessibility. There was only one of these sites that had a foreign language, a Spanish version of it. Others were all in English only.

The second thing we noticed was that the reading level was at a higher level than most consumers could really read. Many independent studies beyond ours have shown that most of the health information on the Web is written at a college level. Now, that may be appropriate for the people who go in and look for things, but ultimately, as the Web gets more and more exposure and more and more people get on, we’ll have to be concerned with how well people can read it and understand what they read.

Finally, in terms of the feedback, we’d like to get your feedback on the utility of these ratings, how we can improve the process, what should be rated and also we’d like to get people involved in rating Web sites who have an interest in doing that.

That’s an overview of what we found. We’re very delighted to be able to announce the ratings of the first 20 Web sites and we encourage you to visit Healthratings.org. And I’d also like to answer any questions you may have.

Yes, please state your name and your affiliation.

Speaker: [unintelligible] Design. Can you please explain once again the difference between accessibility and ease of use?

Peter Goldschmidt: Yeah. Ease of use, as measured here, was what the Consumer Reports WebWatch raters found in terms of structural measures, such as search engine, site map and so on. What our people were concerned about was the navigability of finding information that they had on a site, the accessibility in terms of reading level and, thirdly, the accessibility to people with special needs.

As you may know, the federal government has a set of standards, 508 standards, and there are certain ways in which you can become compliant to them; only the NIH site was compliant. It’s very important for people who are visually impaired. For example, a number of the sites did have a choice of typeface, so you could make it bigger, easier to read.

And the other one that’s obvious is also the foreign language, that some people looking for information who don’t speak English very well are going to have a hard time on these sites, because they’re mostly geared to people who have a college education level. And one of the sites, Medscape, is actually geared to health professionals. So you have to be very good at being able to read those articles if you’re going to get the full information from them.

Tom Eng: Thanks, Peter, very good job. I’m Tom Eng with Helia and eHealth Institute. About six years ago, I was involved in a project, as you know, with a science panel on interactive communication health, so I’m delighted to see that a lot of the template factors and disclosure statements are actually reflected in this instrument. Two questions and then just one comment.

First question is do you know what percentage of the market for health information those top 20 sites represent, number one. And two is, I might have missed it, but did you actually have multiple people review the Web sites and, if so, did you look at the inter-rater reliability, coefficients and all that kind of thing.

And then the third comment is just that I just went on HealthRings.org, and I noticed that it was not actually WC3 compliant [unintelligible], so I would suggest you do that.

Peter Goldschmidt: Okay. Beau, do you know the answer to the first question of what the percentage of these sites, 20 is in terms of hits?

Beau Brendler: No. We can find out.

Peter Goldschmidt: Right. In terms of the second one, yes, there were multiple raters of these two sites. Consumer Reports WebWatch had two people review each site and what they did was then come to a consensus judgment. And what we did was to have three raters review each site independently. They didn’t talk among themselves, but what we did do was to look at the different scores that they had for each of the attributes. And the reliability is quite good. We haven’t yet gotten to the point of calculating that statistically, but that is a guide to how to improve the instruments. Because you want people, when they rate things, to come up with fairly consistent ratings among themselves.

Patricia Thomas: Pat Thomas, University of Georgia. You know, Peter, this is a huge undertaking and a lot of this is really impressive, but I am troubled by one thing and that is, if I understand you correctly, you seem to be saying you can’t evaluate the quality, ultimately, the quality of the information provided, because we’re not at the end time yet and medical research goes on and is always changing.

But, from a consumer’s point of view, they have to make real life decisions based on imperfect information every day and that’s why they’re looking at these Web sites. So it feels like in a way to me that you’re sort of dodging the issue on the quality thing and that bothers me. Why did you do that?

Peter Goldschmidt: Well, I could tell you why we did it. If anybody here would like to make a billion dollar down-payment, we’d undertake the job. It is a tremendous job to be able to evaluate the quality of evidence for medical practice. We’ve done this for professionals, we’re now looking at the issue from the point of view of consumers. And the difficulty is – not we’re dodging the bullet, we’d love to do the job – we need the tools with which to do it, the money.

I do agree with you that ultimately consumers want to know if I’m recommended to have this treatment for breast cancer, is it the best for me? That’s a question that today is very difficult to answer. What we can do at the fringes is to identify, what shall we say, interventions that are clearly beyond what medical science can support. People having coffee enemas, we don’t think that’s going to do them much good. But if you tell me, is the current recommended treatment by oncologists the best treatment for my condition? That’s something that oncologists will have a problem with, because they will argue among themselves as to whether that’s necessarily true.

Patricia Thomas: Well, Peter, just a footnote, I never meant to suggest that any Web site should ever prescribe or diagnose an individual, that would be in Steve Barrett’s [of Quackwatch.com] realm of quackery. But I do feel that presenting consumers with comparisons and relative merits of different courses of treatment is worth trying, even if we don’t have that ultimate outcome study that of course you would like to do and we would all like the answers to.

Peter Goldschmidt: Well, I think it is worth trying and the question becomes how difficult and how costly it is to actually undertake to do that. What we have said is the best we can do with our resources is to look at how the Web sites approach that problem. What kind of search they do for the information, how they grade it. What kind of experts they have write the articles, what kind of experts review the articles and how they assure the quality of what they post on their Web site.

We cannot, because of our limited resources, judge the outcome of that process and say, yep, they’ve got it right or, nope, they didn’t get it right.

Pravin Patel: Pravin Patel, with Patel Communications. Two questions I have. Number one, what is the accuracy of this information – we are talking about the quality and accessibilities and so forth. But I think as the consumer, accuracy is more important or as much important to me, because as we all know, error rate is one of the killer factors for many, many deaths and so forth.

Second question is, would we have access to the slide presentations?

Peter Goldschmidt: Yes, I’d be delighted to answer the last question first, because it’s easy, I’ll say, yes, you may have a copy of this and give me your email address, I’ll email it to you. We’d like to make things available, we’ll put it up on our Web site.

Going back to the first question, which is ultimately a very important question, again, just to reiterate, we did not look at the content. If you look at the contents of a site like WebMD, it is enormous. Or the contents of NIH, there are millions of pages in there. It’s very hard to say, well, all of the information or some of the information is true and this bit isn’t true.

There have been other studies that have been done on looking at the accuracy of information on Web sites. And, in summary, what they say is that some of it’s accurate and some of it isn’t accurate. And that’s true in the print media, it’s true in medical textbooks, it’s true of medical science in general. That’s why you always come to the same conclusion. It is extremely difficult to actually undertake that task. It’s one that I believe should be undertaken, but it will require a tremendous amount of money to do that.

Steve?

Stephen Barrett: Yeah, Steve Barrett from Quackwatch. I was one of the raters, I rated six sites. And I looked at the content, the quality of content of each site. This did not reflect in the ratings.

I think there’s something in-between, though, and that is that – Pat brought up the idea that, can we help consumers know whether a site has reliable content? Your response was, well, that gets into all of medicine. The issue is not whether the site recommends the best treatment. It really ought to be, does the site reflect the best medical knowledge of today? And I think there are techniques that can be used. It’s very easy to screen out sites that don’t. I’ll talk about how to do that this afternoon. But how good is the quality of a good site?

I want to just make a comment about one of the sites and that was Healthboards. I don’t understand the rating. I don’t know what – I haven’t seen the ratings of the lower ones. Healthboards is a site where anybody can go on. If you’re a health professional, they don’t want you, they want consumers to write in and tell about their experiences and ideas. I can’t think of a more unreliable site, that’s absolutely worthless.

I don’t know how this – I haven’t seen how you applied this system. But this does not – you talk about these sites being excellent, but you don’t get into the – somebody might think you’re talking about quality of content and it doesn’t necessarily correspond. I imagine this was the fair one?

Peter Goldschmidt: Well, if you look at the site, there are – actually, is this one–?

Speaker: No, that’s WebMD.

Peter Goldschmidt: Okay. But if we choose, if we’re online, and we choose the Healthboards, you’ll see what we say about it. The Healthboards is basically a site, as Dr. Barrett said, where consumers can go online and share experiences. So, if it has any value, it is in the shared experience. Information on this site is highly unreliable and, if we look at the weaknesses – can you click on weaknesses? “It has no authoritative information, anyone can post any claim. There are some dead links and no credentialing” and so on.

So we actually say that here, that it has –

Stephen Barrett: How can you rate – did you rate it fair?

Peter Goldschmidt: Well, what was the overall rating on that? Yeah, it was rated with an open circle, which is good, but if you look at where it says “contents,” it was not applicable and we say why.

Stephen Barrett: It’s not a good site, it’s absolutely worthless.

Peter Goldschmidt: Well, maybe–

Stephen Barrett: They have people there who are selling things. They don’t want professional input. If they catch you being professional, they’ll throw you off. There couldn’t be a site set up that’s – the only thing that could be worse is a site that’s run by a quack who wants to sell you something. How the hell can it get a good rating? I love what you’re doing, but something’s wrong.

Peter Goldschmidt: Well, in terms of the actual content, we say it hasn’t got any. But the reason it had a Good rating is it complied with the other criteria. And, secondly, it has some value to people who want to exchange information among themselves.

Okay, let me ask the lady behind you.

Bonnie Lawrence: Well, this is fairly simple. I’m Bonnie Lawrence, with Family Caregiver Alliance and the National Center on Caregiving, and we’re continually discussing the level of the information we put on our Web site. What is now the consensus of the literacy level of acceptable – I mean, we even talked about having two versions, a very simplified one and a more detailed one, but you lose some of the precision when [unintelligible] a very simplified language.

Peter Goldschmidt: Well, that’s correct, it is a tradeoff. And I suppose the people on usability would say that the ideal system is one where you’re automatically gauging the level of the comprehension of the reader and providing information in a form that that individual can comprehend. And that’s a very difficult thing to do, but the NIH Web site, for example, is written at a lower level than –

Bonnie Lawrence: At what level, is there a grade level?

Peter Goldschmidt: I don’t know what their target is, but I think at least an 8th grade level is probably what the ideal would be. But that then is irritating to people who can read a college level. Okay.

George Linzer: George Linzer, I run the patient Web site Labtest online, and I was a rater as well.

Before I get to the point that I originally raised my hand for, I just want to address Stephen’s comment about the reliability of the content. And there were some criterion that we looked at that were really looking at, essentially, the evidence base on which the content is derived. It wasn’t assessing the content itself, but looking at whether the Web site disclosed information relating to the evidence on which it was based. And I think – maybe it’s an issue of weighting the overall rating to reflect that a little bit more.

The other point I wanted to make, and this goes back I think to Pat’s original comment, without having looked at the site, I’m just wondering if you disclose on there – I’m sure you disclose what the rating does. Do you disclose what it doesn’t do, which is Stephen’s point, that you’re not looking at the actual information, you’re not assessing that. Because most consumers, I think, are going to look in any kind of a rating system like this or Good Housekeeping seal and assume that the stamp of approval is on the content itself, not on the structure behind the content.

Peter Goldschmidt: Well, we do on the Web site disclose the methods and describe in that sense what it does not do. Also, in a rating scheme like this, a rating of Good isn’t good. I mean, you’ve got to really be Excellent on that scale.

Nicole Spelhaug: Nicole Spelhaug, MayoClinic.com. I love this. This is great. And my question is, how much media play is it going to get? Or, in addition, to the Web sites?

Peter Goldschmidt: Well, I can’t predict that, but I think we would love the media to look at this and decide whether it’s something consumers ought to know about and encourage consumers to use.

Nicole Spelhaug: Well, this rating is exactly what I was speaking with some colleagues last night about in terms of being able to go and identify the weaknesses and the strengths of a particular Web site, that there isn’t one that’s perfect in all regard, but knowing some strengths and weaknesses is very useful.

Peter Goldschmidt: Yeah. Well, if you look at the – as you know, MayoClinic was one of the sites we rated Excellent. If we look at the MayoClinic rating, let’s see, we can see what we said about that. This — Okay, we said it was “easy to navigate and use, no scrolling.” Great, okay, good. “In-depth articles with reliable health information.” The reason we said they were reliable was that they had some editorial policies, they described what they did and we thought that they were somewhat adequate. “Had relevant links to helpful resources, both within and without side the articles.” And that they were “clearly written and provide updated information.”

So those were real strengths, now let’s look at the weaknesses. “Some articles were very long.” Okay, consumers want information that they can get quickly, if you have to a read a 20-page article to find it, it’s really hard. Okay. “Doesn’t name the authors of the articles.” We thought that was a real weakness. It says, Trust us, our authors are great. We don’t like that.

And “lacks a text-only option.” We felt it would be more accessible to people if there was such an option. So those are what we said. There you go. And it has the HON code. Offers various tools, such as calculators, quizzes and recipes.

So, even the very excellent – sites rated Excellent overall had things that were weaknesses and where they were, we listed them. Okay.

Ruth Amernick: I’m Ruth Amernick from San Francisco Public Library, I’m the health and cookbook librarian. And you had mentioned before about peer reviewed. I was interested in, there are different types of journals that are peer-reviewed, like Lancet, American Medical Association Journal, JAMA. But there are also a lot of sites like the Andrew Weil site. He is a doctor, but it’s alternative medicine. And my concern is that there isn’t more about some alternative medicine, Chinese medicine, other kinds of things, because talking about what’s available and what’s accessible and getting the most knowledge we know now, people who look at the Web sites are looking for something that will immediately help with the health problem or whatever. It seems like maybe that’s an area that could be expanded a little bit.

Peter Goldschmidt: Well, that’s a very good point. We know that 80% of people who go online look for health information and that two-thirds of those look for information about a specific problem. And it’s true that there are many specialty sites, like Weil’s site and we haven’t gotten to rating those yet and that may be useful to do. If you’re suggesting that that’s one of the directions in which we could go.

Barbara Coll: I’m Barbara Coll. I want to talk to the gentleman here who thought Healthboards here was – didn’t have any good content on it. It’s one of the things—

Stephen Barrett: Worse than that, it has bad content.

Barbara Coll: Right.

Stephen Barrett: If you go there seeking advice, you’re likely to get as much bad advice as good advice.

Barbara Coll: Right, but what we have to get used to in all communities in the Web world is the idea of community-based Web sites and community-based ratings, and the blog and the fresh content. So as we get more involved in this, the standards will come out – or maybe they won’t, because really what’s happening is you’ll get more and more information from your peers and more and more information from your local community. They’re going to be interactive. We’re going to get used to, as Web users, screening and filtering our own information. And I think things like Healthboard, while we’ve been doing this – well, some of us know community’s been around for a while online, but really, it’s hitting mainstream just this year.

So you are going to see articles that are rated by peers. And you’re going to see articles that have this kind of awful feedback. But it’ll get better over time.

Peter Goldschmidt: Well, that’s an excellent point. We have the People’s Choice Awards, people have opinions, they share. We know that people choose doctors based on what their friends say. That may not be the most sensible way to do it, but it’s reality. So we’re just providing another source of information that people can have accessible to them to make a choice and that’s our goal.

Beau Brendler: I wanted to just note – this is what Peter was talking about about the statement in our site that talks about not assessing to the degree of arbiter of medical truth. And my response, obviously, you can include me as biased, because we did this together, but I’ve gotten a lot of bad advice from doctors.

Peter Goldschmidt: Okay.

Bill Kelm: My name is Bill Kelm, I’m speaking as a consumer. As my wife gets migraines and she looked at two sites – Berkeleywellnessletter.com – because she’s gotten a letter in print for many, many years. But also she looked at a Web site called Health A to Z, which had something on it called the URAC accreditation. And I wondered if you knew anything about that and could you comment on both Web sites.

Peter Goldschmidt: Well, I can’t comment on the Web sites, because I haven’t examined them myself and we didn’t examine them in this part of the study. But URAC, as it’s properly known, is the Utilization Review Accreditation Commission, and they started out life, as their name suggests, assessing utilization review companies. And some years ago, they went into a situation where they wanted to help consumers rate Web sites. And, essentially, you pay them a fee and they come out and assess the extent to which you meet their standards and, if you pass, they give you that seal.

Speaker: I just wanted to go back to the comment about the communities, the rise of communities and the ability of people in those communities to screen out unwanted or not credible information. There’s a lot of buzz about blogs, etc., in the marketing world. And it is only a matter of time before the marketers learn how to market through those communities, as well, so that’s going to raise this whole discussion to a whole new level. I don’t know that we can rely on communities always to ultimately give us the answer that we’re looking for.

Kristen Gerencher: Hi, I’m Kristen Gerencher, I’m with MarketWatch and a healthcare reporter. I’m very interested to learn, first of all, where people go to get their news and to get their health information.

And, secondly, I wonder, in the last year alone, you know, Vioxx, the flu shot shortage – how many of these sites are competing on currency. It’s sort of like a media organization, in a way. And I wonder what kind of check and balance there is on providing conflict-of-interest information. Pfizer could be a great site for all it sets out to do, but do I want to go there to find out what the consensus now is about pain management?

Peter Goldschmidt: Well, if you look at the Pfizer site, again, this site actually came out with a Poor rating. If you look at the strengths and weaknesses here, the strengths was, if you want to find out product information about Pfizer’s products, here’s the place to go, they’ll tell you what the information is.

Well, it was difficult to determine what the objective information was. And so that’s maybe an obvious statement, because everybody knows it’s an advertising site for Pfizer product.

Kristen Gerencher: Right, I’m wondering [unintelligible] stands out. If some of these sites that are advancing different agendas, you know, [unintelligible]. But if they are indeed trying to be first [unintelligible] so slow and so prone to [unintelligible], I just wonder how do you [unintelligible]?

Peter Goldschmidt: Well, that’s a good question. I think most of these sites have news sections, but I’m not sure that they view themselves as news vendors.

Kristen Gerencher: Right, they use the sort of, come to us and [unintelligible] your trusted source on [unintelligible].

Peter Goldschmidt: Correct. Yeah, I mean, you can differentiate a topic that’s current versus one that’s important, for example.

John Hopkins: John Hopkins, with WebMD. I just wanted to add a quick comment about URAC, specifically, and about readability of content. One of the new URAC health Web site standards have just been released for public comment and one of their recommendations is that any holder of the seal actually use the OMB’s plain language to try to get users or help users to understand the content. Obviously, we don’t want to lose medical relevance, but it is important to make it readable and usable by what we sometimes refer to as the lowest common denominator.

Peter Goldschmidt: But, again, I think it would be an error to make every site just be a sixth-grade reading level, because that really disenfranchises people who are able to read at a very advanced level. So I think the real trick is finding out how to present information in an interactive way that’s appropriate for the person that’s on the site and not just have a billboard.

Janet McDonald: Janet McDonald, FDA. I had a comment on the MarketWatch comment. You might want to consider sometime, on a site like Pfizer, suggesting that people link to FDA.gov if they want to find out what the government’s position on drugs that are withdrawn.

Peter Goldschmidt: That’s a very good point. But we also have to be careful not getting ourselves in a situation of endorsing or putting people in touch with contradictory views without having a policy by which we say this is our policy for linking to those other sites.

Beau Brendler: A couple of cases, just to add, we have included in some of the ratings material, see if I can find the one or two that had it where we say we recommend an alternative source of health information. So we do say that for a couple of the sites.

Peter Goldschmidt: Yeah. Steve?

Stephen Barrett: I think that – I like to think of, when consumers look for information, what I advise them to do is to set up good anchors. And your anchor can be your family doctor or your personal physician. But in the media, I have two favorite anchors. I have a list of about 20, but my two favorites, and I’m going to give a plug here – Consumer Reports on Health is in a class by itself when it comes to publications. It’s probably the best anchor there is. On the Web, the best anchor to learn about general disease is Merck Manual Home Edition. I think those are the starting, those would be my starting points for seeking information about a topic. Using search engines to find information is very problematic.

Peter Goldschmidt: Anyone else who has a question? Chuck?

Chuck Bell: Yeah, Chuck Bell from Consumers Union, Consumer WebWatch. I just wanted to underscore that many of the criteria we use is very disclosure-oriented and that’s intentional, that’s on purpose. I was thinking, in particular, on the privacy one, we’re really looking for sites to tell users what their privacy practices are. And somebody who is an intense privacy watchdog might take issue with some of the scores in this area, because they might not like the content of what the site is disclosing.

We’re looking at a somewhat different issue and I think that, in our defense, we have to be able to walk before we can run in this area. We have to look at things that we can actually measure and evaluate and rate in a definitive way. And the disclosure, the core disclosures, some people say, wait, this is way too basic. But I think it is a strategy that we can build from. There’s nothing that excludes us from doing additional types of research projects, investigative journalism and expanding these ratings to include many more types of criteria. But we have to start and have a base and we do think this is a very important place to start.

Peter Goldschmidt: Well, thank you for that clarification.

Chuck Bell: Also, these sites change all the time. And just as in the beginning, we were talking about Yahoo! used to have things in red, now it went to gray. Some of these things, the ratings today, in three months or six months, a lot of these sites might change completely. So that’s always an issue with the Internet, that things are ongoing and things change a lot. And maybe the content for certain things is going to be there and maybe it won’t be.

Peter Goldschmidt: Well, that’s right, the dynamism of the Internet is its strength, that you can change things electronically very quickly, you can get them out very quickly. And so our response to that is to re-rate everything every six months. And what we hope is that some of the sites that were not as highly rated as others might look at some of the criteria we have and improve their sites over time.

And, again, just to reinforce Chuck’s point, what we’re really looking for to start with is disclosure – how do you do things? And then we can make a judgment of whether that’s adequate or not. And we have started to do that with the editorial policies and, as time goes on, we’ll become much more sophisticated in what we can rate.

All right, anybody else have a question or comment? Well, we encourage you to come to the session this afternoon on the quality of health Web sites, because we’re going to get into some more interesting issues with more speakers.