So I realize the title of this blog post is a bit "extreme," but
it's designed to illustrate a question we're frequently asked with
regards to how we analyze the information that comes out of an online
research community. Many companies want to understand if there is some
level of automated text analysis to identify high-level themes and
connections that emerge in research communities. It's understandable
why we're asked this question... Once companies realize how much
qualitative feedback emerges from an ongoing research community
(especially communities with 300+ members), they question how anyone
can stay on top of all the activity.
While there are a few benefits to automated text analysis, it's
something that we generally approach with caution. Call me "old
fashioned," but I still believe that nothing replaces a team of trained
community managers and qualitative researchers pouring over transcripts
and user generated content to highlight themes and generate
recommendations for the company sponsoring the community. Automated
text analysis can have a role, but the output of this analysis should
be treated with caution. Here's why...

Benefits of Automated Text Analysis in Online Research Communities
- Identifies themes that researchers might miss - While a
community manager or moderator is likely to have a good sense of the
high-level themes by reading carefully through each research activity
and following-up with probes, there might be underlying themes they
miss. Text analysis tools might help identify some of these underlying
themes.
- Saves time - In a large research community ("large" here
meaning 300+ member research communities), automated text analysis
might help researchers quickly identify themes for later exploration in
targeted activities. It may save them hours going through every
discussion looking for commonalities among segments.
Drawbacks of Automated Text Analysis in Online Research Communities
- Output is often too "high-level" to be useful - Let's say
you're running a project in your research community on the impact of
the economy on spending decisions (specifically in regards to the
sponsor company's products). The text analysis tools I've seen/used
might output results like "low cost," "saving money," "nervous,"
"recession," and "spending less," along with a frequency with which
these were mentioned. That's fine, but as a researcher I need to know
much more before I can make this into "actionable" information for my
client. For example, why were these words mentioned? How did this
breakdown according to the respondent's background? Were there
commonalities among segments? This information doesn't help me answer
whether or not the economy is likely to impact purchase decisions for
my client's product, which in this case is the fundamental question to
be addressed.
- Lack of context - As I was starting to allude to in my
point above, there is no real context for these results. For example,
if someone notes they are "spending less," is that specifically a
result of the economic environment or a factor of their general
situation (e.g., they just bought a house, are saving for a big
purchase, etc.)? Are there any commonalities that might be relevant
for those who said they are "nervous?" A community manager/researcher
would know from working closely with participants where these
comments/themes might be coming from. An automated text analysis tool
might not be able to accomplish this.
Recommendations
- Use folksonomy and social tagging to your advantage - One
way around the need for some aspects of automated text analysis is to
employ the concept of "folksonomy" in a community, whereby moderators
and community members tag content as it is added to the site. This
helps identify general areas of interest, shared interest/hobbies,
etc... Though not as activity and response-specific as text analysis,
it can still be a valuable tool to identify what is important to
members of a research community.
- Maintain a good manager-to-member ratio - A single
moderator trying to manage a 300 person community is going to miss a
lot of the activity that happens, even if they have tools available to
them to analyze the results. It's important to consider the ratio of
community managers/researchers to members (which I'll cover more in a
future blog post).
- Explore options for internet-wide text analysis instead -
I've seen some pretty interesting tools out there (like Umbria - now
part of J.D. Power) that analyze content from around the internet.
Depending on your research objectives, that might yield more
interesting information. Then again, it might not reflect your target
audience specifically...
What do you think?
Am I missing a benefit or drawback? What kind of experiences have
you had with automated text analysis (specifically with online
qualitative research methodologies)? I'll admit - maybe I haven't seen
automated analysis tools that are more advanced and able to address
some of the drawbacks I've mentioned. If so, any suggestions on tools
I should be looking at instead?