The Internet – a ‘seductive data set’ accessed at the touch of a button

On: September 28, 2010
Print Friendly, PDF & Email
About Anne Lukas
Before I went to Amsterdam to start the New Media Master, I studied Educational Design, Management & Media (BA) at the University of Twente (Enschede, Netherlands). During this time, I did an internship in Sydney, Australia for an educational company. There I was designing course material for academic institutions, but unfortunately only print-media. In my opinion is this medium too limited to educate people nowadays because of its lack of interactivity. So after this experience I wanted to be more engaged in e-learning. For the future, I wish to become an instructional designer who develops educational software, websites, etc.


The rapid expansion of internet uptake throughout the world created a potential for new social experiences, and thus offers researchers new environments for their social enquiry (Beddows, 2008). Kaye and Johnson predicted already in 1999 that the World Wide Web and other new electronic technologies might soon become prime survey vehicles due to convenient, verifiable, low-cost delivery and return systems as well as easy access and feedback mechanisms. Indeed, nowadays more and more researchers use the Internet for their research. The Internet became especially interesting for social science researchers because of two main capabilities. The first category has to do with the ability to search and retrieve data from large data stores (Jones, 1999). The Internet and constructs like the World Wide Web offer quickly a huge amount of data that can be used for data analysis. The second category is the interactive communication capability of the Internet. E-mail, chat rooms, etc. are all forms of text-based communication with variations in time, distance, and audience (Jones, 1999). In effect, the Internet provides the research community with the chance to interface with respondents in ways which may overcome some of the barriers imposed by conventional research approaches (Illingworth, 2001).

The advantages of Internet research over lab research with the undergraduate ‘subject pool’ are as follows: “On the Web one can achieve large samples, making statistical tests very powerful and model fitting very clean. With clean data the “signal” of systematic deviations can be easily distinguished from “noise.” Second, Web studies permit generalization from college students to a wider variety of participants. Third, one can recruit specialized types of participants via the WWW that would be quite rare to find among students.” (Birnbaum, 2004). In conclusion we can say that because of the Internet’s exponential growth, its impact on traditional means of communication, its dynamic nature, and its potential for reaching large and diverse segments of the population, it has gained interest from academia and industry researchers. However, along with the benefits of this new technology come new experiences and lessons to be learned and shared by researchers (Kaye & Johnson, 1999).

So thinking about the advantages of online research, we should also consider the several potential problems and disadvantages of studies conducted via the Internet. For example in lab research it is (almost) not possible for a participant to serve twice in an experiment and thus reduce the effective degrees of freedom. However, in Internet research, the possibility of multiple submissions has received considerable attention (Birnbaum, 2004). The following table will summarize possible methods that prevent the problem of multiple submission (Birnbaum, 2004):

Method – Tactic

  • Instructions – Tell people to participate only once
  • Remove incentives – Rewards not available for those who participate more than once
  • Replace incentive – Provide alternative site for repeated play
  • Use identifiers – Common gateway interface (CGI) script allows only one submission; option: replace previous data or refusal to accept new
  • Use identifiers – Filter data to remove repeats
  • Use Internet protocol (IP), email address – Check for repeated IP addresses
  • Passwords – Allow participation by password only
  • Cookies – Check cookie for previous participation
  • CGI scripts – CGI checks for referring page and other features
  • Log file analysis – Can detect patterns of requests
  • Subsample follow up – Contact participants to verify ID
  • Check for identical data records – Filter identical or nearly identical records

Another threat to internal validity of a between-subjects experiment occurs when there are dropouts, i.e. people who begin a study and quit before completing it (Birnbaum, 2004). Internet research shows unfortunately larger dropout rates than lab studies. The reason for this problem is that other people are present in the lab, so a person needs to explain why he/she wants to leave early. In contrast to Internet research, participants can easily click a button to quit the study without the possible social pressure or embarrassment in a lab (Birnbaum, 2004). Unluckily, there is no method to prevent dropouts in Internet research.

Besides these two disadvantages, there are even more problems that come along with social media. These problems are: security, privacy, intellectual property and credibility. I won’t discuss these problems in detail because there has been a lot of debates about the blurring of public and private experience (Beddows, 2008) and other issues. However, I want to point out that only a few researchers actually have tested empirically the quality of data collected on the Internet. Gosling et al. (2004) evaluated six main preconceptions that have been raised as likely limitations of Internet questionnaires. In the following table you can see the six preconceptions and the actual findings from the comparative analyzes of traditional and Internet methods (Gosling et al., 2004).

Preconception – Finding

  1. Internet samples are not demographically diverse – Mixed. Internet samples are more diverse than traditional samples in many domains (e.g., gender), though they are not completely representative of the population.
  2. Internet samples are maladjusted, socially isolated, or depressed – Myth. Internet users do not differ from nonusers on markers of adjustment and depression.
  3. Internet data do not generalize across presentation formats – Myth. Internet findings replicated across two presentation formats of the Big Five Inventory.
  4. Internet participants are unmotivated – Myth. Internet methods provide means for motivating participants (e.g., feedback).
  5. Internet data are compromised by anonymity of participants – Fact. However, Internet researchers can take steps to eliminate repeat responders.
  6. Internet-based findings differ from those obtained with other methods – Myth? Evidence so far suggests that Internet-based findings are consistent with findings based on traditional methods (e.g., on self-esteem, personality), but more data are needed.

In conclusion Gosling et al. (2004) stress that “Internet samples are certainly not representative or even random samples of the general population, but neither are traditional samples in psychology. In short, the data collected from Internet methods are not as flawed as is commonly believed”.

After showing the dis-/advantages, I would like to present a few techniques and recommendations for the use of the Internet as a research tool. The easiest way to get started with Internet research is probably to make a survey or experiment using one of the free programs to create the webpage (Birnbaum, 2004). Examples of these programs are: SurveyMonkey, SurveyWiz, FactorWiz, Free Online Surveys, QuestionPro, SurveyPirate. To build a useful survey or experiment, you should keep these recommendations in mind following Kaye and Johnson (1999):

Web Survey Design Considerations:

  • The survey should be as short as possible for quick completion and to minimize excessive scrolling.
  • Simple designs with sparse use of graphics save downloading time.
  • Drop-down boxes save space and clutter by avoiding repeating responses.
  • Instructions should be clearly stated.
  • Responding to questions should be easy and intuitive.
  • Pretests should be conducted to measure length of time and ease of completion.
  • A check of the survey using different browsers will uncover any browser-based design flaws.


  • To increase representativeness, define samples as subsets of Web users based on specific characteristics.
  • Solicit respondents by linking the survey from key online sites and by posting announcements to discussion-type groups that are likely to be used by the targeted population. Or, select a sampling frame from e-mail addresses posted on key Usenet newsgroups, listservs, and chat forums. E-mail the selected sample a request to complete the questionnaire along with an identification number and password required for accessing the Web-based survey.
  • The World Wide Web is truly worldwide, and individuals from any country can complete a questionnaire. Thus, clearly state the intended audience of respondents in the survey’s introduction.


  • Devise a method to systematically publicize the survey daily through various means. In other words, do not spam a few discussion groups while ignoring others. To reduce bias, create awareness from a wide variety of Internet outlets.
  • List the survey with as many of the major search engines as possible. Web sites such as ‘Submit It’ facilitate this process by sending listings to many search engines with just one entry. After listing the survey, try locating it by using different search strategies and terms.
  • When sending announcements about the survey, write the entire URL in the message. In Usenet newsgroup postings and in some e-mail transmissions, the URL becomes a clickable link.
  • Take care not to get flamed. Do not oversell the survey; just announce it.
  • Follow up confirmations of survey links to gauge how long the URL is posted and whether it is visible on the page. Remember that if it is difficult for the researchers to find the survey’s URL, then others probably will overlook it as well.
  • Asking respondents how they found out about the survey is an excellent way in which to gauge the most effective sites and discussion outlets.
  • Placing banner advertisements on selected sites might increase the number of completions.
  • Offer incentives for completing the survey. The incentives can be as simple as promising the respondents the results or as alluring as GVU’s lottery system (which rewards winning respondents with cash). The types of incentives offered clearly depend on the researchers’ budgets.
  • A combination of financial incentives, online and traditional advertising, and public relations and marketing efforts might be needed.

Data collection and responses:

  • Ask for respondents’ e-mail addresses to check for duplication. If the e-mail addresses are not given, then keep track of the senders’ Internet protocol addresses.
  • Surveys should be easy to return with just the click of a mouse button. A thank you or other type of verification page should come up on the sender’s screen on returning the survey so that the respondent is not left wondering whether the survey was indeed transmitted.
  • When the survey returns as an e-mail message, it should be designed so that it returns with each question listed on one line followed by its response and a corresponding numerical value. This makes it easy for researchers to eye the data and facilitates coding the surveys and entering them into a statistical software program.


Beddows, E. (2008). The methodological issues associated with Internet-based research. International Journal of Emerging Technologies and Society, 6(2), 124-139.

Birnbaum, M.H., (2004). Human research and data collection via the Internet. Annual Review of Psychology, 2004(55), 803–832. DOI: 10.1146/annurev.psych.55.090902.141601.

Gosling, S.D., Vazire, S., Srivastava, S., & John, O.P. (2004). Should we trust web-based studies?
A comparative analysis of six preconceptions about Internet questionnaires. American Psychologist, 59(2), 93-104. DOI: 10.1037/0003-066X.59.2.93.

Illingworth, N. (2001). The Internet matters: Exploring the use of the Internet as a research tool. Sociological Research Online, 6(2),

Jones, S. (1999). Doing Internet research: Critical issues and methods for examining the Net. Thousand Oaks, Ca: Sage Publications.

Kaye, B.K. & Johnson, T.J. (1999). Research methodology: Taming the cyber frontier : Techniques for improving online surveys. Social Science Computer Review, 1999(17), 323-337. DOI: 10.1177/089443939901700307

Comments are closed.